A. Startari
This study examines how syntactic constructions in expense narratives affect misclassification rates in AI-powered corporate ERP systems. We trained transformerbased classifiers on labeled accounting data to predict expense categories and observed that these models frequently relied on grammatical form rather than financial semantics. We extracted syntactic features including nominalization frequency, defined as the ratio of deverbal nouns to verbs; coordination depth, measured by the maximum depth of coordinated clauses; and subordination complexity, expressed as the number of embedded subordinate clauses per sentence. Using SHAP (SHapley Additive exPlanations), we identified that these structural patterns significantly contribute to false allocations, thus increasing the likelihood of audit discrepancies. For interpretability, we applied the method introduced by Lundberg and Lee in their seminal work, “A Unified Approach to Interpreting Model Predictions,” published in Advances in Neural Information Processing Systems 30 (2017): 4765–4774. To mitigate these syntactic biases, we implemented a rule-based debiasing module that reparses each narrative into a standardized fair-syntax transformation, structured around a
Keywords:
You do not have permission to edit this page, for the following reason:
You are not allowed to execute the action you have requested.
You can view and copy the source of this page.
Return to Startari 2025b.
Published on 22/07/25
Licence: CC BY-NC-SA license
Views 0Recommendations 0