A Systematic Review of AI-Enabled Fraud Detection in Digital Financial Systems (2019–2026)
DOI:
https://doi.org/10.63125/wpj89816Keywords:
Fraud Detection, Artificial Intelligence, Machine Learning, Digital Finance, Data AnalyticsAbstract
This study presented a comprehensive quantitative systematic review of AI-enabled fraud detection models in digital financial systems, synthesizing empirical evidence from 72 peer-reviewed studies published between 2019 and 2026. The analysis focused on evaluating the comparative performance of machine learning, deep learning, ensemble, hybrid, and graph-based approaches using standardized metrics such as accuracy, precision, recall, and F1-score. The findings revealed that ensemble and hybrid models achieved the highest overall performance, with average F1-scores of 0.91 and recall values reaching 0.94, demonstrating superior capability in detecting fraudulent transactions within highly imbalanced datasets where the average fraud rate was approximately 1.8%. Deep learning models showed strong performance with an average accuracy of 0.95 and F1-score of 0.89, particularly in large-scale datasets exceeding one million transactions, which accounted for 41.7% of the reviewed studies. Traditional machine learning models, including random forest and gradient boosting, maintained competitive performance with an average accuracy of 0.93 and F1-score of 0.87, highlighting their continued relevance in structured data environments. The study further identified that advanced feature engineering improved model performance by up to 12%, while imbalance handling techniques increased recall by approximately 13.2%. Graph-based models demonstrated enhanced effectiveness in detecting fraud networks, achieving recall values of up to 0.92 in relational datasets. Statistical analysis confirmed that performance differences between model categories were significant, with effect sizes ranging from 0.52 to 0.88, indicating moderate to strong practical impact. Additionally, real-time detection systems reduced latency by up to 35% while maintaining competitive predictive performance. Overall, the study established that fraud detection effectiveness was influenced not only by model selection but also by data characteristics, feature optimization, and evaluation methodologies, providing a robust quantitative foundation for understanding the performance and application of AI-driven fraud detection systems in modern digital financial environments.
