Predicting Credit Card Fraud using Supervised Machine Learning Methods: Comparative Analysis

Altan, Güner; Zafer, Metin

doi:https://dx.doi.org/10.26650/JEPR1433315

Research Article

DOI :10.26650/JEPR1433315 IUP :10.26650/JEPR1433315 Full Text (PDF)

Predicting Credit Card Fraud using Supervised Machine Learning Methods: Comparative Analysis

Currently, with the progress of technology, people’s and institutions’ range of expenditure channels via digital platforms has expanded. In addition, payment methods have become easier with the digital age. An expenditure, made from even a distant corner of the World, takes place instantaneously through the Internet. Although the rapid and global nature of digitisation contains many advantages, ensuring transaction security can be challenging. In this context, banks have undoubtedly become the most crucial institutions that mediate safe transactions between customers and sellers. In an era where credit card transactions are so prevalent, it is seen as a problem that needs to be solved by banks to determine whether these transactions involve fraud or not, both for their profitability and reputation. It takes a serious effort to determine that credit card expenditures, characterised by dynamic nature, are real expenses of the customer. Therefore, the aim of this study is to propose a model based on supervised machine learning with using real and current data with a few key features. The objective is to reduce banks’ operational burden and cost when identifying credit card fraud. In this context, the credit card transactions of a state-owned bank in January 2023 were considered, using a dataset comprising 13,050 observations. Python programming language is used for model building, and classification algorithms with high discriminatory power, such as Random Forest, Logistic Regression, K-Nearest Neighbours, Decision Trees, and Gradient Boosting, are preferred, which are machine learning techniques. The accuracy scores of the algorithms used in the model setup were determined as follows: Logistic Regression, 92.5%; Decision Tree, 93.1%; K-Nearest Neighbour 86.4%; Random Forest 91.8% and Gradient Boosting 86.9% and performance metrics, such as precision, recall, F1 score, and ROC-AUC, were also examined. Based on their performances, five algorithms were recommended for this study.

Keywords: Credit card fraud, Machine learning, Supervised learning, Random forest, Gradient boosting

JEL Classification : C60 , C69 , C81

DOI :10.26650/JEPR1433315 IUP :10.26650/JEPR1433315 Full Text (PDF)

Denetimli Makine Öğrenmesi Yöntemleri ile Kredi Kartı Sahteciliğini Tahmin Etme: Karşılaştırmalı Analiz

Güner Altan, Metin Recep Zafer

Günümüzde teknolojinin gelişmesiyle birlikte kişi ve kurumların dijital platform aracılığıyla harcama kanal yelpazesi genişlemiştir. Bununla birlikte ödeme yöntemleri dijital çağ ile birlikte kolaylaşmıştır. İnternet aracılığıyla dünyanın bir ucundan yapılan bir harcama saniyeler içinde gerçekleşmektedir. Dijitalleşmenin bu kadar hızlı ve global olması, birçok avantajı barındırırken yapılan harcamaların güvenliğini tespit etmek bir o kadar zor olabilmektedir. Bu bağlamda; bankalar şüphesiz, müşteri ile satıcı arasında güvenli bir alışverişe aracılık eden en önemli kurum haline gelmiştir. Kredi kartı harcamalarının bu denli yoğun olduğu dönemde bankaların söz konusu işlemlerin dolandırıcılık olup olmadığını tespit etmesi hem bankaların karlılığını hem de itibarlarını korumaları açısından çözüme kavuşturulması gereken bir problem olarak görülmektedir. Dinamik bir yapıya sahip olan kredi kartı harcamalarının banka müşterisine ait gerçek bir harcama olduğunu tespit etmek ciddi bir efor gerektirmektedir. Bu bağlamda çalışmanın amacı, denetimli makine öğrenmesi yöntemiyle gerçek ve güncel verilerden yola çıkarak az sayıda öz nitelik ile bir model önerisi sunmaktır. Bu bağlamda bankaların kredi kartı sahteciliği tespitindeki operasyon ve maliyet yükünün hafifletilmesi hedeflenmektedir. Bu kapsamda çalışmamızda kamu sermayeli bir bankaya ait 2023 yılı ocak ayı kredi kartı işlemleri baz alınmıştır. Veri seti 13050 gözlem sayısından oluşmaktadır. Model kurulmasında Python programlama dili kullanılmış olup denetimli makine öğrenmesi tekniklerinden Rassal Orman, Lojistik Regresyon, K-En Yakın Komşu, Karar Ağaçları, Gradyan Güçlendirme gibi sınıflandırmada ayırt etme gücü yüksek olan algoritmalar tercih edilmiştir. Algoritmaların kredi kartı sahtecilik işlemini tahmin etme doğruluk skorları ise Lojistik Regresyon % 92.5, Karar Ağaçları %93.1, K- En Yakın Komşu %86.4, Rassal Orman %91.8, Gradyan Güçlendirme %86. 9 olup bunun yanı sıra kesinlik, duyarlılık, F1 skoru ve ROC-AUC gibi performans metrikleri de incelenmiştir. Çalışmada performanslarından dolayı beş algoritmada önerilmektedir.

Keywords: Kredi kartı sahteciliği, Makine öğrenmesi, Denetimli öğrenme, Rassal orman, Gradyan güçlendirme

JEL Classification : C60 , C69 , C81

EXTENDED ABSTRACT

With the progress of digitalisation, customer behaviours in the financial sector are also undergoing transformations. The simplest example of this is the evolution of consumption habits. With the expansion of digital network platforms, consumption has become easier, faster, and more reliable. Undoubtedly, credit card expenditures take precedence over these outlays. In a period characterised by such an elevated intensity of credit card expenditures, the importance of secure shopping has significantly increased for both the customer and the banks. In this context, while banks broaden credit card spending networks, they are simultaneously trying to maintain safety infrastructure. With the progress of online money transactions, banks have abandoned traditional methods and have initiated the use of advanced artificial intelligence-based methodologies to detect fraudulent transactions in credit card expenditures.

In the introductory section of the study, while determining its outlines, it has been noted that credit card expenditures have increased significantly with the globalisation. In parallel with this trend, it has been mentioned that banks may experience challenges because they rely on conventional methods for fraud detection in credit card transactions.

In the second part of this study, a literature review is conducted, and machine learning is briefly mentioned, as well as studies on detecting credit card fraud. In this context, it has been observed that machine learning methods are generally used to detect credit card fraud, and domestic studies are found to be relatively fewer than international studies.

The third and final part of the study covers the practical application phase. This section provides information about the model’s data and preparation. In this study, artificial intelligence-based machine learning methods were used with the aim of shedding light on banks’ ability to detect credit card fraud. Real data from January 2023 is used to create the model, with an observation count of 13050. The dataset was obtained from a public capital bank and has not been shared because it contains customer information for security reasons. The dependent variable of the model consists of two categories: fraud and non-fraud. Independent variables are not detailed. However, other studies have observed that independent variables are not shared. Unlike previous studies, a model proposal is prepared using some features.

During model setup, algorithms with high prediction performance for classification problems are preferred. In this context, Logistic Regression, Decision Tree, K-Nearest Neighbour, Random Forest and Gradient Boosting algorithms have been used. The accuracy scores of the algorithms used in the model setup have been determined as, Logistic Regression 92.5%, Decision Tree 93.1%, K-Nearest Neighbour 86.4%, Random Forest 91.8%, and gradient boosting 86.9%. A detailed analysis of the models was shared comparatively, and unlike previous studies, both ROC-AUC curves and confusion matrices of the models were individually shared in the test performance metrics. In addition, the precision and F1 score metrics of the models are also presented.

Looking at the test results of the models, it is determined that they have high performance in predicting credit card fraud. In this context, the study has demonstrated the successful construction of prediction models using supervised machine learning techniques. Five algorithms were offered as alternatives to the study. As a limitation of this study, the dataset was not publicly available. In addition, the "transaction time" information from the dataset was excluded due to the risk of anomalies. In this context, by adding this information, which was excluded from the dataset for other studies, a new model proposal can be made.

References

Afriyie, J. K., Tawiah, K., Pels, W. A., Addai-Henne, S., Dwamena, H. A., Owiredu, E. O., ... & Eshun, J. (2023). A supervised machine learning algorithm for detecting and predicting fraud in credit card transactions. Decision Analytics Journal, 6, 100163. google scholar
Alraddadi, A. S. (2023). A Survey and a Credit Card Fraud Detection and Prevention Model using the Decision Tree Algorithm. Engineering, Technology & Applied Science Research, 13(4), 11505-11510. google scholar
Ay, A.K. (2022). Kredi Kartı Dolandırıcılığının Tespitinde Yeniden Örnekleme Tekniklerinin Kullanımı. (Yüksek Lisans Tezi), Eskişehir Osmangazi Üniversitesi Fen Bilimleri Enstitüsü. google scholar
Breiman, L. (2001). Random forests. Machine learning, 45, 5-32. google scholar
Çilburunoğlu, K. (2023). Kredi Kartı Dolandırıcılık Tespitinde Makine Öğrenme Algoritmalarının Karşılaştırmalı Analizi. (Yüksek Lisans Tezi), İstanbul Gedik Üniversitesi Eğitim Enstitüsü. google scholar
Çolak, U. (2021, 30 Mayıs). https://ufukcolak.medium.com/makine-öğrenmesi-veri-ön-işleme-5-58e1ce73c1fb. google scholar
Goy, G., Gezer, C., & Güngör, V. C. (2019). Makine Öğrenmesi Yöntemleri ile Kredi Kartı Sahteciliği Tespiti. 4. Uluslararası Bilgisayar Bilimleri ve Mühendisliği Konferansı (UBMK) google scholar
Hild, A. (2021). Estimating And Evaluating The Probability Of Default- A Machine Learning Approach. (Master Thesis). Uppsala Universitet, Statistics İn The Faculty Of Social Sciences. google scholar
Kılıç, N. (2023). Makine Öğrenimi Algoritmaları ile Kredi Kartı İşlemlerinde Dolandırıcılık Tespiti. (Yüksek Lisans Tezi). Hitit Üniversitesi Eğitim Enstitüsü. google scholar
Madhurya, M. J., Gururaj, H. L., Soundarya, B. C., Vidyashree, K. P., & Rajendra, A. B. (2022). Exploratory analysis of credit card fraud detection using machine learning techniques. Global Transitions Proceedings, 3(1), 31-37. google scholar
Nie, G., Rowe, W., Zhang, L., Tian, Y., & Shi, Y. (2011). Credit card churn forecasting by logistic regression and decision tree. Expert Systems with Applications, 38(12), 15273-15285. google scholar
Noviandy, T. R., Idroes, G. M., Maulana, A., Hardi, I., Ringga, E. S., & Idroes, R. (2023). Credit Card Fraud Detection for Contemporary Financial Management Using XGBoost-Driven Machine Learning and Data Augmentation Techniques. Indatu Journal of Management and Accounting, 1(1), 29-35. google scholar
Sorhun, E. (2021). Python ile Makine Öğrenmesi. İstanbul: Abaküs Yayınları. google scholar
Şahinaslan, E., Günerkan, M., & Şahinaslan, Ö. (2023). Makine Öğrenmesinde Kategorik Veri Kodlama Tekniğinin Kullanımına Alternatif Bir Çözüm Yöntemi. Journal of Intelligent Systems: Theory and Applications, 6(1), 1-11. google scholar
Ren, Z., Wang, S., & Zhang, Y. (2023). Weakly supervised machine learning. CAAI Transactions on Intelligence Technology, 8(3), 549-580. google scholar
Taşcı, E., & Onan, A. (2016). K-en yakın komşu algoritması parametrelerinin sınıflandırma performansı üzerine etkisinin incelenmesi. Akademik Bilişim, 1(1), 4-18. google scholar
Osisanwo, F. Y., Akinsola, J. E. T., Awodele, O., Hinmikaiye, J. O., Olakanmi, O., & Akinjobi, J. (2017). Supervised machine learning algorithms: classification and comparison. International Journal of Computer Trends and Technology (IJCTT), 48(3), 128-138. google scholar
Unogwu, O. J., & Filali, Y. (2023). Fraud detection and identification in credit card based on machine learning techniques. Wasit Journal of Computer and Mathematics Science, 2(3), 16-22. google scholar
Yeşilyurt, F. (2023). Kredi Kartı Sahteciliğinin Yapay Sinir Ağları ile Tespiti. (Yüksek Lisans Tezi), Kütahya Dumlupınar Üniversitesi Lisansüstü Eğitim Enstitüsü. google scholar

Citations

Copy and paste a formatted citation or use one of the options to export in your chosen format

EXPORT

APA

Altan, G., & Zafer, M.R. (2024). Predicting Credit Card Fraud using Supervised Machine Learning Methods: Comparative Analysis. Journal of Economic Policy Researches, 11(2), 242-262. https://doi.org/10.26650/JEPR1433315

AMA

Altan G, Zafer M R. Predicting Credit Card Fraud using Supervised Machine Learning Methods: Comparative Analysis. Journal of Economic Policy Researches. 2024;11(2):242-262. https://doi.org/10.26650/JEPR1433315

ABNT

Altan, G.; Zafer, M.R. Predicting Credit Card Fraud using Supervised Machine Learning Methods: Comparative Analysis. Journal of Economic Policy Researches, [Publisher Location], v. 11, n. 2, p. 242-262, 2024.

Chicago: Author-Date Style

Altan, Güner, and Metin Recep Zafer. 2024. “Predicting Credit Card Fraud using Supervised Machine Learning Methods: Comparative Analysis.” Journal of Economic Policy Researches 11, no. 2: 242-262. https://doi.org/10.26650/JEPR1433315

Chicago: Humanities Style

Altan, Güner, and Metin Recep Zafer. “Predicting Credit Card Fraud using Supervised Machine Learning Methods: Comparative Analysis.” Journal of Economic Policy Researches 11, no. 2 (Apr. 2025): 242-262. https://doi.org/10.26650/JEPR1433315

Harvard: Australian Style

Altan, G & Zafer, MR 2024, 'Predicting Credit Card Fraud using Supervised Machine Learning Methods: Comparative Analysis', Journal of Economic Policy Researches, vol. 11, no. 2, pp. 242-262, viewed 4 Apr. 2025, https://doi.org/10.26650/JEPR1433315

Harvard: Author-Date Style

Altan, G. and Zafer, M.R. (2024) ‘Predicting Credit Card Fraud using Supervised Machine Learning Methods: Comparative Analysis’, Journal of Economic Policy Researches, 11(2), pp. 242-262. https://doi.org/10.26650/JEPR1433315 (4 Apr. 2025).

MLA

Altan, Güner, and Metin Recep Zafer. “Predicting Credit Card Fraud using Supervised Machine Learning Methods: Comparative Analysis.” Journal of Economic Policy Researches, vol. 11, no. 2, 2024, pp. 242-262. [Database Container], https://doi.org/10.26650/JEPR1433315

Vancouver

Altan G, Zafer MR. Predicting Credit Card Fraud using Supervised Machine Learning Methods: Comparative Analysis. Journal of Economic Policy Researches [Internet]. 4 Apr. 2025 [cited 4 Apr. 2025];11(2):242-262. Available from: https://doi.org/10.26650/JEPR1433315 doi: 10.26650/JEPR1433315

ISNAD

Altan, Güner - Zafer, MetinRecep. “Predicting Credit Card Fraud using Supervised Machine Learning Methods: Comparative Analysis”. Journal of Economic Policy Researches 11/2 (Apr. 2025): 242-262. https://doi.org/10.26650/JEPR1433315

Volume 11, Issue 22024, P. 242-262

TIMELINE

Submitted	07.02.2024
Accepted	26.06.2024
Published Online	13.08.2024

LICENCE

Attribution-NonCommercial (CC BY-NC)

This license lets others remix, tweak, and build upon your work non-commercially, and although their new works must also acknowledge you and be non-commercial, they don’t have to license their derivative works on the same terms.

Journal of Economic Policy Researches

Research Article

Predicting Credit Card Fraud using Supervised Machine Learning Methods: Comparative Analysis

Denetimli Makine Öğrenmesi Yöntemleri ile Kredi Kartı Sahteciliğini Tahmin Etme: Karşılaştırmalı Analiz

EXTENDED ABSTRACT

PDF View

References

Citations

EXPORT

APA

AMA

ABNT

Chicago: Author-Date Style

Chicago: Humanities Style

Harvard: Australian Style

Harvard: Author-Date Style

MLA

Vancouver

ISNAD

TIMELINE

LICENCE

SHARE