Modern Kredi Sınıflandırma Çalışmaları ve Metasezgisel Algoritma Uygulamaları: Sistematik Bir Derleme
Hazar AltınbaşKredi başvurularında, başvuranların temerrüde düşüp düşmeyeceklerinin başarılı şekilde tahmin edilmesi amacıyla önerilen gelişmiş analiz yöntemlerinin sayısı, özellikle Küresel Finans Krizi sonrası dönemde önemli bir artış göstermiştir. Geleneksel istatistiksel sınıflandırma yöntemlerine alternatif olarak bilgiyi, kısıtlar ve varsayımlardan bağımsız olarak doğrudan veri kümelerinden ortaya çıkarma yeteneğine sahip makine öğrenme yöntemleri kullanılmaya başlanmıştır. Bu yöntemlerin yanı sıra, sınıflandırma performansları üzerinde çok büyük iyileştirmeler sağlayan metasezgisel algoritmalar da yazında kendilerine fazlaca yer bulmaya başlamıştır. Veri saklama ve işleme kapasitelerinde yaşanan artıştan en üst düzeyde faydalanmaya yönelik olarak öğrenme yöntemleri ile metasezgisel algoritmaların birlikte kullanımları, kredi risk değerlendirme alanına büyük katkılar sağlamaktadır. Bu derleme kapsamında 2000 sonrası dönemde yazına sunulmuş olan ve metasezgisel algoritmaların yer aldığı kredi sınıflandırma çalışmaları sistematik bir süreç ile incelenmiştir. Yazında karşılaşılan sınıflandırma yöntemleri, uygulanan metasezgisel algoritmalar ile kullanım amaçları ve sınıflandırma performans değerlendirme kriterleri ele alınmış ve mevcut duruma ilişkin genel bir çerçeve oluşturulmuştur. İnceleme, metasezgisel algoritmalar ile makine öğrenme yöntemlerine yönelik artan bir ilgi olduğunu ortaya koymaktadır ancak yöntem tercihleri birkaç alternatif üzerine yoğunlaşmış durumdadır. Yeni geliştirilen metasezgisel algoritmaların ve/veya hibrit ve birlikte kullanımların alanda daha fazla yer alması gerekmektedir. Bilgisayar ve matematik bilimlerinde yaşanan gelişmeler ile paralel olarak ilerleyecek çalışmaların, yazına sürekli katkı sunmaya devam edeceğini söylemek mümkündür.
Metaheuristic Algorithms and Modern Credit Classification Methods: A Systematic Review
Hazar AltınbaşNumber of proposed advanced analysis methods, which try to successfully predict if applicants are going to default in credit applications show an increasing pattern, especially after the Global Financial Crisis. Alternative to conventional statistical classification methods, machine learning methods arrive on the scene; they have capability to reveal information from the data independently from constraints and assumptions. Along with machine learning methods, metaheuristic algorithms that substantially improves classification performances take part in studies. Combined usages of learning methods and metaheuristic algorithms aim to benefit from the contemporary data storage and process capacities at the highest level and greatly contribute to credit risk assessment field. In this review study, credit classification studies that adopt metaheuristic algorithms in the analyses are examined with a systematic process, for the period after 2000. By forming a general framework, classification methods, metaheuristic algorithm implementations, algorithms’ intended uses and performance assessment criteria are addressed. Examination showed that there is a growing interest, nevertheless method preferences are concentrated over a limited option. It is necessary to incorporate more novel metaheuristics and/or hybrid and combined usages to the studies. It is possible to say that progressive works parallel to the developments in computer and mathematical sciences will continuously contribute to the literature.
Review Subject: This systematic review examines studies in credit risk assessment field, that apply metaheuristic algorithms for classification and optimization purposes. Objective of these algorithms is to improve performances of credit applicants’ classifications as good or bad, which are predicted and interpreted as non-defaulters and defaulters, respectively.
Study Questions: The review aims to shed light on the examples of metaheuristic algorithms usages in credit classification analyses and exhibit a contemporary framework. In this sense, answers to the following four questions are searched:
1. What is the trend of metaheuristic algorithm applications over 2000-2018 period and which algorithms are used specifically?
2. What are the main classifiers in the analyses with metaheuristic optimization improvement?
3. How are the performances of analyses with metaheuristic improvements are evaluated?
4. What is the most recent situation in credit risk classification with metaheuristic optimization?
Methodology: Prisma systematic review process is followed. Articles found in Web of Science, Scopus and ProQuest databases are included. Reviews, conference proceedings, patents and theses are not included. Three group of keywords are used for searching:
Group 1: “credit scoring”, “credit evaluation”, “credit assessment”, “credit risk assessment”, “credit decision”;
Group 2: “classification”, “data mining”, “machine learning”, “statistical learning”, “soft computing”, “computational intelligence”;
Group 3: “metaheuristic”, “evolutionary computing”, “heuristic”, “genetic algorithm”, “genetic programming”, “swarm intelligence”, “evolutionary programming”
All included studies’ objectives are either classification and/or improvement of classification performances.
Results and Conclusions: By following the systematic process, a total of thirty articles are found over the examination period. Publishing trend shows at least one article is presented to the literature each year after 2005 and there is a strong positive trend after 2015. It seems like more and more researchers are attracted to make use of metaheuristic algorithms to optimize credit classification analyses. Studies report improved performances with metaheuristic algorithm utilizations.
Most metaheuristic algorithm designs are not suitable to classify observations in data sets. For this reason, they are used as an optimization technique rather than classifier itself. In the examined studies, only Genetic Programming is found to be used as both a classifier and optimization method, by virtue of its solution representation and solution space searching design. In credit risk assessment, as the first empirical studies used statistical learning methods like classical linear regression, logistic regression and discriminant analysis, machine learning and unsupervised methods became more popular in the last two decades. Neural network models, support vector machines, tree-based methods, Bayesian networks, fuzzy logic and k-nearest neighborhood models are preferred classifiers, while the former two are dominating the field.
Frequently used metaheuristic in the studies is Genetic Algorithm, which is also well-known in other research fields with intense data analysis. Genetic algorithm is followed by Genetic Programming, Particle Swarm Optimization, Tabu Search and Simulated Annealing. Choice of metaheuristic seem to be focused only on a few alternatives, though there are many other suggestions available in optimization literature.
Metaheuristics contribute to classifier performances in several ways. More widely, they are used for variable selection, to find optimum set of variables to teach the main classifier in the analyses. This selection provides classifier with a smaller set of variables and thus improve efficiency and effectivity. Parameter optimization comes after feature selection; in which several parameters of the classifiers are optimized. It is important to configure parameters in a proper way to achieve best results. Optimizing variables to be included and method parameters are both complex in nature, and best or near to best solution finding cannot be done by more conventional optimization techniques that aim to find exact solutions. Metaheuristics’ main advantage is their solution space searching mechanisms and can provide feasible solutions in reasonable times. Therefore, implementation of metaheuristic search algorithms into analyses significantly contributes to credit risk assessments/scorings.
This review shows that a solid literature is formed for metaheuristic applications in credit classification. More and more studies are expected to be presented in the future. There lies a huge potential by widening the choice of algorithms and hybrid/combined forms. By following this pace, credit risk management activities in banks may continue to benefit from developments in data analysis field.