Children of the Tree: Optimised Rule Extraction from Machine Learning Models
Hilal Meydan, Mert BalThe “Children of the Tree” algorithm provides a strong understanding of how the imbalanced dataset is classified by extracting rules from each tree of the Random Forest (RF) model. Basically, it converts the divisions created at each node of the trees into “if-then” rules and extracts individual rules for each tree by differentiating the general “community model” perception in the RF. Thus, the algorithm finds the “Children of the Tree” by converting the forest into a rule set. This study, developed on the “German Credit Data Set”, which is one of the banking data sets on which many studies have been conducted in the literature; determines the rules that cause to fall into that class(class good or class bad) for candidate customers. In this way, the bank would see the rules for potential customers belonging to the risky class and have the chance to recommend the alternative plans/products that are suitable for their risk strategy to their potential customers. The study evaluates rule validity and reliability using association rule mining metrics—support, confidence, lift, leverage, conviction - calculates "Minimum Description Length" (MDL), and ranks rules by "support" and "MDL cost" to extract the simplest rules for each class. It addresses risk management in banking and marketing needs, using MDL cost and SMOTE to handle imbalanced datasets, setting it apart from other algorithms.