CHAPTER


DOI :10.26650/B/T3.2024.40.016   IUP :10.26650/B/T3.2024.40.016    Full Text (PDF)

Improving Cardiovascular Disease Prediction Using Ensemble Learning Techniques and Dimensionality Reduction

Hatice KoçAli ZıdelkhırSeda Tolun TayalıÇiğdem Selçukcan Erol

Cardiovascular diseases (CVDs) represent a significant global health challenge, leading to heart failure in numerous cases. Addressing this issue requires the development of effective strategies. In this study, we employ ensemble learning models, specifically “Bagging” and “Boosting”, to predict the risk of cardiovascular diseases using a Kaggle dataset comprising 11 features and 70,000 observations. Our investigation focuses on exploring the potential of ensemble models such as AdaBoost, Random Forest, Gradient Boosting, and Gaussian Naive Bayes to enhance the prediction performance for a medical dataset. Additionally, we highlight the importance of dimensionality reduction through Principal Component Analysis (PCA). The findings underscore the critical role of dimensionality reduction. Applying the Bagging and Boosting models with dimensionality reduction results in higher accuracy, precision, recall, F1-score, and Area under Curve (AUC). Leveraging dimensionality reduction significantly improves the model performance, yielding substantial enhancements in predictive capabilities.


DOI :10.26650/B/T3.2024.40.016   IUP :10.26650/B/T3.2024.40.016    Full Text (PDF)

Topluluk Öğrenme Tekni̇kleri̇ ve Boyut Azaltma Kullanılarak Kardi̇yovasküler Hastalık Tahmi̇ni̇ni̇n İyi̇leşti̇ri̇lmesi̇

Hatice KoçAli ZıdelkhırSeda Tolun TayalıÇiğdem Selçukcan Erol

Kardiyovasküler hastalıklar, birçok vakada kalp yetmezliğine neden olan önemli bir küresel sağlık sorununu ifade etmektedir. Bu sorunla ilgilenmek etkili stratejiler geliştirilmesini gerektirmektedir. Bu çalışmada, 11 öznitelik ve 70,000 gözlemden oluşan bir Kaggle veri seti kullanılarak kardiyovasküler hastalıklar riskini tahmin etmek için özellikle “Torbalama” ve “Arttırma” olmak üzere topluluk öğrenme modelleri kullanılmaktadır. Çalışmamız tıbbi bir veri setinde tahmin performansını artırmak amacıyla Uyarlanabilir Yükseltme, Rassal Orman, Gradyan Artırma ve Gauss Naive Bayes gibi topluluk modellerinin potansiyellerinin keşfedilmesine odaklanmaktadır. Ek olarak, Temel Bileşen Analizi ile boyut indirgemenin önemini vurgulamaktayız. Bulgular, boyut indirgemenin kritik rolünün önemini vurgulamaktadır. Torbalama ve Arttırma modellerinin boyut indirgeme ile beraber uygulanması daha yüksek doğruluk, kesinlik, duyarlılık, f1 puanı ve AUC ile sonuçlanmaktadır. Boyutsal indirgemeden yararlanmak tahmin yeteneklerinde önemli gelişmeler sağlanarak model performansı önemli ölçüde arttırmaktadır.



References

  • Alpaydın, E. (2010). Introduction. E. Alpaydın içinde, Introduction to machine learning (p. 1-19). Cambrige: MIT Press. google scholar
  • Amma, N. B. (2012). Cardiovascular disease prediction system using genetic algorithms and neural networks. 2012 international conference on computing, communication and applications (p. 1-15). Dindigul, India: IEEE. google scholar
  • Bhatt, C., Patel, P., Ghetia, T., & Mazzeo, P. (2022). Effective heart disease prediction using machine learning techniques. Algorithms, 88. google scholar
  • Cardiovascular Disease Dataset. (2023). Kaggle: https://www.kaggle.com/datasets/sulianova/cardiovascular-di-sease-dataset google scholar
  • Howley, T., Madden, M., O’Connell, M.-L., & Ryder, A. (2005). The effect of principal component analysis on machine learning accuracy with high dimensional spectral data. International Conference on Innovative Techniques and Applications of Artificial Intelligence (p. 209-222). London: Springer. google scholar
  • Ileberi, E., Sun, Y., & Wang, Z. (2021). Performance evaluation of machine learning methods for credit card fraud detection using SMOTE and AdaBoost. IEEE Access, 165286-165294. google scholar
  • Jinjri, W. M., Keikhosrokiani, P., & Abdullah, N. (2021). Machine learning algorithms for the classification of cardiovascular disease-a comparative study). 2021 International Conference on Information Technology (ICIT) (p. 132-138). Amman Jordan: IEEE. google scholar
  • Jones, M. T. (2007). Artificial intelligence: a system approach. Infinity Science Press LLC. google scholar
  • Kabari, L. G., & Believe, B. N. (2019). Principal component analysis (PCA) - an effective tool in machine lear-ning. Int. J. Advanced Research in Computer Science and Software Engineering, 56-59. google scholar
  • Laudon, K., & Laudon, J. P. (2023). Management information systems: managing the digital firm. Pearson. google scholar
  • Lee, I., & Shin, Y. J. (2020). Machine learning for enterprises: applications, algorithm selection, and challenges. Business Horizons, 157-170. google scholar
  • Mezzatesta , S., Torino , C., Meo De, P., Fiumara , G., & Vilasi , A. (2019). A machine learning-based approach for predicting the outbreak of cardiovascular diseases in patients on dialysis. Computer Methods and Prog-rams in Biomedicine, 177, 9-15. google scholar
  • Navamai, T. M. (2019). Efficient deep learning approaches for health informatics. A. K. Sangaiah içinde, Deep learning and parallel computing environment for bioengineering systems (p. 123-137). Academic Press. google scholar
  • Princy, R., Parthasarathy, S., Jose, P., Lakshminarayanan, A., & Jeganathan, S. (2020). Prediction of cardiac disease using supervised machine learning algorithms. 020 4th international conference on intelligent com-puting and control systems (ICICCS) (p. 570-575). Madurai, India: IEEE. google scholar
  • Sharda, R., Delen, D., & Turban, E. (2014). Data mining. R. Sharda, D. Delen, & E. Turban In: Business intel-ligence and analytics: systems for decision support (p. 216-272). Pearson. google scholar
  • Shilaskar, S., & Ghatol, A. (2013). Feature selection for medical diagnosis: evaluation for cardiovascular dise-ases. Expert Systems with Applications, 4146-4153. google scholar
  • Thiriet, M. (2014). Cardiovascular disease: An introduction. M. Thiriet içinde, Biomathematical and biomecha-nical modeling of the circulatory and ventilatory systems (p. 1-90). Springer. google scholar
  • Uddin, M. N., & Halder, R. K. (2021). An ensemble method based multilayer dynamic system to predict cardi-ovascular disease using machine learning approach. Informatics in Medicine, 100584. google scholar
  • World Health Organization. (2023, June 14). Cardiovascular disease. World Health Organization: https://www. who.int/health-topics/cardiovascular-diseases#tab=tab_1 google scholar
  • Zhou, Z.-H. (2021). Ensemble learning. Ensemble learning. (p. 181-210). Springer. google scholar


SHARE




Istanbul University Press aims to contribute to the dissemination of ever growing scientific knowledge through publication of high quality scientific journals and books in accordance with the international publishing standards and ethics. Istanbul University Press follows an open access, non-commercial, scholarly publishing.