Research Article


DOI :10.26650/ekoist.2020.33.843564   IUP :10.26650/ekoist.2020.33.843564    Full Text (PDF)

Microeconometric Analysis of Household Consumption Expenditures: LAD-LASSO Method

Kadriye Hilal TopalEbru Çağlayan Akay

This study examined how supervised machine learning methods help us select the relevant variables of a Household Budget Survey Consumption Expenditures dataset with outliers in order to achieve better performance in the predicting and forecasting of the Household Consumption Expenditures Model. To achieve this, the Household Budget Survey Consumption Expenditures dataset of Turkey for 2018 was examined using the Least Absolute Deviation (LAD), Least Absolute Shrinkage and Selection Operator (LASSO) and LAD-LASSO methods. In addition, the classical regression method and the prediction and forecasting performances of the methods were compared. According to the analyzed results,it was concluded that the LAD-LASSO machine learning method, which enables the selection of variables while obtaining robust predictors in the presence of long-tailed errors, was the most successful method in prediction performance and forecasting accuracy. Additionally, several fundamental variables such as income, saving, and household size increase the household consumption expenditures for all models. In addition to these variables, other variables including the structure of a room, the kitchen, bathroom floors, heating, air conditioning preferences, energy sources used, detached house, apartment, cottage, vineyard ownership, investment preferences, credit card usage, and internet shopping habits were selected as determinants of household consumption expendituresin the LAD-LASSO model. From the results of the study, it wasfound that machine learning algorithms can be used in the selection of the most appropriate variablesin the course of the construction of microeconometric models.

JEL Classification : C31 , C55 , D12
DOI :10.26650/ekoist.2020.33.843564   IUP :10.26650/ekoist.2020.33.843564    Full Text (PDF)

Hanehalkı Tüketim Harcamalarının Mikroekonometrik Analizi: LAD-LASSO Yöntemi

Kadriye Hilal TopalEbru Çağlayan Akay

Bu çalışmanın amacı, denetimli makine öğrenmesi yöntemlerinin aşırı değer ve uzun kuyruklu hatalara sahip Hanehalkı Bütçe Anketi Hane veri setinin ilgili değişkenlerini seçmemize nasıl yardımcı olduğunu incelemek ve Türkiye’nin Hanehalkı TüketimHarcamaları’nın tahmininde en iyitahmin ve öngörü performansına sahip olanmodelin belirlenmesinisağlamaktır. Bu amaçla, 2018 yılı Türkiye’nin Hanehalkı Bütçe Anketi Hane veri seti klasik regresyon yönteminin yanı sıra En Küçük Mutlak Sapma (LAD), En Küçük Mutlak Küçültme ve Seçim Operatörü (LASSO) ve LAD-LASSO yöntemleri kullanılarak incelenmiş ve yöntemlerin tahmin ve öngörü performansları karşılaştırılmıştır. Analiz sonuçlarına göre; uzun kuyruklu hataların varlığında dayanıklı tahminciler elde edilirken aynı zamanda değişken seçimine olanak sağlayan LAD-LASSO makine öğrenmesi yönteminin tahmin performansı ve öngörü açıklığı açısından en başarılı yöntem olduğu sonucuna ulaşılmıştır. Ayrıca gelir, tasarruf ve hane halkı büyüklüğü gibi bazı temel değişkenler tüm modeller için hanehalkı tüketim harcamalarını artırmaktadır. Bu değişkenlere ek olarak odanın yapısı, mutfak, banyo zeminleri, ısıtma, klima tercihleri, kullanılan enerji kaynakları, müstakil ev, apartman, yazlık, bağ sahipliği ve yatırım tercihleri, kredi kartı kullanımı, internet alışveriş alışkanlıkları gibi çeşitli değişkenler LAD-LASSO modelinde hane halkı tüketim harcamalarının belirleyicileri olarak seçilmiştir. Çalışma sonuçlarından, makine öğrenme algoritmalarının mikroekonometrik modellerin oluşturulması sırasında gerekli değişkenlerin seçiminde kullanılabileceğine dair bulgular elde edilmiştir. Bu çalışma doktora tezinden üretilmiştir.

JEL Classification : C31 , C55 , D12

EXTENDED ABSTRACT


Household consumption expenditures play an important role both in providing information about the economic development levels of countries and determining rational production policies together with the determination of socioeconomic determinants. In literature, there were many studies on consumption expenditures. Although these studies aimed to select variables that determine consumption and obtain the most appropriate statistical and econometric model, these studies were modeled with different variables.

The Least Squares regression model (LS) is one of the most widely used estimation methods but LS estimators give unrealistic predictions in the presence of long-tailed errors, so LAD estimators are often used. However, since the number of variables in large data sets is high and the number of candidate models increases exponentially, the best model cannot be selected due to processing complexity. For this reason, Wang, Li, and Jiang (2007) developed the LAD-LASSO method, which enables the best model selection using the LASSO type penalty method, while obtaining robust estimators in the presence of outliers and long-tailed errors. The Household Budget Survey Consumption Expenditures dataset of Turkey contains both a great number of observations and many variables. Since the income distribution is not homogeneous in Turkey, household consumption expenditure does not show a homogeneous structure. Therefore, the LAD-LASSO, a penalty based machine learning method based on dimension reduction, was used in the analysis of the Household Budget Survey household data set in this study.

This study examined how the supervised machine learning methods help us to select the relevant variables of the Household Budget Survey Consumption Expenditures dataset with outliers in order to achieve a better performance in predicting and forecasting performances of the Household Consumption Expenditures Model. Since the main purpose of a penalty-based variable selection method is the only estimation and causal and statistical inferences cannot be made from the obtained models, the results of the LAD-LASSO regression were evaluated in terms of variable selection and modeling. 

In the study, the Household Consumption Expenditure model was predicted with the EKK method first, and diagnostic tests were applied to investigate the deviations from assumptions and outliers. To detect outliers, diagnostic tests were utilized to standard and student type residuals, and the presence of outliers was detected in 410 observations. In addition to the LASSO regression, the LAD and LADLASSO methods were predicted, which enabled robust estimators to be obtained in the presence of outliers and long-tailed errors; The results were compared and interpreted. The EKK and LASSO models prediction performance comparisons made use of Root Mean Squared Error (RMSE), Mean Absolute Error (MAE) and R-Squared (R2 ) criteria which gave very similar results. 

According to the analyzed results, it was concluded that the LAD-LASSO machine learning method, which enables the selection of variables while obtaining robust predictors in the presence of long-tailed errors, is the most successful in prediction performance and forecasting accuracy. Several fundamental variables such as income, saving, and household size increased the household consumption expenditures for all models. In addition to these variables, other variables including the structure of a room, the kitchen, bathroom floors, heating, air conditioning preferences, energy sources used, detached house, apartment, cottage, vineyard ownership, investment preferences, credit card usage, and internet shopping habits were selected as determinants of household consumption expenditures in the LAD-LASSO model. From the results of the study, it was found that machine learning algorithms can be used in the selection of most appropriate variables in the course of the construction of microeconometric models. Although, penalty-based machine learning methods are successful methods in determining the model in data sets with a large number of variables, they should be used carefully because they make predictions based on correlation rather than causality.


PDF View

References

  • Ahrens A.; Hansen, C. B. & Schaffer, M.E.(2019). “Lassopack: Model Selection and Prediction with Regularized Regression in Stata”, IZA Institute of Labor Economics, IZA DP No.12081. google scholar
  • Andini, M.; Ciani, E.; De Blasio, G.; D’ignazio, A.& Salvestrini, V. (2018). “Targeting with Machine Learning: An Application to A Tax Rebate Program in Italy”, Journal of Economic Behavior and Organization, 156: 86–102. google scholar
  • Arthanari, T. S.& Dodge, Y. (1993). Mathematical Programming in Statistics. John Wiley&Sons, Inc., New York. google scholar
  • Azzopardi, D.; Fareed, F.; Lenain, P.& Shutherland, D. (2019). “Assessing Household Financial Vulnerability: Empirical Evidence from the U.S. using Machine Learning”, OECD Economic Survey of the United States: Key Research Findings 2019: 121-142. google scholar
  • Birkes, D.& Dodge,Y. (1993). Alternative Methods of Regression. John Wiley&Sons, Inc., New York. google scholar
  • Breusch, T. S.& Pagan, A. R. (1979). "A Simple Test for Heteroskedasticity and Random Coefficient Variation", Econometrica, 47(5): 1287-1294. google scholar
  • Cook, R. D.& Weisberg, S. (1983). "Diagnostics for Heteroskedasticity in Regression". Biometrika, 70 (1): 1-10. google scholar
  • Çalmaşur, G.&Kılıç, A. (2018). “Türkiye’de Hanehalkı Tüketim Harcamalarının Analizi”, ETÜ Sosyal Bilimler Enstitüsü Dergisi, 5 : 61-73. google scholar
  • Dodge, Y. (1997). “LAD Regression for Detecting Outliers inResponse and Explanatory Variables”, Journal of Multivariate Analysis, 61: 144-158. google scholar
  • Gaffney, R.&Kirkby, R. (2018). “Machine Learning the Consumption Function”, EEA-ESEM Cologne 2018 Conference. https://editorialexpress.com/conference/EEAESEM2018/program/EEAESEM2018 (Erişim Tarihi: 15.07.2020). google scholar
  • Hampel, F. R.; Ronchetti, E. M.; Rousseeuw, P. J.& Stahel, W. A. (2005). Robust Statistics: The Approach Based on Influence Functions. John Wiley&Sons, Inc., New York. google scholar
  • Kolmogorov, A. (1933). "Sulla Determinazione Empirica di una Legge di Distribuzione". G. Ist. Ital. Attuari, 4: 83-91. google scholar
  • Mian, A.; Rao, K & Amir, S. (2013). “Household Balance Sheets, Consumption, and the Economic Slump”, The Quarterly Journal of Economics, 148: 1687–1726. google scholar
  • Obrizan, M.; Torosyan, K. & Pignatti, R. (2019). “Tobacco Spending in Georgia: Machine Learning Approach”, ICDSIAI 2018: Recent Developments in Data Science and Intelligent Analysis of Information, 103-114. google scholar
  • Önder, K.&Turgut, H. (2018). “Examination of the Factors Affecting Household Rental Housing Demand Through Data Mining: The Case of Turkey”, Eskişehir Osmangazi Üniversitesi İİBF Dergisi, 13(2): 227-238. google scholar
  • Pedregosa, F. (2016). “Hyperparameter Optimization with Approximate Gradient.” 33rd ICML, New York, 2016,(Editör. M. F. Balcan and K. Q. Weinberger). Proceedings of Machine Learning Research, 48: 737-746. google scholar
  • Rao, C.R.(1973). Linear Statistical Inference and its Applications.2. Basım, John Wiley & Sons, Inc., Canada. google scholar
  • Sec, R.&Zemcik, P. (2007). "The Impact Of Mortgages, House Prices And Rents On Household Consumption In The Czech Republic", CERGE-EI Discussion Paper, 2007–2185. google scholar
  • Selim, S.& Demirkıran, E. (2020) “Türkiye’de Hanehalkı Gıda Harcamalarını Etkileyen Sosyo-Ekonomik Faktörler: Karşılaştırmalı Bir Analiz”, Hacettepe Üniversitesi İktisadi ve İdari Bilimler Fakültesi Dergisi, 38(2): 297-321. google scholar
  • Shapiro S. S.& Wilk, M. B. (1965). “An Analysis of Variance Test for Normality (Complete Samples)”, Biometrika, 52(3/4): 591-611. google scholar
  • Shi, P.& Tsai, C. L. (2002). “Regression Model Selection a Residual Likelihood Approach”, J. R. Statist. Soc. B, 64: 237-252. google scholar
  • Showers, V. E.& Shotick, J. A. (1994). “The Effects of Household Characteristics on Demand for Insurance: A Tobit Analysis”, The Journal of Risk and Insurance, 61(3): 492-502. google scholar
  • Smirnov, N. (1948). "Table for Estimating the Goodness of Fit of Empirical Distributions". Annals of Mathematical Statistics, 19 (2): 279-281. google scholar
  • Tibshirani, R. (1996). “Regression Shrinkage and Selection via the Lasso”, Journal of the Royal Statistical Society, 58: 267-288. google scholar
  • TUİK(2018). Hanehalkı Bütçe İstatistikleri Mikro Veri Seti, 2018, Metaveri, Amaç. İstanbul. google scholar
  • Wang, H,; LI, G.& JIANG, G. (2007). “Robust Regression Shrinkage and Consistent Variable Selection Through the LAD-Lasso”, Journal of Business & Economic Statistics, 25: 347-355. google scholar
  • Varlamova, J.&Larıonova, N. (2015). “Macroeconomic and Demographic Determinants of Household Expenditures in OECD Countries”, Procedia Economics and Finance, 24: 727 -733. google scholar
  • Ylvisaker, D. (1977). “Test Resistance”, Journal of the American Statistical Association, 72(359): 551-556. google scholar

Citations

Copy and paste a formatted citation or use one of the options to export in your chosen format


EXPORT



APA

Topal, K.H., & Çağlayan Akay, E. (2020). Microeconometric Analysis of Household Consumption Expenditures: LAD-LASSO Method. Ekoist: Journal of Econometrics and Statistics, 0(33), 13-31. https://doi.org/10.26650/ekoist.2020.33.843564


AMA

Topal K H, Çağlayan Akay E. Microeconometric Analysis of Household Consumption Expenditures: LAD-LASSO Method. Ekoist: Journal of Econometrics and Statistics. 2020;0(33):13-31. https://doi.org/10.26650/ekoist.2020.33.843564


ABNT

Topal, K.H.; Çağlayan Akay, E. Microeconometric Analysis of Household Consumption Expenditures: LAD-LASSO Method. Ekoist: Journal of Econometrics and Statistics, [Publisher Location], v. 0, n. 33, p. 13-31, 2020.


Chicago: Author-Date Style

Topal, Kadriye Hilal, and Ebru Çağlayan Akay. 2020. “Microeconometric Analysis of Household Consumption Expenditures: LAD-LASSO Method.” Ekoist: Journal of Econometrics and Statistics 0, no. 33: 13-31. https://doi.org/10.26650/ekoist.2020.33.843564


Chicago: Humanities Style

Topal, Kadriye Hilal, and Ebru Çağlayan Akay. Microeconometric Analysis of Household Consumption Expenditures: LAD-LASSO Method.” Ekoist: Journal of Econometrics and Statistics 0, no. 33 (Jun. 2021): 13-31. https://doi.org/10.26650/ekoist.2020.33.843564


Harvard: Australian Style

Topal, KH & Çağlayan Akay, E 2020, 'Microeconometric Analysis of Household Consumption Expenditures: LAD-LASSO Method', Ekoist: Journal of Econometrics and Statistics, vol. 0, no. 33, pp. 13-31, viewed 18 Jun. 2021, https://doi.org/10.26650/ekoist.2020.33.843564


Harvard: Author-Date Style

Topal, K.H. and Çağlayan Akay, E. (2020) ‘Microeconometric Analysis of Household Consumption Expenditures: LAD-LASSO Method’, Ekoist: Journal of Econometrics and Statistics, 0(33), pp. 13-31. https://doi.org/10.26650/ekoist.2020.33.843564 (18 Jun. 2021).


MLA

Topal, Kadriye Hilal, and Ebru Çağlayan Akay. Microeconometric Analysis of Household Consumption Expenditures: LAD-LASSO Method.” Ekoist: Journal of Econometrics and Statistics, vol. 0, no. 33, 2020, pp. 13-31. [Database Container], https://doi.org/10.26650/ekoist.2020.33.843564


Vancouver

Topal KH, Çağlayan Akay E. Microeconometric Analysis of Household Consumption Expenditures: LAD-LASSO Method. Ekoist: Journal of Econometrics and Statistics [Internet]. 18 Jun. 2021 [cited 18 Jun. 2021];0(33):13-31. Available from: https://doi.org/10.26650/ekoist.2020.33.843564 doi: 10.26650/ekoist.2020.33.843564


ISNAD

Topal, KadriyeHilal - Çağlayan Akay, Ebru. Microeconometric Analysis of Household Consumption Expenditures: LAD-LASSO Method”. Ekoist: Journal of Econometrics and Statistics 0/33 (Jun. 2021): 13-31. https://doi.org/10.26650/ekoist.2020.33.843564



TIMELINE


Submitted08.12.2020
Accepted30.12.2020
Published Online15.01.2021

LICENCE


Attribution-NonCommercial (CC BY-NC)

This license lets others remix, tweak, and build upon your work non-commercially, and although their new works must also acknowledge you and be non-commercial, they don’t have to license their derivative works on the same terms.


SHARE




Istanbul University Press aims to contribute to the dissemination of ever growing scientific knowledge through publication of high quality scientific journals and books in accordance with the international publishing standards and ethics. Istanbul University Press follows an open access, non-commercial, scholarly publishing.