Genetik Programlamanın Temelleri: Genetik Programlama ve Ekonometri İlişkisi Bağlamında Bir İnceleme

Bu çalışmanın temel amacı genetik programlamanın temelleri ve ekonometri alanındaki uygulamaları hakkında bir değerlendirme sunmaktır. Genetik programlama, sağlanan girdiler ve çıktılar arasında matematiksel gösterimler veya bilgisayar programları üretmeyi amaçlayan evrim temelli bir arama tekniğidir. Çalışmada genetik programlama-ekonometri ilişkisi bağlamında okuyucuya derin bir kavrayış kazandırma motivasyonu ile öncelikle genetik programlamanın tanımı ve gelişimine ilişkin kapsamlı bir literatür taramasına yer verilmiş ve ardından genetik programlamanın çalışma prensipleri tanıtılmıştır. Genetik programlamanın mühendislik, elektronik, görüntü işleme, işaret işleme, işletme, ekonomi, finansal zaman serileri, tıp, biyoloji, eğlence ve bilgisayar oyunları gibi pek çok alanda insanla rekabet edebilir nitelikte, yenilikçi çözümler üretme yeteneği ortaya konulmuştur. Ekonometrik problemlerin nasıl bir genetik programlama problemi olarak düşünülebileceğine ilişkin örnekler ile ekonometrik uygulamalar ve genetik programlama arasındaki ilişkiler özellikle sembolik regresyon çerçevesinde tartışılmıştır. Ayrıca model seçimi, değişken seçimi, finansal zaman serilerinin modellenmesi ve tahmini, finansal zaman serilerinde kural arama, parametreler için anlamlılık testleri konularında genetik programlamanın alternatif olarak çözüm sunabileceği durumlar da ortaya konulmuştur. Çalışmanın sonucunda tartışılan kavramlar bir arada değerlendirildiğinde, ekonometrik modelleme ve tahmin süreçlerinde araştırmacıların modelin fonksiyonel biçimi ve (veya) değişkenleri hakkında bilgisinin olmadığı ya da yetersiz olduğu durumlarda genetik programlamanın geleneksel yöntemlere alternatif olarak bir çözüm sunma konusunda başarılı olabileceği ifade edilmiştir.

Anahtar Kelimeler: Genetik Programlama, Evrimsel Programlama, Sembolik Regresyon, Yapay Zekâ, Model Seçimi, Değişken Seçimi

DOI :10.26650/B/SS10.2021.013.12 IUP :10.26650/B/SS10.2021.013.12 Tam Metin (PDF)

Basics of Genetic Programming: An Overview in the Context of the Relationship Between Genetic Programming And Econometrics

Seda Karakaş Geyik, Mehmet Hakan Satman

This study provides a short introduction and an overview of the basics of genetic programming (GP) and its applications in the context of econometrics. In other words, the aim of this study is to gain a deep insight into the relationship between GP and econometrics. Accordingly, first, the basic definition and development of GP are introduced. After the literature review, GP applications in the areas of engineering, electronics, image processing, signal processing, business, economics, financial time series, medicine, biology, entertainment, and computer games are presented. In this part, we aim to highlight the results of GP applications that produce human-competitive results and new inventions, followed by working principles of GP (the basic representation schemes, initialization, and operators used in GP). Then, we discuss how econometric problems can be handled as a GP problem. We showed that GP can successfully solve several econometric problems by producing new inventions and human-competitive results.

GP is a family of searching tools in which the basic aim is to generate mathematical expressions or computer programs between the inputs fed and output. In other words, GP explores the rules in terms of computer programs using given input and output data via evolutionary optimization. GP starts with a population of randomly chosen mathematical expressions or computer programs. The quality of an expression is measured by the fitness or cost, which indicates the difference between the generated and real outputs. Then, the best solutions are recombined to generate offspring. An offspring is a cut and paste copy of the two best candidate solutions. A good offspring is expected by recombining two fittest parents. After generating new candidates, mutations are applied on the constants or functions with small probabilities. Mutations result in small and significant changes depending on the location of the mutated elements in expressions. Small changes perform a local search, whereas significant changes are essential for searching other sides of the search space in terms of optimization. After several steps, generated expressions form a new population. The search process terminates after the meeting the stopping criteria.

GP develops computer programs by optimization, whereas genetic algorithms are generally used for problems in which the main objective is to find a global extremum of a function. Genetic algorithms encode decision variables using binary, integer, permutation, or floatingpoint encoding–decoding schemes, whereas GP generally uses abstract syntax trees, postfix and prefix representations, Java bytecodes, or assembly language opcodes. Because GP develops mathematical expressions and programs, GP tasks are computer-aided, and an interpreter is always needed to evaluate generated offspring for calculating fitness values in runtime.

In econometrics; the tasks of estimating model parameters, testing the significanceofparameters, and performing forecasts are nontrivial. In numerous cases, researchers face difficulties in selecting the best model, functional structure, correct combination of lagged values or independent variables, etc. The problems of model and variable selections are handled separately or together in the literature. When the number of independent variables in variable-pool increases, selecting the k correct variables among K variables results in a combinatorial problem, where k ≤ K. Furthermore, when the functional form is undefined by theory, nothing can be done, except trying all possible forms with unknown variables. This is a “searching the needle in the haystack” type of problem. GP is broadly used in such problems under the title of symbolic regression. In general terms, the main question of symbolic regression is: “what is the correct functional form with correct independent variables that explains a dependent variable well?”. Besides the estimation process and techniques used, numerous model suggestions can satisfy a model selection criteria used in the classic econometric literature.

Econometric models are generally represented by mathematical expressions. In programming languages, mathematical expressions are also computer programs. For example, piecewise functions are basically represented using if-then-else statements or expressions in several programming languages. Similarly, piecewise regressions, nonlinear time-series models, as well as buy and hold strategies in finance are combinations of mathematical expressions including decisions as computer programs. Developing the best strategy and determining the best time for buying a stock is a well-studied topic in the GP literature.

GP is neither the end of the whole story nor the magic that solves problems. First, GP tends to generate more than sufficient expressions as solutions; several researchers have tried deciphering this phenomenon. Second, an automatic process can almost always find an overfitted set of solutions describing the relations between variables. As in other econometric studies, researchers should be attentive to select interpretable solutions to the concerned problem. In other words, the model that minimizes selection criteria may not be the proper model. Moreover, GP is successful in model and variable selections, rule detection, discovering new location-comparison tests, and inventing patentable things in a human-competitive way.

Anahtar Kelimeler: Genetic Programming, Evolutionary Programming, Symbolic Regression, Artificial Intelligence, Model Selection, Variable Selection

Referanslar

Adalı, E. (2017). Yapay Zekâ, İstanbul Teknik Üniversitesi Yayını: İnsanlaşan Makineler ve Yapay Zekâ, 75, 8-13. google scholar
Affenzeller, M., Winkler, S., Wagner, S. & Beham A. (2009). Genetic Algorithms and GeneticProgramming Modern Concepts and Practical Applications, Taylor & Francis Group, LLC. google scholar
Akaike, H. (1973). Information Theory and an Extension of the Maximum Likelihood Principle. In B.N. Petro and F. Csaki ed. 2nd International Symposium on Information Theory, pp. 267-281. google scholar
Akaike, H. (1974). A new look at the statistical model identification. IEEE Trans. Autom. Control, 19(6), 716723. google scholar
Akaike, H. (1978). A Bayesian analysis of the Minimum AIC Procedure. Annals of the Institute of Statistical Mathematics, 20, 9-14. google scholar
Andersen, C. M., & Bro, R. (2010). Variable selection in regression—a tutorial. Journal of Chemometrics, 24, 728-737. google scholar
Arganis, M., Val, R., Prats, J., Rodriguez, K., Dommguez, R. & Dolz, J. (2009). Genetic Programming and Standardization in Water Temperature Modelling. Hindawi Publishing Corporation Advances in Civil Engineering, Volume 2009, Article ID 353960, 10 pages. google scholar
Arnaldo, I., Krawiec, K., & O’Reilly, U. (2014). Multiple regression genetic programming. Proceedings of the 2014 Annual Conference on Genetic andEvolutionary Computation, ACM, pp. 879-886. google scholar
Aryafar, A., Khosravi, V., Zarepourfard, H. & Rooki, R. (2019). Evolving genetic programming and other AI-based models for estimating groundwater quality parameters of the Khezri plain, Eastern Iran. Environmental Earth Sciences, 78, 69. google scholar
Azaria, Y., & Sipper, M. (2005a). GP-gammon: Genetically programming backgammon players. Genetic Programming and Evolvable Machines, 6(3), 283-300. google scholar
Balcombe, K. G. (2005). Model selection using information criteria and genetic algorithms. Computational Economics, 25(3), 207-228. google scholar
Banzhaf, W., Nordin, P., Keller, R. E. & Francone, F. D. (1997). Genetic programming: An introduction: On the automatic evolution of computer programs and its applications (the morgan kaufmann series in artificial intelligence). Morgan Kaufmann Publishers. google scholar
Bauer, R. J., Jr. (1994). Genetic Algorithms and Investment Strategies. John Wiley. google scholar
Bhanu, B. & Lee, S. (1994). Genetic Learning for Adaptive Image Segmentation. Boston: Kluwer Academic Publishers. google scholar
Chadalawada J, Havlicek V., & Babovic, V. (2017). A genetic programming approach to system identification of rainfall-runoff models. Water Resources Management, 31, 3975-3992. google scholar
Chen, S. H., Duffy, J., & Yeh, C. H. (2002). Equilibrium selection via adaptation: Using genetic programming to model learning in a coordination game. The Electronic Journal ofEvolutionary Modeling and Economic Dynamics, 15 January 2002. google scholar
Corno, F., Sanchez, E., & Squillero, G. (2005). Evolving assembly programs: how games help microprocessor validation. IEEE Transactions on Evolutionary Computation, 9(6), 695-706. google scholar
Cramer, N. L. (1985). A Representation for the Adaptive Generation of Simple Sequential Programs, Proceedings ofAn International Conference On Genetic Algorithms and Their Applications, pp. 183-187. google scholar
Davidson, J. W., Savic, D. A., & Walters, G.A. (2003). Symbolic and numerical regression: experiments and applications. Information Sciences, 150, 95-117. google scholar
Dijkstra, E. W. (1961).Algol 60 translation: An algol 60 translator for the x1 and making a translator for algol 60.Stichting Mathematisch Centrum. Rekenafdeling. Stichting Mathematisch Centrum. google scholar
Dorsey, R., & Mayer, W. (1995). Genetic algorithms for estimation problems with multiple optima, nondifferentiability, and other irregular features. Journal ofBusiness & EconomicStatistics, 13(1), 53-66. google scholar
Drecourt, J. P. (1999). Application of neural networks and genetic programming to rainfall run off modeling. Water Resources Management, 13, 219-231. google scholar
Efron, B., Hastie, T., Johnstone, I. & Tibshirani, R. (2004). Least angle regression. The Annals of statistics, 32(2), 407-499. google scholar
Esfahanipour, A., & Mousavi, S. (2011). A genetic programming model to generate risk-adjusted technical trading rules in stock markets. Expert Systems with Applications, 38(7), 8438-8445. google scholar
Fischer, M. M., & Leung, Y. (1998). A genetic-algorithms based evolutionary computational neural network formodelling spatial interaction data. The Annals ofRegional Science, 32(3), 437-458. google scholar
Forsyth. B. R. (1981). A Darwinian approach to pattern recognition. Kybernetes, 10, 159-166. google scholar
Freitas, A. A. A. (2008). Review of evolutionary Algorithms for Data Mining, In: Maimon, O., Rokach, L. (Editors.), Soft Computing for Knowledge Discovery and Data Mining. Springer US, 79-111. google scholar
Giustolisi, O., & Savic, D. A. (2006). A symbolic data-driven technique based on evolutionary polynomial regression. Journal of Hydroinformatics, 8(3), 207-222. google scholar
Goldberg, D. E. (1989). Genetic Algorithms in Search, Optimisation and Machine Learning, Addison-Wesley. google scholar
Hauptman, A., & Sipper, M. (2005). GP-endchess: Using genetic programming to evolve chess endgame players. In M. Keijzer, et al., editors, Proceedings of the 8th European Conference on Genetic Programming, volume 3447 of Lecture Notes in Computer Science, pages 120-131, Lausanne, Switzerland, 30 March - 1 April 2005. Springer. google scholar
Holland, J. H. (1975) Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence. University of Michigan Press, AnnArbor, MI. google scholar
İnce, G. (2017). İnsanlığın Yapay Zekâyla İmtihanı, İstanbul Teknik Üniversitesi Yayını: İnsanlaşan Makineler ve Yapay Zekâ, 75, 14-17. google scholar
Jin, N. (2005). Equilibrium selection by co-evolution for bargaining problems under incomplete information about time preferences. In Proceedings of the IEEE Congress on Evolutionary Computation (CEC), pages 2661-2668, Edinburgh, 2-5 September 2005. google scholar
Jin, N., & Tsang, E. P. K. (2006). Co-adaptive strategies for sequential bargaining problems with discount factors and outside options. In Proceedings of the 2006 IEEE Congress on Evolutionary Computation, pages 7913-7920, Vancouver, 6-21 July 2006. google scholar
Kaboudan, M. (1999). A measure of time series’ predictability using genetic programming applied to stock returns. Journal ofForecasting, 18(5), 345-357. google scholar
Kaboudan, M. (2000). Genetic programming prediction of stock prices. Computational Economics, 16(3), 207-236. google scholar
Kaboudan, M. (2005). Extended daily exchange rates forecasts using wavelet temporal resolutions. New Mathematics and Natural Computation, 1(1), 79-107. google scholar
Kotanchek M. E., Vladislavleva, E. Y., & Smits, G.F. (2009) Symbolic regression via GP as a discovery engine: Insights on outliers and prototypes. In: Riolo RL, O’Reilly UM, McConaghy T (eds) Genetic Programming Theory and Practice VII, Genetic and Evolutionary Computation, Springer, Ann Arbor, pp.55-72. google scholar
Koza, J. R. (1992). Genetic Programming: On the Programming of Computers by Means ofNatural Selection. The MIT Press. google scholar
Koza, J. R. (1997). Genetic Programming. Encyclopedia of Computer Science and Technology to be edited by Allen Kent and James G. Williams, 1-26. google scholar
Koza, J. R., Bennett, III F. H., & Stiffelman, O. (1999) Genetic programming as a Darwinian invention machine. In R. Poli, et al., editors, Genetic Programming, Proceedings ofEuroGP’99, volume 1598 of LNCS, pages 93-108, Goteborg, Sweden, 26-27 May 1999. Springer-Verlag. google scholar
Koza, J. R., Keane, M. A., Yu, J., Bennet III, F. H., & Mydlowec, W. (2000). Automatic Creation of Human-Competitive Programs and Controllers by Means of Genetic Programming. Genetic Programming and Evolvable Machines, 1, 121-164. google scholar
Koza, R., Al-Sakran, S.H., & Jones, L. W. (2005). Automated re-invention of six patented optical lens systems using genetic programming. In H.-G. Beyer, et al., editors, GECCO 2005: Proceedings of the 2005 conference on Genetic and evolutionary computation, volume 2, pages 1953-1960, Washington DC, USA, 25-29 June 2005. ACM Press. google scholar
Koza, J. R. (2010). Human-competitive results produced by genetic programming. Genet Program Evolvable Mach, 11, 251-284. google scholar
Langdon, W. B. (1998). Genetic Programming and Data Structures: Genetic Programming + Data Structures = Automatic Programming!, Springer. google scholar
Langdon, W. B., & Poli, R. (2002). Foundations ofgenetic programming. Springer-Verlag. google scholar
Leardi, R., Boggia, R., & Terrile, M. (1992). Genetic algorithms as a strategy for feature selection. Journal of chemometrics, 6(5), 267-281. google scholar
Lee, Y. S., & Tong, L. I. (2011). Forecasting time series using a methodology based on autoregressive integrated moving average and genetic programming. Knowledge-Based Systems, 24(1), 66-72. google scholar
Lo, A. W. (2004). The Adaptive Markets Hypothesis: Market Efficiency from an Evolutionary Perspective. Journal of Portfolio Management, Forthcoming. Available at SSRN:https://ssrn.com/abstract=602222 google scholar
Lohn, J. D., Hornby, G. S., & Linden, D. S. (2004). Evolutionary antenna design for a NASA spacecraft. In U.-M. O’Reilly, et al., editors, Genetic Programming Theory and Practice II, chapter 18, pages 301-315. Springer, Ann Arbor, 13-15 May 2004. google scholar
Mallick, D., Lee, V. C. S.,& Ong, Y. S. (2008). An empirical study of Genetic Programming generated trading rules in computerized stock trading service system. 2008 International Conference on Service Systems and Service Management, Melbourne, VIC, Australia, IEEE. google scholar
Man, K. F., Tang, K. S., Kwong, S., & Halang, W. A. (1997). Genetic algorithms for control and signal processing. Springer, London. google scholar
Manahov, V., & Zhang, H. (2019). Forecasting Financial Markets Using High-Frequency Trading Data: Examination with Strongly Typed Genetic Programming. International Journal of Electronic Commerce, 23(1), 12-32. google scholar
Marney, J. P., Miller, D., Fyfe, C., & Tarbert, H. F. E. (2001). Risk adjusted returns to technical trading rules: a genetic programming approach. In 7th International Conference of Society of Computational Economics, Yale, 28-29 June 2001. google scholar
Massey, P., Clark, J. A., & Stepney, S. (2005). Evolution of a human-competitive quantum fourier transform algorithm using genetic programming. In H.-G. Beyer, et al., editors, GECCO 2005: Proceedings of the 2005 conference on Genetic and evolutionary computation, volume 2, pages 1657-1663, Washington DC, USA, 25-29 June 2005.ACM Press. google scholar
Michalewicz, Z. (1992). GeneticAlgorithms + Data Structures = Evolution Programs. Berlin: Springer-Verlag. google scholar
Mitchell, M. (1996). An IntroductiontoGeneticAlgorithms. Cambridge, MA: The MIT Press. google scholar
Moore, J., Olson, R., Chen, Y., & Sipper, M. (2019). Automated discovery of test statistics using genetic programming.16th Annual Humies Awards - 2019 - Prague, Czech Republic. URL http://www.human-competitive.org/awards google scholar
Mousavi, S., Esfahanipour A., & Zarandi, M. H. F. (2014). A novel approach to dynamic portfolio trading system using multitree genetic programming. Knowledge-Based Systems, 66, 68-81. google scholar
Neely, J., Weller, P. A., & Dittmar, R. (1997). Is technical analysis in the foreign exchange market profitable? A genetic programming approach. The Journal of Financial and Quantitative Analysis, 32(4), 405-426. google scholar
Neely, C. J., & Weller, P. A. (1999). Technical trading rules in the european monetary system. Journal of International Money and Finance, 18(3):429-458. google scholar
Neely, C. J., & Weller, P. A. (2001). Technical analysis and central bank intervention. Journal of International Money and Finance, 20(7), 949-970. google scholar
Neely, C. J. (2003). Risk-adjusted, ex ante, optimal technical trading rules in equity markets. International Review of Economics and Finance, 12(1), 69-87. google scholar
Neely, C. J., Weller, P. A., & Ulrich, J. M. (2006). The adaptive markets hypothesis: evidence from the foreign exchange market. Working Paper 2006-046B, Federal Reserve Bank of St. Louis, Research Division, P.O. Box 442, St. Louis, MO 63166, USA. google scholar
Orlov, M., & Sipper, M. (2009). Genetic programming in the wild: Evolving unrestricted bytecode.Proceedings of the 11th Annual conference on Genetic and evolutionary computation, ACM. google scholar
Pal, S. K., & Wang, P. P. (1996). Genetic Algorithms and Pattern Recognition. Boca Raton, FL: CRC Press. google scholar
Perkis, T. (1994). Stack-based genetic programming.Proceedings of the first IEEE conference on evolutionary computation. IEEE world congress on computational intelligence, IEEE. google scholar
Poli, R., Langdon, W. B., & McPhee, N. F. (2008). A field guide to genetic programming. Published via http:// lulu.com and freely available at http://www.gp-field-guide.org.uk (With contributions by J. R. Koza). google scholar
Potvin, J.Y., Soriano, P. & Vallee, M. (2004). Generating trading rules on the stock markets with genetic programming. Computers & Operations Research, 31 (7), 1033-1047. google scholar
Ruengvirayudh, P., & Brooks, G.P. (2016). Comparing Stepwise Regression Models to the Best-Subsets Models, or, the Art of Stepwise. General Linear Model Journal, 42(1), 1-14. google scholar
Schwarz, G. (1978). Estimating the Dimensions of a Model, The Annals of Statistics, 6, 461-464. google scholar
Shichel, Y., Ziserman, E. & Sipper, M. (2005). GP-robocode: Using genetic programming to evolve robocode players. In M. Keijzer, et al., editors, Proceedings of the 8th European Conference on Genetic Programming, volume 3447 of Lecture Notes in Computer Science, pages 143-154, Lausanne, Switzerland, 30 March - 1 April 2005. Springer. google scholar
Smith, S. F. (1980). A Learning System Based on Genetic Adaptive Algorithms. PhD thesis, University of Pittsburgh. google scholar
Stender, J., Hillebrand, E., & Kingdon, J. (editors). (1994). Genetic Algorithms in Optimization, Simulation, and Modeling. Amsterdam: IOS Publishing. google scholar
Tanimoto, S. L.(1987). The elements of artificial intelligence: an introduction using LISP. New York, NY, USA: Computer Science Press. google scholar
Tolvi, J. (2004). Genetic algorithms for outlier detection and variable selection in linear regression models. Soft Computing, 8, 527-533. google scholar
Tong, H. (1978). On A Threshold Model. Chen, C. (Ed.). Pattern Recognition And Signal Processing, p. 575-586, NATO ASI Series E: Applied Sc. (29), Sijthoff ve Noordhoff, Netherlands. google scholar
Tong, H., & Lim, K.S. (1980).Threshold Autoregression, Limit Cycles and Cyclical Data. Journal of the Royal Statistical Society. Series B (Methodological), 42(3), 245-292. google scholar
Tong, H. (1990). Non-Linear Time Series: A Dynamical System Approach, Oxford University Pres. google scholar
Trujillo, L., & Olague, G. (2006). Using evolution to learn how to perform interest point detection. In X. Y. T. et al., editor, ICPR 2006 18th International Conference on Pattern Recognition, volume 1, pages 211-214. IEEE, 20-24 August 2006a. google scholar
Turing, A. M. (1936). Oncomputable numbers, with an application to the Entscheidungsproblem. Proceedings of theLondon Mathematical Society, Series 2, 42 (1936-7), 230-265. google scholar
Turing, A. M. (1948). Intelligent machinery. [Reprinted in ed. by D.C. Ince, Mechanical Intelligence: Collected Works of A. M. Turing. (North Holland, Amsterdam, 1992), pp. 107-127). Also reprinted in ed. by B. Meltzer, D. Michie. 1969]. Machine Intelligence 5. (Edinburgh University Press, Edinburgh, 1948). google scholar
Turing, A. M. (1950). Computing machinery and intelligence. Mind, 59(236), 433-460. [Reprinted in D. C. Ince (ed), Mechanical Intelligence: Collected Works of A. M. Turing. (North Holland, Amsterdam, 1992), pp. 133-160]. google scholar
Wong, M. L., & Leung, K.S. (2002). Data Mining Using Grammar Based Genetic Programming and Applications. Kluwer Academic Publishers New York, Boston, Dordrecht, London, Moscow. google scholar
Wu, A. S., & Banzhaf. W. (1998). Introduction to the special issue: Variable-length representation and noncoding segments for evolutionary algorithms. Evolutionary Computation, 6(4), iii-vi. google scholar
Yule, G. U. (1926). Why do we sometimes get nonsense correlations between time series? A study in sampling and the nature of time series. J. Roy. Statist. Soc., 89, 1-63. google scholar

Ekonometride Güncel Yöntemler ve Uygulamalar

BÖLÜM

Genetik Programlamanın Temelleri: Genetik Programlama ve Ekonometri İlişkisi Bağlamında Bir İnceleme

Basics of Genetic Programming: An Overview in the Context of the Relationship Between Genetic Programming And Econometrics

Referanslar

PAYLAŞ