DOI :10.26650/B/SS10.2021.013.12   IUP :10.26650/B/SS10.2021.013.12    Full Text (PDF)

Basics of Genetic Programming: An Overview in the Context of the Relationship Between Genetic Programming And Econometrics

Seda Karakaş GeyikMehmet Hakan Satman

This study provides a short introduction and an overview of the basics of genetic programming (GP) and its applications in the context of econometrics. In other words, the aim of this study is to gain a deep insight into the relationship between GP and econometrics. Accordingly, first, the basic definition and development of GP are introduced. After the literature review, GP applications in the areas of engineering, electronics, image processing, signal processing, business, economics, financial time series, medicine, biology, entertainment, and computer games are presented. In this part, we aim to highlight the results of GP applications that produce human-competitive results and new inventions, followed by working principles of GP (the basic representation schemes, initialization, and operators used in GP). Then, we discuss how econometric problems can be handled as a GP problem. We showed that GP can successfully solve several econometric problems by producing new inventions and human-competitive results.

GP is a family of searching tools in which the basic aim is to generate mathematical expressions or computer programs between the inputs fed and output. In other words, GP explores the rules in terms of computer programs using given input and output data via evolutionary optimization. GP starts with a population of randomly chosen mathematical expressions or computer programs. The quality of an expression is measured by the fitness or cost, which indicates the difference between the generated and real outputs. Then, the best solutions are recombined to generate offspring. An offspring is a cut and paste copy of the two best candidate solutions. A good offspring is expected by recombining two fittest parents. After generating new candidates, mutations are applied on the constants or functions with small probabilities. Mutations result in small and significant changes depending on the location of the mutated elements in expressions. Small changes perform a local search, whereas significant changes are essential for searching other sides of the search space in terms of optimization. After several steps, generated expressions form a new population. The search process terminates after the meeting the stopping criteria.

GP develops computer programs by optimization, whereas genetic algorithms are generally used for problems in which the main objective is to find a global extremum of a function. Genetic algorithms encode decision variables using binary, integer, permutation, or floatingpoint encoding–decoding schemes, whereas GP generally uses abstract syntax trees, postfix and prefix representations, Java bytecodes, or assembly language opcodes. Because GP develops mathematical expressions and programs, GP tasks are computer-aided, and an interpreter is always needed to evaluate generated offspring for calculating fitness values in runtime. 

In econometrics; the tasks of estimating model parameters, testing the significanceofparameters, and performing forecasts are nontrivial. In numerous cases, researchers face difficulties in selecting the best model, functional structure, correct combination of lagged values or independent variables, etc. The problems of model and variable selections are handled separately or together in the literature. When the number of independent variables in variable-pool increases, selecting the k correct variables among K variables results in a combinatorial problem, where k ≤ K. Furthermore, when the functional form is undefined by theory, nothing can be done, except trying all possible forms with unknown variables. This is a “searching the needle in the haystack” type of problem. GP is broadly used in such problems under the title of symbolic regression. In general terms, the main question of symbolic regression is: “what is the correct functional form with correct independent variables that explains a dependent variable well?”. Besides the estimation process and techniques used, numerous model suggestions can satisfy a model selection criteria used in the classic econometric literature.

Econometric models are generally represented by mathematical expressions. In programming languages, mathematical expressions are also computer programs. For example, piecewise functions are basically represented using if-then-else statements or expressions in several programming languages. Similarly, piecewise regressions, nonlinear time-series models, as well as buy and hold strategies in finance are combinations of mathematical expressions including decisions as computer programs. Developing the best strategy and determining the best time for buying a stock is a well-studied topic in the GP literature.

GP is neither the end of the whole story nor the magic that solves problems. First, GP tends to generate more than sufficient expressions as solutions; several researchers have tried deciphering this phenomenon. Second, an automatic process can almost always find an overfitted set of solutions describing the relations between variables. As in other econometric studies, researchers should be attentive to select interpretable solutions to the concerned problem. In other words, the model that minimizes selection criteria may not be the proper model. Moreover, GP is successful in model and variable selections, rule detection, discovering new location-comparison tests, and inventing patentable things in a human-competitive way.

DOI :10.26650/B/SS10.2021.013.12   IUP :10.26650/B/SS10.2021.013.12    Full Text (PDF)

Genetik Programlamanın Temelleri: Genetik Programlama ve Ekonometri İlişkisi Bağlamında Bir İnceleme

Seda Karakaş GeyikMehmet Hakan Satman

Bu çalışmanın temel amacı genetik programlamanın temelleri ve ekonometri alanındaki uygulamaları hakkında bir değerlendirme sunmaktır. Genetik programlama, sağlanan girdiler ve çıktılar arasında matematiksel gösterimler veya bilgisayar programları üretmeyi amaçlayan evrim temelli bir arama tekniğidir. Çalışmada genetik programlama-ekonometri ilişkisi bağlamında okuyucuya derin bir kavrayış kazandırma motivasyonu ile öncelikle genetik programlamanın tanımı ve gelişimine ilişkin kapsamlı bir literatür taramasına yer verilmiş ve ardından genetik programlamanın çalışma prensipleri tanıtılmıştır. Genetik programlamanın mühendislik, elektronik, görüntü işleme, işaret işleme, işletme, ekonomi, finansal zaman serileri, tıp, biyoloji, eğlence ve bilgisayar oyunları gibi pek çok alanda insanla rekabet edebilir nitelikte, yenilikçi çözümler üretme yeteneği ortaya konulmuştur. Ekonometrik problemlerin nasıl bir genetik programlama problemi olarak düşünülebileceğine ilişkin örnekler ile ekonometrik uygulamalar ve genetik programlama arasındaki ilişkiler özellikle sembolik regresyon çerçevesinde tartışılmıştır. Ayrıca model seçimi, değişken seçimi, finansal zaman serilerinin modellenmesi ve tahmini, finansal zaman serilerinde kural arama, parametreler için anlamlılık testleri konularında genetik programlamanın alternatif olarak çözüm sunabileceği durumlar da ortaya konulmuştur. Çalışmanın sonucunda tartışılan kavramlar bir arada değerlendirildiğinde, ekonometrik modelleme ve tahmin süreçlerinde araştırmacıların modelin fonksiyonel biçimi ve (veya) değişkenleri hakkında bilgisinin olmadığı ya da yetersiz olduğu durumlarda genetik programlamanın geleneksel yöntemlere alternatif olarak bir çözüm sunma konusunda başarılı olabileceği ifade edilmiştir.


  • Adalı, E. (2017). Yapay Zekâ, İstanbul Teknik Üniversitesi Yayını: İnsanlaşan Makineler ve Yapay Zekâ, 75, 8-13. google scholar
  • Affenzeller, M., Winkler, S., Wagner, S. & Beham A. (2009). Genetic Algorithms and GeneticProgramming Modern Concepts and Practical Applications, Taylor & Francis Group, LLC. google scholar
  • Akaike, H. (1973). Information Theory and an Extension of the Maximum Likelihood Principle. In B.N. Petro and F. Csaki ed. 2nd International Symposium on Information Theory, pp. 267-281. google scholar
  • Akaike, H. (1974). A new look at the statistical model identification. IEEE Trans. Autom. Control, 19(6), 716723. google scholar
  • Akaike, H. (1978). A Bayesian analysis of the Minimum AIC Procedure. Annals of the Institute of Statistical Mathematics, 20, 9-14. google scholar
  • Andersen, C. M., & Bro, R. (2010). Variable selection in regression—a tutorial. Journal of Chemometrics, 24, 728-737. google scholar
  • Arganis, M., Val, R., Prats, J., Rodriguez, K., Dommguez, R. & Dolz, J. (2009). Genetic Programming and Standardization in Water Temperature Modelling. Hindawi Publishing Corporation Advances in Civil Engineering, Volume 2009, Article ID 353960, 10 pages. google scholar
  • Arnaldo, I., Krawiec, K., & O’Reilly, U. (2014). Multiple regression genetic programming. Proceedings of the 2014 Annual Conference on Genetic andEvolutionary Computation, ACM, pp. 879-886. google scholar
  • Aryafar, A., Khosravi, V., Zarepourfard, H. & Rooki, R. (2019). Evolving genetic programming and other AI-based models for estimating groundwater quality parameters of the Khezri plain, Eastern Iran. Environmental Earth Sciences, 78, 69. google scholar
  • Azaria, Y., & Sipper, M. (2005a). GP-gammon: Genetically programming backgammon players. Genetic Programming and Evolvable Machines, 6(3), 283-300. google scholar
  • Balcombe, K. G. (2005). Model selection using information criteria and genetic algorithms. Computational Economics, 25(3), 207-228. google scholar
  • Banzhaf, W., Nordin, P., Keller, R. E. & Francone, F. D. (1997). Genetic programming: An introduction: On the automatic evolution of computer programs and its applications (the morgan kaufmann series in artificial intelligence). Morgan Kaufmann Publishers. google scholar
  • Bauer, R. J., Jr. (1994). Genetic Algorithms and Investment Strategies. John Wiley. google scholar
  • Bhanu, B. & Lee, S. (1994). Genetic Learning for Adaptive Image Segmentation. Boston: Kluwer Academic Publishers. google scholar
  • Chadalawada J, Havlicek V., & Babovic, V. (2017). A genetic programming approach to system identification of rainfall-runoff models. Water Resources Management, 31, 3975-3992. google scholar
  • Chen, S. H., Duffy, J., & Yeh, C. H. (2002). Equilibrium selection via adaptation: Using genetic programming to model learning in a coordination game. The Electronic Journal ofEvolutionary Modeling and Economic Dynamics, 15 January 2002. google scholar
  • Corno, F., Sanchez, E., & Squillero, G. (2005). Evolving assembly programs: how games help microprocessor validation. IEEE Transactions on Evolutionary Computation, 9(6), 695-706. google scholar
  • Cramer, N. L. (1985). A Representation for the Adaptive Generation of Simple Sequential Programs, Proceedings ofAn International Conference On Genetic Algorithms and Their Applications, pp. 183-187. google scholar
  • Davidson, J. W., Savic, D. A., & Walters, G.A. (2003). Symbolic and numerical regression: experiments and applications. Information Sciences, 150, 95-117. google scholar
  • Dijkstra, E. W. (1961).Algol 60 translation: An algol 60 translator for the x1 and making a translator for algol 60.Stichting Mathematisch Centrum. Rekenafdeling. Stichting Mathematisch Centrum. google scholar
  • Dorsey, R., & Mayer, W. (1995). Genetic algorithms for estimation problems with multiple optima, nondifferentiability, and other irregular features. Journal ofBusiness & EconomicStatistics, 13(1), 53-66. google scholar
  • Drecourt, J. P. (1999). Application of neural networks and genetic programming to rainfall run off modeling. Water Resources Management, 13, 219-231. google scholar
  • Efron, B., Hastie, T., Johnstone, I. & Tibshirani, R. (2004). Least angle regression. The Annals of statistics, 32(2), 407-499. google scholar
  • Esfahanipour, A., & Mousavi, S. (2011). A genetic programming model to generate risk-adjusted technical trading rules in stock markets. Expert Systems with Applications, 38(7), 8438-8445. google scholar
  • Fischer, M. M., & Leung, Y. (1998). A genetic-algorithms based evolutionary computational neural network formodelling spatial interaction data. The Annals ofRegional Science, 32(3), 437-458. google scholar
  • Forsyth. B. R. (1981). A Darwinian approach to pattern recognition. Kybernetes, 10, 159-166. google scholar
  • Freitas, A. A. A. (2008). Review of evolutionary Algorithms for Data Mining, In: Maimon, O., Rokach, L. (Editors.), Soft Computing for Knowledge Discovery and Data Mining. Springer US, 79-111. google scholar
  • Giustolisi, O., & Savic, D. A. (2006). A symbolic data-driven technique based on evolutionary polynomial regression. Journal of Hydroinformatics, 8(3), 207-222. google scholar
  • Goldberg, D. E. (1989). Genetic Algorithms in Search, Optimisation and Machine Learning, Addison-Wesley. google scholar
  • Hauptman, A., & Sipper, M. (2005). GP-endchess: Using genetic programming to evolve chess endgame players. In M. Keijzer, et al., editors, Proceedings of the 8th European Conference on Genetic Programming, volume 3447 of Lecture Notes in Computer Science, pages 120-131, Lausanne, Switzerland, 30 March - 1 April 2005. Springer. google scholar
  • Holland, J. H. (1975) Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence. University of Michigan Press, AnnArbor, MI. google scholar
  • İnce, G. (2017). İnsanlığın Yapay Zekâyla İmtihanı, İstanbul Teknik Üniversitesi Yayını: İnsanlaşan Makineler ve Yapay Zekâ, 75, 14-17. google scholar
  • Jin, N. (2005). Equilibrium selection by co-evolution for bargaining problems under incomplete information about time preferences. In Proceedings of the IEEE Congress on Evolutionary Computation (CEC), pages 2661-2668, Edinburgh, 2-5 September 2005. google scholar
  • Jin, N., & Tsang, E. P. K. (2006). Co-adaptive strategies for sequential bargaining problems with discount factors and outside options. In Proceedings of the 2006 IEEE Congress on Evolutionary Computation, pages 7913-7920, Vancouver, 6-21 July 2006. google scholar
  • Kaboudan, M. (1999). A measure of time series’ predictability using genetic programming applied to stock returns. Journal ofForecasting, 18(5), 345-357. google scholar
  • Kaboudan, M. (2000). Genetic programming prediction of stock prices. Computational Economics, 16(3), 207-236. google scholar
  • Kaboudan, M. (2005). Extended daily exchange rates forecasts using wavelet temporal resolutions. New Mathematics and Natural Computation, 1(1), 79-107. google scholar
  • Kotanchek M. E., Vladislavleva, E. Y., & Smits, G.F. (2009) Symbolic regression via GP as a discovery engine: Insights on outliers and prototypes. In: Riolo RL, O’Reilly UM, McConaghy T (eds) Genetic Programming Theory and Practice VII, Genetic and Evolutionary Computation, Springer, Ann Arbor, pp.55-72. google scholar
  • Koza, J. R. (1992). Genetic Programming: On the Programming of Computers by Means ofNatural Selection. The MIT Press. google scholar
  • Koza, J. R. (1997). Genetic Programming. Encyclopedia of Computer Science and Technology to be edited by Allen Kent and James G. Williams, 1-26. google scholar
  • Koza, J. R., Bennett, III F. H., & Stiffelman, O. (1999) Genetic programming as a Darwinian invention machine. In R. Poli, et al., editors, Genetic Programming, Proceedings ofEuroGP’99, volume 1598 of LNCS, pages 93-108, Goteborg, Sweden, 26-27 May 1999. Springer-Verlag. google scholar
  • Koza, J. R., Keane, M. A., Yu, J., Bennet III, F. H., & Mydlowec, W. (2000). Automatic Creation of Human-Competitive Programs and Controllers by Means of Genetic Programming. Genetic Programming and Evolvable Machines, 1, 121-164. google scholar
  • Koza, R., Al-Sakran, S.H., & Jones, L. W. (2005). Automated re-invention of six patented optical lens systems using genetic programming. In H.-G. Beyer, et al., editors, GECCO 2005: Proceedings of the 2005 conference on Genetic and evolutionary computation, volume 2, pages 1953-1960, Washington DC, USA, 25-29 June 2005. ACM Press. google scholar
  • Koza, J. R. (2010). Human-competitive results produced by genetic programming. Genet Program Evolvable Mach, 11, 251-284. google scholar
  • Langdon, W. B. (1998). Genetic Programming and Data Structures: Genetic Programming + Data Structures = Automatic Programming!, Springer. google scholar
  • Langdon, W. B., & Poli, R. (2002). Foundations ofgenetic programming. Springer-Verlag. google scholar
  • Leardi, R., Boggia, R., & Terrile, M. (1992). Genetic algorithms as a strategy for feature selection. Journal of chemometrics, 6(5), 267-281. google scholar
  • Lee, Y. S., & Tong, L. I. (2011). Forecasting time series using a methodology based on autoregressive integrated moving average and genetic programming. Knowledge-Based Systems, 24(1), 66-72. google scholar
  • Lo, A. W. (2004). The Adaptive Markets Hypothesis: Market Efficiency from an Evolutionary Perspective. Journal of Portfolio Management, Forthcoming. Available at SSRN: google scholar
  • Lohn, J. D., Hornby, G. S., & Linden, D. S. (2004). Evolutionary antenna design for a NASA spacecraft. In U.-M. O’Reilly, et al., editors, Genetic Programming Theory and Practice II, chapter 18, pages 301-315. Springer, Ann Arbor, 13-15 May 2004. google scholar
  • Mallick, D., Lee, V. C. S.,& Ong, Y. S. (2008). An empirical study of Genetic Programming generated trading rules in computerized stock trading service system. 2008 International Conference on Service Systems and Service Management, Melbourne, VIC, Australia, IEEE. google scholar
  • Man, K. F., Tang, K. S., Kwong, S., & Halang, W. A. (1997). Genetic algorithms for control and signal processing. Springer, London. google scholar
  • Manahov, V., & Zhang, H. (2019). Forecasting Financial Markets Using High-Frequency Trading Data: Examination with Strongly Typed Genetic Programming. International Journal of Electronic Commerce, 23(1), 12-32. google scholar
  • Marney, J. P., Miller, D., Fyfe, C., & Tarbert, H. F. E. (2001). Risk adjusted returns to technical trading rules: a genetic programming approach. In 7th International Conference of Society of Computational Economics, Yale, 28-29 June 2001. google scholar
  • Massey, P., Clark, J. A., & Stepney, S. (2005). Evolution of a human-competitive quantum fourier transform algorithm using genetic programming. In H.-G. Beyer, et al., editors, GECCO 2005: Proceedings of the 2005 conference on Genetic and evolutionary computation, volume 2, pages 1657-1663, Washington DC, USA, 25-29 June 2005.ACM Press. google scholar
  • Michalewicz, Z. (1992). GeneticAlgorithms + Data Structures = Evolution Programs. Berlin: Springer-Verlag. google scholar
  • Mitchell, M. (1996). An IntroductiontoGeneticAlgorithms. Cambridge, MA: The MIT Press. google scholar
  • Moore, J., Olson, R., Chen, Y., & Sipper, M. (2019). Automated discovery of test statistics using genetic programming.16th Annual Humies Awards - 2019 - Prague, Czech Republic. URL google scholar
  • Mousavi, S., Esfahanipour A., & Zarandi, M. H. F. (2014). A novel approach to dynamic portfolio trading system using multitree genetic programming. Knowledge-Based Systems, 66, 68-81. google scholar
  • Neely, J., Weller, P. A., & Dittmar, R. (1997). Is technical analysis in the foreign exchange market profitable? A genetic programming approach. The Journal of Financial and Quantitative Analysis, 32(4), 405-426. google scholar
  • Neely, C. J., & Weller, P. A. (1999). Technical trading rules in the european monetary system. Journal of International Money and Finance, 18(3):429-458. google scholar
  • Neely, C. J., & Weller, P. A. (2001). Technical analysis and central bank intervention. Journal of International Money and Finance, 20(7), 949-970. google scholar
  • Neely, C. J. (2003). Risk-adjusted, ex ante, optimal technical trading rules in equity markets. International Review of Economics and Finance, 12(1), 69-87. google scholar
  • Neely, C. J., Weller, P. A., & Ulrich, J. M. (2006). The adaptive markets hypothesis: evidence from the foreign exchange market. Working Paper 2006-046B, Federal Reserve Bank of St. Louis, Research Division, P.O. Box 442, St. Louis, MO 63166, USA. google scholar
  • Orlov, M., & Sipper, M. (2009). Genetic programming in the wild: Evolving unrestricted bytecode.Proceedings of the 11th Annual conference on Genetic and evolutionary computation, ACM. google scholar
  • Pal, S. K., & Wang, P. P. (1996). Genetic Algorithms and Pattern Recognition. Boca Raton, FL: CRC Press. google scholar
  • Perkis, T. (1994). Stack-based genetic programming.Proceedings of the first IEEE conference on evolutionary computation. IEEE world congress on computational intelligence, IEEE. google scholar
  • Poli, R., Langdon, W. B., & McPhee, N. F. (2008). A field guide to genetic programming. Published via http:// and freely available at (With contributions by J. R. Koza). google scholar
  • Potvin, J.Y., Soriano, P. & Vallee, M. (2004). Generating trading rules on the stock markets with genetic programming. Computers & Operations Research, 31 (7), 1033-1047. google scholar
  • Ruengvirayudh, P., & Brooks, G.P. (2016). Comparing Stepwise Regression Models to the Best-Subsets Models, or, the Art of Stepwise. General Linear Model Journal, 42(1), 1-14. google scholar
  • Schwarz, G. (1978). Estimating the Dimensions of a Model, The Annals of Statistics, 6, 461-464. google scholar
  • Shichel, Y., Ziserman, E. & Sipper, M. (2005). GP-robocode: Using genetic programming to evolve robocode players. In M. Keijzer, et al., editors, Proceedings of the 8th European Conference on Genetic Programming, volume 3447 of Lecture Notes in Computer Science, pages 143-154, Lausanne, Switzerland, 30 March - 1 April 2005. Springer. google scholar
  • Smith, S. F. (1980). A Learning System Based on Genetic Adaptive Algorithms. PhD thesis, University of Pittsburgh. google scholar
  • Stender, J., Hillebrand, E., & Kingdon, J. (editors). (1994). Genetic Algorithms in Optimization, Simulation, and Modeling. Amsterdam: IOS Publishing. google scholar
  • Tanimoto, S. L.(1987). The elements of artificial intelligence: an introduction using LISP. New York, NY, USA: Computer Science Press. google scholar
  • Tolvi, J. (2004). Genetic algorithms for outlier detection and variable selection in linear regression models. Soft Computing, 8, 527-533. google scholar
  • Tong, H. (1978). On A Threshold Model. Chen, C. (Ed.). Pattern Recognition And Signal Processing, p. 575-586, NATO ASI Series E: Applied Sc. (29), Sijthoff ve Noordhoff, Netherlands. google scholar
  • Tong, H., & Lim, K.S. (1980).Threshold Autoregression, Limit Cycles and Cyclical Data. Journal of the Royal Statistical Society. Series B (Methodological), 42(3), 245-292. google scholar
  • Tong, H. (1990). Non-Linear Time Series: A Dynamical System Approach, Oxford University Pres. google scholar
  • Trujillo, L., & Olague, G. (2006). Using evolution to learn how to perform interest point detection. In X. Y. T. et al., editor, ICPR 2006 18th International Conference on Pattern Recognition, volume 1, pages 211-214. IEEE, 20-24 August 2006a. google scholar
  • Turing, A. M. (1936). Oncomputable numbers, with an application to the Entscheidungsproblem. Proceedings of theLondon Mathematical Society, Series 2, 42 (1936-7), 230-265. google scholar
  • Turing, A. M. (1948). Intelligent machinery. [Reprinted in ed. by D.C. Ince, Mechanical Intelligence: Collected Works of A. M. Turing. (North Holland, Amsterdam, 1992), pp. 107-127). Also reprinted in ed. by B. Meltzer, D. Michie. 1969]. Machine Intelligence 5. (Edinburgh University Press, Edinburgh, 1948). google scholar
  • Turing, A. M. (1950). Computing machinery and intelligence. Mind, 59(236), 433-460. [Reprinted in D. C. Ince (ed), Mechanical Intelligence: Collected Works of A. M. Turing. (North Holland, Amsterdam, 1992), pp. 133-160]. google scholar
  • Wong, M. L., & Leung, K.S. (2002). Data Mining Using Grammar Based Genetic Programming and Applications. Kluwer Academic Publishers New York, Boston, Dordrecht, London, Moscow. google scholar
  • Wu, A. S., & Banzhaf. W. (1998). Introduction to the special issue: Variable-length representation and noncoding segments for evolutionary algorithms. Evolutionary Computation, 6(4), iii-vi. google scholar
  • Yule, G. U. (1926). Why do we sometimes get nonsense correlations between time series? A study in sampling and the nature of time series. J. Roy. Statist. Soc., 89, 1-63. google scholar


Istanbul University Press aims to contribute to the dissemination of ever growing scientific knowledge through publication of high quality scientific journals and books in accordance with the international publishing standards and ethics. Istanbul University Press follows an open access, non-commercial, scholarly publishing.