World Scientific
  • Search
Skip main navigation

Cookies Notification

We use cookies on this site to enhance your user experience. By continuing to browse the site, you consent to the use of our cookies. Learn More
Our website is made possible by displaying certain online content using javascript.
In order to view the full content, please disable your ad blocker or whitelist our website

System Upgrade on Tue, Oct 25th, 2022 at 2am (EDT)

Existing users will be able to log into the site and access content. However, E-commerce and registration of new users may not be available for up to 12 hours.
For online purchase, please visit us again. Contact us at [email protected] for any enquiries.

Implementing Gene Expression Programming in the Parallel Environment for Big Datasets’ Classification

    The paper investigates a Gene Expression Programming (GEP)-based ensemble classifier constructed using the stacked generalization concept. The classifier has been implemented with a view to enable parallel processing with the use of Spark and SWIM — an open source genetic programming library. The classifier has been validated in computational experiments carried out on benchmark datasets. Also, it has been inbvestigated how the results are influenced by some settings. The paper is an extension of a previous paper of the authors.


    • 1. C. Ferreira, Gene expression programming: A new adaptive algorithm for solving problems, Complex Syst. 13(2) (2001) 87–129. Google Scholar
    • 2. Y. Liu, C. Ma, L. Xu, X. Shen, M. Li and P. Li, Mapreduce-based parallel gep algorithm for efficient function mining in big data applications, Concurrency Comput. Pract. Exp. 30(23) (2018) e4379. CrossrefGoogle Scholar
    • 3. J. Jȩdrzejowicz and P. Jȩdrzejowicz, Gep-induced expression trees as weak classifiers, in Proc. 8th Industrial Conf. Advances in Data Mining: Medical Applications, E-Commerce, Marketing, and Theoretical Aspects, ICDM ’08 (Springer-Verlag, Berlin, Heidelberg, 2008), pp. 129–141. CrossrefGoogle Scholar
    • 4. J. Jȩdrzejowicz and P. Jȩdrzejowicz, A family of gep-induced ensemble classifiers, in Proc. 1st Int. Conf. Computational Collective Intelligence. Semantic Web, Social Networks and Multiagent Systems, ICCCI ’09 (Springer-Verlag, Berlin, Heidelberg, 2009), pp. 641–652. CrossrefGoogle Scholar
    • 5. J. Jȩdrzejowicz and P. Jȩdrzejowicz, Combining expression trees, in 2013 IEEE Int. Conf. Cybernetics (CYBCO) (IEEE, New York, 2013), pp. 80–85. CrossrefGoogle Scholar
    • 6. J. Jȩdrzejowicz and P. Jȩdrzejowicz, Gene expression programming ensemble for classifying big datasets, in 9th Int. Conf. Computational Collective Intelligence ICCCI 2017, Nicosia, Cyprus, 27–29 September 2017, Proc. Part II, eds. N. T. NguyenG. A. PapadopoulosP. JȩdrzejowiczB. TrawińskiG. Vossen (Springer International Publishing, Cham, 2017), pp. 3–12. CrossrefGoogle Scholar
    • 7. J. Jȩdrzejowicz, P. Jȩdrzejowicz and I. Wierzbowska, Parallel gep ensemble for classifying big datasets in Int. Conf. Computational Collective Intelligence (Springer, Cham, 2018), pp. 234–242. CrossrefGoogle Scholar
    • 8. C. Ferreira, Gene Expression Programming: Mathematical Modeling by an Artificial Intelligence (Springer, Berlin, Heidelberg, 2006). CrossrefGoogle Scholar
    • 9. C. Zhou, W. Xiao, T. M. Tirpak and P. C. Nelson, Evolving accurate and compact classification rules with gene expression programming, IEEE Trans. Evol. Comput. 7 (2003) 519–531. CrossrefGoogle Scholar
    • 10. W. Wang, Q. Li and Z. Cai, Finding compact classification rules with parsimonious gene expression programming, in 2005 Int. Conf. Neural Networks and Brain (IEEE, New York, 2005), pp. 702–705. CrossrefGoogle Scholar
    • 11. W. R. Weinert and H. S. Lopes, Gepclass: A classification rule discovery tool using gene expression programming, in Advanced Data Mining and Applications, eds. X. LiO. R. ZaïaneZ. Li (Springer, Berlin, Heidelberg, 2006), pp. 871–880. CrossrefGoogle Scholar
    • 12. L. Duan, C. Tang, T. Zhang, D. Wei and H. Zhang, Distance guided classification with gene expression programming, in Advanced Data Mining and Applications, eds. X. LiO. R. ZaïaneZ. Li (Springer, Berlin, 2006), pp. 239–246. CrossrefGoogle Scholar
    • 13. X. Liu, Z. Cai and W. Gong, An improved gene expression programming for fuzzy classification, in Advances in Computation and Intelligence, eds. L. KangZ. CaiX. YanY. Liu (Springer, Berlin, Heidelberg, 2008), pp. 520–529. CrossrefGoogle Scholar
    • 14. A. Guerrero-Enamorado, C. Morell, A. Y. Noaman and S. Ventura, An algorithm evaluation for discovering classification rules with gene expression programming, Int. J. Comput. Intell. Syst. 9(2) (2016) 263–280. CrossrefGoogle Scholar
    • 15. L. Xu, Y. Huang, X. Shen and Y. Liu, Parallelizing gene expression programming algorithm in enabling large-scale classification, Sci. Program. 2017 (2017) 10pp. Google Scholar
    • 16. Q. Li, W. Wang, S. Han and J. Li, Evolving classifier ensemble with gene expression programming, in Proc. Third Int. Conf. Natural Computation — Volume 03, ICNC ’07 (IEEE Computer Society, Washington, DC, 2007), pp. 546–550. CrossrefGoogle Scholar
    • 17. J. Wu, C. Tang, J. Zhu, T. Li, L. Duan, C. Li and L. Dai, An attribute-oriented ensemble classifier based on niche gene expression programming, in Third Int. Conf. Natural Computation (ICNC 2007) (IEEE, New York, 2007), pp. 525–529. CrossrefGoogle Scholar
    • 18. Swim library, Accessed on 30 September 2017. Google Scholar
    • 19. Apache spark,, Accessed on 30 September 2017. Google Scholar
    • 20. K. Krawiec, Behavioral Program Synthesis with Genetic Programming, Studies in Computational Intelligence, Vol. 618 (Springer International Publishing, Berlin, 2016), 172pp. CrossrefGoogle Scholar
    • 21. X. Limón, A. Guerra-Hernández, N. Cruz-Ramírez, H.-G. Acosta-Mesa and F. Grimaldo, A windowing strategy for distributed data mining optimized through gpus, Pattern Recognit. Lett. 93(11) (2016) 29–30. Google Scholar
    • 22. S. Neema and B. Soibam, The comparison of machine learning methods to achieve most cost-effective prediction for credit card default, J. Manag. Sci. Bus. Intell. 2(8) (2017) 36–41. Google Scholar
    • 23. A. Jalalirad and T. Tjalkens, Using feature-based models with complexity penalization for selecting features, J. Signal Process. Syst. 90 (2018) 201–210. CrossrefGoogle Scholar
    • 24. A. Ghazvini, J. Awwalu and A. A. Bakar, Comparative analysis of algorithms in supervised classification: A case study of bank notes dataset, Int. J. Comput. Trends Technol. 17(1) (2014) 39–43. CrossrefGoogle Scholar
    • 25. A. Koç and Z. Yeniay, A comparative study of artificial neural networks and logistic regression for classification of marketing campaign results, Math. Comput. Appl. 18 (2013) 392–398. Google Scholar
    • 26. M. Almseidin, M. Alzubi, S. Kovacs and M. Alkasassbeh, Evaluation of machine learning algorithms for intrusion detection system, in 2017 IEEE 15th Int. Symp. Intelligent Systems and Informatics (SISY) (IEEE, New York, 2017), pp. 000277–000282. CrossrefGoogle Scholar
    • 27. K. Bache and M. Lichman, UCI machine learning repository, UCI Machine Learning Repository University of California, Irvine, School of Information and Computer Sciences (2013). Google Scholar
    • 28. J. Vanschoren, J. N. van Rijn, B. Bischl and L. Torgo, OpenML: networked science in machine learning, SIGKDD Explorations, Vol. 15, No. 3 (ACM, New York, NY, USA, 2017), pp. 49–60. Google Scholar