World Scientific
Skip main navigation

Cookies Notification

We use cookies on this site to enhance your user experience. By continuing to browse the site, you consent to the use of our cookies. Learn More
×
Our website is made possible by displaying certain online content using javascript.
In order to view the full content, please disable your ad blocker or whitelist our website www.worldscientific.com.

System Upgrade on Mon, Jun 21st, 2021 at 1am (EDT)

During this period, the E-commerce and registration of new users may not be available for up to 6 hours.
For online purchase, please visit us again. Contact us at [email protected] for any enquiries.

Convergent Time-Varying Regression Models for Data Streams: Tracking Concept Drift by the Recursive Parzen-Based Generalized Regression Neural Networks

    One of the greatest challenges in data mining is related to processing and analysis of massive data streams. Contrary to traditional static data mining problems, data streams require that each element is processed only once, the amount of allocated memory is constant and the models incorporate changes of investigated streams. A vast majority of available methods have been developed for data stream classification and only a few of them attempted to solve regression problems, using various heuristic approaches. In this paper, we develop mathematically justified regression models working in a time-varying environment. More specifically, we study incremental versions of generalized regression neural networks, called IGRNNs, and we prove their tracking properties — weak (in probability) and strong (with probability one) convergence assuming various concept drift scenarios. First, we present the IGRNNs, based on the Parzen kernels, for modeling stationary systems under nonstationary noise. Next, we extend our approach to modeling time-varying systems under nonstationary noise. We present several types of concept drifts to be handled by our approach in such a way that weak and strong convergence holds under certain conditions. Finally, in the series of simulations, we compare our method with commonly used heuristic approaches, based on forgetting mechanism or sliding windows, to deal with concept drift. Finally, we apply our concept in a real life scenario solving the problem of currency exchange rates prediction.

    References

    • 1. J. Gama, I. Žliobaitė, A. Bifet, M. Pechenizkiy and A. Bouchachia , A survey on concept drift adaptation, ACM Comput. Surv. 46(4) (2014) 44. Crossref, ISIGoogle Scholar
    • 2. S. K. Tanbeer, C. F. Ahmed, B.-S. Jeong and Y.-K. Lee , Sliding window-based frequent pattern mining over data streams, Inf. Sci. 179(22) (2009) 3843–3865. Crossref, ISIGoogle Scholar
    • 3. U. Yun and G. Lee , Sliding window-based weighted erasable stream pattern mining for stream data applications, Future Gener. Comput. Syst. 59 (2016) 1–20. Crossref, ISIGoogle Scholar
    • 4. A. Dries and U. Rückert , Adaptive concept drift detection, Stat. Anal. Data Min. 2(5–6) (2009) 311–327. CrossrefGoogle Scholar
    • 5. J. Gama, P. Medas, G. Castillo and P. Rodrigues , Learning with drift detection, in Brazilian Symposium on Artificial Intelligence (Springer, Berlin, 2004) pp. 286–295. Google Scholar
    • 6. L. Pietruczuk, L. Rutkowski, M. Jaworski and P. Duda , How to adjust an ensemble size in stream data mining? Inf. Sci. 381 (2017) 46–54. Crossref, ISIGoogle Scholar
    • 7. P. Li, X. Wu, X. Hu and H. Wang , Learning concept-drifting data streams with random ensemble decision trees, Neurocomputing 166 (2015) 68–83. Crossref, ISIGoogle Scholar
    • 8. X. Yin, K. Huang and H. Hao , De2: Dynamic ensemble of ensembles for learning nonstationary data, Neurocomputing 165 (2015) 14–22. Crossref, ISIGoogle Scholar
    • 9. G. Ditzler, M. Roveri, C. Alippi and R. Polikar , Learning in nonstationary environments: A survey, IEEE Comput. Intell. Mag. 10(4) (2015) 12–25. Crossref, ISIGoogle Scholar
    • 10. D. J. Hill and B. S. Minsker , Anomaly detection in streaming environmental sensor data: A data-driven modeling approach, Environ. Model. Softw. 25(9) (2010) 1014–1022. Crossref, ISIGoogle Scholar
    • 11. J. A. Silva, E. R. Faria, R. C. Barros, E. R. Hruschka, A. C. de Carvalho and J. Gama , Data stream clustering: A survey, ACM Comput. Surv. 46(1) (2013) 13. Crossref, ISIGoogle Scholar
    • 12. P. K. Sen and J. M. Singer , Large Sample Methods in Statistics: An Introduction with Applications (CRC Press, USA, 1994). Google Scholar
    • 13. D. F. Specht , A general regression neural network, IEEE Trans. Neural Netw. 2(6) (1991) 568–576. Crossref, Medline, ISIGoogle Scholar
    • 14. D. F. Specht , Probabilistic neural networks, Neural Netw. 3(1) (1990) 109–118. Crossref, ISIGoogle Scholar
    • 15. D. F. Specht , Probabilistic neural networks and the polynomial adaline as complementary techniques for classification, IEEE Trans. Neural Netw. 1(1) (1990) 111–121. Crossref, MedlineGoogle Scholar
    • 16. L. Györfi, M. Kohler, A. Krzyzak and H. Walk , A Distribution-free Theory of Nonparametric Regression (Springer Science & Business Media, 2006). Google Scholar
    • 17. W. Härdle , Applied Nonparametric Regression (Cambridge University Press, Cambridge, 1990). CrossrefGoogle Scholar
    • 18. L. Pietruczuk, L. Rutkowski, M. Jaworski and P. Duda , The parzen kernel approach to learning in non-stationary environment, in 2014 Int. Joint Conf., Neural Networks (IJCNN), (IEEE, Beijing, China, 2014), pp. 3319–3323. Google Scholar
    • 19. B. W. Silverman , Density Estimation for Statistics and Data Analysis (CRC press, 1986). CrossrefGoogle Scholar
    • 20. M. P. Wand and M. C. Jones , Kernel Smoothing (CRC Press, USA, 1994). CrossrefGoogle Scholar
    • 21. H. Adeli and A. Panakkat , A probabilistic neural network for earthquake magnitude prediction, Neural Netw. 22(7) (2009) 1018–1024. Crossref, Medline, ISIGoogle Scholar
    • 22. Z. Sankari and H. Adeli , Probabilistic neural networks for diagnosis of Alzheimer’s disease using conventional and wavelet coherence, J. Neurosci. Methods 197(1) (2011) 165–170. Crossref, Medline, ISIGoogle Scholar
    • 23. M. Ahmadlou and H. Adeli , Enhanced probabilistic neural network with local decision circles: A robust classifier, Integr. Comput.-Aided Eng. 17(3) (2010) 197–210. Crossref, ISIGoogle Scholar
    • 24. D. Glotsos, J. Tohka, P. Ravazoula, D. Cavouras and G. Nikiforidis , Automated diagnosis of brain tumours astrocytomas using probabilistic neural network clustering and support vector machines, Int. J. Neural Syst. 15(01n02) (2005) 1–11. LinkGoogle Scholar
    • 25. T. J. Hirschauer, H. Adeli and J. A. Buford , Computer-aided diagnosis of Parkinsons disease using enhanced probabilistic neural network, J. Med. Syst. 39(11) (2015) 179. Crossref, Medline, ISIGoogle Scholar
    • 26. S.-N. Yu and Y.-H. Chen , Electrocardiogram beat classification based on wavelet transformation and probabilistic neural network, Pattern Recognit. Lett. 28(10) (2007) 1142–1150. Crossref, ISIGoogle Scholar
    • 27. P. K. Banerjee and A. K. Datta , Generalized regression neural network trained preprocessing of frequency domain correlation filter for improved face recognition and its optical implementation, Opt. Laser Technol. 45 (2013) 217–227. Crossref, ISIGoogle Scholar
    • 28. T. Asefa, N. Wanakule and A. Adams , Field-scale application of three types of neural networks to predict ground-water levels1, J. Am. Water Resour. Assoc. 43(5) (2007) 1245–1256. Crossref, ISIGoogle Scholar
    • 29. C. C. Aggarwal , Data Streams: Models and Algorithms (Advances in Database Systems) (Springer-Verlag, New York, Secaucus, NJ, 2006). Google Scholar
    • 30. C. Alippi, G. Boracchi and M. Roveri , Hierarchical change-detection tests, IEEE Trans. Neural Netw. Learn. Syst. 28 (2017) 246–258. Crossref, Medline, ISIGoogle Scholar
    • 31. C. Alippi, G. Boracchi and M. Roveri , Just-in-time classifiers for recurrent concepts, IEEE Trans. Neural Netw. Learn. Syst. 24(4) (2013) 620–634. Crossref, Medline, ISIGoogle Scholar
    • 32. C. Alippi and M. Roveri , Just-in-time adaptive classifiers — Part I: Detecting nonstationary changes, IEEE Trans. Neural Netw. 19(7) (2008) 1145–1153. CrossrefGoogle Scholar
    • 33. A. Bifet, G. Holmes, R. Kirkby and B. Pfahringer , Moa: Massive online analysis, J. Mach. Learn. Res. 11 (2010) 1601–1604. ISIGoogle Scholar
    • 34. M. Jaworski, P. Duda and L. Rutkowski , New splitting criteria for decision trees in stationary data streams, IEEE Trans. Neural Netw. Learn. Syst. PP(99) (2017) 1–14. CrossrefGoogle Scholar
    • 35. L. Rutkowski, M. Jaworski, L. Pietruczuk and P. Duda , A new method for data stream mining based on the misclassification error, IEEE Trans. Neural Netw. Learn. Syst. 26(5) (2015) 1048–1059. Crossref, Medline, ISIGoogle Scholar
    • 36. L. Rutkowski, M. Jaworski, L. Pietruczuk and P. Duda , The CART decision tree for mining data streams, Inf. Sci. 266 (2014) 1–15. Crossref, ISIGoogle Scholar
    • 37. L. Rutkowski, M. Jaworski, L. Pietruczuk and P. Duda , Decision trees for mining data streams based on the Gaussian approximation, IEEE Trans. Knowl. Data Eng. 26(1) (2014) 108–119. Crossref, ISIGoogle Scholar
    • 38. L. Rutkowski, L. Pietruczuk, P. Duda and M. Jaworski , Decision trees for mining data streams based on the McDiarmid’s bound, IEEE Trans. Knowl. Data Eng. 25(6) (2013) 1272–1279. Crossref, ISIGoogle Scholar
    • 39. B. Vinayagasundaram, R. Aarthi and P. A. Saranya , Efficient gaussian decision tree method for concept drift data stream, in 2015 3rd Int. Conf., Signal Processing, Communication and Networking (ICSCN), (IEEE, Chennai, India, 2015) pp. 1–5. Google Scholar
    • 40. I. Zliobaite, A. Bifet, B. Pfahringer and G. Holmes , Active learning with drifting streaming data, IEEE Trans. Neural Netw. Learn. Syst. 25(1) (2014) 27–39. Crossref, Medline, ISIGoogle Scholar
    • 41. P. Ellis , The time-dependent mean and variance of the non-stationary Markovian infinite server system, J. Math. Stat. 6 (2010) 68–71. CrossrefGoogle Scholar
    • 42. P. C. Phillips , Impulse response and forecast error variance asymptotics in nonstationary VARs, J. Econ. 83(1) (1998) 21–56. Crossref, ISIGoogle Scholar
    • 43. K. F. K. Wong, A. Galka, O. Yamashita and T. Ozaki , Modelling non-stationary variance in EEG time series by state space garch model, Comput. Biol. Med. 36(12) (2006) 1327–1335. Crossref, Medline, ISIGoogle Scholar
    • 44. J. Shawe-Taylor and N. Cristianini , Kernel Methods for Pattern Analysis (Cambridge university press, Cambridge, 2004). CrossrefGoogle Scholar
    • 45. E. A. Nadaraya , On estimating regression, Theory Probab. Appl. 9(1) (1964) 141–142. CrossrefGoogle Scholar
    • 46. G. S. Watson , Smooth regression analysis, Sankhyā: Indian J. Stat. A 26(4) (1964) 359–372. Google Scholar
    • 47. W. Greblicki , Asymptotycznie Optymalne Algorytmy Rozpoznawania i Identyfikacji w Warunkach Probabilistycznych, (Prace Naukowe Instytutu Cybernetyki Technicznej, Wroclaw, Poland, 1974). Google Scholar
    • 48. I. A. Ahmad and P.-E. Lin , Nonparametric sequential estimation of a multiple regression function, Bull. Math. Stat. 17(1/2) (1976) 63–75. Google Scholar
    • 49. W. Greblicki and M. Pawlak , Necessary and sufficient consistency conditions for a recursive kernel regression estimate, J. Multivariate Anal. 23(1) (1987) 67–76. Crossref, ISIGoogle Scholar
    • 50. A. Krzyzak and M. Pawlak , Almost everywhere convergence of a recursive regression function estimate and classification, IEEE Trans. Inf. Theory 30(1) (1984) 91–93. Crossref, ISIGoogle Scholar
    • 51. W. Greblicki and M. Pawlak , Nonparametric System Identification (Cambridge University Press, Cambridge, 2008). CrossrefGoogle Scholar
    • 52. L. Rutkowski , Generalized regression neural networks in time-varying environment, IEEE Trans. Neural Netw. 15 (2004) 576–596. Crossref, MedlineGoogle Scholar
    • 53. C. Wolverton and T. Wagner , Asymptotically optimal discriminant functions for pattern classification, IEEE Trans. Inf. Theory 15(2) (1969) 258–265. Crossref, ISIGoogle Scholar
    • 54. E. Parzen , On estimation of probability density function and mode, Ann. Math. Stat. 33 (1962) 1065–1076. Crossref, ISIGoogle Scholar
    • 55. T. Cacoullos , Estimation of a multivariate density, Ann. Inst. Stat. Math. 18(1) (1966) 179–189. CrossrefGoogle Scholar
    • 56. V. Dupač , A dynamic stochastic approximation method, Ann. Math. Stat. 36(6) (1965) 1695–1702. CrossrefGoogle Scholar
    • 57. K. Uosaki , Some generalizations of dynamic stochastic approximation processes, Ann. Stat. 2 (1974) 1042–1048. Crossref, ISIGoogle Scholar
    • 58. M. Watanabe , On Robbins–Monro stochastic approximation method with time varing observations, Bull. Math. Stat. 16 (1974) 73–91. Google Scholar
    • 59. V. A. Epanechnikov , Non-parametric estimation of a multivariate probability density, Theory Probab. Appl. 14(1) (1969) 153–158. CrossrefGoogle Scholar
    • 60. S. J. Sheather and M. C. Jones , A reliable data-based bandwidth selection method for kernel density estimation, J. R. Stat. Soc., Series B Methodol. 53(3) (1991) 683–690. Google Scholar
    • 61. A. W. Bowman , An alternative method of cross-validation for the smoothing of density estimates, Biometrika 71(2) (1984) 353–360. Crossref, ISIGoogle Scholar
    • 62. P. Hall, J. S. Marron and B. U. Park , Smoothed cross-validation, Probab. Theory Related Fields 92(1) (1992) 1–20. Crossref, ISIGoogle Scholar
    • 63. M. Rudemo , Empirical choice of histograms and kernel density estimators, Scand. J. Stat. 9(2) (1982) 65–78. ISIGoogle Scholar
    • 64. H. Van den Berg , International Finance and Open-Economy Macroeconomics Theory, History, and Policy (World Scientific, Singapore, 2010). LinkGoogle Scholar
    • 65. G. Gandolfo , International Finance and Open-Economy Macroeconomics (Springer, Germany, 2016). Google Scholar
    • 66. M. T. Leung, A.-S. Chen and H. Daouk , Forecasting exchange rates using general regression neural networks, Comput. Oper. Res. 27(11) (2000) 1093–1110. Crossref, ISIGoogle Scholar
    • 67. S.-Y. Lin, C.-H. Chen and C.-C. Lo , Currency exchange rates prediction based on linear regression analysis using cloud computing, Int. J. Grid Distrib. Comput. 6(2) (2013) 1–10. Google Scholar
    • 68. J. Gama, R. Sebastião and P. P. Rodrigues , On evaluating stream learning algorithms, Mach. Learn. 90(3) (2013) 1–30. Crossref, ISIGoogle Scholar
    • 69. R. M. Chiulli , Quantitative Analysis: An Introduction (CRC Press, Singapore, 1999). Google Scholar
    • 70. A. Bifet and R. Gavaldà , Adaptive learning from evolving data streams, in International Symposium on Intelligent Data Analysis (Springer, Lyon, France, 2009), pp. 249–260. Google Scholar
    • 71. L. P. Devroye , On the pointwise and the integral convergence of recursive kernel estimates of probability densities, Util. Math. 15 (1979) 113–128. Google Scholar
    • 72. M. Loeve , Probability Theory I, 4th edn. (Springer, USA, 1977). Google Scholar
    • 73. E. Braverman and L. Rozonoer , Convergence of random processes in machine learning theory I, Autom. Remote Control 30(1) (1969) 57–76. Google Scholar
    Published: 13 November 2017
    Remember to check out the Most Cited Articles!

    Check out our titles in neural networks today!