World Scientific
  • Search
  •   
Skip main navigation

Cookies Notification

We use cookies on this site to enhance your user experience. By continuing to browse the site, you consent to the use of our cookies. Learn More
×
Our website is made possible by displaying certain online content using javascript.
In order to view the full content, please disable your ad blocker or whitelist our website www.worldscientific.com.

System Upgrade on Tue, Oct 25th, 2022 at 2am (EDT)

Existing users will be able to log into the site and access content. However, E-commerce and registration of new users may not be available for up to 12 hours.
For online purchase, please visit us again. Contact us at [email protected] for any enquiries.

Approximating functions with multi-features by deep convolutional neural networks

    https://doi.org/10.1142/S0219530522400085Cited by:17 (Source: Crossref)
    This article is part of the issue:

    Deep convolutional neural networks (DCNNs) have achieved great empirical success in many fields such as natural language processing, computer vision, and pattern recognition. But there still lacks theoretical understanding of the flexibility and adaptivity of DCNNs in various learning tasks, and the power of DCNNs at feature extraction. We propose a generic DCNN structure consisting of two groups of convolutional layers associated with two downsampling operators, and a fully connected layer, which is determined only by three structural parameters. Our generic DCNNs are capable of extracting various features including not only polynomial features but also general smooth features. We also show that the curse of dimensionality can be circumvented by our DCNNs for target functions of the compositional form with (symmetric) polynomial features, spatially sparse smooth features, and interaction features. These demonstrate the expressive power of our DCNN structure, while the model selection can be relaxed comparing with other deep neural networks since there are only three hyperparameters controlling the architecture to tune.

    AMSC: 68T07, 68Q32, 41A25

    References

    • 1. O. E. Ayach, S. Rajagopal, S. Abu-Surra, Z. Pi and R. W. Heath , Spatially sparse precoding in millimeter wave MIMO systems, IEEE Trans. Wirel. Commun. 13(3) (2014) 1499–1513. Crossref, Web of ScienceGoogle Scholar
    • 2. F. Bach , Breaking the curse of dimensionality with convex neural networks, J. Mach. Learn. Res. 18(1) (2017) 629–681. Google Scholar
    • 3. A. R. Barron , Universal approximation bounds for superpositions of a sigmoidal function, IEEE Trans. Inform. Theory 39(3) (1993) 930–945. Crossref, Web of ScienceGoogle Scholar
    • 4. B. Bauer and M. Kohler , On deep learning as a remedy for the curse of dimensionality in nonparametric regression, Ann. Statist. 47(4) (2019) 2261–2285. Crossref, Web of ScienceGoogle Scholar
    • 5. C. K. Chui, X. Li and H. N. Mhaskar , Neural networks for localized approximation, Math. Comp. 63(208) (1994) 607–623. Crossref, Web of ScienceGoogle Scholar
    • 6. C. K. Chui, S.-B. Lin, B. Zhang and D.-X. Zhou , Realization of spatial sparseness by deep ReLU nets with massive data, IEEE Trans. Neural Netw. Learn. Syst. 33(1) (2022) 229–243. Crossref, Web of ScienceGoogle Scholar
    • 7. F. Cucker and D.-X. Zhou , Learning Theory: An Approximation Theory Viewpoint, Vol. 24 (Cambridge University Press, 2007). CrossrefGoogle Scholar
    • 8. G. Cybenko , Approximation by superpositions of a sigmoidal function, Math. Control Signals Systems 2(4) (1989) 303–314. CrossrefGoogle Scholar
    • 9. Z. Fang, H. Feng, S. Huang and D.-X. Zhou , Theory of deep convolutional neural networks II: Spherical analysis, Neural Netw. 131 (2020) 154–162. Crossref, Web of ScienceGoogle Scholar
    • 10. H. Feng, S. Hou, L. Wei and D.-X. Zhou , CNN models for readability of Chinese texts, Math. Found. Comput. 5(4) (2022) 351–362. Crossref, Web of ScienceGoogle Scholar
    • 11. I. Goodfellow, Y. Bengio and A. Courville , Deep Learning (MIT Press, 2016). Google Scholar
    • 12. Z. Han, S. Yu, S.-B. Lin and D.-X. Zhou , Depth selection for deep ReLU nets in feature extraction and generalization, IEEE Trans. Pattern Anal. Mach. Intell. 44(4) (2022) 1853–1868. Crossref, Web of ScienceGoogle Scholar
    • 13. G. E. Hinton, S. Osindero and Y.-W. Teh , A fast learning algorithm for deep belief nets, Neural Comput. 18(7) (2006) 1527–1554. Crossref, Web of ScienceGoogle Scholar
    • 14. K. Hornik, M. Stinchcombe and H. White , Multilayer feedforward networks are universal approximators, Neural Netw. 2(5) (1989) 359–366. Crossref, Web of ScienceGoogle Scholar
    • 15. X. Hou, J. Harel and C. Koch , Image signature: Highlighting sparse salient regions, IEEE Trans. Pattern Anal. Mach. Intell. 34(1) (2011) 194–201. Web of ScienceGoogle Scholar
    • 16. J. M. Klusowski and A. R. Barron , Approximation by combinations of ReLU and squared ReLU ridge functions with 1 and 0 controls, IEEE Trans. Inform. Theory 64(12) (2018) 7649–7656. Crossref, Web of ScienceGoogle Scholar
    • 17. M. Kohler and A. Krzyżak , Nonparametric regression based on hierarchical interaction models, IEEE Trans. Inform. Theory 63(3) (2016) 1620–1630. Crossref, Web of ScienceGoogle Scholar
    • 18. A. Krizhevsky, I. Sutskever and G. E. Hinton , ImageNet classification with deep convolutional neural networks, in Advances in Neural Information Processing Systems, Vol. 25 (Curran Associates, Inc., 2012), pp. 1097–1105. Google Scholar
    • 19. S.-B. Lin , Generalization and expressivity for deep nets, IEEE Trans. Neural Netw. Learn. Syst. 30(5) (2018) 1392–1406. Crossref, Web of ScienceGoogle Scholar
    • 20. V. Y. Lin and A. Pinkus , Fundamentality of ridge functions, J. Approx. Theory 75 (1993) 295–311. Crossref, Web of ScienceGoogle Scholar
    • 21. V. E. Maiorov , On best approximation by ridge functions, J. Approx. Theory 99(1) (1999) 68–94. Crossref, Web of ScienceGoogle Scholar
    • 22. S. Mallat , Understanding deep convolutional networks, Philos. Trans. R. Soc. A, Math. Phys. Eng. Sci. 374(2065) (2016) 20150203. Crossref, Web of ScienceGoogle Scholar
    • 23. T. Mao, Z. Shi and D.-X. Zhou , Theory of deep convolutional neural networks III: Approximating radial functions, Neural Netw. 144 (2021) 778–790. Crossref, Web of ScienceGoogle Scholar
    • 24. H. N. Mhaskar , Approximation properties of a multilayered feedforward artificial neural network, Adv. Comput. Math. 1(1) (1993) 61–80. CrossrefGoogle Scholar
    • 25. H. N. Mhaskar and C. A. Micchelli , Approximation by superposition of sigmoidal and radial basis functions, Adv. Appl. Math. 13(3) (1992) 350–373. Crossref, Web of ScienceGoogle Scholar
    • 26. H. N. Mhaskar and T. Poggio , Deep versus shallow networks: An approximation theory perspective, Anal. Appl. 14(6) (2016) 829–848. Link, Web of ScienceGoogle Scholar
    • 27. H. Montanelli and Q. Du , New error bounds for deep ReLU networks using sparse grids, SIAM J. Math. Data Sci. 1(1) (2019) 78–92. CrossrefGoogle Scholar
    • 28. K. Oono and T. Suzuki , Approximation and non-parametric estimation of ResNet-type convolutional neural networks, in Int. Conf. Machine Learning (PMLR, 2019), pp. 4922–4931. Google Scholar
    • 29. P. Petersen and F. Voigtlaender , Equivalence of approximation by convolutional neural networks and fully-connected networks, Proc. Amer. Math. Soc. 148(4) (2020) 1567–1581. Crossref, Web of ScienceGoogle Scholar
    • 30. A. Pinkus , N-Widths in Approximation Theory, Vol. 7 (Springer Science & Business Media, 2012). Google Scholar
    • 31. J. Schmidt-Hieber , Nonparametric regression using deep neural networks with ReLU activation function, Ann. Statist. 48(4) (2020) 1875–1897. Web of ScienceGoogle Scholar
    • 32. D. Silver et al., Mastering the game of go without human knowledge, Nature 550(7676) (2017) 354–359. Crossref, Web of ScienceGoogle Scholar
    • 33. T. Suzuki , Adaptivity of deep ReLU network for learning in Besov and mixed smooth Besov spaces: Optimal rate and curse of dimensionality, in Int. Conf. Learning Representations (2019). Google Scholar
    • 34. M. Telgarsky , Benefits of depth in neural networks, in Conf. Learning Theory (PMLR, 2016), pp. 1517–1539. Google Scholar
    • 35. T. Wiatowski and H. Bölcskei , A mathematical theory of deep convolutional neural networks for feature extraction, IEEE Trans. Inform. Theory 64(3) (2017) 1845–1866. Crossref, Web of ScienceGoogle Scholar
    • 36. J. Wright, Y. Ma, J. Mairal, G. Sapiro, T. S. Huang and S. Yan , Sparse representation for computer vision and pattern recognition, Proc. IEEE 98(6) (2010) 1031–1044. CrossrefGoogle Scholar
    • 37. D. Yarotsky , Error bounds for approximations with deep ReLU networks, Neural Netw. 94 (2017) 103–114. Crossref, Web of ScienceGoogle Scholar
    • 38. D.-X. Zhou , Deep distributed convolutional neural networks: Universality, Anal. Appl. 16(6) (2018) 895–919. Link, Web of ScienceGoogle Scholar
    • 39. D.-X. Zhou , Theory of deep convolutional neural networks: Downsampling, Neural Netw. 124 (2020) 319–327. Crossref, Web of ScienceGoogle Scholar
    • 40. D.-X. Zhou , Universality of deep convolutional neural networks, Appl. Comput. Harmon. Anal. 48(2) (2020) 787–794. Crossref, Web of ScienceGoogle Scholar
    • 41. T.-Y. Zhou and D.-X. Zhou, Theory of deep CNNs induced by 2D convolutions, preprint (2020). Google Scholar
    Remember to check out the Most Cited Articles!

    Check out our Differential Equations and Mathematical Analysis books in our Mathematics 2021 catalogue
    Featuring authors such as Ronen Peretz, Antonio Martínez-Abejón & Martin Schechter