Deep learning in high dimension: Neural network expression rates for generalized polynomial chaos expansions in UQ
Abstract
We estimate the expressive power of certain deep neural networks (DNNs for short) on a class of countably-parametric, holomorphic maps on the parameter domain . Dimension-independent rates of best -term truncations of generalized polynomial chaos (gpc for short) approximations depend only on the summability exponent of the sequence of their gpc expansion coefficients. So-called -holomorphic maps , with for some , are known to allow gpc expansions with coefficient sequences in . Such maps arise for example as response surfaces of parametric PDEs, with applications in PDE uncertainty quantification (UQ) for many mathematical models in engineering and the sciences. Up to logarithmic terms, we establish the dimension independent approximation rate for these functions in terms of the total number of units and weights in the DNN. It follows that certain DNN architectures can overcome the curse of dimensionality when expressing possibly countably-parametric, real-valued maps with a certain degree of sparsity in the sequences of their gpc expansion coefficients. We also obtain rates of expressive power of DNNs for countably-parametric maps , where is the Hilbert space .
References
- 1. , Fully discrete approximation of parametric and stochastic elliptic PDEs, SIAM J. Numer. Anal. 55(5) (2017) 2151–2186. Web of Science, Google Scholar
- 2. , Sparse polynomial approximation of parametric elliptic PDEs. Part I: Affine coefficients, ESAIM Math. Model. Numer. Anal. 51(1) (2017) 321–339. Web of Science, Google Scholar
- 3. , Differential operators on domains with conical points: Precise uniform regularity estimates, Rev. Roumaine Math. Pures Appl. 62(3) (2017) 383–411. Google Scholar
- 4. ,
Complexity regularization with application to artificial neural networks , in Nonparametric Functional Estimation and Related Topics,NATO Science Series C: Mathematical and Physical Sciences , Vol. 335 (Kluwer Academic Publisher, Dordrecht, 1991), pp. 561–576. Google Scholar - 5. , Universal approximation bounds for superpositions of a sigmoidal function, IEEE Trans. Inform. Theory 39(3) (1993) 930–945. Web of Science, Google Scholar
- 6. C. Beck, E. Weinan and A. Jentzen, Machine learning approximation algorithms for high-dimensional fully nonlinear partial differential equations and second-order backward stochastic differential equations, Technical Report 2017-49, Seminar for Applied Mathematics, ETH Zürich (2017). Google Scholar
- 7. Y. Bengio, Practical recommendations for gradient-based training of deep architectures, preprint (2012), arXiv:1206.5533v2. Google Scholar
- 8. H. Bölcskei, P. Grohs, G. Kutyniok and P. Petersen, Optimal approximation with sparsely connected deep neural networks, preprint (2017), arXiv:1705.01714. Google Scholar
- 9. , High-dimensional adaptive sparse polynomial interpolation and applications to parametric PDEs, J. Found. Comput. Math. 14(4) (2013) 601–633. Web of Science, Google Scholar
- 10. , Breaking the curse of dimensionality in sparse polynomial approximation of parametric PDEs, J. Math. Pures Appl. 103(2) (2015) 400–428. Web of Science, Google Scholar
- 11. , Approximation of high-dimensional parametric PDEs, Acta Numer. 24 (2015) 1–159. Web of Science, Google Scholar
- 12. , Convergence rates of best -term Galerkin approximations for a class of elliptic sPDEs, Found. Comput. Math. 10(6) (2010) 615–646. Web of Science, Google Scholar
- 13. , Analytic regularity and polynomial approximation of parametric and stochastic elliptic PDE’s, Anal. Appl. (Singap.) 9(1) (2011) 11–47. Link, Web of Science, Google Scholar
- 14. , Shape holomorphy of the stationary Navier–Stokes equations, SIAM J. Math. Anal. 50(2) (2018) 1720–1752. Web of Science, Google Scholar
- 15. , On the expressive power of deep learning: A tensor analysis, in Proc. of 29th Annual Conf. Learning Theory,
New York, USA (2016), pp. 698–728. Google Scholar - 16. , Multilevel higher-order quasi-Monte Carlo Bayesian estimation, Math. Models Methods Appl. Sci. 27(5) (2017) 953–995. Link, Web of Science, Google Scholar
- 17. , Approximate realization of identity mappings by three-layer neural networks, Electron. Comm. Japan Part III Fund. Electron. Sci. 73(11) (1991) 61–68. Google Scholar
- 18. , Convergence rates of multilevel and sparse tensor approximations for a random elliptic PDE, SIAM J. Numer. Anal. 51(4) (2013) 2426–2447. Web of Science, Google Scholar
- 19. , Deep convolutional neural networks on cartoon functions, in Proc. 2016 IEEE International Symposium on Information Theory (ISIT),
Barcelona, Spain (2016), pp. 1163–1167. Google Scholar - 20. , Analyticity in Infinite-Dimensional Spaces,
de Gruyter Studies in Mathematics , Vol. 10 (Walter de Gruyter & Co., Berlin, 1989). Google Scholar - 21. , Approximation capabilities of multilayer feedforward networks, Neural Net. 4(2) (1991) 251–257. Web of Science, Google Scholar
- 22. , Multilayer feedforward networks are universal approximators, Neural Net. 2(5) (1989) 359–366. Web of Science, Google Scholar
- 23. , Electromagnetic wave scattering by random surfaces: Shape holomorphy, Math. Models Methods Appl. Sci. 27(12) (2017) 2229–2259. Link, Web of Science, Google Scholar
- 24. , Machine learning-based 3-D geometry reconstruction and modeling of aortic valve deformation using 3-D computed tomography images, Int. J. Numer. Methods Biomed. Eng. 33(5) (2017) e2827. Web of Science, Google Scholar
- 25. , A deep learning approach to estimate stress distribution: A fast and accurate surrogate of finite-element analysis, J. R. Soc. Interface 15 (2018) 20170844. Web of Science, Google Scholar
- 26. , Why deep neural networks for function approximation? in Proc. of ICLR 2017,
Toulon, France (2017), pp. 1–17. Google Scholar - 27. , Neural networks for optimal approximation of smooth and analytic functions, Neural Comput. 8 (1996) 164–177. Web of Science, Google Scholar
- 28. , Deep vs. shallow networks: An approximation theory perspective, Anal. Appl. (Singap.) 14(6) (2016) 829–848. Link, Web of Science, Google Scholar
- 29. , Complexifications of real Banach spaces, polynomials and multilinear maps, Studia Math. 134(1) (1999) 1–33. Web of Science, Google Scholar
- 30. , Approximation theory of the MLP model in neural networks, Acta Numer. 8 (1999) 143–195. Google Scholar
- 31. , Sum-product networks: A new deep architecture, in 2011 IEEE Int. Conf. Computer Vision Workshops (ICCV Workshops) (IEEE, 2011), pp. 689–690. Google Scholar
- 32. , Numerical solution of parametrized Navier–Stokes equations by reduced basis methods, Numer. Methods Partial Differential Equations 23(4) (2007) 923–948. Web of Science, Google Scholar
- 33. D. Rolnik and M. Tegmark, The power of deeper networks for expressing natural functions, Technical Report (2017). Google Scholar
- 34. , Sparse deterministic approximation of Bayesian inverse problems, Inv. Prob. 28(4) (2012) 045003. Web of Science, Google Scholar
- 35. , Inverse problems: A Bayesian perspective, Acta Numer. 19 (2010) 451–559. Google Scholar
- 36. R. Tripathy and I. Bilionis, Deep UQ: Learning deep neural network surrogate models for high dimensional uncertainty quantification, Technical Report (2018), arXiv:1705.05502. Google Scholar
- 37. , Error bounds for approximations with deep ReLU networks, Neural Net. 94 (2017) 103–114. Web of Science, Google Scholar
- 38. J. Zech, D. Dung and C. Schwab, Multilevel approximation of parametric and stochastic pdes, Technical Report 2018-05, Seminar for Applied Mathematics, ETH Zürich (2018), (in review). Google Scholar
- 39. J. Zech and C. Schwab, Convergence rates of high dimensional Smolyak quadrature, Technical Report 2017-27, Seminar for Applied Mathematics, ETH Zürich (2017), (in review). Google Scholar