One-Shot Neural Architecture Search by Dynamically Pruning Supernet in Hierarchical Order
Abstract
Neural Architecture Search (NAS), which aims at automatically designing neural architectures, recently draw a growing research interest. Different from conventional NAS methods, in which a large number of neural architectures need to be trained for evaluation, the one-shot NAS methods only have to train one supernet which synthesizes all the possible candidate architectures. As a result, the search efficiency could be significantly improved by sharing the supernet’s weights during the candidate architectures’ evaluation. This strategy could greatly speed up the search process but suffer a challenge that the evaluation based on sharing weights is not predictive enough. Recently, pruning the supernet during the search has been proven to be an efficient way to alleviate this problem. However, the pruning direction in complex-structured search space remains unexplored. In this paper, we revisited the role of path dropout strategy, which drops the neural operations instead of the neurons, in supernet training, and several interesting characters of the supernet trained with dropout are found. Based on the observations, a Hierarchically-Ordered Pruning Neural Architecture Search (HOPNAS) algorithm is proposed by dynamically pruning the supernet with a proper pruning direction. Experimental results indicate that our method is competitive with state-of-the-art approaches on CIFAR10 and ImageNet.
References
- 1. , Development of convolutional neural network and its application in image classification: A survey, Opt. Eng. 58(4) (2019) 040901. Crossref, Web of Science, Google Scholar
- 2. , Recent advances in deep learning for object detection, Neurocomputing 396 (2020) 39–64. Crossref, Web of Science, Google Scholar
- 3. F. Charte, A. J. Rivera, F. Martínez and M. J. del Jesus, Evoaaa: An evolutionary methodology for automated neural autoencoder architecture search, Integr. Comput.-Aided Eng. (preprint) (2020) 1–21. Google Scholar
- 4. , A deep learning algorithm for simulating autonomous driving considering prior knowledge and temporal information, Comput.-Aided Civ. Infrastruct. Eng. 35(4) (2020) 305–321. Crossref, Web of Science, Google Scholar
- 5. , Benchmark and survey of automated machine learning frameworks, Journal of Artificial Intelligence Research 70 (2021) 409–474. Crossref, Web of Science, Google Scholar
- 6. B. Zoph and Q. V. Le, Neural architecture search with reinforcement learning (2016), arXiv preprint, arXiv:1611.01578. Google Scholar
- 7. H. Liu, K. Simonyan and Y. Yang, Darts: Differentiable architecture search (2018), arXiv preprint, arXiv:1806.09055. Google Scholar
- 8. , Auto-deeplab: Hierarchical neural architecture search for semantic image segmentation, in Proc. IEEE Conf. Computer Vision and Pattern Recognition (2019), pp. 82–92. Crossref, Google Scholar
- 9. Y. Chen, T. Yang, X. Zhang, G. Meng, C. Pan and J. Sun, Detnas: Neural architecture search on object detection (2019), arXiv preprint, arXiv:1903.10979. Google Scholar
- 10. , Learning transferable architectures for scalable image recognition, in Proc. IEEE Conf. Computer Vision and Pattern Recognition (2018), pp. 8697–8710. Crossref, Google Scholar
- 11. M. Wistuba, A. Rawat and T. Pedapati, A survey on neural architecture search (2019), arXiv preprint, arXiv:1905.01392. Google Scholar
- 12. H. Pham, M. Y. Guan, B. Zoph, Q. V. Le and J. Dean, Efficient neural architecture search via parameter sharing (2018), arXiv preprint, arXiv:1802.03268. Google Scholar
- 13. L. Xie, X. Chen, K. Bi, L. Wei, Y. Xu, Z. Chen, L. Wang, A. Xiao, J. Chang, X. Zhang et al., Weight-sharing neural architecture search: A battle to shrink the optimization gap (2020), arXiv preprint, arXiv:2008.01475. Google Scholar
- 14. , Understanding and simplifying one-shot architecture search, Int. Conf. Machine Learning (2018), pp. 550–559. Google Scholar
- 15. K. Yu, C. Sciuto, M. Jaggi, C. Musat and M. Salzmann, Evaluating the search phase of neural architecture search (2019), arXiv preprint, arXiv:1902.08142. Google Scholar
- 16. Z. Guo, X. Zhang, H. Mu, W. Heng, Z. Liu, Y. Wei and J. Sun, Single path one-shot neural architecture search with uniform sampling, (2019), arXiv preprint, arXiv:1904.00420. Google Scholar
- 17. , Improving one-shot nas by suppressing the posterior fading, in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition (2020), pp. 13836–13845. Crossref, Google Scholar
- 18. , Deep residual learning for image recognition, in Proc. IEEE Conf. Computer Vision and Pattern Recognition (2016), pp. 770–778. Crossref, Google Scholar
- 19. , Densely connected convolutional networks, in Proc. IEEE Conf. Computer Vision and Pattern Recognition (2017), pp. 4700–4708. Crossref, Google Scholar
- 20. , Efficient embedded decoding of neural network language models in a machine translation system, Int. J. Neural Syst. 28(9) (2018) 1850007. Link, Web of Science, Google Scholar
- 21. , Imagenet classification with deep convolutional neural networks, Advances in Neural Information Processing Systems 25 (2012) 1097–1105. Google Scholar
- 22. , Mask r-cnn, in Proc. IEEE Int. Conf. Computer Vision (2017), pp. 2961–2969. Crossref, Google Scholar
- 23. , Deep learning techniques for recommender systems based on collaborative filtering, Expert Syst. 37(6) (2020) e12647. Crossref, Web of Science, Google Scholar
- 24. , A neural network for image anomaly detection with deep pyramidal representations and dynamic routing, Int. J. Neural Syst. 30(10) (2020) 2050060–2050060. Link, Web of Science, Google Scholar
- 25. , Deep learning for data anomaly detection and data compression of a long-span suspension bridge, Comput.-Aided Civ. Infrastruct. Eng. 35(7) (2020) 685–700. Crossref, Web of Science, Google Scholar
- 26. , Automatic seizure detection based on s-transform and deep convolutional neural network, Int. J. Neural Syst. 30(4) (2020) 1950024. Link, Web of Science, Google Scholar
- 27. , Alternative diagnosis of epilepsy in children without epileptiform discharges using deep convolutional neural networks, Int. J. Neural Syst. 30(5) (2020) 1850060. Link, Web of Science, Google Scholar
- 28. , Instance-based representation using multiple kernel learning for predicting conversion to alzheimer disease, Int. J. Neural Syst. 29(2) (2019) 1850042. Link, Web of Science, Google Scholar
- 29. , Deep learning representation from electroencephalography of early-stage creutzfeldt-jakob disease and features for differentiation from rapidly progressive dementia, Int. J. Neural Syst. 27(2) (2017) 1650039. Link, Web of Science, Google Scholar
- 30. , Exudate-based diabetic macular edema recognition in retinal images using cascaded deep residual networks, Neurocomputing 290 (2018) 161–171. Crossref, Web of Science, Google Scholar
- 31. , Automatic diagnosis for thyroid nodules in ultrasound images by deep neural networks, Med. Image Anal. 61 (2020) 101665. Crossref, Medline, Web of Science, Google Scholar
- 32. , Random search for hyper-parameter optimization, J. Mach. Learn. Res. 13(1) (2012) 281–305. Google Scholar
- 33. , Making a science of model search: Hyperparameter optimization in hundreds of dimensions for vision architectures, Int. Conf. Machine Learning (2013), pp. 115–123. Google Scholar
- 34. , Neuroevolution: From architectures to learning, Evol. Intell. 1(1) (2008) 47–62. Crossref, Google Scholar
- 35. E. Real, S. Moore, A. Selle, S. Saxena, Y. L. Suematsu, J. Tan, Q. Le and A. Kurakin, Large-scale evolution of image classifiers (2017), arXiv preprint, arXiv:1703.01041. Google Scholar
- 36. ,
Evolving deep neural networks , in Artificial Intelligence in the Age of Neural Networks and Brain Computing (Elsevier, 2019), pp. 293–312. Crossref, Google Scholar - 37. X. Chu, B. Zhang, R. Xu and J. Li, Fairnas: Rethinking evaluation fairness of weight sharing neural architecture search (2019), arXiv preprint, arXiv:1907.01845. Google Scholar
- 38. H. Liang, S. Zhang, J. Sun, X. He, W. Huang, K. Zhuang and Z. Li, Darts+: Improved differentiable architecture search with early stopping (2019), arXiv preprint, arXiv:1909.06035. Google Scholar
- 39. X. Chu, T. Zhou, B. Zhang and J. Li, Fair darts: Eliminating unfair advantages in differentiable architecture search (2019), arXiv preprint, arXiv:1911.12126. Google Scholar
- 40. , Progressive differentiable architecture search: Bridging the depth gap between search and evaluation, in Proc. IEEE Int. Conf. Computer Vision (2019), pp. 1294–1303. Crossref, Google Scholar
- 41. , Progressive neural architecture search, in Proc. European Conf. Computer Vision (ECCV) (2018), pp. 19–34. Crossref, Google Scholar
- 42. H. Liu, K. Simonyan, O. Vinyals, C. Fernando and K. Kavukcuoglu, Hierarchical representations for efficient architecture search (2017), arXiv preprint, arXiv:1711.00436. Google Scholar
- 43. G. E. Hinton, N. Srivastava, A. Krizhevsky, I. Sutskever and R. R. Salakhutdinov, Improving neural networks by preventing co-adaptation of feature detectors (2012), arXiv preprint, arXiv:1207.0580. Google Scholar
- 44. , Dropout: A simple way to prevent neural networks from overfitting, J. Mach. Learn. Res. 15(1) (2014) 1929–1958. Google Scholar
- 45. B. Zhou, Y. Sun, D. Bau and A. Torralba, Revisiting the importance of individual units in cnns via ablation (2018), arXiv preprint, arXiv:1806.02891. Google Scholar
- 46. , Nas-bench-101: Towards reproducible neural architecture search, Int. Conf. Machine Learning (2019), pp. 7105–7114. Google Scholar
- 47. , A new measure of rank correlation, Biometrika 30(1/2) (1938) 81–93. Crossref, Google Scholar
- 48. , How functions evolve in deep convolutional neural network, 2018 14th IEEE Int. Conf. Signal Processing (ICSP) (IEEE, 2018), pp. 1133–1138. Crossref, Google Scholar
- 49. H. Cai, L. Zhu and S. Han, Proxylessnas: Direct neural architecture search on target task and hardware (2018), arXiv preprint, arXiv:1812.00332. Google Scholar
- 50. , Mnasnet: Platform-aware neural architecture search for mobile, in Proc. IEEE Conf. Computer Vision and Pattern Recognition (2019), pp. 2820–2828. Crossref, Google Scholar
- 51. , A fast and elitist multiobjective genetic algorithm: Nsga-ii, IEEE Trans. Evol. Comput. 6(2) (2002) 182–197. Crossref, Web of Science, Google Scholar
- 52. , Covariance matrix adaptation pareto archived evolution strategy with hypervolume-sorted adaptive grid algorithm, Integr. Comput.-Aided Eng. 23(4) (2016) 313–329. Crossref, Web of Science, Google Scholar
- 53. , Progressive preference articulation for decision making in multi-objective optimisation problems, Integr. Comput.-Aided Eng. 24(4) (2017) 315–335. Crossref, Web of Science, Google Scholar
- 54. , Regularized evolution for image classifier architecture search, in Proc. AAAI Conf. Artificial Intelligence, Vol. 33 (2019), pp. 4780–4789. Crossref, Google Scholar
- 55. , Neural architecture optimization, in Advances in Neural Information Processing Systems (2018), pp. 7816–7827. Google Scholar
- 56. , Pc-darts: Partial channel connections for memory-efficient architecture search, Int. Conf. Learning Representations (2019). Google Scholar
- 57. H. Zhou, M. Yang, J. Wang and W. Pan, Bayesnas: A Bayesian approach for neural architecture search (2019), arXiv preprint, arXiv:1905.04919. Google Scholar
- 58. T. DeVries and G. W. Taylor, Improved regularization of convolutional neural networks with cutout (2017), arXiv preprint, arXiv:1708.04552. Google Scholar
- 59. , Going deeper with convolutions, in Proc. IEEE Conf. Computer Vision and Pattern Recognition (2015), pp. 1–9. Crossref, Google Scholar
- 60. A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto and H. Adam, Mobilenets: Efficient convolutional neural networks for mobile vision applications (2017), arXiv preprint, arXiv:1704.04861. Google Scholar
- 61. X. Zhang, X. Zhou, M. Lin and J. Sun, An extremely efficient convolutional neural network for mobile devices (2017), arXiv preprint, arXiv:1707.01083. Google Scholar
- 62. , Shufflenet v2: Practical guidelines for efficient cnn architecture design, in Proc. European Conf. Computer Vision (ECCV) (2018), pp. 116–131. Crossref, Google Scholar
- 63. S. Xie, H. Zheng, C. Liu and L. Lin, Snas: Stochastic neural architecture search, International Conference on Learning Representations (New Orleans, 2019). Google Scholar
Remember to check out the Most Cited Articles! |
---|
Check out our titles in neural networks today! |