Academia.eduAcademia.edu

Outline

Neural Architecture Transfer

2021, IEEE Transactions on Pattern Analysis and Machine Intelligence

https://doi.org/10.1109/TPAMI.2021.3052758

Abstract

Neural architecture search (NAS) has emerged as a promising avenue for automatically designing task-specific neural networks. Existing NAS approaches require one complete search for each deployment specification of hardware or objective. This is a computationally impractical endeavor given the potentially large number of application scenarios. In this paper, we propose <italic>Neural Architecture Transfer</italic> (NAT) to overcome this limitation. NAT is designed to efficiently generate task-specific custom models that are competitive under multiple conflicting objectives. To realize this goal we learn task-specific supernets from which specialized subnets can be sampled without any additional training. The key to our approach is an integrated online transfer learning and many-objective evolutionary search procedure. A pre-trained supernet is iteratively adapted while simultaneously searching for task-specific subnets. We demonstrate the efficacy of NAT on 11 benchmark ...

References (71)

  1. O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein et al., "Imagenet large scale visual recognition challenge," International Journal of Computer Vision, vol. 115, no. 3, pp. 211-252, 2015.
  2. K. Simonyan and A. Zisserman, "Very Deep Convolutional Networks for Large-scale Image Recognition," in ICLR, 2015.
  3. K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition," in CVPR, 2016.
  4. A. Krizhevsky, I. Sutskever, and G. E. Hinton, "Imagenet classification with deep convolutional neural networks," in NeurIPS, 2012.
  5. C. Szegedy, S. Ioffe, V. Vanhoucke, and A. A. Alemi, "Inception-v4, inception-resnet and the impact of residual connections on learning," in AAAI, 2017.
  6. G. Huang, Z. Liu, L. Van Der Maaten, and K. Q. Weinberger, "Densely connected convolutional networks," in CVPR, 2017.
  7. S. Kornblith, J. Shlens, and Q. V. Le, "Do better imagenet models transfer better?" in CVPR, 2019.
  8. B. Zoph, V. Vasudevan, J. Shlens, and Q. V. Le, "Learning transferable architectures for scalable image recognition," in CVPR, 2018.
  9. A. Krizhevsky, G. Hinton et al., "Learning layers of features from tiny images," Citeseer, Tech. Rep., 2009.
  10. G. Bender, P.-J. Kindermans, B. Zoph, V. Vasudevan, and Q. Le, "Understanding and simplifying one-shot architecture search," in ICML, 2018.
  11. K. Deb and H. Jain, "An evolutionary many-objective optimization algorithm using reference-point-based nondominated sorting approach, part I: Solving problems with box constraints," IEEE Transactions on Evolutionary Computation, vol. 18, no. 4, pp. 577-601, Aug 2014.
  12. L. N. Darlow, E. J. Crowley, A. Antoniou, and A. J. Storkey, "Cinic-10 is not imagenet or cifar-10," arXiv preprint arXiv:1810.03505, 2018.
  13. A. Coates, A. Ng, and H. Lee, "An analysis of single-layer networks in unsupervised feature learning," in Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, 2011.
  14. L. Bossard, M. Guillaumin, and L. Van Gool, "Food-101-mining dis- criminative components with random forests," in ECCV, 2014.
  15. J. Krause, M. Stark, J. Deng, and L. Fei-Fei, "3d object representations for fine-grained categorization," in IEEE Workshop on 3D Representation and Recognition (3dRR-13), Sydney, Australia, 2013.
  16. S. Maji, J. Kannala, E. Rahtu, M. Blaschko, and A. Vedaldi, "Fine- grained visual classification of aircraft," Tech. Rep., 2013.
  17. M. Cimpoi, S. Maji, I. Kokkinos, S. Mohamed, and A. Vedaldi, "De- scribing textures in the wild," in CVPR, 2014.
  18. O. M. Parkhi, A. Vedaldi, A. Zisserman, and C. Jawahar, "Cats and dogs," in CVPR, 2012.
  19. M. Nilsback and A. Zisserman, "Automated flower classification over a large number of classes," in 2008 Sixth Indian Conference on Computer Vision, Graphics Image Processing, 2008.
  20. A. Howard, M. Sandler, G. Chu, L.-C. Chen, B. Chen, M. Tan, W. Wang, Y. Zhu, R. Pang, V. Vasudevan, Q. V. Le, and H. Adam, "Searching for mobilenetv3," in ICCV, 2019.
  21. C. Liu, B. Zoph, M. Neumann, J. Shlens, W. Hua, L.-J. Li, L. Fei-Fei, A. Yuille, J. Huang, and K. Murphy, "Progressive neural architecture search," in ECCV, 2018.
  22. H. Liu, K. Simonyan, and Y. Yang, "DARTS: Differentiable architecture search," in ICLR, 2019.
  23. T. Elsken, J. H. Metzen, and F. Hutter, "Efficient multi-objective neural architecture search via lamarckian evolution," in ICLR, 2019.
  24. H. Cai, L. Zhu, and S. Han, "ProxylessNAS: Direct neural architecture search on target task and hardware," in ICLR, 2019.
  25. M. Tan, B. Chen, R. Pang, V. Vasudevan, M. Sandler, A. Howard, and Q. V. Le, "Mnasnet: Platform-aware neural architecture search for mobile," in CVPR, 2019.
  26. M. Tan and Q. V. Le, "Efficientnet: Rethinking model scaling for convolutional neural networks," in ICML, 2019.
  27. X. Dai, P. Zhang, B. Wu, H. Yin, F. Sun, Y. Wang, M. Dukhan, Y. Hu, Y. Wu, Y. Jia et al., "Chamnet: Towards efficient network design through platform-aware model adaptation," in CVPR, 2019.
  28. H. Cai, C. Gan, T. Wang, Z. Zhang, and S. Han, "Once for all: Train one network and specialize it for efficient deployment," in ICLR, 2020.
  29. A. Wan, X. Dai, P. Zhang, Z. He, Y. Tian, S. Xie, B. Wu, M. Yu, T. Xu, K. Chen et al., "Fbnetv2: Differentiable neural architecture search for spatial and channel dimensions," in CVPR, 2020.
  30. X. Yao, "Evolving artificial neural networks," Proceedings of the IEEE, vol. 87, no. 9, pp. 1423-1447, 1999.
  31. K. O. Stanley and R. Miikkulainen, "Evolving neural networks through augmenting topologies," Evolutionary Computation, vol. 10, no. 2, pp. 99-127, 2002.
  32. B. Zoph and Q. V. Le, "Neural architecture search with reinforcement learning," in ICLR, 2017.
  33. Z. Zhong, J. Yan, W. Wu, J. Shao, and C. Liu, "Practical block-wise neural network architecture generation," in CVPR, 2018, pp. 2423-2432.
  34. H. Liu, K. Simonyan, O. Vinyals, C. Fernando, and K. Kavukcuoglu, "Hierarchical representations for efficient architecture search," in ICLR, 2018.
  35. E. Real, A. Aggarwal, Y. Huang, and Q. V. Le, "Regularized evolution for image classifier architecture search," in AAAI, 2019.
  36. D. Lian, Y. Zheng, Y. Xu, Y. Lu, L. Lin, P. Zhao, J. Huang, and S. Gao, "Towards fast adaptation of neural architectures with meta learning," in ICLR, 2020.
  37. T. Elsken, B. Staffler, J. H. Metzen, and F. Hutter, "Meta-learning of neural architectures for few-shot learning," in CVPR, 2020.
  38. M. Wistuba, "Xfernas: Transfer neural architecture search," arXiv preprint arXiv:1907.08307, 2019.
  39. J. Fang, Y. Chen, X. Zhang, Q. Zhang, C. Huang, G. Meng, W. Liu, and X. Wang, "Eat-nas: Elastic architecture transfer for accelerating large- scale neural architecture search," arXiv preprint arXiv:1901.05884, 2019.
  40. C. Wong, N. Houlsby, Y. Lu, and A. Gesmundo, "Transfer learning with neural automl," in NeurIPS, 2018.
  41. E. Kokiopoulou, A. Hauth, L. Sbaiz, A. Gesmundo, G. Bartok, and J. Berent, "Fast task-aware architecture inference," arXiv preprint arXiv:1902.05781, 2019.
  42. B. Baker, O. Gupta, R. Raskar, and N. Naik, "Accelerating neu- ral architecture search using performance prediction," arXiv preprint arXiv:1705.10823, 2017.
  43. H. Pham, M. Guan, B. Zoph, Q. Le, and J. Dean, "Efficient neural architecture search via parameters sharing," in ICML, 2018.
  44. L. Li and A. Talwalkar, "Random search and reproducibility for neural architecture search," arXiv preprint arXiv:1902.07638, 2019.
  45. S. Xie, A. Kirillov, R. Girshick, and K. He, "Exploring randomly wired neural networks for image recognition," in CVPR, 2019.
  46. K. Yu, C. Sciuto, M. Jaggi, C. Musat, and M. Salzmann, "Evaluating the search phase of neural architecture search," in ICLR, 2020.
  47. A. Brock, T. Lim, J. Ritchie, and N. Weston, "SMASH: One-shot model architecture search through hypernetworks," in ICLR, 2018.
  48. B. Wu, X. Dai, P. Zhang, Y. Wang, F. Sun, Y. Wu, Y. Tian, P. Vajda, Y. Jia, and K. Keutzer, "Fbnet: Hardware-aware efficient convnet design via differentiable neural architecture search," in CVPR, 2019.
  49. Z. Lu, I. Whalen, V. Boddeti, Y. Dhebar, K. Deb, E. Goodman, and W. Banzhaf, "NSGA-Net: Neural architecture search using multi- objective genetic algorithm," in GECCO, 2019.
  50. J.-D. Dong, A.-C. Cheng, D.-C. Juan, W. Wei, and M. Sun, "Dpp-net: Device-aware progressive search for pareto-optimal neural architectures," in ECCV, 2018.
  51. X. Chu, B. Zhang, R. Xu, and J. Li, "Fairnas: Rethinking evaluation fairness of weight sharing neural architecture search," arXiv preprint arXiv:1907.01845, 2019.
  52. Z. Lu, K. Deb, and V. N. Boddeti, "MUXConv: Information multiplexing in convolutional neural networks," in CVPR, 2020.
  53. J. Bracken and J. T. McGill, "Mathematical programs with optimization problems in the constraints," Operations Research, vol. 21, no. 1, pp. 37-44, 1973. [Online]. Available: http://www.jstor.org/stable/169087
  54. M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, and L.-C. Chen, "Mobilenetv2: Inverted residuals and linear bottlenecks," in CVPR, 2018.
  55. A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto, and H. Adam, "Mobilenets: Efficient convo- lutional neural networks for mobile vision applications," arXiv preprint arXiv:1704.04861, 2017.
  56. Y. Sun, H. Wang, B. Xue, Y. Jin, G. G. Yen, and M. Zhang, "Surrogate- assisted evolutionary deep learning using an end-to-end random forest- based performance predictor," IEEE Transactions on Evolutionary Com- putation, 2019.
  57. E. Real, S. Moore, A. Selle, S. Saxena, Y. L. Suematsu, J. Tan, Q. V. Le, and A. Kurakin, "Large-scale evolution of image classifiers," in ICML, 2017.
  58. K. Deb and R. B. Agrawal, "Simulated binary crossover for continuous search space," Complex Systems, vol. 9, no. 2, pp. 115-148, 1995.
  59. I. Das and J. E. Dennis, "Normal-boundary intersection: A new method for generating the pareto surface in nonlinear multicriteria optimization problems," SIAM J. on Optimization, vol. 8, no. 3, p. 631657, Mar. 1998. [Online]. Available: https://doi.org/10.1137/S1052623496307510
  60. Z. Guo, X. Zhang, H. Mu, W. Heng, Z. Liu, Y. Wei, and J. Sun, "Single path one-shot neural architecture search with uniform sampling," arXiv preprint arXiv:1904.00420, 2019.
  61. E. D. Cubuk, B. Zoph, J. Shlens, and Q. V. Le, "Randaugment: Prac- tical automated data augmentation with a reduced search space," arXiv preprint arXiv:1909.13719, 2019.
  62. J. Mei, Y. Li, X. Lian, X. Jin, L. Yang, A. Yuille, and J. Yang, "Atomnas: Fine-grained end-to-end neural architecture search," in ICLR, 2020.
  63. Y. Li, X. Jin, J. Mei, X. Lian, L. Yang, C. Xie, Q. Yu, Y. Zhou, S. Bai, and A. Yuille, "Neural architecture search for lightweight non-local networks," in CVPR, 2020.
  64. C. Li, J. Peng, L. Yuan, G. Wang, X. Liang, L. Lin, and X. Chang, "Blockwisely supervised neural architecture search with knowledge distillation," in CVPR, 2020.
  65. M. Tan and Q. V. Le, "Mixconv: Mixed depthwise convolutional kernels," in BMVC, 2019.
  66. J. Yu, P. Jin, H. Liu, G. Bender, P.-J. Kindermans, M. Tan, T. Huang, X. Song, R. Pang, and Q. Le, "Bignas: Scaling up neural architecture search with big single-stage models," arXiv preprint arXiv:2003.11142, 2020.
  67. X. Wang, D. Kihara, J. Luo, and G.-J. Qi, "Enaet: Self-trained ensem- ble autoencoding transformations for semi-supervised learning," arXiv preprint arXiv:1911.09265, 2019.
  68. N. Nayman, A. Noy, T. Ridnik, I. Friedman, R. Jin, and L. Zelnik, "Xnas: Neural architecture search with expert advice," in NeurIPS, 2019.
  69. E. Zitzler and L. Thiele, "Multiobjective optimization using evolutionary algorithms case study," in Parallel Problem Solving from Nature -PPSN V, A. E. Eiben, T. Bäck, M. Schoenauer, and H.-P. Schwefel, Eds. Berlin, Heidelberg: Springer Berlin Heidelberg, 1998, pp. 292-301.
  70. F. Hutter, H. H. Hoos, and K. Leyton-Brown, "Sequential model-based optimization for general algorithm configuration," in International Con- ference on Learning and Intelligent Optimization. Springer, 2011, pp. 507-523.
  71. K. Deb, Multi-objective optimization using evolutionary algorithms. Chichester, UK: Wiley, 2001.