Neural Architecture Transfer
2021, IEEE Transactions on Pattern Analysis and Machine Intelligence
https://doi.org/10.1109/TPAMI.2021.3052758Abstract
Neural architecture search (NAS) has emerged as a promising avenue for automatically designing task-specific neural networks. Existing NAS approaches require one complete search for each deployment specification of hardware or objective. This is a computationally impractical endeavor given the potentially large number of application scenarios. In this paper, we propose <italic>Neural Architecture Transfer</italic> (NAT) to overcome this limitation. NAT is designed to efficiently generate task-specific custom models that are competitive under multiple conflicting objectives. To realize this goal we learn task-specific supernets from which specialized subnets can be sampled without any additional training. The key to our approach is an integrated online transfer learning and many-objective evolutionary search procedure. A pre-trained supernet is iteratively adapted while simultaneously searching for task-specific subnets. We demonstrate the efficacy of NAT on 11 benchmark ...
References (71)
- O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein et al., "Imagenet large scale visual recognition challenge," International Journal of Computer Vision, vol. 115, no. 3, pp. 211-252, 2015.
- K. Simonyan and A. Zisserman, "Very Deep Convolutional Networks for Large-scale Image Recognition," in ICLR, 2015.
- K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition," in CVPR, 2016.
- A. Krizhevsky, I. Sutskever, and G. E. Hinton, "Imagenet classification with deep convolutional neural networks," in NeurIPS, 2012.
- C. Szegedy, S. Ioffe, V. Vanhoucke, and A. A. Alemi, "Inception-v4, inception-resnet and the impact of residual connections on learning," in AAAI, 2017.
- G. Huang, Z. Liu, L. Van Der Maaten, and K. Q. Weinberger, "Densely connected convolutional networks," in CVPR, 2017.
- S. Kornblith, J. Shlens, and Q. V. Le, "Do better imagenet models transfer better?" in CVPR, 2019.
- B. Zoph, V. Vasudevan, J. Shlens, and Q. V. Le, "Learning transferable architectures for scalable image recognition," in CVPR, 2018.
- A. Krizhevsky, G. Hinton et al., "Learning layers of features from tiny images," Citeseer, Tech. Rep., 2009.
- G. Bender, P.-J. Kindermans, B. Zoph, V. Vasudevan, and Q. Le, "Understanding and simplifying one-shot architecture search," in ICML, 2018.
- K. Deb and H. Jain, "An evolutionary many-objective optimization algorithm using reference-point-based nondominated sorting approach, part I: Solving problems with box constraints," IEEE Transactions on Evolutionary Computation, vol. 18, no. 4, pp. 577-601, Aug 2014.
- L. N. Darlow, E. J. Crowley, A. Antoniou, and A. J. Storkey, "Cinic-10 is not imagenet or cifar-10," arXiv preprint arXiv:1810.03505, 2018.
- A. Coates, A. Ng, and H. Lee, "An analysis of single-layer networks in unsupervised feature learning," in Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, 2011.
- L. Bossard, M. Guillaumin, and L. Van Gool, "Food-101-mining dis- criminative components with random forests," in ECCV, 2014.
- J. Krause, M. Stark, J. Deng, and L. Fei-Fei, "3d object representations for fine-grained categorization," in IEEE Workshop on 3D Representation and Recognition (3dRR-13), Sydney, Australia, 2013.
- S. Maji, J. Kannala, E. Rahtu, M. Blaschko, and A. Vedaldi, "Fine- grained visual classification of aircraft," Tech. Rep., 2013.
- M. Cimpoi, S. Maji, I. Kokkinos, S. Mohamed, and A. Vedaldi, "De- scribing textures in the wild," in CVPR, 2014.
- O. M. Parkhi, A. Vedaldi, A. Zisserman, and C. Jawahar, "Cats and dogs," in CVPR, 2012.
- M. Nilsback and A. Zisserman, "Automated flower classification over a large number of classes," in 2008 Sixth Indian Conference on Computer Vision, Graphics Image Processing, 2008.
- A. Howard, M. Sandler, G. Chu, L.-C. Chen, B. Chen, M. Tan, W. Wang, Y. Zhu, R. Pang, V. Vasudevan, Q. V. Le, and H. Adam, "Searching for mobilenetv3," in ICCV, 2019.
- C. Liu, B. Zoph, M. Neumann, J. Shlens, W. Hua, L.-J. Li, L. Fei-Fei, A. Yuille, J. Huang, and K. Murphy, "Progressive neural architecture search," in ECCV, 2018.
- H. Liu, K. Simonyan, and Y. Yang, "DARTS: Differentiable architecture search," in ICLR, 2019.
- T. Elsken, J. H. Metzen, and F. Hutter, "Efficient multi-objective neural architecture search via lamarckian evolution," in ICLR, 2019.
- H. Cai, L. Zhu, and S. Han, "ProxylessNAS: Direct neural architecture search on target task and hardware," in ICLR, 2019.
- M. Tan, B. Chen, R. Pang, V. Vasudevan, M. Sandler, A. Howard, and Q. V. Le, "Mnasnet: Platform-aware neural architecture search for mobile," in CVPR, 2019.
- M. Tan and Q. V. Le, "Efficientnet: Rethinking model scaling for convolutional neural networks," in ICML, 2019.
- X. Dai, P. Zhang, B. Wu, H. Yin, F. Sun, Y. Wang, M. Dukhan, Y. Hu, Y. Wu, Y. Jia et al., "Chamnet: Towards efficient network design through platform-aware model adaptation," in CVPR, 2019.
- H. Cai, C. Gan, T. Wang, Z. Zhang, and S. Han, "Once for all: Train one network and specialize it for efficient deployment," in ICLR, 2020.
- A. Wan, X. Dai, P. Zhang, Z. He, Y. Tian, S. Xie, B. Wu, M. Yu, T. Xu, K. Chen et al., "Fbnetv2: Differentiable neural architecture search for spatial and channel dimensions," in CVPR, 2020.
- X. Yao, "Evolving artificial neural networks," Proceedings of the IEEE, vol. 87, no. 9, pp. 1423-1447, 1999.
- K. O. Stanley and R. Miikkulainen, "Evolving neural networks through augmenting topologies," Evolutionary Computation, vol. 10, no. 2, pp. 99-127, 2002.
- B. Zoph and Q. V. Le, "Neural architecture search with reinforcement learning," in ICLR, 2017.
- Z. Zhong, J. Yan, W. Wu, J. Shao, and C. Liu, "Practical block-wise neural network architecture generation," in CVPR, 2018, pp. 2423-2432.
- H. Liu, K. Simonyan, O. Vinyals, C. Fernando, and K. Kavukcuoglu, "Hierarchical representations for efficient architecture search," in ICLR, 2018.
- E. Real, A. Aggarwal, Y. Huang, and Q. V. Le, "Regularized evolution for image classifier architecture search," in AAAI, 2019.
- D. Lian, Y. Zheng, Y. Xu, Y. Lu, L. Lin, P. Zhao, J. Huang, and S. Gao, "Towards fast adaptation of neural architectures with meta learning," in ICLR, 2020.
- T. Elsken, B. Staffler, J. H. Metzen, and F. Hutter, "Meta-learning of neural architectures for few-shot learning," in CVPR, 2020.
- M. Wistuba, "Xfernas: Transfer neural architecture search," arXiv preprint arXiv:1907.08307, 2019.
- J. Fang, Y. Chen, X. Zhang, Q. Zhang, C. Huang, G. Meng, W. Liu, and X. Wang, "Eat-nas: Elastic architecture transfer for accelerating large- scale neural architecture search," arXiv preprint arXiv:1901.05884, 2019.
- C. Wong, N. Houlsby, Y. Lu, and A. Gesmundo, "Transfer learning with neural automl," in NeurIPS, 2018.
- E. Kokiopoulou, A. Hauth, L. Sbaiz, A. Gesmundo, G. Bartok, and J. Berent, "Fast task-aware architecture inference," arXiv preprint arXiv:1902.05781, 2019.
- B. Baker, O. Gupta, R. Raskar, and N. Naik, "Accelerating neu- ral architecture search using performance prediction," arXiv preprint arXiv:1705.10823, 2017.
- H. Pham, M. Guan, B. Zoph, Q. Le, and J. Dean, "Efficient neural architecture search via parameters sharing," in ICML, 2018.
- L. Li and A. Talwalkar, "Random search and reproducibility for neural architecture search," arXiv preprint arXiv:1902.07638, 2019.
- S. Xie, A. Kirillov, R. Girshick, and K. He, "Exploring randomly wired neural networks for image recognition," in CVPR, 2019.
- K. Yu, C. Sciuto, M. Jaggi, C. Musat, and M. Salzmann, "Evaluating the search phase of neural architecture search," in ICLR, 2020.
- A. Brock, T. Lim, J. Ritchie, and N. Weston, "SMASH: One-shot model architecture search through hypernetworks," in ICLR, 2018.
- B. Wu, X. Dai, P. Zhang, Y. Wang, F. Sun, Y. Wu, Y. Tian, P. Vajda, Y. Jia, and K. Keutzer, "Fbnet: Hardware-aware efficient convnet design via differentiable neural architecture search," in CVPR, 2019.
- Z. Lu, I. Whalen, V. Boddeti, Y. Dhebar, K. Deb, E. Goodman, and W. Banzhaf, "NSGA-Net: Neural architecture search using multi- objective genetic algorithm," in GECCO, 2019.
- J.-D. Dong, A.-C. Cheng, D.-C. Juan, W. Wei, and M. Sun, "Dpp-net: Device-aware progressive search for pareto-optimal neural architectures," in ECCV, 2018.
- X. Chu, B. Zhang, R. Xu, and J. Li, "Fairnas: Rethinking evaluation fairness of weight sharing neural architecture search," arXiv preprint arXiv:1907.01845, 2019.
- Z. Lu, K. Deb, and V. N. Boddeti, "MUXConv: Information multiplexing in convolutional neural networks," in CVPR, 2020.
- J. Bracken and J. T. McGill, "Mathematical programs with optimization problems in the constraints," Operations Research, vol. 21, no. 1, pp. 37-44, 1973. [Online]. Available: http://www.jstor.org/stable/169087
- M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, and L.-C. Chen, "Mobilenetv2: Inverted residuals and linear bottlenecks," in CVPR, 2018.
- A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto, and H. Adam, "Mobilenets: Efficient convo- lutional neural networks for mobile vision applications," arXiv preprint arXiv:1704.04861, 2017.
- Y. Sun, H. Wang, B. Xue, Y. Jin, G. G. Yen, and M. Zhang, "Surrogate- assisted evolutionary deep learning using an end-to-end random forest- based performance predictor," IEEE Transactions on Evolutionary Com- putation, 2019.
- E. Real, S. Moore, A. Selle, S. Saxena, Y. L. Suematsu, J. Tan, Q. V. Le, and A. Kurakin, "Large-scale evolution of image classifiers," in ICML, 2017.
- K. Deb and R. B. Agrawal, "Simulated binary crossover for continuous search space," Complex Systems, vol. 9, no. 2, pp. 115-148, 1995.
- I. Das and J. E. Dennis, "Normal-boundary intersection: A new method for generating the pareto surface in nonlinear multicriteria optimization problems," SIAM J. on Optimization, vol. 8, no. 3, p. 631657, Mar. 1998. [Online]. Available: https://doi.org/10.1137/S1052623496307510
- Z. Guo, X. Zhang, H. Mu, W. Heng, Z. Liu, Y. Wei, and J. Sun, "Single path one-shot neural architecture search with uniform sampling," arXiv preprint arXiv:1904.00420, 2019.
- E. D. Cubuk, B. Zoph, J. Shlens, and Q. V. Le, "Randaugment: Prac- tical automated data augmentation with a reduced search space," arXiv preprint arXiv:1909.13719, 2019.
- J. Mei, Y. Li, X. Lian, X. Jin, L. Yang, A. Yuille, and J. Yang, "Atomnas: Fine-grained end-to-end neural architecture search," in ICLR, 2020.
- Y. Li, X. Jin, J. Mei, X. Lian, L. Yang, C. Xie, Q. Yu, Y. Zhou, S. Bai, and A. Yuille, "Neural architecture search for lightweight non-local networks," in CVPR, 2020.
- C. Li, J. Peng, L. Yuan, G. Wang, X. Liang, L. Lin, and X. Chang, "Blockwisely supervised neural architecture search with knowledge distillation," in CVPR, 2020.
- M. Tan and Q. V. Le, "Mixconv: Mixed depthwise convolutional kernels," in BMVC, 2019.
- J. Yu, P. Jin, H. Liu, G. Bender, P.-J. Kindermans, M. Tan, T. Huang, X. Song, R. Pang, and Q. Le, "Bignas: Scaling up neural architecture search with big single-stage models," arXiv preprint arXiv:2003.11142, 2020.
- X. Wang, D. Kihara, J. Luo, and G.-J. Qi, "Enaet: Self-trained ensem- ble autoencoding transformations for semi-supervised learning," arXiv preprint arXiv:1911.09265, 2019.
- N. Nayman, A. Noy, T. Ridnik, I. Friedman, R. Jin, and L. Zelnik, "Xnas: Neural architecture search with expert advice," in NeurIPS, 2019.
- E. Zitzler and L. Thiele, "Multiobjective optimization using evolutionary algorithms case study," in Parallel Problem Solving from Nature -PPSN V, A. E. Eiben, T. Bäck, M. Schoenauer, and H.-P. Schwefel, Eds. Berlin, Heidelberg: Springer Berlin Heidelberg, 1998, pp. 292-301.
- F. Hutter, H. H. Hoos, and K. Leyton-Brown, "Sequential model-based optimization for general algorithm configuration," in International Con- ference on Learning and Intelligent Optimization. Springer, 2011, pp. 507-523.
- K. Deb, Multi-objective optimization using evolutionary algorithms. Chichester, UK: Wiley, 2001.