卷积神经⽹络发展历程及经典论⽂
2012年,AlexNet横空出世,以极⼤优势赢得了ImageNet 2012图像识别挑战赛的冠军,也引发研究⼈员对早期神经⽹络、卷积神经⽹络的思考和再研究。⾄此,卷积神经⽹络开始领衔掀起此轮⼈⼯智能浪潮。这篇⽂章将简要介绍卷积神经⽹络的发展历程以及其中涉及到的经典论⽂。
涉及的论⽂已在以下仓库中分享:
卷积神经⽹络的前⾝与早期发展
这阶段的卷积神经⽹络发展为现代卷积神经⽹络的蓬勃发展提供了必要的理论基础。
1980年⽇本学者福岛邦彦(Kunihiko Fukushima)提出神经认知机模型Neocognitron;福岛邦彦因此获得 2021 年度鲍尔奖「Bower Award and Prize for Achievement in Science」,获奖理由为:通过发明第⼀个深度卷积神经⽹络Neocognitron将神经科学原理应⽤于⼯程的开创性研究,这是对⼈⼯智能发展的关键贡献。
Fukushima K, Miyake S. Neocognitron: A lf-organizing neural network model for a mechanism of visual pattern recognition[M].Competition and cooperation in neural nets. Springer, Berlin, Heidelberg, 1982: 267-285.
1989年Yann LeCun提出第⼀个真正意义上的CNN:LeNet 1989。
LeCun Y, Bor B, Denker J S, et al. Backpropagation applied to handwritten zip code recognition[J]. Neural computation, 1989, 1(4): 541-551.
1998年Yann LeCun进⼀步介绍了LeNet(⼜称LeNet-5),影响⼒巨⼤。
LeCun Y, Bottou L, Bengio Y, et al. Gradient-bad learning applied to document recognition[J]. Proceedings of the IEEE, 1998, 86(11): 2278-2324.
现代卷积神经⽹络的基本架构与经典模块的提出
2012年ILSVRC(分类)冠军:AlexNet,掀起深度学习计算机视觉狂潮
Krizhevsky A, Sutskever I, Hinton G E. Imagenet classification with deep convolutional neural networks[C].Advances in neural information processing systems. 2012: 1097-1105.
2013年ILSVRC(分类)冠军:ZFNet
Zeiler M D, Fergus R. Visualizing and understanding convolutional networks[C].European conference on computer vision. Springer, Cham, 2014: 818-833.
2014年ILSVRC(分类)冠军:GoogLeNet,提出Inception结构
Szegedy C, Liu W, Jia Y, et al. Going deeper with convolutions[C]. Cvpr, 2015.
2014年ILSVRC(分类)亚军:VGGNet,亮点是对⽹络深度的研究
Simonyan K, Zisrman A. Very deep convolutional networks for large-scale image recognition[J]. arXiv preprint arXiv:1409.1556, 2014.
2015年ILSVRC(分类)冠军:ResNet,提出Residual结构
He K, Zhang X, Ren S, et al. Deep residual learning for image recognition[C].Proceedings of the IEEE conference on computer vision and pattern recognition. 2016: 770-778.
2016年Google团队结合了Inception结构与Residual 结构,提出Inception-Residual Net
Szegedy C, Ioffe S, Vanhoucke V, et al. Inception-v4, inception-resnet and the impact of residual connections on
learning[C].AAAI. 2017, 4: 12.
2016年何凯明提出新的ResNet的想法:Identity Mapping
He K, Zhang X, Ren S, et al. Identity mappings in deep residual networks[C].European Conference on Computer Vision.
Springer, Cham, 2016: 630-645.
2017年DenNet
Huang G, Liu Z, Weinberger K Q, et al. Denly connected convolutional networks[C].Proceedings of the IEEE
conference on computer vision and pattern recognition. 2017, 1(2): 3.
卷积注意⼒机制的探索与完善
2017年ILSVRC(分类)冠军:SENet(Squeeze-and-Excitation Networks),提出了Squeeze-and-Excitation Block,⽹络结合SE Block和Res Block
Hu J, Shen L, Sun G. Squeeze-and-excitation networks[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2018: 7132-7141.
轻量级卷积神经⽹络的发展
2016年以来,轻量级卷积神经⽹络的研究开始逐渐浮现,为视觉深度学习模型在移动设备上的应⽤提供条件。
2016年MobileNet
Howard A G, Zhu M, Chen B, et al. Mobilenets: Efficient convolutional neural networks for mobile vision applications[J].
arXiv preprint arXiv:1704.04861, 2017.
2016年ShuffleNet
Zhang X, Zhou X, Lin M, et al. Shufflenet: An extremely efficient convolutional neural network for mobile devices[J].
arXiv preprint arXiv:1707.01083, 2017.
2016年Xception【注:Xception⽬标并不是使卷积神经⽹络轻量化,⽽是在不增加⽹络复杂度的情况
下提升性能,但其中使⽤的depthwi convolution思想是MobileNet等轻量级卷积神经⽹络的关键,故也列在这⾥】
Chollet F. Xception: Deep learning with depthwi parable convolutions[J]. arXiv preprint, 2017: 1610.02357.
2016年ResNeXt【注:ResNeXt也是为了在不增加⽹络复杂度的情况下提升性能,列在此处的原因与Xception相同】
Xie S, Girshick R, Dollár P, et al. Aggregated residual transformations for deep neural networks[C].Computer Vision and Pattern Recognition (CVPR), 2017 IEEE Conference on. IEEE, 2017: 5987-5995.
2018年MobileNet V2
Sandler M, Howard A, Zhu M, et al. Mobilenetv2: Inverted residuals and linear bottlenecks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018: 4510-4520.
2018年ESPNet【ESPNet这篇⽂章不是纯粹介绍CNN⽹络的,⽽是为语义分割任务设计的,但是其C
NN⽹络也是轻量的。】Mehta S, Rastegari M, Caspi A, et al. Espnet: Efficient spatial pyramid of dilated convolutions for mantic
gmentation[C]//Proceedings of the European Conference on Computer Vision (ECCV). 2018: 552-568.
2018年ShuffleNet V2
Ma N, Zhang X, Zheng H T, et al. Shufflenet v2: Practical guidelines for efficient cnn architecture
design[C]//Proceedings of the European Conference on Computer Vision (ECCV). 2018: 116-131.
2018年ESPNetV2
Mehta S, Rastegari M, Shapiro L, et al. ESPNetv2: A Light-weight, Power Efficient, and General Purpo Convolutional Neural Network[J]. arXiv preprint arXiv:1811.11431, 2018.