深度学习论⽂汇总
本博客⽤于记录⾃⼰平时收集的⼀些不错的深度学习论⽂,近9成的⽂章都是引⽤量3位数以上的论⽂,剩下少部分来⾃个⼈喜好,本博客将伴随着我的研究⽣涯长期更新,如有错误或者推荐⽂章烦请私信。
深度学习书籍和⼊门资源
LeCun Y, Bengio Y, Hinton G. Deep learning[J]. Nature, 2015, 521(7553): 436-444.(深度学习最权威的综述)
Bengio, Yoshua, Ian J. Goodfellow, and Aaron Courville. Deep learning. An MIT Press book. (2015).(深度学习经典书籍)
Deep Learning Tutorial(李宏毅的深度学习综述PPT,适合⼊门)
D L. LISA Lab[J]. University of Montreal, 2014.(Theano配套的深度学习教程)
择天记好看吗deeplearningbook-chine (深度学习中⽂书,⼤家⼀起翻译的)
早期的深度学习
Hecht-Nieln R. Theory of the backpropagation neural network[J]. Neural Networks, 1988, 1(Supplement-1): 445-448.(BP神经⽹络)
Hinton G E, Osindero S, Teh Y W. A fast learning algorithm for deep belief nets.[J]. Neural Computation, 2006, 18(7):1527-1554.(深度学习的开端DBN)
Hinton G E, Salakhutdinov R R. Reducing the dimensionality of data with neural networks.[J]. Science, 2006, 313(5786):504-7.(⾃编码器降维)
Ng A. Spar autoencoder[J]. CS294A Lecture notes, 2011, 72(2011): 1-19.(稀疏⾃编码器)
Vincent P, Larochelle H, Lajoie I, et al. Stacked denoising autoencoders: Learning uful reprentations in a deep network with a local denoising criterion[J]. Journal of Machine Learning Rearch, 2010, 11(Dec): 3371-3408.(堆叠⾃编码器,SAE)
深度学习的爆发:ImageNet挑战赛
Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems. 2012.(AlexNet)
枇杷熟了Simonyan, Karen, and Andrew Zisrman. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).(VGGNet)
Szegedy, Christian, et al. Going deeper with convolutions. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015. (GoogLeNet)
Szegedy C, Vanhoucke V, Ioffe S, et al. Rethinking the Inception Architecture for Computer Vision[J]. Computer Science, 2015:2818-2826.(InceptionV3)
He, Kaiming, et al. Deep residual learning for image recognition. arXiv preprint arXiv:1512.03385 (2015).(ResNet)
Chollet F. Xception: Deep Learning with Depthwi Separable Convolutions[J]. arXiv preprint arXiv:1610.02357, 2016.(Xception)Huang G, Liu Z, Weinberger K Q, et al. Denly Connected Convolutional Networks[J]. 2016. (DenNet, 2017 CVPR best paper) Squeeze-and-Excitation Networks. (SeNet, 2017 ImageNet 冠军)
Zhang X, Zhou X, Lin M, et al. Shufflenet: An extremely efficient convolutional neural network for mobile devices[J]. arXiv preprint arXiv:1707.01083, 2017.(Shufflenet)
Sabour S, Frosst N, Hinton G E. Dynamic routing between capsules[C]//Advances in Neural Information Processing Systems. 2017: 3859-3869.(Hinton, capsules)
炼丹技巧
Srivastava N, Hinton G E, Krizhevsky A, et al. Dropout: a simple way to prevent neural networks from overfitting[J]. Journal of Machine Learning Rearch, 2014, 15(1): 1929-1958.(Dropout)
Ioffe S, Szegedy C. Batch normalization: Accelerating deep network training by reducing internal covariate shift[J]. arXiv preprint arXiv:1502.03167, 2015.(Batch Normalization)
Lin M, Chen Q, Yan S. Network In Network[J]. Computer Science, 2014.(Global average pooling的灵感来源)
Goyal, Priya, Dollár, Piotr, Girshick, Ross, et al. Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour[J]. 2017. (Facebook实验室的成果,解决了⼯程上⽹络batchsize特⼤时性能下降的问题)
递归神经⽹络
Mikolov T, Karafiát M, Burget L, et al. Recurrent neural network bad language model[C]//Interspeech. 2010, 2: 3.(RNN和语language model结合较经典⽂章)
Kamijo K, Tanigawa T. Stock price pattern recognition-a recurrent neural network approach[C]//Neural Networks, 1990., 1990 IJCNN International Joint Conference on. IEEE, 1990: 215-221.(RNN预测股价)
Hochreiter S, Schmidhuber J. Long short-term memory[J]. Neural computation, 1997, 9(8): 1735-1780.(LSTM的数学原理)
Sak H, Senior A W, Beaufays F. Long short-term memory recurrent neural network architectures for large scale acoustic
modeling[C]//Interspeech. 2014: 338-342.(LSTM进⾏语⾳识别)
祝你一路顺风简谱Chung J, Gulcehre C, Cho K H, et al. Empirical evaluation of gated recurrent neural networks on quence modeling[J]. arXiv preprint arXiv:1412.3555, 2014.(GRU⽹络)
Ling W, Luís T, Marujo L, et al. Finding function in form: Compositional character models for open vocabulary word reprentation[J].
arXiv preprint arXiv:1508.02096, 2015.(LSTM在词向量中的应⽤)
Huang Z, Xu W, Yu K. Bidirectional LSTM-CRF models for quence tagging[J]. arXiv preprint arXiv:1508.01991, 2015.(Bi-LSTM在序列标注中的应⽤)
注意⼒模型
Bahdanau D, Cho K, Bengio Y. Neural machine translation by jointly learning to align and translate[J]. arXiv preprint
arXiv:1409.0473, 2014.(Attention model的提出)
Mnih V, Heess N, Graves A. Recurrent models of visual attention[C]//Advances in neural information processing systems. 2014: 2204-2212.(Attention model和视觉结合)
Xu K, Ba J, Kiros R, et al. Show, Attend and Tell: Neural Image Caption Generation with Visual Attention[C]//ICML. 2015, 14: 77-81.
(Attention model⽤于image caption的经典⽂章)
Lee C Y, Osindero S. Recursive Recurrent Nets with Attention Modeling for OCR in the Wild[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016: 2231-2239.(Attention model ⽤于OCR)
Gregor K, Danihelka I, Graves A, et al. DRAW: A recurrent neural network for image generation[J]. arXiv preprint arXiv:1502.04623, 2015.(DRAM,结合Attention model的图像⽣成)
⽣成对抗⽹络
Goodfellow I, Pouget-Abadie J, Mirza M, et al. Generative adversarial nets[C]//Advances in neural information processing systems.
2014: 2672-2680.(GAN的提出,挖坑⿐祖)
Mirza M, Osindero S. Conditional generative adversarial nets[J]. arXiv preprint arXiv:1411.1784, 2014.(CGAN)
Radford A, Metz L, Chintala S. Unsupervid reprentation learning with deep convolutional generative adversarial networks[J].
arXiv preprint arXiv:1511.06434, 2015.(DCGAN)
Denton E L, Chintala S, Fergus R. Deep Generative Image Models using a Laplacian Pyramid of Adversarial Networks[C]//Advances in neural information processing systems. 2015: 1486-1494.(LAPGAN)
Chen X, Duan Y, Houthooft R, et al. Infogan: Interpretable reprentation learning by information maximizing generative adversarial nets[C]//Advances in Neural Information Processing Systems. 2016: 2172-2180.(InfoGAN)
Arjovsky M, Chintala S, Bottou L. Wasrstein gan[J]. arXiv preprint arXiv:1701.07875, 2017.(WGAN)
Zhu J Y, Park T, Isola P, et al. Unpaired image-to-image translation using cycle-consistent adversarial networks[J]. arXiv preprint arXiv:1703.10593, 2017.(CycleGAN)
Yi Z, Zhang H, Gong P T. DualGAN: Unsupervid Dual Learning for Image-to-Image Translation[J]. arXiv preprint arXiv:1704.02510, 2017.(DualGAN)
Isola P, Zhu J Y, Zhou T, et al. Image-to-image translation with conditional adversarial networks[J]. arXiv preprint arXiv:1611.07004, 2016.(pix2pix)
⽬标检测
Szegedy C, Toshev A, Erhan D. Deep neural networks for object detection[C]//Advances in Neural Information Processing Systems.
宝宝营养餐2013: 2553-2561.(深度学习早期的物体检测)
Girshick, Ross, et al. Rich feature hierarchies for accurate object detection and mantic gmentation. Proceedings of the IEEE conference on computer vision and pattern recognition. 2014.(RCNN)
He K, Zhang X, Ren S, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[C]//European Conference on Computer Vision. Springer International Publishing, 2014: 346-361.(何凯明⼤神的SPPNet)
Girshick R. Fast r-cnn[C]//Proceedings of the IEEE International Conference on Computer Vision. 2015: 1440-1448.(速度更快的Fast R-cnn)
Ren S, He K, Girshick R, et al. Faster r-cnn: Towards real-time object detection with region proposal networks[C]//Advances in neural information processing systems. 2015: 91-99.(速度更更快的Faster r-cnn)
Redmon J, Divvala S, Girshick R, et al. You only look once: Unified, real-time object detection[C]//Proceedings of the IEEE
藜蒿
Conference on Computer Vision and Pattern Recognition. 2016: 779-788.(实时⽬标检测YOLO)
Liu W, Anguelov D, Erhan D, et al. SSD: Single shot multibox detector[C]//European Conference on Computer Vision. Springer International Publishing, 2016: 21-37.(SSD)
春节手抄报内容50字
Li Y, He K, Sun J. R-fcn: Object detection via region-bad fully convolutional networks[C]//Advances in Neural Information
Processing Systems. 2016: 379-387.(R-fcn)
Lin T Y, Goyal P, Girshick R, et al. Focal loss for den object detection[J]. arXiv preprint arXiv:1708.02002, 2017.(Focal loss)One/Zero shot learning
Fei-Fei L, Fergus R, Perona P. One-shot learning of object categories[J]. IEEE transactions on pattern analysis and machine
intelligence, 2006, 28(4): 594-611.(One shot learning)
Larochelle H, Erhan D, Bengio Y. Zero-data learning of new tasks[J]. 2008:646-651.(Zero shot learning的提出)
Palatucci M, Pomerleau D, Hinton G E, et al. Zero-shot learning with mantic output codes[C]//Advances in neural information processing systems. 2009: 1410-1418.(Zero shot learning⽐较经典的应⽤)
中央电视台焦点访谈图像分割
Long J, Shelhamer E, Darrell T. Fully convolutional networks for mantic gmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015: 3431-3440.(有点⽼但是⾮常经典的图像语义分割论⽂,CVPR2015)
Chen L C, Papandreou G, Kokkinos I, et al. Deeplab: Semantic image gmentation with deep convolutional nets, atrous convolution, and fully connected crfs[J]. arXiv preprint arXiv:1606.00915, 2016.(DeepLab)
Zhao H, Shi J, Qi X, et al. Pyramid scene parsing network[J]. arXiv preprint arXiv:1612.01105, 2016.(PSPNet)
Yu F, Koltun V, Funkhour T. Dilated residual networks[J]. arXiv preprint arXiv:1705.09914, 2017.
He K, Gkioxari G, Dollár P, et al. Mask R-CNN[J]. arXiv preprint arXiv:1703.06870, 2017.(何凯明⼤
神的MASK r-cnn,膜)
Hu R, Dollár P, He K, et al. Learning to Segment Every Thing[J]. arXiv preprint arXiv:1711.10370, 2017.(Mask Rcnn增强版)-
Person Re-ID
雅思词汇量要求多少Yi D, Lei Z, Liao S, et al. Deep metric learning for person re-identification[C]//Pattern Recognition (ICPR), 2014 22nd International Conference on. IEEE, 2014: 34-39.(较早的⼀篇基于CNN的度量学习的Re-ID,现在来看⽹络已经很简单了)
Ding S, Lin L, Wang G, et al. Deep feature learning with relative distance comparison for person re-identification[J]. Pattern
Recognition, 2015, 48(10): 2993-3003.(triplet loss)
Cheng D, Gong Y, Zhou S, et al. Person re-identification by multi-channel parts-bad cnn with improved triplet loss
function[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016: 1335-1344.(improved triplet loss)
Hermans A, Beyer L, Leibe B. In Defen of the Triplet Loss for Person Re-Identification[J]. arXiv preprint arXiv:1703.07737, 2017.
(Triplet loss with hard mining sample)
Chen W, Chen X, Zhang J, et al. Beyond triplet loss: a deep quadruplet network for person re-identification[J]. arXiv preprint
arXiv:1704.01719, 2017.(四元组)
Zheng Z, Zheng L, Yang Y. Unlabeled samples generated by gan improve the person re-identification baline in vitro[J]. arXiv preprint arXiv:1701.07717, 2017.(⽤GAN造图做ReID第⼀篇)
Zhang X, Luo H, Fan X, et al. AlignedReID: Surpassing Human-Level Performance in Person Re-Identification[J]. arXiv preprint arXiv:1711.08184, 2017. (AlignedReid,⾸次超越⼈类)
(在这个领域提供了⼤量论⽂,常⽤的数据集和代码都可以在主页中找到)
本⽂转载,原⽂链接:blog.csdn/qq_21190081/article/details/69564634