用户名: 密码: 验证码:
结合目标检测的小目标语义分割算法
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:A small object semantic segmentation algorithm combined with object detection
  • 作者:胡太 ; 杨明
  • 英文作者:Hu Tai;Yang Ming;School of Computer Science and Technology,Nanjing Normal University;
  • 关键词:图像语义分割 ; 目标分割 ; 卷积神经网络 ; 目标检测
  • 英文关键词:image semantic segmentation;;small objects segmentation;;convolutional neural networks;;object detection
  • 中文刊名:NJDZ
  • 英文刊名:Journal of Nanjing University(Natural Science)
  • 机构:南京师范大学计算机科学与技术学院;
  • 出版日期:2019-01-30
  • 出版单位:南京大学学报(自然科学)
  • 年:2019
  • 期:v.55;No.244
  • 基金:国家自然科学基金重点项目(61432008);国家自然科学基金(61876087,61272222);; 赛尔网络下一代互联网技术创新项目(NGII20170524)
  • 语种:中文;
  • 页:NJDZ201901007
  • 页数:12
  • CN:01
  • ISSN:32-1169/N
  • 分类号:79-90
摘要
卷积神经网络(Convolutional Neural Networks,CNN)可以提供比传统分类算法更强大的分类器并且能够自学习得到深层特征,有效地提高了图像语义分割的准确性.然而,基于CNN的语义分割算法依然存在一些挑战,例如在复杂场景中现有较优的方法较难分割小目标.为了解决复杂场景下小目标分割的难题,提出一种结合目标检测的小目标语义分割算法.与现有较优方法不同的是,该方法没有直接利用单个神经网络模型同时分割单幅图像中的小尺寸和较大尺寸目标,而是将小目标分割任务从完整图像的分割任务中分离.算法首先训练一个目标检测模型以获取小目标图像块,然后设计一个小目标分割网络得到图像块的分割结果,最终根据该结果修正整体图像的分割图.该算法提升了语义分割数据集的总体性能,同时能够有效地解决小目标分割的难题.
        Convolutional Neural Networks(CNN)can provide classifiers which are more powerful than traditional classification methods and can automatically learn deep features,which significantly improve the accuracy of image semantic segmentation.However,these semantic segmentation methods based on CNNs still have some challenges,such as the difficulty in segmenting the small objects in the complex scenes.In this paper,we proposed a semantic segmentation algorithm for small objects combined with object detection,aiming to solve the segmentation challenges of small objects.This work does not directly use a single neural network to segment both small-sized and large-sized objects simultaneously.Instead,it separates the small object segmentation task from the complete image segmentation task and trains an object detection model to obtain small object image blocks.A small object segmentation network is designed to get the small object segmentation results,and the results are used to modify the overall image segmentation results.The modified segmentation maps have a better segmentation performance on small objects.
引文
[1] Gould S,He X M.Scene understanding by labeling pixels.Communications of the ACM,2014,57(11):68-77.
    [2] Koller D,Friedman N.Probabilistic graphical models:Principles and techniques.New York:MIT Press,2009,142-147.
    [3] Lafferty J D,Mccallum A,Pereira F C N.Conditional random fields:probabilistic models for segmenting and labeling sequence data∥Proceedings of the Eighteenth International Conference on Machine Learning.San Francisco,CA,USA:Morgan Kaufmann Publishers Inc.,2001:282-289.
    [4] Deng J,Dong W,Socher R,et al.ImageNet:A large-scale hierarchical image database∥2009IEEE Conference on Computer Vision and Pattern Recognition.Miami,FL,USA:IEEE,2009:248-255.
    [5] Shelhamer E,Long J,Darrell T.Fully convolutional networks for semantic segmentation.IEEE Transactions on Pattern Analysis and Machine Intelligence,2014,39(4):640-651.
    [6] Dean J,Corrado G S,Monga R,et al.Large scale distributed deep networks∥Proceedings of the25th International Conference on Neural Information Processing Systems.Lake Tahoe,NV,USA:Curran Associates Inc.,2012:1223-1231.
    [7] Krizhevsky A,Sutskever I,Hinton G E,et al.ImageNet classification with deep convolutional neural networks∥Proceedings of the 25th International Conference on Neural Information Processing Systems.Lake Tahoe,NV,USA:Curran Associates Inc.,2012:1097-1105.
    [8] Sermanet P,Eigen D,Zhang X,et al.Overfeat:Integrated recognition,localization and detection using convolutional networks.2013,arXiv:1312.6229.
    [9] Simonyan K,Zisserman A.Two-stream convolutional networks for action recognition in videos∥Proceedings of Advances in Neural Information Processing Systems.Red Hook,NY,USA:Curran Associates,Inc.,2014:568-576.
    [10]Russakovsky O,Deng J,Su H,et al.ImageNet large scale visual recognition challenge.International Journal of Computer Vision,2015,115(3):211-252.
    [11]Perronnin F,Snchez J,Mensink T.Improving thefisherkernelforlarge-scaleimage classification∥Proceedings of the 11th European Conference on Computer Vision.Springer Berlin Heidelberg,2010:143-156.
    [12]Hariharan B,Arbelez P,Girshick R,et al.Hypercolumns for object segmentation and fine-grained localization∥Proceedings of the2015IEEE Conference on Computer Vision and Pattern Recognition.Boston,MA,USA:IEEE,2015:447-456.
    [13]Noh H,Hong S,Han B,et al.Learning deconvolution network for semantic segmentation∥2015IEEE International Conference on Computer Vision.Santiago,Chile:IEEE,2015:1520-1528.
    [14]SimonyanK, ZissermanA. Verydeep convolutional networks for large-scale image recognition.2014,arXiv:1409.1556.
    [15]Badrinarayanan V,Kendall A,Cipolla R.Segnet:A deep convolutional encoder-decoder architecture for image segmentation.IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,39(12):2481-2495.
    [16]Ronneberger O,Fischer P,Brox T.U-net:Convolutional networks for biomedical image segmentation∥International Conference on Medical image computing and computer-assisted intervention.Springer Berlin Heidelberg,2015:234-241.
    [17]Chen L C,Papandreou G,Kokkinos I,et al.DeepLab:Semantic image segmentation with deep convolutional nets,Atrous convolution,and fully connected cRFs.IEEE Transactions on Pattern Analysis and Machine Intelligence,2016,40(4):834-848.
    [18]Chen L,Yang Y,Wang J,et al.Attention to scale:scale-aware semantic image segmentation∥2016 IEEE Conference on Computer Vision and Pattern Recognition.Las Vegas,NV,USA:IEEE,2016:3640-3649.
    [19]Zhao H S,Shi J P,Qi X J,et al.Pyramid scene parsing network∥2017 IEEE Conference on Computer Vision and Pattern Recognition.Honolulu,HI,USA:IEEE,2017:2881-2890.
    [20]Yu F,Koltun V.Multi-scale context aggregation by dilated convolutions.2015,arXiv:1511.07122.
    [21]Liu W,Anguelov D,Erhan D,et al.SSD:single shot MultiBox detector∥European Conference on Computer Vision.Springer Berlin Heidelberg,2016:21-37.
    [22]He K M,Zhang X Y,Ren S Q,et al.Deep residual learning for image recognition∥2016IEEE Conference on Computer Vision and Pattern Recognition.Las Vegas,NV,USA:IEEE,2016:770-778.
    [23]Zeiler M D,Fergus R.Visualizing and understanding convolutional networks∥European Conference on Computer Vision.Springer Berlin Heidelberg,2014:818-833.
    [24]Jia Y Q,Shelhamer E,Donahue J,et al.Caffe:convolutional architecture for fast feature embedding∥Proceedings of the 22nd ACM International Conference on Multimedia.New York,NY,USA:ACM,2014:675-678.
    [25]Ioffe S,Szegedy C.Batch normalization:accelerating deep network training by reducing internal covariate shift∥Proceedings of the 32nd International Conference on Machine Learning.Lille,France:ACM,2015:448-456.
    [26]Everingham M,Eslami S M A,Van Gool L,et al.The Pascal visual object classes challenge:A retrospective.International Journal of Computer Vision,2015,111(1):98-136.
    [27]Lin T Y,Maire M,Belongie S,et al.Microsoft coco:Common objects in context∥European Conference on Computer Vision.Springer Berlin Heidelberg,2014:740-755.
    [28]Chen L C,Papandreou G,Kokkinos I,et al.Semantic image segmentation with deep convolutional nets and fully connected CRFs.2014,arXiv:1412.7062.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700