摘要
场景理解是智能自主机器人领域的一个重要研究方向,而图像分割是场景理解的基础.但是,不完备的训练数据集,以及真实环境中的罕见情形,会导致在图像分割时存在先验知识不完备的情况,进而影响图像分割的效果.因此,提出在彩色深度(RGB–D)图像上使用抽象的支撑语义关系来解决多样的物体形态所面对的先验知识不完备问题.在先验知识不完备情况下,针对自底向上的图像分割过程中被过度分割出的物体块,首先对物体块间的支撑语义关系进行建模并计算其支撑概率,然后构造能够度量场景总体稳定性的能量函数,最后通过Swendsen-Wang割(SWC)随机图分割算法最小化该能量函数的值,将物体块间的支撑概率转化为强支撑语义关系并完成物体块合并,实现先验知识不完备情况下的图像分割.实验结果证明,结合支撑语义关系的图像分割能够在先验知识不完备的情况下,将同一物体被过度分割的部分重新合并起来,从而提升了图像分割的准确性.
One of the most important research fields for intelligent autonomous robots is scene understanding which requires a foundation of image segmentation. However, it usually faces a condition of incomplete prior knowledge due to incomplete training dataset and uncommon situations in the real world, which affects the segmentation quality. Therefore, a method that solves the incomplete prior knowledge problem caused by the diversity of objects is presented, which is based on extracting abstract supportive semantic relationships in red green blue-depth(RGB–D) images. This method aims at the over-segmented object parts during the bottom-up image segmentation process with incomplete prior knowledge. First,the supportive semantic relationships between object parts are modelled by calculating supportive probabilities. Second,an energy function is built to measure the total stability of the entire scene. Last, Swendsen-Wang cuts(SWC) algorithm is implemented to minimize the energy function so that the supportive probabilities between object parts are converted into strong supportive semantic relationships. According to the relationships, the object parts are merged to accomplish image segmentation with incomplete prior knowledge. The experiments show that the over-segmented parts are merged into one single object with incomplete prior knowledge by simply introducing supportive semantic relationships to image segmentation. With our method more precise image segmentation results are achieved.
引文
[1]GAO J,XIE Z.Theory and Method of Image Understanding.Beijing:Science Press,2009.
[2]GU G H.Image semantic representation based scene classification research.Beijing:Beijing Jiaotong University,2013.
[3]KHURANA P,SHARMA A,SINGH S N,et al.A survey on object recognition and segmentation techniques.The 3rd International Conference on Computing for Sustainable Global Development.Piscataway,USA:IEEE,2016:3822-3826.
[4]MARTIN T.A survey of semantic segmentation.Computing Research Repository,2016,abs/1602.06541:1-16.
[5]KHAN S H,HE X M,BANNAMOUN M,et al.Separating objects and clutter in indoor scenes.IEEE Conference on Computer Vision and Pattern Recognition.Piscataway,USA:IEEE,2015:4603-4611.
[6]HAN F,ZHU S C.Bottom-up/top-down image parsing with attribute grammar.IEEE Transactions on Pattern Analysis and Machine Intelligence,2009,31(1):59-73.
[7]HANDA A,PATRAUCEAN V,STENT S,et al.SceneNet:an annotated model generator for indoor scene understanding.IEEE International Conference on Robotics and Automation.Piscataway,USA:IEEE,2016:5737-5743.
[8]VENKATESHKUMAR S K,SRIDHAR M,OTT P.Latent hierarchical part based models for road scene understanding.IEEE International Conference on Computer Vision Workshop.Piscataway,USA:IEEE,2015:115-123.
[9]KUNZE L,BURBRIDGE C,ALBERTI M,et al.Combining topdown spatial reasoning and bottom-up object class recognition for scene understanding.IEEE/RSJ International Conference on Intelligent Robots and Systems.Piscataway,USA:IEEE,2014:2910-2915.
[10]ZHENG B,ZHAO Y,YU J C,et al.Beyond point clouds:scene understanding by reasoning geometry and physics.IEEE Conference on Computer Vision and Pattern Recognition.Piscataway,USA:IEEE,2013:3127-3134.
[11]JIA Z,GALLAGHER A C,SAXENA A,et al.3D reasoning from blocks to stability.IEEE Transactions on Pattern Analysis and Machine Intelligence,2015,37(5):905-918.
[12]ZHENG B,ZHAO Y,YU J C,et al.Detecting potential falling objects by inferring human action and natural disturbance.IEEE International Conference on Robotics and Automation.Piscataway,USA:IEEE,2014:3417-3424.
[13]DUPRE R,ARGYRIOU V.3D voxel HOG and risk estimation.IEEEInternational Conference on Digital Signal Processing.Piscataway,USA:IEEE,2015:482-486.
[14]DUPRE R,ARGYRIOU V,GREENHILL D,et al.A 3D scene analysis framework and descriptors for risk evaluation.International Conference on 3D Vision.Piscataway,USA:IEEE,2015:100-108.
[15]WANG Xiaonian,FENG Yuanjing,FENG Zuren.Ant colony optimization with active contour models for image segmentation.Control Theory&Applications,2006,23(4):515-522.(王晓年,冯远静,冯祖仁.一种基于主动轮廓模型的蚁群图像分割算法.控制理论与应用,2006,23(4):515-522.)
[16]SILBERMAN N,HOIEM D,KOHLI P,et al.Indoor segmentation and support inference from RGBD images.European Conference on Computer Vision.Berlin,Germany:Springer,2012:746-760.
[17]FISCHLER M A,BOLLES R C.Random sample consensus:a paradigm for model fitting with applications to image analysis and automated cartography.Communications of the ACM,1981,24(6):381-395.
[18]ZHU S C,BARBU A.Generalizing Swendsen-Wang to sampling arbitrary posterior probabilities.IEEE Transactions on Pattern Analysis and Machine Intelligence,2005,27(8):1239-1253.
[19]BARBU A,ZHU S C.Graph partition by Swendsen-Wang cuts.IEEE International Conference on Computer Vision.Piscataway,US-A:IEEE,2003:320-327.
[20]BARBU A,ZHU S C.Multigrid and multi-level Swendsen-Wang cuts for hierarchic graph partition.IEEE Computer Society Conference on Computer Vision and Pattern Recognition.Piscataway,USA:IEEE,2004:731-738.
[21]DUECK D.Affinity propagation:clustering data by passing messages.Canadian:University of Toronto,2009.
[22]RUIZ-SARMIENTO J R,GALINDO C,GONZALEZ-JIMENEZ J.A survey on learning approaches for undirected graphical models.Application to scene object recognition.International Journal of Approximate Reasoning,2017,83(1):434-451.