多模态深度神经网络的固废对象分割

设为首页

收藏本站

网站地图 | English | 公务邮箱

读者指南

学术客户端

NSTL服务站

科技查新

多模态深度神经网络的固废对象分割

详细信息查看全文 | 推荐本文 |

英文篇名：Multimodal deep neural network for construction waste object segmentation
作者：张剑华 ; 陈嘉伟 ; 张少波 ; 郭建双 ; 刘盛
英文作者：Zhang Jianhua;Chen Jiawei;Zhang Shaobo;Guo Jianshuang;Liu Sheng;College of Computer Science and Technology,Zhejiang University of Technology;
关键词：多模态信息 ; 固废对象分割 ; 卷积神经网络 ; 条件随机 ; 深度梯度
英文关键词：multimodal information;;construction waste object segmentation;;convolutional neural network;;conditional random;;depth gradient
中文刊名：ZGTB
英文刊名：Journal of Image and Graphics
机构：浙江工业大学计算机科学与技术学院;
出版日期：2019-07-16
出版单位：中国图象图形学报
年：2019
期：v.24;No.279
基金：国家自然科学基金项目(61876167,U1509207)~~
语种：中文;
页：ZGTB201907013
页数：12
CN：07
ISSN：11-3758/TB
分类号：130-141

摘要

目的对城市发展过程中产生的建筑固废进行处理,并将之转换为资源和能源,是极佳的保护环境的经济发展模式。然而人工分拣处理存在效率低、污染严重、对人身危害大等问题。目前工业界在探索一种有效的基于机械臂自动抓取的建筑固废自动分拣系统,其中图像分割技术是非常必要的一个环节。但是工业现场的环境因素造成固废对象的颜色严重退化,会影响最终的固废对象分割。本文针对建筑固废图像分割难度大的现状,提出一种基于多模态深度神经网络的方法来解决固废对象分割问题。方法首先,在颜色退化严重的场景下,把RGB图像和深度图一起作为深度卷积神经网络的输入,利用深度卷积神经网络进行高维特征学习,通过softmax分类器获得每个像素的标签分配概率。其次,基于新的能量函数建立全连接条件随机场,通过最小化能量函数寻找全局最优解来分割图像,从而为每一类固废对象产生一个独立的分割块。最后,利用局部轮廓信息计算深度梯度,实现同一类别的不同实例的固废对象精确分割。结果在固废图像测试集上,该方法取得了90. 02%均像素精度和89. 03%均交并比(MIOU)。此外,与目前一些优秀的语义分割算法相比,也表现出了优越性。结论本文方法能够对每一个固废对象同时进行有效的分割和分类,为建筑垃圾自动分拣系统提供准确的固废对象轮廓和类别信息,从而方便实现机械臂的自动抓取。
Objective Construction waste is no longer useless nowadays. It has become an excellent economic development mode to protect the environment by recycling waste generated during construction and converting it into resources and energy. The current situation of construction waste in China has become increasingly severe. With the development of urbanization,old buildings have been demolished,rebuilt,and or replaced by skyscrapers. Moreover,once-inhabited areas have been gradually transformed into cities with ever expanding sizes. These cities develop quickly but entail serious hidden dangers. Construction waste generated by numerous construction work sites have become difficult to ignore. Urban construction waste refers to all kinds of construction waste generated during the construction,transformation,decoration,demolition,and laying of various buildings and structures and their auxiliary facilities. It mainly includes muck,waste concrete,waste brick,waste pipe,and waste wood. According to statistics,the amount of construction building waste in China now accounts for 30% to 40% of the total amount of municipal waste. In the next 10 years,China will produce more than 1. 5 billion tons of construction waste per year on the average. By 2020,construction waste will reach 2. 6 billion tons; by2030,it will reach 7. 3 billion tons. Resource utilization and recycling are inevitable options for dealing with construction waste in buildings. To deal with construction waste in buildings effectively,one can start from its characteristics. Construction waste is a mixture of various building material wastes,which are actually unutilized resources. In the 1990 s,several communities in California first launched a single-stream recycling project,which referred to the mixture of all paper products,plastics,glass,metal,and other wastes. The mixture was separated into single items by a sorting system. In the sorting system,waste was mainly processed by a combination of hardware equipment and manpower. The system was not fully automated and relied mainly on human recycling; thus,it was inefficient. This attempt was meaningful because it let people understand the feasibility of recycling waste. Many construction wastes,such as waste bricks,waste rock,and scrap steel,can be recycled after being sorted,rejected,or crushed. However,systems,such as a single-stream recycling project,cannot handle substantial construction waste. Owing to the development of artificial intelligence technology,the use of intelligent robotic equipment in the field of construction waste recycling can greatly improve the capability,efficiency,and safety of recycling. Among these equipment,the robotic arm is the most widely used automated mechanical device in the industrial field. It can quickly grasp objects and work continuously. The emergence of robotic arms provides a new and efficient solution for the automatic sorting of construction waste in buildings. The use of robotic arms to sort construction waste is a revolutionary innovation for the construction waste treatment industry. The position and contour information of the object are indispensable to the robotic arm grabbing task. The application of computer image segmentation algorithms in this scene is undoubtedly suitable. Through image segmentation algorithms,a construction waste image can be accurately segmented to obtain the position and contour of each object. Combining robotic arms and image segmentation algorithms to achieve efficient construction waste recovery is worth exploring. However,segmentation of construction waste objects from construction waste images obtained via segmentation algorithms is difficult due to the characteristics of industrial sites and construction waste objects. With regard to the difficulty of object segmentation in construction waste images,this study proposes a construction waste object segmentation method based on a multimodal information deep neural network to solve the image segmentation problem and provides accurate construction waste object contour and category information for the construction waste automatic sorting system. Therefore,the system can realize automatic grabbing using a robot arm. Method First,in scenes with severe color degradation,feature learning with RGB images alone does not meet actual needs. Therefore,training the salient features with depth information is necessary. Second,we treat the RGB image and the corresponding depth image as the input of the deep convolutional neural network. The deep convolutional neural network is used to perform highdimensional feature learning,and the feature maps obtained from the convolutional layers of the last layer are weighted and summed then fed as input data of the SoftM ax classifier. Finally,we obtain the label allocation probability of each pixel.On the basis of the probability that each pixel belongs to a category,we construct a multi-label,fully connected,conditional random field. The unary energy term treats each pixel as an independent item without considering the relationship between pixels. The binary energy term represents the relationship among pixels. Thus,similar pixels are divided into the same category,and pixels with large differences between each other are assigned to different categories,which makes the segmented edges smooth. We are thus able to obtain accurate segmentation results. Therefore,according to the actual situation,we propose an energy function suitable for construction waste objects. The global optimal solution is obtained by minimizing the energy function to segment the object in the image,thereby generating an independent segmentation block for each type of construction waste object. Finally,fine segmentation of local ambiguous regions is performed according to the depth gradient information. Ambiguous areas refer to the adhesion areas between construction waste objects that are difficult to distinguish due to the degradation of visual characteristics. The depth gradient information is used to obtain the local depth edge map,from which the local ambiguity area is extracted. For the local ambiguity area,the algorithm extracts the effective internal edge to segment the adhesion objects belonging to the same class. Result In the construction waste image test set,our method achieves 90. 02% mean pixel accuracy and 89. 03% mean intersection over union. Compared with several excellent semantic segmentation algorithms,the experimental results show that the proposed method performs better and improves segmentation accuracy. Conclusion The algorithm proposed in this study can effectively segment and classify most construction waste objects simultaneously and can provide accurate contour and classification information of the construction waste object to a construction waste automatic sorting system so as to facilitate the automatic grasping of construction waste by robotic arms.

引文

[1]Hao Z G,Su X M.Research on the reuse of construction waste[J].Architecture&Culture,2017,(2):110-111.[郝占国,苏晓明.建筑垃圾再利用研究[J].建筑与文化,2017,(2):110-111.]
    [2]Wang W X.Recycling of construction waste[J].China Building Materials,2005,3(8):67-71.[王武祥.建筑垃圾的循环利用[J].中国建材,2005,3(8):67-71.][DOI:10.16291/j.cnki.zgjc.2005.08.005]
    [3]Krhenbühl P,Koltun V.Efficient inference in fully connected crfs with gaussian edge potentials[C]//Advances in Neural Information Processing Systems.Granada,Spain:[s.n.],2011:109-117.
    [4]Reddi S S,Rudin S F,Keshavan H R.An optimal multiple threshold scheme for image segmentation[J].IEEE Transactions on Systems,Man,and Cybernetics,1984,SMC-14(4):661-665.[DOI:10.1109/TSMC.1984.6313341]
    [5]Ma W Y,Manjunath B S.Edge Flow:a technique for boundary detection and image segmentation[J].IEEE Transactions on Image Processing,2000,9(8):1375-1388.[DOI:10.1109/83.855433]
    [6]Leung T K,Malik J.Contour continuity in region based image segmentation[C]//Proceedings of the 5th European Conference on Computer Vision.Germany:Springer-Verlag,1998:544-559.
    [7]Achanta R,Shaji A,Smith K,et al.SLIC superpixels compared to state-of-the-art superpixel methods[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2012,34(11):2274-2282.[DOI:10.1109/TPAMI.2012.120]
    [8]Felzenszwalb P F,Huttenlocher D P.Efficient graph-based image segmentation[J].International Journal of Computer Vision,2004,59(2):167-181.[DOI:10.1023/B:VISI.0000022288.19776.77]
    [9]Hruschka H,Natter M.Comparing performance of feedforward neural nets K-means for cluster-based market segmentation[J].European Journal of Operational Research,1999,114(2):346-353.[DOI:10.1016/s0377-2217(98)00170-2]
    [10]Song M J S,Civco D.Road extraction using SVM and image segmentation[J].Photogrammetric Engineering&Remote Sensing,2004,70(12):1365-1371.[DOI:10.14358/PERS.70.12.1365]
    [11]Li J,Bioucas-Dias J M,Plaza A.Spectral-spatial hyperspectral image segmentation using subspace multinomial logistic regression and Markov random fields[J].IEEE Transactions on Geoscience and Remote Sensing,2012,50(3):809-823.[DOI:10.1109/TGRS.2011.2162649]
    [12]Shelhamer E,Long J,Darrell T.Fully convolutional networks for semantic segmentation[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,39(4):640-651.[DOI:10.1109/TPAMI.2016.2572683]
    [13]Zhao H S,Shi J P,Qi X J,et al.Pyramid scene parsing network[C]//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition.Hawaii,USA:IEEE,2017:2881-2890.
    [14]Qiu Y Q,Chen J W,Guo J S,et al.Three dimensional object segmentation based on spatial adaptive projection for solid waste[C]//CCF Chinese Conference on Computer Vision.Springer,Singapore,2017:453-464.[DOI:10.1007/978-981-10-7305-2_39]
    [15]Holz D,Holzer S,Rusu R B,et al.Real-time plane segmentation using RGB-D cameras[M]//R9fer T,Mayer N M,Savage J,et al.Robo Cup 2011:Robot Soccer World Cup XV.Berlin,Heidelberg:Springer,2011:306-317.[DOI:10.1007/978-3-642-32060-6_26]
    [16]Richtsfeld A,M9rwald T,Prankl J,et al.Learning of perceptual grouping for object segmentation on RGB-D data[J].Journal of Visual Communication and Image Representation,2014,25(1):64-73.[DOI:10.1016/j.jvcir.2013.04.006]
    [17]Simonyan K,Zisserman A.Very Deep Convolutional Networks for Large-Scale Image Recognition[EB/OL].2004-09-04[2019-04-22].https://arxiv.org/pdf/1409.1556.pdf.
    [18]Chen L C,Papandreou G,Kokkinos I,et al.Deep Lab:semantic image segmentation with deep convolutional nets,atrous convolution,and fully connected CRFs[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2016,40(4):834-848.[DOI:10.1109/TPAMI.2017.2699184]
    [19]Badrinarayanan V,Kendall A,Cipolla R.Seg Net:a deep convolutional encoder-decoder architecture for image segmentation[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,39(12):2481-2495.[DOI:10.1109/TPAMI.2016.2644615]
    [20]Chen L C,Papandreou G,Kokkinos I,et al.Semantic image segmentation with deep convolutional nets and fully connected crfs[EB/OL].2014-12-22[2019-04-22].https://arxiv.org/pdf/1412.7062.pdf.
    [21]Hazirbas C,Ma L,Domokos C,et al.Fuse Net:incorporating depth into semantic segmentation via fusion-based CNN architecture[C]//Proceedings of the 13th Asian Conference on Computer Vision.Taipei,China:Springer,Cham,2016:213-228.[DOI:10.1007/978-3-319-54181-5_14]

常见问题　|　交通位置　|　联系我们　|　OA远程办公

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700