摘要
近些年来,卷积神经网络框架在二维图像的语义分割、分类、检索等领域取得了非常好的效果.但是由于三维模型结构的复杂性与不规律性,卷积神经网络的卷积和池化操作却无法直接应用在三维模型上.为了发挥深度学习框架在二维图像分析领域积累的技术优势,本文采用基于多视角投影的方法来完成三维模型分类的任务.然而现有的基于多视角投影的三维模型分类方法大多采用固定视角,所采集到的多视角下模型投影渲染图中存在大量的信息冗余,对结果造成一定的干扰.本文提出了一种新型的多视角卷积神经网络框架,在网络训练过程中自动判别每个视角的贡献度,舍弃冗余视角的信息,从而提取出最能表征模型类别的特征,提高了网络的鲁棒性.此外,本文将基于视点熵的最佳视角选择方法引入三维模型分类领域,相比于固定视角方法,本文方法能更多地保留模型的细节信息,同时不需要模型的朝向对齐.通过在ModelNet10和ModelNet40数据集上的实验,验证了将基于视点熵的视角选择方法应用于三维模型分类,以及本文提出的基于视角判别的多视角信息融合方法的合理性和优越性.实验结果表明,本文方法的分类准确性也优于现有的基于固定多视角投影的三维模型分类方法.
In recent years, convolutional neural network(CNN) architecture has achieved good results in the fields of 2D image recognition, detection, and semantic segmentation. However, given the complexity and irregularity of 3D shape structures, CNNs cannot be directly applied to 3D data. With the advantage of the deep learning framework in the field of 2D image analysis, the view-based method can be used for 3D shape classification. However, the existing multi-view based 3D shape classification methods mostly adopt fixed viewpoints.Considerable information redundancy exist in the rendered images, and it can cause certain interference to the results. Herein, we propose a novel multi-view CNN framework, which automatically discriminates the contribution of each viewpoint during the network training and discards the redundant information. In addition, the optimal viewpoint selection method based on viewpoint entropy is introduced into the field of 3D shape classification. In comparison with the fixed viewpoint method, this procedure can retain more detailed information of the shapes and requires no orientation alignment of the model. Experiments on the ModelNet10 and ModelNet40 datasets verify the rationality and superiority of applying the optimal viewpoint selection method based on the viewpoint entropy to 3D model classification and the multi-view information fusion method proposed herein. The experimental results show the better classification accuracy of this method than that of existing 3D model classification methods.
引文
1 Bronstein A M,Bronstein M M,Guibas L J,et al.Shape Google:geometric words and expressions for invariant shape retrieval.ACM Trans Graph,2011,30:1-20
2 Funkhouser T,Kazhdan M,Min P,et al.Shape-based retrieval and analysis of 3D models.Commun ACM,2005,48:58
3 Huang Q X,Su H,Guibas L.Fine-grained semi-supervised labeling of large shape collections.ACM Trans Graph,2013,32:1-10
4 Long J,Shelhamer E,Darrell T.Fully convolutional networks for semantic segmentation.In:Proceedings of IEEEConference on Computer Vision and Pattern Recognition,Massachusetts,2015.3431-3440
5 Chen L C,Papandreou G,Kokkinos I,et al.DeepLab:semantic image segmentation with deep convolutional nets,atrous convolution,and fully connected CRFs.IEEE Trans Pattern Anal Mach Intell,2018,40:834-848
6 Su H,Maji S,Kalogerakis E,et al.Multi-view convolutional neural networks for 3D shape recognition.In:Proceedings of IEEE International Conference on Computer Vision,Santiago,2015.945-953
7 Qi C R,Su H,Niebner M,et al.Volumetric and multi-view CNNs for object classification on 3D data.In:Proceedings of IEEE Conference on Computer Vision and Pattern Recognition,Las Vegas,2016.5648-5656
8 Wang C,Pelillo M,Siddiqi K.Dominant set clustering and pooling for multi-view 3D object recogni-tion.In:Proceedings of British Machine Vision Conference,2017
9 Vazquez P P,Feixas M,Sbert M,et al.Viewpoint selection using viewpoint entropy.In:Proceedings of Vision Modeling and Visualization Conference,2001.273-280
10 Sbert M,Plemenos D,Feixas M.Viewpoint quality:measures and applications.In:Proceedings of the 1st Computational Aesthetics in Graphics,Visualization and Imaging.2005.185-192
11 Lee C H,Varshney A,Jacobs D W.Mesh saliency.ACM Trans Graph,2005,24:659
12 Qin F W,Li L Y,Gao S M,et al.A deep learning approach to the classification of 3D CAD models.J Zhejiang Univ Sci C,2014,15:91-106
13 Socher R,Huval B,Bhat B,et al.Convolutional-recursive deep learning for 3D object classification.In:Proceedings of the 25th Advances in Neural Information Processing Systems,Lake Tahoe,2012.665-673
14 Bruna J,Zaremba W,Szlam A,et al.Spectral networks and locally connected networks on graphs.2013.ArXiv:1312.6203
15 Masci J,Boscaini D,Bronstein M M,et al.Geodesic convolutional neural networks on riemannian manifolds.In:Proceedings of IEEE International Conference on Computer Vision,Santiago,2015.832-840
16 Wu Z R,Song S R,Khosla A,et al.3D shapenets:a deep representation for volumetric shapes.In:Proceedings of IEEE Conference on Computer Vision and Pattern Recognition,2015.1912-1920
17 Li Y Y,Pirk S,Su H,et al.FPNN:field probing neural networks for 3D data.In:Proceedings of IEEE Conference on Computer Vision and Pattern Recognition,Las Vegas,2016
18 Xu X,Todorovic S.Beam search for learning a deep convolutional neural network of 3D shapes.In:Proceedings of International Conference on Pattern Recognition,Columbia,2017.3506-3511
19 Sedaghat N,Zolfaghari M,Brox T.Orientation-boosted voxel nets for 3D object recognition.In:Proceedings of British Machine Vision Conference,London,2017
20 Ren M W,Niu L,Fang Y.3D-a-nets:3D deep dense descriptor for volumetric shapes with adversarial networks.2017.ArXiv:1711.10108
21 Qi C R,Su H,Mo K,et al.PointNet:deep learning on point sets for 3D classification and segmentation.In:Proceedings of IEEE Conference on Computer Vision and Pattern Recognition,Hawaii,2017
22 Qi C R,Yi L,Su H,et al.PointNet++:deep hierarchical feature learning on point sets in a metric space.In:Proceedings of the 31st Advances in Neural Information Processing Systems,2017
23 Simonovsky M,Komodakis N.Dynamic edge-conditioned filters in convolutional neural networks on graphs.In:Proceedings of IEEE Conference on Computer Vision and Pattern Recognition,Hawaii,2017
24 Klokov R,Lempitsky V.Escape from cells:deep Kd-networks for the recognition of 3D point cloud models.In:Proceedings of IEEE International Conference on Computer Vision,Venice,2017.863-872
25 Li J X,Chen B M,Lee G H.SO-Net:self-organizing network for point cloud analysis.In:Proceedings of IEEEConference on Computer Vision and Pattern Recognition,Salt Lake City,2018
26 Shi B G,Bai S,Zhou Z C,et al.DeepPano:deep panoramic representation for 3D shape recognition.IEEE Signal Process Lett,2015,22:2339-2343
27 Johns E,Leutenegger S,Davison A J.Pairwise decomposition of image sequences for active multi-view recognition.In:Proceedings of IEEE Conference on Computer Vision and Pattern Recognition,Las Vegas,2016.3813-3822
28 Sfikas K,Pratikakis I,Theoharis T.Ensemble of PANORAMA-based convolutional neural networks for 3D model classification and retrieval.Comput Graph,2018,71:208-218
29 Kanezaki A,Matsushita Y,Nishida Y.RotationNet:joint object categorization and pose estimation using multiviews from unsupervised viewpoints.In:Proceedings of IEEE Conference on Computer Vision and Pattern Recognition,Salt Lake City,2018
30 Krizhevsky A,Sutskever I,Hinton G E.ImageNet classification with deep convolutional neural networks.In:Proceedings of the 25th Advances in Neural Information Processing Systems,Lake Tahoe,2012
31 van der Maaten L,Hinton G.Visualizing high-dimensional data using t-SNE.J Mach Learn Res,2008,9:2579-2605
1)The Princeton ModelNet. http://modelnet.cs.princeton.edu/.