基于双通道混合3D-2D RBM模型的手势识别

设为首页

收藏本站

网站地图 | English | 公务邮箱

读者指南

学术客户端

NSTL服务站

科技查新

基于双通道混合3D-2D RBM模型的手势识别

详细信息查看全文 | 推荐本文 |

英文篇名：Dynamic Hand Gesture Recognition Based on Two-channel Hybrid 3D-2D RBM
作者：李敬华 ; 淮华瑞 ; 孔德慧 ; 王立春 ; 孙艳丰
英文作者：LI Jinghua;HUAI Huarui;KONG Dehui;WANG Lichun;SUN Yanfeng;Beijing Key Laboratory of Multimedia and Intelligent Software Technology,Faculty of Information Technology,Beijing University of Technology;
关键词：3D-2D受限玻尔兹曼机 ; 梯度直方图 ; 光流 ; 动态手势识别
英文关键词：3D-2D restricted Boltzmann machine (RBM);;histogram of oriented gradient (HOG);;optical flow;;dynamic hand gesture recognition
中文刊名：BJGD
英文刊名：Journal of Beijing University of Technology
机构：北京工业大学信息学部多媒体与智能软件技术北京市重点实验室;
出版日期：2019-03-20 14:43
出版单位：北京工业大学学报
年：2019
期：v.45
基金：国家自然科学基金资助项目(61402024,61602486);; 北京市自然科学基金资助项目(4152009);; 北京市教育委员会科技计划资助项目(KM201710005022)
语种：中文;
页：BJGD201905003
页数：8
CN：05
ISSN：11-2286/T
分类号：20-27

摘要

为了挖掘基于视频的动态手势识别问题中手势的固有时空表示,提出一种3D-2D受限玻尔兹曼机(restricted Boltzmann machine,RBM)模型,以便建模手势视频数据的时空相关信息.特别地,为了更好地描述动态手势的时空特征,提出传统手工定义特征与3D-2D RBM结合的混合特征表示方法,该方法首先提取Canny-2D HOG表观特征以及光流-2D HOG运动特征,然后基于3D-2D RBM进一步学习动态手势潜在的高层时空语义特征,提升动态手势的特征描述力.融合手势外观判别和运动判别的双通道融合判别改进了单通道分类的能力.在公开的剑桥手势数据集上的实验验证了所提方法的有效性和优越性.
To explore the intrinsic spatio-temporal representation of dynamic hand gesture in the videobased hand gesture recognition,this paper proposed a 3D-2D restricted Boltzmann machine( RBM)model,which is able to model the spatio-temporal correlation of hand gesture video data. Especially,a method combining traditional hand-defined feature with 3D-2D RBM was proposed to describe hand gesture better. The proposed hybrid 3D-2D RBM model consists of three phases. First,Canny-2D HOG and optical flow 2D HOG were used to describe the spatial and temporal feature,respectively. A 3D-2D RBM was then adopted to learn the latent high-level semantics. Finally,the two-channel discrimination results were fused together for recognition. The experimental results on the public Cambridge Hand Gesture Data set show that the proposed hybrid 3D-2D RBM outperforms the state-of-the-art.

引文

[1]ESHED O B,MOHAN M T.Hand gesture recognition in real time for automotive interfaces:a multimodal visionbased approach and evaluations[J].IEEE Transactions on Intelligent Transportation Systems,2014,15(6):2368-2377.
    [2]WU D,SHAO L.Deep dynamic neural networks for gesture segmentation and recognition[C]∥Computer Vision-ECCV 2014 Workshops.Berlin:Springer,2015:552-571.
    [3]AUEPHANWIRIYAKUL S,PHITAKWINAI S.Thai sign language translation using scale invariant feature transform and hidden markov models[J].Pattern Recognition Letters,2013,34(11):1291-1298.
    [4]WANG M,CHEN W Y,LI X D.Hand gesture recognition using valley circle feature and hu's moments technique for robot movement control[J].Measurement,2016,94:734-744.
    [5]PRASUHN L,OYAMADA Y,MOCHIZUKI Y,et al.Ahog-based hand gesture recognition system on a mobile device[C]∥Proceedings of IEEE International Conference on Image Processing.Piscataway:IEEE,2014:3973-3977.
    [6]SIMONYAN K,ZISSERMAN A.Two-stream convolutional networks for action recognition in videos[C]∥Advances in Neural Information Processing Systems.New York:ACM,2014:568-576.
    [7]CHEN F S,FU C M,HUANG C L.Hand gesture recognition using a real-time tracking method and hidden markov models[J].Image and Vision Computing,2003,21(8):745-758.
    [8]YANG M H,AHUJA N,TABB M.Extraction of 2Dmotion trajectories and its application to hand gesture recognition[J].IEEE Transaction Pattern Analysis and Machine Intelligence,2002,24(8):1061-1074.
    [9]FISCHER A,IGEL C.An introduction to restricted boltzmann machines[J].Lecture Notes in Computer Science,2012,7441:14-36.
    [10]KRIZHEVSKY A,SUTSKEVER I,HINTON G E.Image Net classification with deep convolutional neural networks[C]∥Advances in Neural Information Processing Systems.New York:ACM,2012:1097-1105.
    [11]MOLCHANOV P,GUPTA S,KIM K,et al.Hand gesture recognition with 3d convolutional neural networks[C]∥Proceedings of IEEE Conference on Computer Vision and Pattern Recognition Workshops.Piscataway:IEEE,2015:1-7.
    [12]HUANG J,ZHOU W,LI H,et al.Sign language recognition using 3d convolutional neural networks[C]∥Proceedings of IEEE International Conference on Multimedia&Expo.Piscataway:IEEE,2015:1-6.
    [13]QI G L,SUN Y F,GAO J B,et al.Matrix Variate RBMand Its Applications[C]∥Proceedings of IEEEInternational Joint Conference on Neural Networks.Piscataway:IEEE,2016:389-395.
    [14]HINTON G E,SALKHUTDINOW R R.Reducing the dimensionality of data with neural networks[J].Science,2006,313(5786):504-507.
    [15]CANNY J.A computational approach to edge detection[J].IEEE Transactions on Pattern Analysis&Machine Intelligence,1986,8(6):679-98.
    [16]FARNEBACK G.Two-frame motion estimation based on polynomial expansion[J].Lecture Notes in Computer Science,2003,2749:363-370.
    [17]NAVNEET D,BILL T.Histograms of oriented gradients for human detection[C]∥Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.Piscataway:IEEE,2005:886-893.
    [18]KIM T K,CIPOLLA R.Canonical correlation analysis of video volume tensors for action categorization and Detection[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2009,31(8):1415-1428.
    [19]LUI Y M.Human gesture recognition on product manifolds[J].Journal of Machine Learning Research,2012,13(1):3297-3321.

常见问题　|　交通位置　|　联系我们　|　OA远程办公

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700