面部表情识别方法的研究

设为首页

收藏本站

网站地图 | English | 公务邮箱

NSTL服务站

面部表情识别方法的研究

详细信息本馆镜像全文| 推荐本文 | | 获取CNKI官网全文

英文题名：Research on Facial Expressions Recognition Method
作者：欧阳琰
论文级别：博士
学科专业名称：控制科学与工程
中文关键词：表情识别技术 ; 面部运动单元 ; 稀疏编码 ; 多分类器融合 ; 基于参数估计的稀疏编码
英文关键词：Expression recognition technology ; Facial action units ; Sparse representation ; Fusion of multiple classifiers ; Robust sparse coding
学位年度：2013
导师：桑农
学科代码：0811
学位授予单位：华中科技大学
论文提交日期：2013-05-01

摘要

人脸表情识别技术能够使计算机识别人的表情,从而营造真正和谐的人机环境。表情识别对建立友好的人机交互界面有着非同一般的重要意义。如今表情识别技术已经深入应用到了我们日常生活中的多个领域：远程教育系统、疲劳驾驶检测系统和微笑检测技术等。因此,本文着重研究了表情识别中的几项关键技术。
     表情识别技术不同于人脸识别和纹理识别,它拥有自己独有的定义和特性。如今,表情识别关键技术的研究重点主要体现在两个方面：(1)利用表情的独有特性(面部运动单元)来改进经典的人脸识别或者纹理识别方法；(2)模拟生物视觉系统提出表情识别方法,使该方法拥有生物视觉系统的特点,即对噪声和遮挡等具有一定的鲁棒性。本文通过研究以上两个方面中的现有先进关键技术,综合运用了数字图像处理、生物视觉感知等人工智能技术对这些关键技术进行了改进。
     基于面部运动单元的表情识别方法是当前表情识别方法中的关键技术之一,本文在认真分析经典算法的基础上,提出了新的特征组合策略和采用了分类能力更强的分类器训练方法,从而进一步提高了识别准确率。但是,基于面部运动单元的表情识别方法是没有考虑表情识别中的噪声和遮挡等关键问题的。因此,本文对另一项表情识别方法中的关键技术也进行了研究,即对遮挡鲁棒的表情识别技术。
     由于生物视觉系统能够非常轻易的分辨出带有噪声和遮挡的人脸表情,因此在模拟生物视觉系统的基础上,基于稀疏表达的分类方法被越来越多的应用到表情识别之中。本文在研究稀疏表达的基础上,根据不同的应用情况提出了多种改进思路：(1)为了提高基于稀疏表达分类方法的识别率,提出了采用方向梯度直方图描述子替代传统的特征；(2)提出了一种模拟生物视觉的表情识别模型,并以此模型为标准确定了局部二值化模式与方向梯度直方图是最佳的特征提取方式,并使用基于贝叶斯理论的分类器融合方法对基于两种特征的分类方法进行了决策级上的融合,进一步提升了识别准确率；(3)针对前面两种方法运算时间过高的问题,提出了2项特征选取准则,并根据该准则选取出了新的特征,虽然使用该特征分类方法的识别准确率不如前面两种方法,但是处理单幅图像的运算时间大幅下降,并且其识别准确率是高于现有基于稀疏表达的分类方法。(4)为了进一步提高分类方法的鲁棒性,使用了一种基于参数估计的稀疏编码求解方法,并证明该方法能够提升基于稀疏表达表情识别方法的鲁棒性。
Facial expression recognition technology enables the computer to recognize the hunman facial expressions and create a truly harmonious human-machine environment. Expression recognition has extraordinary significance to establish friendly man-machine interface. Nowadays, expression recognition technology has in-depth applied to many areas of our daily lives:the distance education system, driver fatigue detection system and smile detection technology. Therefore, this paper focuses on several key technologies in facial expression recognition.
     Expression recognition technology is different from face recognition and texture recognition; it has its own unique definitions and characteristics. Today, the expression recognition of key technologies research focus on two aspects:(1) Use the unique characteristics of facial expressions (facial action unit) to improve the classic face recognition or texture recognition method;(2) Propose facial expression recognition method by simulating the biological visual system, the method has the characteristics of the biological visual system and has a certain robustness to noise and occlusion. We improve these tecnologies by using some artificial intelligence techniques such as digital image processing and biological visual perception, based on the research of the existing advantages of key technologies,
     Expression recognition method based on facial action unit is one of the key technologies in the current expression recognition method. Based on the analysis of the classical algorithm, we propsed a new feature combination strategy and a final classifying method with better classification capabilities to improve the accuracy rates of expression recognition. However, the expression recognition method based on facial action unit is not considered the corruption and occlusion problems. So, we study the other key technology in facial expression recognition which can robust to occlusion.
     As the biological visual system is able to very easily distinguish the facial expressions with noise and occlusion, so sparse representation based classifiers (SRC) are more and more applied to facial expression recognition.Based on the research of SRC, we proposed a variety of improvement ideas:(1) Propose to use histogram of gradient descriptor to take place of traditional feature extraction method, for the purpose of increasing the accuracy rates of SRC.(2)Propose an expression recognition model by simulating biological visual based on the existing research result. Determine that the Local Binary Patterns and histogram of gradient descriptor are the best features. For the purpose of further increasing the accuracy rates, use classifier combination method which based on Bayesian theory to fuse the results of two classifier methods.(3) Propose two feature selection criterias to solve the problem of time-consuming. Select a new feature based on these two criterias, the facial expression recognition method which based on the feature and SRC can decrease the time-consuming and give better performance than the exisiting method based on SRC.(4) For the purpose of increasing the robustness of SRC, appling a robust sparse coding model to facial expression recognition. The results show that the robustness of SRC can be improved.

引文

[1]王志良,孟秀艳.人脸工程学[M].北京：机械工业出版社,2008.
    [2]王映辉.人脸识别——原理、方法和技术[M].北京：科学出版社,2010.
    [3]王志良,祝长生,解仑.人工情感[M].北京：机械工业出版社,2009.
    [4]Izard C E. The face of emotion[M]. New York:AppletonCentury-Crofts,1971.
    [5]Kamachi.M,Lyons.M,Gyoba.J. The Japanese Female Facial Expression (JAFFE) Database. [OL]. Available.http://www.kasrl.org/jaffe.html
    [6]Kanade.T,Cohn.J.F,Tian.Y. Comprehensive database for facial expression analysis[C]. International Conference on Automatic Face and Gesture Recognition.2000:46-53.
    [7]PICS database[DB].http://pics.psych.stir.ac.uk.
    [8]The MMI Facial Expression Database[DB].http://www.mmifacedb.com.
    [9]RU-FACS-1 Database[DB].http://mplab.ucsd.edu/databases/databases.html.
    [10]Donato G, Bartlett M, Hager J,et al. Classifying facial actions[J].IEEE Trans.PAMI,1999,21(10):974-989.
    [11]Friesen W,Ekman P. Dictionary-Interpretation of FACS Scoring. Unpublished manuscript,UC San Francisco,1987.
    [12]Ekman P. Facial expression and emotion. American Psychologist,1993,48:384-392.
    [13]O'Tool A. Psychological and neural perspectives in human face recognition[J]. The Handbook of Face Recognition,2004,Springer-Verlag.
    [14]O'Tool A, HARMS J.A video database of moving faces and people[J].IEEE Trans.PAMI,2005,27(5):812-816.
    [15]Sebe N, Lew M, Cohen I, et al. Authentic facial expression analysis [C].Sixth IEEE International Conference on Automatic Face and Gesture Recognition,2004:457-462.
    [16]Douglas-Cowie E,Cowie R,Schroder M. A new emotion database:considerations, sources and scope[J].Proc. of the ISCA ITRW on Speech and Emotion, Newcastle, 2000:39-44.
    [17]Yin L J,Wei X Z,Sun Y,et al. A 3D facial expression database for facial behavior research[J].7th International Conference on Automatic Face and Gesture Recognition, IEEE Computer Society TC PAMI.Southhampton,2006,211-216.
    [18]吴丹,林学阖.人脸表情视频数据库的设计与实现[J].计算机工程与应用,2004,5：177-180
    [19]刘伟峰,人脸表情况别研究[D].合肥：中国科学技术大学博士学位论文,2007.
    [20]何良华.人脸表情识别中若干关键技术的研究[D].南京：东南大学,2005.
    [21]Song,M.L. Tao,D.C. Liu, Z.C. et al. Image Ratio Features for Facial Expression Recognition Application[J].IEEE Transactions on Systems, Man and Cybernetics, 2010,40(3):779-788.
    [22]Cohen B.L. Sebe,L N., Chen, et al. Facial Expression Recognition fromVidio Sequences:Temporal and Static Modelling[J]. Computer Vision and Image Understading.2003,91(1/2):160-187.
    [23]Anderson, K. Mcowan, P.W. A real-time automated system for the recognition of human faicial expressions[J]. IEEE Trans Syst Man Cybern B Cybern.2006,36(1): 96-105.
    [24]Cootes, T.F. Taylor, C.J. Active Shape Models-'Smart Snakes'[C]. In:Proc.British Machine Vision Conference.1992:266-275.
    [25]Cootes,T.F. Taylor,C.J. Graham, J. Active shape models—their training and application[J].Computer Vision and image understanding,1994 61(1):38-59.
    [26]Cootes,T.F. Taylor,C.J. Lanitis, A. Multi-Resolution Search with Active Shape Models[C]. Proc.ICPR 1994(1):610-612.
    [27]Turk M, Pentland A. Eigen faces for recognition [J].Journal of Cognitive Neuroscience,1991,3(1):71-86.
    [28]Cootes,T.F. Edwards,G.J. Taylor, C.J. Active appearance models[C].5th European Conference on Computer Vision,1998:484-498.
    [29]Edwards,G.J.Taylor,C.J. Cootes,T.F. Interpreting face images using active appearance models[C]. Proceedings Third IEEE International Conference on Automatic Face and Gesture Recognition.1998:300-305.
    [30]P.Horn,B.K. Schunck,B.G. Determining Optical Flow [J]. Artificial Intelligence, 1981(17):185-203.
    [31]Kearney,J.K. Thompson, W.B. Boley, D.L. Optical flow estimation:An error analysis of gradient-Based method with local optimization[J]. IEEE Trans on Pattern Analysis and Machine Intelligence.1987,9(2):229-244.
    [32]Oron,E.Motion estimation and image difference for multi-object tracking[C]. Aerospace Conference,1999(4):401-409
    [33]Kobayashi H, Hara F. The recognition of basic facial expression by neutral network.[C], International Joint Conference on Neural Network.1991:460-466.
    [34]尹星云,王洵,董兰芳,等.用隐马尔可夫模型设计人脸表情识别系统[J].电子科技大学学报,2003,32(6)：725-728.
    [35]Shan, C.F. Gong,S.G. P.W. McOwan. Robust facial expression recognition using local binary patterns[C]//ICIP 2005 IEEE International Conference on Image Processing. 2005:11-370-373.
    [36]Littlewort, G. Bartlett,M.S. Fasel, I.et al.Dynamics of facial expression extracted automatically from video [J]. Image and Vision Computing,2006, (24):615-625.
    [37]朱健翔,苏光大,李迎春.结合Gabor特征与Adaboost的人脸表情识别[J].光电子·激光,2006,17(8)：993-998.
    [38]Duda R O,Hart P E,Stork D G.模式分类[M].李宏东,译.北京：机械工业出版社.2003：320-330.
    [39]Yoav F. a weak learning algorithm by majority [J]. Information and Computation,1995,121 (2):256-285.
    [40]Yoav F., Robert E.S. A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences,1995,55(1): 119-139.
    [41]Cortes C.,Vapnik V. Support vector networks[J]. Machine Learning,1995(20): 273-295.
    [42]Vapnik V.N.,Levin E. Cunn Y.L.Measuring the VC dimension of a learning machine[J]. Neural Computation,1994(6):851-876.
    [43]Vapnik V.N. Golowich S. Smola A. Support vector method for function approximation, regression estimation, and signal processing [J]. Advances in Neural Information Processing Systems 1977(9):281-287.
    [44]Thomas M.C. Peter E.H. Nearest neighbor pattern classification[J].IEEE Transactions on Information Theory,1967(1):21-27.
    [45]Peter H. The condensed nearest neighbor rule [J]. IEEE Transactions on Information Theory,1968(3):515-516.
    [46]Donato G. Bartlett M. Hager J,et al. Classifying facial actions[J].IEEE Transactions on Pattern Analysis and Machine Intelligence.1999,21(10):974-989.
    [47]Tian,Y. Kanade,T. Cohn,J.F. Recognizing action units for facial expression analysis [J].IEEE Trans on Pattern Analysis and Machine Intelligence.2001,23(2):97-115.
    [48]Ding, L.Y. Martinez,A.M. Features versus Context:An approach for precise and detailed detection and delineation of faces and facial features [J]. IEEE Transactions On Pattern Analysis and Machine Intelligence,2010,38(11):2022-2038
    [49]Simon, T. Nguyen, M.H..Torre,F.D.L et al. Action unit detection with segment-based SVMs[C] IEEE International Conference on Computer Vision and Pattern Recognition.2010:2737-2744.
    [50]Yang,P. Liu, Q.S. Metaxas.,D.N. Exploring facial expressions with compositional features[C]. IEEE International Conference on Computer Vision and Pattern Recognition,2010:2638-2644.
    [51]Gu W.F., Chneg X.., Venkatesh,Y.V., et al. Facial expression recognition using radial encoding of local Gabor features and classifier synthesis[J]. Pattern Recognition. 201245(1):80-91.
    [52]Wright,J. Yang,A.Y. Ganesh,A. et al. Robust face recognition via sparse representation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2009,31(2):210-227.
    [53]Cotter,S.F. Sparse representation for accurate classification of corrupted and occluded facial expressions[C]. Proc.ICASSP.2010:838-841.
    [54]Huang,M.W. Wang,Z.W. Ying,Z.L. A new method for facial expression recognition based on Sparse representation plus LBP[C]. International Congress on Image and Signal Processing (CISP).2010:1750-1754
    [55]Huang,M.W. Ming, Z.L. The performance study of facial expression recognition via sparse representation[C]. Internationa l conference on Machine Learning and Cybernetics (ICMLC).2010:824-827.
    [56]Zafeiriou, S. Petrou, M. Sparse representations for facial expressions recognition via l1 optimization[C]. IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops,2010:32-39
    [57]Viola P, Jones, M.J. Robust real-time face detection [J]. International Journal of Computer Vision,2004,57(2):137-154
    [58]Freeman,W.T. Adelson E.H. The design and use of steerable filters[C].IEEE Transactions on Pattern Analysis and Machine Intelligence,1991,13(9):891-906.
    [59]Greenspan,H.,Belongie,S.,Gooodman,R. et al. Overcomplete steerable pyramid filters and rotation invariance[C]. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.1994:222-228.
    [60]Martin J.K. An exact probability metric for decision tree splitting and stopping[J]. Machine Learning,1997,28(23):257-291.
    [61]Mehta M., Rissanen J., Agrawal R. MDL-based decision tree pruning [C]. In Proceedings of the First International Conference on Knowledge Discovery and Data Mining,1995:216-221.
    [62]Liu C., Shum H.Y. Kullback-leibler boosting[C]. IEEE Computer Society Conference on Computer Vision and Pattern Recognition.2003(1):587-594.
    [63]Yuan J.,Luo J.,Wu Y. Mining compositional features for boosting[C]. IEEE Conference on Computer Vision and Pattern Recognition,2008:1-8.
    [64]Schapire R.E., Singer Y. Improved boosting algorithms using confidence-rated predictions [J]. Machine Learning,1999,31(3):291-336.
    [65]Vezhnevets,A. Vezhnevets,V. Modest AdaBoost:teaching AdaBoost to generalize better[C]. Graphicon 2005 Proceedings. Russia:2005.
    [66]Li.Z.S.,Imai J.,Kaneko M. Facial expression recognition using facial-component-based bag of words and PHOG descriptors [J]. Information and Media Technologies 2010,5(3):1003-1009.
    [67]Olshausen,B.A. Field,D.I. Emergence of Simple-Cell Receptive Field Properties by Learning a Sparse Code for Natural images[J].Nature,1996,381(6583):607-609.
    [68]Hyrarinen, A. A Two-Layer Sparse Coding Model Learns Simple and Complex Cell Receptive Fields and Topography from Natural Imgaes[J]. Vision Research, 2001,41(18):2413-2423.
    [69]Swets,D.L. Weng,J. Using discirminant eigenfeatures for image retrieval [J].IEEE Trans, on Pattern Analysis and Machine Intelligence.1996,18(8):831-836.
    [70].Belhumeur,P.N Hespanha, J.P. Kriegman,D.J. Eigenfaces vs. Fisherfaces: Recogni-tion using class specific linear projection [J]. IEEE Trans. On Pattern Analysis and Machine Intelligence.1997,19(7):711-720.
    [71]Martinez,A.M. Kak, A.C. PCA versus LDA[J]. IEEE Trans. On Pattern Analysis and Machine Intelligence.2001,23(2):228-233.
    [72]Turk,M. Pentland. A.P. Face Recognition Using Eigenfaces[C]. Proceeding of IEEE Conference on Computer Vision and Pattern Recognition.1991:1209-1230.
    [73]He X., Yan S.,Hu Y. et al. Face recognition using Laplacian faces[J]. IEEE Trans, on Pattern Analysis and Machine Intelligence,2005,21 (3):328-340.
    [74]J.S.Zhang, D.F.Li. Direct Discirminant Locality Preserving Projection with Hammerstein Polynomial Expansion [J]. IEEE Trans. on Image Process.2012, 21(12):4858-4867.
    [75]Belkin M., Niyogi P. Laplacian Eigenmaps for Dimensionality Reduction and data representation [J]. Naural Computation,2003,15(6):1373-1396.
    [76]Yang J. Yang J.Y. Why can LDA be performed in PCA transformed space?[J]. Pattern Recognition.2003(36):563-566.
    [77]Whitehill J. Littlewort G. Fasel L. et al. Toward practical smile detection[J].IEEE Trancsactions on Pattern Analysis and Machine Intelligence.2009,31(11):2106-2111.
    [78]Dalal, N. Triggs,B. Histograms of oriented gradients for human detection [J]. IEEE Conf. on Computer Vision and Pattern Recognition.2005:886-893.
    [79]Li,Z.S. Imai,J. Kaneko,M. acial expression recognition using facial-component-based bag of words and PHOG descriptors [J]. Information and Media Technologies 2010, 5(3):1003-1009
    [80]Cotter,S.F. Recognition of occluded facial expressions using a fusion of localized sparse representation classifiers [C]. Digital Signal Processing Workshop and IEEE Signal Processing Education Workshop.2011:437-442.
    [81]Ojala, T. Pietikaninen, M., Maenpaa, T. Multiresolution Gray-Scale and Rotation Invariant Texture Classification with Local Binary Patterns[J]. IEEE Trans on Pattern Analysis and Machine Intelligence.2002(24) pp:971-987
    [82]Zhao,G.Y. Matti,P. Dynamic Texture Recognition Using Local Binary Patterns with an Application to Facial Expressions [J].IEEE Trans. on Pattern Analysis and Machine Intelligence.2007,29(6):915-928.
    [83]Shan,C. Gong, S. P.W. McOwan. Robust facial expression recognition using local binary patterns [J]. IEEE International Conference on Image Processing.2005:370-373.
    [84].He L.H., Zhu C.R., Zhao L.,&Hu,D.(2005). An enhanced LBP feature based on facial expression recognition[C], IEEE Engineering in Medicine and Biology 27th Annual Conference. pp.3300-3303.
    [85]Pantic M.& Rothkrantz L.J.M.Automatic analysis of facial expressions:The state of the art[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,2000,22(12):1424-1445
    [86]Deng H. A new facial expression recognition method based on local Gabor filter bank and PCAplus LDA[J].Intl.Jrnl. oflnfo.Tech.2005,11(11):86-96
    [87]Dosher BA, Lu Z L. (1999). Mechanisms of perceptual learning[J]. Vision Research (39)3197-3221.
    [88]Tsao D Y,&Freiwald W A. (2006) R.Tootell, M.Livingstone, A cortical region consisting entirely of face-selective cells[J], Science.311:670-674.
    [89]Tian Y.(2004) Evaluation of face resolution for expression analysis[C]. Computer Vision and Pattern Recognition Workshop on Face Processing in Video. pp:82
    [90]Yeasin,M. Builot R. Sharma R. From facial expression to level of interest:a spatio-temporal approach[C], Conference on Computer Vision and Pattern Recognition 2004:922-927.
    [91]Aleksic P.S. Katsaggelos A.K. Automatic facial expression recognition using facial animation parameters and multi-stream HMMS [J]. IEEE Transactions on Information Forensics and Security 2006(1):3-11.
    [92].Kim S J, Koh K, Lustig M, Boyd S & Gorinevsky D. (2007) A method for large-scale l1-regularized least squares[J]. IEEE Journal on Selected Topics in Signal Processing, 1(4):606-617.
    [93].Josef,K Mohamad,H. Robert,P.W.D. et al. On combining classifiers [J]. IEEE Trans on PAMI.1998(20):226-239.
    [94]Wright.J. Y.Ma Dense error correction via l1 minimizaion[J].IEEE Transactions on Information Theory,2010,56(7):3540-3560.
    [95]Meng.Y.,Lei.Z.,Jian.Y.,&David.Z.Robust sparse coding for face recognition[J]. IEEE International Conference on Computer Vision and Pattern Recognition,2011:625-632.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700