基于MFCC和SVM的说话人性别识别

英文篇名：Gender recognition of speakers based on MFCC and SVM
中文刊名：重庆大学学报
英文刊名：Journal of Chongqing University
作者：肖汉光 ; 何为
英文作者：XIAO Han-guang1 ; 2 ; HE Wei2(1.State Key Laboratory of Power Transmission Equipment & System Security and New Technology ; Chongqing University ; Chongqing 400030 ; P.R.China ; 2.School of Mathematics and Physics ; Chongqing Institute of Technology ; Chongqing 400054 ; P.R.China)
中文关键词：模式识别 ; 分类器 ; 性别识别 ; 支持向量机 ; 梅尔频率频谱系数
英文关键词：pattern recognition ; classifiers ; gender recognition ; mel-frequency cepstrum coefficients ; support vector machine
出版日期：2009-07-15
机构：重庆大学输配电装备及系统安全与新技术国家重点实验室;重庆工学院数理学院;
年：2009
期：07
出版单位：重庆大学学报

摘要

建立了普通话语音性别数据库,提出联合梅尔频率频谱系数(Mel-frequency CepstrumCoefficients,MFCC)的特征提取方法和支持向量机(Support Vector Machine,SVM)的分类方法进行说话人性别识别,并与其它分类方法进行比较,实验结果表明该方法的说话人性别识别准确率达到98.7%,明显优于其它分类器。
A Chinese speech(mandarin) database was established for speakers gender recognition.A combination method is proposed for gender recognition of speakers based on support vector machine and Mel-frequency cepstrum coefficients(MFCC) for classification and feature extraction respectively.The comparative result shows that the accuracy of SVM is 98.7%,which is better than other methods.

引文

[1]张捍东,李金炜.基于性别识别的分类CHMM语音识别[J].计算机工程与应用,2007,43(21):187-189.ZHANG HAN-DONG,LI JIN-WEI.Speechrecognition based on CHMM classified by genderidentification[J].Computer Engineering andApplications,2007,43(21):187-189.
    [2]李娟娟,俞一彪,薛广荣.说话人性别识别系统的DSP实现[J].现代电子技术,2005,215(24):37-39.LI JUAN-JUAN,YU YI-BIAO,XUE GUANG-RONG.Speaker gender identification using DSPs[J].Modern Electronic Technique,2005,215(24):37-39.
    [3]邓英,欧贵文.基于HMM的性别识别[J].计算机工程与应用,2004,40(15):74-75.DENG YING,OU GUI-WEN.Gender identificationusing HMM[J].Computer Engineering andApplications,2004,40(15):74-75.
    [4]王伟,邓辉文.基于MFCC参数和VQ的说话人识别系统[J].仪器仪表学报,2006,27(6):2253-2255.WANG WEI,DENG YUI-WEN.Speaker recognitionsystem using MFCC features and vector quantization[J].Chinese Journal of Scientific Instrument,2006,27(6):2253-2255.
    [5]VAPNIK V.The nature of statistical learning theory[M].New York:Springer,1995.
    [6]张学工.关于统计学习理论与支持向量机[J].自动化学报,2000,26(1):32-42.ZHANG XUE-GONG.Introduction to statisticallearning theory and support vector machines[J].ActaAutomatica Sinica,2000,26(1):32-42.
    [7]肖汉光,蔡从中,廖克俊.利用声波和地震波识别军事车辆类型[J].系统工程理论与实践,2006,26(4):108-113.XIAO HAN-GUANG,CAI CONG-ZHONG,LIAOKE-JUN.Recognition of military vehicles by usingacoustic and seismic signals[J].Systems Engineering-Theory&Practice,2006,26(4):108-113.
    [8]CAI C Z,HAN L Y,JI Z L,et al.SVM2Prot:Web2based support vector machine software forfunctional classification of a protein from its primarysequence[J].Nucleic Acids Research,2003,31(13):3692-3697.
    [9]CAI C Z,HAN L Y,JI Z L,et al.Enzyme familyclassification by support vector machines[J].Proteins,2004,55(1):66-76.
    [10]蔡从中,袁前飞,肖汉光,等.中药组方的计算机辅助分类与识别[J].重庆大学学报:自然科学版,2006,29(10):42-46.CAI CONG-ZHONG,YUAN QIAN-FEI,XIAOHAN-GUANG,et al.Computer-aided classificationand identification of traditional Chinese medicine herbalformula[J].Journal of Chongqing University:NaturalScience Edition,2006,29(10):42-46.
    [11]SPECHT D F.Probabilistic neural networks[J].NeuralNetworks,1990,3(5):109-118.
    [12]AVCI E.A new optimum feature extraction and classi-fication method for speaker recognition:GWPNN[J].Expert Systems with Applications,2007,32(2):485-498.
    [13]GIULIANI D,GEROSA M,BRUGNARA F.Improved au-tomatic speech recognition through speaker normalization[J].Computer Speech and Language,2006,20(1):107-123.
    [14]LI L.Ground vehicle acoustic signal processing basedon biological hearing models[D].Maryland:Universityof Maryland College Park,1999.
    [15]张小玫,张雪英,梁五洲.基于小波Mel倒谱系数的抗噪语音识别[J].中国电子科学研究院学报,2008,3(2):187-189.ZHANG XIAO-MEI,ZHANG XUE-YING,LIANG WU-ZHOU.A noise robust speech recognition based on waveletMFCC[J].Journal of China Academy of Electronics andInformation Technology,2008,3(2):187-189.