用户名: 密码: 验证码:
支持向量机在语音识别中的应用研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
语音识别是语音信号处理的一个重要方面,是人机交互技术的基础,有着广阔的应用前景。因此,对语音识别进行研究具有重要的理论价值和实际意义。
     本文首先系统地介绍了语音识别的基本原理,分析了目前主要的语音识别方法的局限性和不足,概述了本文研究的基础——统计学习理论和支持向量机方法,分析了支持向量机在语音识别技术中的应用前景。为了验证支持向量机在语音识别系统中的识别效果,本文分别构建了基于线性核支持向量机、径向基核支持向量机、三阶多项式核支持向量机以及Sigmoid核支持向量机的非特定人孤立词语音识别系统,并进行了大量的仿真实验。实验结果表明,前三种支持向量机应用于语音识别系统中均取得了优于隐马尔可夫模型的识别结果,并且支持向量机的运行速度也优于隐马尔可夫模型;而Sigmoid核支持向量机应用于语音识别系统中却得到了不尽如人意的识别结果。因此,核函数的不同直接影响着支持向量机的分类性能,从而影响了语音识别系统的识别效果。
     其次,为了研究在核函数相同的情况下,核参数和惩罚因子的不同取值对支持向量机推广性能的影响,本文构建了基于径向基核函数支持向量机的非特定人孤立词语音识别系统。在实验中,分别取了核参数和惩罚因子的三组不同的值进行了语音识别实验。实验结果表明,核参数和惩罚因子的不同取值也会影响支持向量机的推广性能,从而影响语音识别系统的识别效果。
     核函数的类型、核参数以及惩罚因子的选取直接影响着支持向量机语音识别系统的识别效果。然而,到目前为止,支持向量机的核函数、核参数及惩罚因子的选择还没有科学的方法,它们的选择只能根据经验、大量的反复实验进行对比等方法来进行选择,带有很大的局限性。针对这个问题,本文做了初步的研究,实现了在核函数类型确定的前提下,用粒子群优化算法对核参数和惩罚因子的优化,并用基于优选参数值的支持向量机进行语音识别实验,识别率得到了一定的改善和提高。
Speech recognition is an important aspect of speech signal processing. It is the foundation of human-computer interaction technology and has wide application prospect. It has great theoretical value and practical significance for us to do research on speech recognition.
     This paper first introduced the basic principle of speech recognition systematically, analyzed the limitation and shortage of current main speech recognition methods, summarized the research foundation of this paper - statistical learning theory and support vector machine method and analyzed the application prospect of support vector machine in speech recognition technology. In order to verify the recognition effect of support vector machine in speech recognition system, this paper constructed four non-specific person and isolated words speech recognition systems which are based on support vector machines of different kernel function respectively and did a lot of simulation experiments. These four kernel function are linear kernel function, radial basis kernel function, three-order polynomial kernel function and sigmoid kernel function. The experimental results show that the recognition results of speech recognition systems which are based on linear kernel support vector machine, radial basis kernel support vector machine and three-order polynomial support vector machine are very good and better than the recognition results that is based on hidden markov models. The running speed of support vector machine is faster than hidden markov models. But the recognition results of speech recognition system based on sigmoid kernel support vector machine are very bad. So the type of kernel functions directly affects the classification performance of support vector machine and accordingly affects the recognition effect of the speech recognition system.
     Secondly, in order to study the influences of kernel parameter and error penalty parameter on the generalization performance of support vector machine in condition of a fixed kernel function, this paper constructed a non-specific person and isolated words speech recognition system based on support vector machine of radial basis kernel function. In the experiments, three groups of kernel parameter and error penalty parameter values were taken to do speech recognition. The experimental results show that the different values of kernel parameter and error penalty parameter affect the generalization performance of support vector machine and accordingly affect the recognition effect of the speech recognition system.
     The selection of kernel function type, kernel parameter value and error penalty parameter value directly affects the recognition effect of speech recognition system based on support vector machine. However, there is no scientific method to select these three factors and people select them only according to experience and repeated experiments. There exists great limitation. Aiming at this problem, this paper did preliminary research and proposed a method to do parameter optimization that uses particle swarm optimization algorithm in condition of the kernel function type is fixed. At last this paper constructed a speech recognition system based on the support vector machine whose kernel parameter and error penalty parameter have been optimized and the recognition rates get certain improvement.
引文
[1]于倩.非特定人孤立词语音识别技术的研究[D].北京:中国民航大学,2007.
    [2]陈程.机载环境下的语音识别技术及实现[D] .成都:电子科技大学,2008.
    [3] Lee Kai-Fu, Hon Hsiao-Wuen, Hwang Mei-Yuh, etc.Recent Progress and Future Outlook of the SPHINX Speech Recognition System [J].Computer Speech and Language,1990,4(1):57-69.
    [4] Novak M, Hampl R, Krbec P, etc.Two-pass Search Strategy for Large List Recognition on Embedded Speech Recognition Platforms[A].Acoustics, Speech, and Signal Processing, ICASSP 2003 Proceedings [C], 2003, 185-188.
    [5] Liam Comerford, David Frank, Ponani Gopalakrishnan, etc.The IBM Personal Speech Assistant[A].Acoustics, Speech, and Signal Processing, ICASSP 2001 Proceedings [C],2001, Vol.1:1-4.
    [6] Deligne Sabine, Eide Ellen, Gopinath Ramesh, etc.Low-Resource Speech Recognition of 500-Word Vocabularies[R].New York: IBM Watson Research Center, 2001.
    [7]崔毓菁.语音识别系统速度优化算法研究[D].北京:北京邮电大学,2008.
    [8]谢湘,匡镜明.支持向量机在语音识别中的应用研究[A].现代通信理论与信号处理进展——2003年通信理论与信号处理年会论文集[C],2003.
    [9]唐军.基于HMM与小波神经网络的语音识别系统研究[D].南京:南京理工大学,2007.
    [10]胡航.语音信号处理[M].哈尔滨:哈尔滨工业大学出版社,2000:83-87.
    [11]蔡莲红,黄德智,蔡锐.现代语音技术基础及应用[M].北京:清华大学出版社,2003:15-18.
    [12]罗飞.语音识别技术在虚拟校园中的应用研究[D].武汉:华中师范大学,2007.
    [13]拉宾纳,谢弗.语音信号数字处理[M].北京:科学出版社,1983:93-98.
    [14]赵力.语音信号处理[M].北京:机械工业出版社,2003:81-88.
    [15]张刚,张雪英,马建芬.语音处理与编码[M].北京:兵器工业出版社,2000:1-13.
    [16]闫文娟.基于TMS320C5409的语音识别系统[D].太原:太原理工大学,2007.
    [17]孙颖.噪音环境下语音特征提取前端处理及优化帧算法研究[D].太原:太原理工大学,2007.
    [18]王鹏.基于模糊神经网络语音识别系统的研究[D].太原:太原理工大学,2008.
    [19]王一平.用遗传算法改进HMM的语音识别算法研究[D].太原:太原理工大学,2007.
    [20] Doh-SukKim, Soo-Yong Lee, RheeM.Kil.Auditory processing of speech signal for robust speech recognition real-word noisy environments [J].IEEE TRANSACTION ON SPEECH AND AUDIO PROCESSION, 1999, 7(1):55-69.
    [21]唐发明.基于统计学习理论的支持向量机算法研究[D].武汉:华中科技大学,2005.
    [22]朱志宇,张冰,刘维亭.基于模糊支持向量机的语音识别方法[J].计算机工程,2006,32 (2):180-182.
    [23]宋晓宁.模糊支持向量机及其在图像识别中的应用[D].南京:江苏科技大学,2005.
    [24]张学工.关于统计学习理论与支持向量机[J].自动化学报,2000,26(1):32-42.
    [25]邓乃扬,田英杰.数据挖掘中的新方法——支持向量机[M].北京:科学出版社,2004:145-151.
    [26] Vapnik V.The Nature of Statistical Learning Theory [M].New York: Springer-Verlag 1995: 77-79.
    [27]杨杰.基于模糊支持向量机的多类分类方法研究[D].武汉:武汉大学,2005.
    [28] V.Vapnik著,张学工译.统计学习理论的本质[M].北京:清华大学出版社,2000:64-67.
    [29]李忠伟.支持向量机学习算法研究[D].哈尔滨:哈尔滨工程大学,2006.
    [30]范昕炜.支持向量机算法的研究及其应用[D].杭州:浙江大学,2003.
    [31]苏毅,吴文虎等.基于支持向量机的语音识别研究[A].第六届全国人机语音通讯学术会议论文集[C],2001.
    [32]曹兆龙.基于支持向量机的多分类算法研究[D].上海:华东师范大学,2007.
    [33] C.J.C Burges.A tutorial on support vector machines for pattern recognition [J].Data Mining and Knowledge Discovery, 1998, 2(2): 121-167.
    [34] BAI Jing, GUO Yueling, ZHANG Xueying.Speech Recognition Based on A Compound Kernel Support Vector Machine [A].The 11th International Conference on CommunicationTechnologies 2008 (ICCT2008) [C], 2008.
    [35]花静.基于HMM/SVM混合架构的连续语音识别系统的研究[D].哈尔滨:哈尔滨工业大学,2006.
    [36]梁五洲.抗噪语音识别特征提取算法的研究[D].太原:太原理工大学,2006.
    [37] Shi Y, Eberhart R C.A Modified Particle Swarm Optimization[J].Proceedings of the Congress on Evolutionary Computation[C], 1998: 69-73.
    [38]王坤华.基于PSO和SVM的上市公司财务危机预警研究[D].北京:中国科学技术大学,2007.
    [39]陈尧东.基于支持向量机的遥感矿化蚀变信息提取方法研究[D].长沙:中南大学,2007.
    [40]胡建秀,曾建潮.微粒群算法中惯性权重的调整策略[J].计算机工程,2007,33(11):193-195.
    [41]朱永生,张优云.支持向量机分类问题中几个问题的研究[J].计算机工程与应用,2003,39(13):36-38.
    [42] Ron Kahavi.A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection[C].New York: In International Joint Conference on Artificial Intelligence, 1995: 3-8.
    [43] Chih-Chung Chang, Chih-Jen Lin.LIBSVM: A Library for Support Vector Machines [EB/OL]. Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm, 2001.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700