用户名: 密码: 验证码:
基于语音识别的机器人控制技术的研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
语音,作为人类最自然的交流工具,是人类获取资源与信息的重要来源。在信息技术高速发展的今天,让计算机能“听懂”人类的语音,是人—机进行沟通的最便捷的形式之一,语音识别就是这样的一门技术。近些年来,语音识别技术一直是计算机应用的热点。同样,机器人技术也逐渐成为现代自动化技术发展的标志之一。因此,将语音识别技术和机器人控制技术相结合,更是体现了当今最高技术上的自动化。
     本课题利用博创科技的旅行家—Ⅱ号机器人,以实现对机器人的语音控制为目标,针对语音信号的特征参数提取问题、语音识别算法的性能优化问题及机器人运动测试进行探讨研究,具体工作如下:
     提出了一种新的语音特征参数提取的方法。在传统的基于人耳听觉特性的MFCC特征参数基础上,将其与基于人的发声机理的共振峰参数结合,构成新的语音特征参数(MFCC+共振峰)。该方法从语音的发声机理和人耳听觉两方面出发,提取语音参数,此算法较传统方法包含信息量多,准确性高,且抗噪声能力强。
     针对传统的隐马尔科夫模型训练过程中对初值依赖性强、容易陷入局部最优的缺陷和DTW算法实时性的要求,分别提出了三步混合算法(TSMS)和DTW高效算法。经过仿真实验表明,TSMS算法的适应性更强,收敛速度更快,识别的准确率更高。DTW高效算法能够在满足实时性的前提下,减小计算量。给出两个改进算法的实验测试结果,验证了两者的优势。
     最后,设计机器人语音控制系统。创建Microsoft Speech SDK程序文件,编写DTW高效算法的端口程序和机器人运动测试程序,并对运动测试结果作了分析和说明。
Voice, as the most natural human communication tool, is important sources of human getting resources information. With rapid development in information technology, so making the computer "understand" the human voice, is one of the most convenient forms of human - machine communication, speech recognition is such a technology. In recent years, speech recognition technology has been the focus of computer applications. Similarly, robot technology has gradually become one of the signs the development of automation technology. Therefore, the combination of speech recognition technology and robot control technology reflects the high technology of today's automation.
     In this paper, we use Borch technology raveler -Ⅱrobot to achieve the goal of robot voice control, for speech signal feature extraction problem, the performance of speech recognition algorithms and motion tests of robot ,what we do are as follows:
     We propose a new way of speech feature parameters extraction.At the traditional MFCC feature parameters based on human auditory characteristics, combined with the formant parameters based on the person's voice mechanism, form the new voice feature parameters (MFCC + resonance peak). The method is based on he voice mechanism of sound and the human auditory, it includes more speech parameters information than the traditional algorithms,also it has higher accuracy and strong resistance to noise.
     For the fefect of the high dependence on the initial value、easy to fall into local optimal and the requirements of real-time DTW algorithm during the training process of traditional hidden Markov model,we proposed three steps hybrid algorithm (TSMS) and DTW efficient algorithms. The simulation results show that, TSMS algorithms have stronger adaptability, faster convergence and higher recognition accuracy. DTW algorithm can efficiently satisfy the real-time requirements and reduce the amount of calculation. The given experimental results of two improved algorithms verifyied the advantages of them.
     Finally, we designed the robot voice control system. Create Microsoft Speech SDK file, write the port procedures of DTW efficient algorithms and robot testing procedures, and analyzed the test results.
引文
[1]王士元,彭刚.语音与技术[M].上海:上海教育出版社,2006,23~36.
    [2]蔡莲红,黄德智,蔡锐.现代语音技术基础与应用[M].北京:清华大学出版社,2003,75~96.
    [3] http://www.ctiforum.com CTI论坛.语音识别技术与发展.
    [4]刘幺和,宋庭新.语音识别与控制应用技术[M].北京:科学出版社,2008,4~39.
    [5] Diveni P. Speech Separation By Humans and Machines [M].Kluwer Academic Publisers, 2005, 59-142.
    [6]王炳锡,屈丹,彭煊.实用语音识别基础[M].北京:国防工业出版社,2005,33~67.
    [7] Davis K H, Biddulph R, Balashek S. Automatic Recognition of Spoken Digits[J].The Jourmal of the Acoustical Society of Acoustical Society of America, 1952, 24(6):637-642.
    [8]刘加.汉语大词汇量连续语音识别系统研究进展[J].电子学报,2000,28(1):85~91.
    [9]陈方,高升.语音识别技术及发展[J].电信科学,1996,12(10):54~57.
    [10]杜利民,候自强.汉语语音识别研究面临的一些科学问题[J].清华大学学报:自然科学出版社,1998,38(9):51~54.
    [11]林奕琳,韦岗,杨康才.语音情感识别的研究进展[J].电路与系统学报,2007(12).
    [12]迟艳玲.机器人语音控制及人机交互系统的实现[D].北京:中国科学院自动化研究所,2002.
    [13]刘福才,王冬云.语音识别技术及其在控制领域中的应用研究[J].可编程控制器与工厂自动化,2005,(11):6~10.
    [14]曾辉.语音识别研究[J].现代商贸工业,2008,2,20(2):199~200.
    [15]韩慧莲.语音识别中个人特征参数提取研究[D].2009,4:18~20.
    [16]余良俊,张友纯.在噪声背景下的语音识别端点检测[J].软件导刊,2007,3:23~25.
    [17]李振静,王国胤等.基于谱熵噪声估计的改进谱减法[J].计算机工程,2009,9,35(18):36~39.
    [18] Skowronski M D, Harris J G. Imcreased MFCC Filter Bandwidth for Noise-Robust Phoneme Recognitiong [C]//Proc of IEEE Int’l Conf on Acoustics Speech and Signal Processing, 2002: 801-804.
    [19] Hung W W, Wang H C. On the Use of Weighted Filter Bank Analysis for the Derivation of Robust MFCCs [J].IEEE Signal Processing Letters, 2001, 8(3):70-73.
    [20] Zheng Fang, Zhang Guoliang, Song Zhanjiang. Comparsion of Different Implementations of MFCC [J].Computer Science & Technology, 2001, 16(6): 582-589.
    [21]魏星,周萍.语音识别系统及其特征参数的提取研究[J].计算机与现代化, 2009,(09).
    [22] Onshaunjit, Srinonchat. LSP Trajectory Analysis for Speech Recognition [A].Computer Graphics, Imaging and Visualisation, 2008 CGIV‘08. Fifth International Conference on.2008, 8, 276-279.
    [23]吕岗,赵鹤鸣,刘建新,龚呈卉.有效提取耳语音共振峰的改进方法[J].计算机工程与应用,2009,45(19).
    [24]王晓亚.倒谱在语音的基音以及共振峰提取中的应用[J].无线电工程,2004,34(1):57~61.
    [25]黄海,陈祥献.基于Hilbert-Huang变换的语音信号共振峰频率估计[J].浙江大学学报,2006,40(11):1926~1930.
    [26] Gao M. Tones in whispered Chinese:Articulator and perceptualcues[D].Canada: University of Victorria, 2002.
    [27] Itoh T, Takeda K, Itakura F. Analysis and recognition of whispered speech[J].Speech Communication, 2005, 45(2):139-152.
    [28] Dharanipragada S. Feature extraction for robust speech recognition [C].IEEE International Symposium on Circuits and Systems. Phoenix-Scottsdale, AZ, USA: IEEE, May 2002:855-858.
    [28]李波,王成友,蔡宣平,唐朝京,张尔扬.语音转换及相关技术综述[J].通信学报, 2004,25(5):109~118.
    [29]王聪修.语音转换及相关问题的研究[D].中国科学院声学研究所,2001,56~78.
    [30] Shen Jialin,Jeihweih H, L. Robust Entropy-based Endpoint Detection for Speech Recogniton in Noisy Environments [C].Proc. of ICSLP’98, Australia: ASSTA, 1998.
    [31] Kitaoka N, Yamamoto K, Kusmizu T. Development of VAD Evaluation Framework and Investigation of Relationship between VAD and Speech Recogniton Performance [C].Proc.Of ASRU’07. Kyoto, Japan: IEEE Signal Processing Society, 2007, 159-183.
    [32]刘云冰,祝彦成,彭静等.HMM在说话人识别系统中的实现[J].软件导刊,2007(8):177~355.
    [33]张玲华,杨震.基于HMM的说话人辨认系统极其改进[J].电讯技术,2005(6):86~87.
    [34]程开东,栾方军,马驷良.一种基于隐马尔科夫模型的在线手写签名认证算法[J].吉林大学学报,2008,46(5):940~943.
    [35]徐慧红,栾方军.基于改进的HMM算法的说话人识别研究[J].微计算机信息(测控自动化),2010,26(8-1).
    [36] Kudo T, Matsumoto Y. Chunking with Suppert Vector Machines [A].Proceedings of NAACL 2001[C].Pittsburgh, USA: Morgan Kaufman Publishers, 2001.
    [37] Mak B, Bocchieri E. Direct training of subspace distribution clustering Hidden Markov Model [J].IEEE Transactions on Speech and Audio Processing, 2001, 9(4):378-387.
    [38]张增银,元昌安,胡建军等.基于GEP和Banm-Welch算法训练HMM模型的研究[J].计算机工程与设计,2010,31(9).
    [39]曾剑平,郭东辉.一种基于HMM和遗传算法的伪装入侵检测方法[J].小型微型计算机系统,2007,28(7):1210~1215.
    [40] Zuo Jie, Tang Changjie, Zhang Tianqing. Mining predicate association rule by gene expresstion programming[C].Proc of the 3rd Int’l Conf for Web Information Age(WAIM02), LNCS 2419. Berlin: Springer-Verlag, 2002:92-103.
    [41] Pan Wei, Diao Huazong, Jing Yuanwei. An Improved Real Numbers Adaptive Genetic Algorithm [J].Control and Decision, 2006, 21(7):782-800.
    [42] Tahera, R Ibrahim, P Lochert. Development of Adaptive Genetic Algoithm[C].Seventh International Conference on Intelligent Systems Ddsign and Applications. 2007:883-888.
    [43] Jiabao Guan, Elcin Kentel. Genetic Algorithm for Constrained Opti-mization Models and Its Application in Groudwater Resources Management [J].JOURNAL OF WATER RESOURCES, 2008, (1/2):64-70.
    [44]李荣,郑家恒,郭海英.基于遗传算法的隐马尔科夫模型在名词短语识别中的应用研究[J].计算机科学,2009,36(10):244~246.
    [45] Ogawa T, Kobayashi T. Genetic algorithm based optimization of partly-hidden markov model structure using discriminative criterion [J].IEICE Transactions on Information and Systems, 2006, 89(3):939-945.
    [46] YANG Feng-qin, ZHANG Chang-hai, SUN Tie-li.Comparison of particle swaim optimization and genetic algorithm for HMM training [C].19th Intermational Conference on Pattern Recogniton New York IEEE, 2008:1-4.
    [47]汪庆淼,鞠时光,秦剑锋.基于改进蚁群算法的HMM参数估计[J].江南大学学报(自然科学版),2009,8(6):707~710.
    [48] Yoshinobu Watanabe, Tomohiro Yoshikawa, Takeshi Furuhashi. A study on application of fitness inference method to PC-IGA [C].Proceedings of the 2007 IEEE Congress on Evolutionary Conputation, 2007:1450-1455.
    [49]张功,张雄伟.基于HMM和K-均值聚类的声目标识别[J] .弹箭与制导学报,2006,26(2):144~147.
    [50]李晶皎,王爱侠,王娇等译.模式识别第四版[M].北京:电子工业出版社,2010(2):331~339.
    [51]尚福华,孙达辰,吕海霞.提高DTW运算效率的改进算法[J].计算机工程与设计,2010,31(15):3518~3521.
    [52] Chavari-Alkhansavi M. A fast globally optimal algorithm for template matching using low resolution pruning, IEEE Transactions on Image Processing, Vol. 10(4), 2001, 526-533.
    [53]李邵梅,刘力雄,陈鸿昶.实时说话人辨识系统中改进的DTW算法[J].计算机工程,2008,2,34(4):218~219.
    [54]文翰,黄国顺.语音识别中DTW算法改进研究[J].模式识别,2010,195~197.
    [55] Skow ronsk iM D, Garris J G.Increased MFCC filter bandwidth for noise robust phoneme recognition [A].IEEE International Conference on Acoustics, Speech, and Signal Processing [C].2002.
    [56]罗志增,赵敬斌.机器人语音控制及其实现[J].杭州电子工业学院学报,2004,24(1):30-34.
    [57]张增芳,胡迎春,龙华强.智能机器人遥控技术的算法研究[J].机器人技术与应用,2003,(3):31-33.
    [58] Lawrence Rabiner. Biing-Hwang Juang Fundamentals of Speech Recognition[M].北京:清华大学出版社,1999:35~41.
    [59]续芳,王宇俊,朱俊.智能机器人语音控制方法的设计与实现[J].河南教育学院学报(自然科学版),2009,18(4):38~39.
    [60]王键.一种具有语音无线控制功能的机器人系统[J].长春师范学院学报(自然科学版),2010,29(5):19-21.
    [61]于平,胡志强,尤波.一种基于微处理器控制的智能移动机器人[J].哈尔滨工业大学学报,2006(38):830~832.
    [62]余皓,苏全.语音控制机器人的设计与实现[J].机器人技术(电气自动化),2007,29(5):29~31.
    [63]徐国华,谭民.移动机器人的发展现状及趋势[J].机器人控制技术与应用,2001(3):7~14.
    [64] A R IV A. Human interaction with intelligent system [J].Research Taxonomy Computer and Electrical Engineering, 2000(18):71-107.
    [65] CARACCIOLO L, LUCA A D, IANNIIT I S.Trajectory tracking control of a four-wheel differentially driven mobile robot [A].Proceedings of the 1999 IEEE Intermational Conference on Robotics&Automation [C].1999.
    [66]韩大鹏,韦庆.机器人控制器的一种模块化设计方法[J].微计算机信息,2005(5):3~4.
    [67]李瑞峰,吕开元.基于图形编程技术的服务机器人人机交互系统的研究[J].制造业自动化,2003(3):40~43.
    [68] BATES J, TOM PK IN S T著.实用Visual C++ 6.0教程[M].何健辉,董方鹏,译.北京:清华大学出版社,2000.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700