面向机器人对话的语音识别关键技术的研究

设为首页

收藏本站

网站地图 | English | 公务邮箱

读者指南

学术客户端

NSTL服务站

科技查新

面向机器人对话的语音识别关键技术的研究

详细信息本馆镜像全文| 推荐本文 | | 获取CNKI官网全文

英文题名：Research on the Key Technologies fo Speech Recognition for Robot Communication
作者：刘旸
论文级别：硕士
学科专业名称：计算机技术
中文关键词：机器人 ; 语音识别 ; 隐马尔可夫模型(HMM)
英文关键词：Robot ; Speech recognition ; Hidden Markov model (HMM)
学位年度：2009
导师：刘志镜 ; 裘国永
学科代码：081201
学位授予单位：西安电子科技大学
论文提交日期：2009-04-01

摘要

机器人发展至今,对于机器人的控制,语音控制无非是最自然、最便捷的控制方式。从国内外对语音识别技术的研究现状来看,与机器人进行交流,把语音识别技术应用于机器人,正成为目前研究的热点。
     语音识别技术使机器人能听懂人的自然语言,由识别得到的信息作为声控信号应用到机器人的多种技术领域。将语音识别技术应用于机器人为使用者带来了极大的方便。因此研究并开发实用的机器人语音识别系统对于机器人的广泛应用具有重要的意义。论文的主要内容如下：
     首先,基于语音识别的基本原理,研究了面向机器人对话的语音识别的关键技术,语音信号的预处理,包括采样、去除噪音、端点检测、预加重、加窗分帧等；对线性预测倒谱系数(LPCC)与Mel频率倒谱系数(MFCC)的性能进行了对比分析；研究了主流的模型训练和模式匹配技术,这是语音识别技术的核心部分,包括隐马尔可夫模型(HMM)、动态时间规整(DTW)、矢量量化(VQ)、人工神经网络(ANN)、HMM和ANN的混合模型等。
     其次,设计完成了机器人的语音识别控制系统。基于VC++的集成开发环境编写了语音识别控制系统的软件,实现了识别性能较好、执行效率较高的机器人语音指令识别算法。并在AS-R机器人上进行了测试,结合声纳和PSD传感器的使用,大大的提高了机器人的交互性。
     实验结果表明,实现的语音识别控制系统的识别性能较好。同时,该系统结构简单,性价比高,易于功能扩展和移植,具有广阔的应用前景。
So far the development of robot, as to the robot control, voice control is nothing but the most natural and most convenient. From the present research on voice recognition technology at home and abroad, the exchanging with robot and applying speech recognition technology to robot is becoming a hot spot of the present study.
     Speech recognition technology allows the robot can understand natural language. The identified information received as a voice signal is applied to a variety of robot technology. Applying the voice recognition technology to robot will bring users the greatest convenience. Therefore research and development of practical speech recognition system for robots makes great sense to the wider use of robots. The main contents of this paper are as follows:
     First of all, based on the basic principles of speech recognition, we research on the key technologies of robot dialogue oriented speech recognition, pre-processing of speech signal, including sampling, removing noise, endpoint detection, pre-emphasis, windowing separate frame and so on. The performance of the line Prediction Cepstral Coefficients (LPCC) and Mel Frequency Cepstral Coefficients (MFCC) are also compared and analyzed. We then study the mainstream model training and pattern-matching technology, which is the core of the speech recognition technology, including Hidden Markov Model (HMM), dynamic time warping (DTW), vector quantization (VQ), artificial neural network (ANN), HMM and ANN hybrid model, etc.
     Secondly, we design and complete the robot speech recognition control system, the speech recognition control software is compiled based on VC++integrated development environment. It can achieve better recognition performance and higher efficiency in the implementation of the robot voice command recognition algorithm. Finally it is tested on AS-R robot. Combined with sonar and the use of PSD sensors, it has greatly increased the interactivity of the robot.
     The experimental results show that the implementation of the speech recognition control system has better recognition performance. At the same time, the system is simple, cost-effective, easy-function expansion and transplantation. And it has a good prospect of broad application.

引文

[1]韩纪庆,张磊,郑铁然.语音信号处理.北京：清华大学出版社,2004.
    [2]王炳锡,屈丹,彭煊.实用语音识别基础.北京：国防工业出版社,2004.
    [3]罗志增,赵敬斌.机器人语音控制及其实现.杭州电子工业学院学报,2004,24(1)：30～34.
    [4]诸刚.汉语语音识别技术在机器人控制中的应用.北京市计划劳动管理干部学院学报,2004,12(1)：47～48.
    [5]张雄伟,陈亮,杨吉斌.现代语音处理技术及应用[M].北京：机械工业出版社,2003.
    [6]高新涛,陈乖丽.语音识别技术的发展现状及应用前景.甘肃科技纵横,2007.4.
    [7]姚文兵等.稳健语音识别技术发展现状及展望信号处理.2001第17卷第6期P484-P493.
    [8]王伟臻.基于神经网络的语音识别研究.浙江大学,2008.5.15.
    [9]邹超军,黄琰,邓秋香.基于HMM与神经网络的语音识别技术研究,中国水运,2007.11.
    [10]钱芳,韩纪庆,张磊.基于MAP自适应算法的应力变异语音识别方法[J].计算机工程与应用,2004(5)：42～44.
    [11]Wen Gao,Yiyong Ma,Jiangqin Wu.Sign Language Recognition Based on HMM/ANN/DP[J]. International Journal of Pattern Recognition and artificial Intelligence,Vol.14,No.5(2000):587-602.
    [12]刘茂胜.我国推出首批拥有自主产权的语音识别产品科学时报[J].语音识别,2000,22(1)：9～11.
    [13]吴宗济,林茂灿.实验语音学概要[M].北京：高等教育出版社,2005.
    [14]Hueng XDetal, Deleted Interpolation and Density Sharing for Continuous Hidden Markov Models[M]. IEEE International Conference on Acoustics, Speech and Signal Processing,1996.
    [15]易可初,田斌,付强.语音信号处理[M].北京：国防工业出版社,2005.8.
    [16]王永恒.声纹的分析方法及其应用.北京理工大学,2006.
    [17]杨行峻,迟惠生.语音信号数字处理[M].北京：电子工业出版社2005.247～338.
    [18]余小清,万旺根.基于听觉谱特征的语音识别新方法[J].中国学术期刊文摘, 2005,4(3)：374～375.
    [19]王一平.用遗传算法改进HMM的语音识别算法研究.太原理工大学.2007.5.
    [20]姚志强,戴蓓倩,李辉等.基于多带HMM和神经网络融合的语音识别方法的信道鲁棒性[J].计算机工程与应用,2004(1)：71～74.
    [21]N. T. Lay, F. W. Say, D. Silva. Robust Endpoint Detection and Energy Normalization for Real-time Speech and Speaker Recognition[J]. IEEE Trans. On Speech and Audio Processing,2002,10(3):146-157.
    [22]Tetsuya TAKIGUCHI, Satoshi NAKAMURA, Qiang HUO. Model Adaptation Based on HMM Decomposition for Reverberant Speech Recognition[J]. Proceeding of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing(ICASSP'97),1997,827～830.
    [23]魏艳娜.语音识别的矢量量化技术研究.河北工程大学,2007.4.
    [24]韩纪庆,张磊,郑铁然.语音信号处理[M].北京：清华大学出版社,2004.191～285.
    [25]Zhang xinyi, Wu jinpei, Zhang youwei. Optimum Vector Quantization Codebook Design for Speeker Recognition [J]. International Conference on Signal Processing Proceedings,2004,7:14～16.
    [26]Cui Tao, Zhang Taiyi. Speaker-Independent Speech Recognition Based on Fast Neural Network[J]. International Journal on artificial Intelligence Tools, Vol.12, No.4(2003):481～487.
    [27]Roberto GEMELLO, Dario ALBESANO, Franco MANA. CSELT Hybrid HMM/Neural Networks Technology for Continuos Speech Recognition [J]. Proceedings of the IEEE INNS ENNS International Joint Conference on Neural Networks,2000,152～153.
    [28]姚立月.基于HMM模型的说话人识别系统的研究.天津大学,2006.1.
    [29]Morgan N. Neural Networks for Statistical Recognition of Continuous Speech[J]. Proceeding of IEEE,1999,83(5):742～770.
    [30]Lippmann R P. Review of Neural Networks for Speech Recognition[J]. Neural Computation 1998,1(1):1-38.
    [31]侯周国.基于HMM的汉语数字语音识别系统研究.湖南师范大学,2006.4.

常见问题　|　交通位置　|　联系我们　|　OA远程办公

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700