基于BP神经网络的低延迟矢量激励语音编码系统

设为首页

收藏本站

网站地图 | English | 公务邮箱

读者指南

学术客户端

NSTL服务站

科技查新

基于BP神经网络的低延迟矢量激励语音编码系统

详细信息本馆镜像全文| 推荐本文 | | 获取CNKI官网全文

英文题名：Low Delay Vector Exciting Speech Coding Algorithm Based on BP Neural Network
作者：赵峰
论文级别：硕士
学科专业名称：信号与信息处理
中文关键词：语音编码 ; 非线性预测 ; 矢量量化 ; BP网络
英文关键词：speech coding ; nonlinear prediction ; BP network ; vector quantization
学位年度：2005
导师：张刚
学科代码：081002
学位授予单位：太原理工大学
论文提交日期：2005-05-01

摘要

人工神经网络是采用大量的处理单元连接起来构成的一种复杂的信息处理网络。这种网络具有与人脑相类似的学习记忆能力和输入信息特征抽取能力。人工神经网络因其非线性、自适应及学习特性而受到极大关注,并在诸多领域都取得成功的应用,如模式识别与图像处理、控制与优化、预测、通信等。
     语音信号本质上是一个非平稳和非线性的过程,但一直以来,传统的语音处理方法都采用一种线性预测方法来处理,这就无法适应语音信号的非线性特征。而现有神经网络非线性滤波方法对矢量激励语音编码尚无有效的解决方案。
     本课题首先针对线性滤波方法的不足,在语音编码系统的预测中引入神经网络模型,并研究了基于神经网络的语音编码系统的结构、适合于语音后向预测的神经网络的结构和学习算法,并且针对算法实时性的要求通过固定部分微变系数改进了BP网络训练过程缩短了训练时间,实验表明本算法比G721信噪比提高1.5-2dB。
Artificial neural network (ANN) is complicated information processing one made of many processing units. This network has the ability of learning memory and input information trait extracting. Now it receives great attention and gets successful application such as mode recognize and image processing , control and optimize , predict, communication etc.Speech signal is got in essence non-stationary and nonlinear. But all along, traditional speech processing method uses linear prediction, it do not adapt well to the nonlinear characteristic of speech signals, and there is not any effective schemes about vector exciting speech coding algorithm in existing nonlinear prediction algorithm based on neural network.
    Aimed at this shortage about linear prediction, the non-linear predictors based on ANN are researched in this article. ANN is used to replace the conventional LP technology. At first, the structure of the system of speech coding based on ANN is researched. Secondly, the structure and the algorithm of the ANN fitted to speech signal are analyzed. Due to the complicated learning of ANN, it is difficult to implement speech coding in real-time. This article ameliorates the process of BP neural network training and shortens the time of BP neural network training by making part of coefficients of BP neural network fixedness. The experiment results indicate that the speech SNR based this arithmetic has increased 1.5-2 dB compared to that of G721 by CCITT.Vector quantization (VQ) is efficient method in speech coding algorithm, and there are not any effective schemes about vector exciting speech coding algorithm in existing nonlinear prediction algorithm based on neural network. This paper presented a new concept on nonlinear inverse filter based on BP neural network. A unit transform nonlinear filter with center tap can be got after off-line network training, which was divided into a positive filter and a inverse
    filter from the middle of tap; the speech passed through the positive filter and was formed into exciting vector; The exciting codebook can be obtained by training wave vector using LBG method. The coding end searches the codebook to product the optimal exciting vector. In decoding end it was nonlinear inverse filtered and the synthesis speech can be got. For shortening search time, this paper uses the search method based Fractal, trains the trained codebook again into some son-codebook and gets representative code of every son-codebook. When searching, at first gets representative code that is similar to originality exciting, then search relevant son-codebook. This search method shortens search time by two-quantity unit. Based on these theories, this article designs and develops 8 kbps low delay vector exciting speech coding algorithm based on by neural network. In this algorithm coefficients of BP neural network are fixed, so its MIPS is 42.2 in aspects of complication. The experiment has shown the SNR is 15.3323 in 30 sentences.

引文

[1] 姚天任编著,数字信号处理,华中理工大学出版社,1992
    [2] 杨行峻、迟惠生等编著,语音信号数字处理,电子工业出版社,1995
    [3] 曹志刚、钱亚生编著,现代通信原理,清华大学出版社,1992
    [4] 易克初,田斌,付强编著,语音信号处理,国防工业出版社,6,2000
    [5] 刘津,G.728语音编码标准的DSP定点实时实现及LD-CELP算法的研究,北京邮电大学硕士研究生学位论文,3,1998。
    [6] 余雪丽、孙承意、冯秀芳等编著,神经网络与实例学习,中国铁道出版社,1996
    [7] 张雪英,16Kb/s LD-CELP语音编码算法的计算机仿真和实时实现,哈尔滨工程大学博士论文,1997
    [8] 王跃科,林嘉宇,黄芝平等.语音信号非线性分析与处理通信技术 2000(1):61-65
    [9] Jes Thyssen,Henrik Nielsen and Steffen Duus Hansen. Non-linear short-term in speech coding. Proceedings of International Conference on Signal Processing, 1994: 185-188
    [10] Lizhong wu, Mahesan Niranjan and Frank Fallside. Fully Vector-quantized neural network-based nonlinear predictive speech coding. IEEE Transaction on-SP, 1994, 4, 482-489.
    [11] Tsungnan Lin, Bill G. Home and Lee Giles. Learning long-term dependencies in NARX recurrent neural networks. IEEE Transations on neural network, 1996, 7(6): 1329-1338.
    [12] 马霓,韦岗.一种基于回归神经网络的码本激励非线性预测话音编码算法.通讯学报,2000,(10):31-37
    [13] 欧阳缮,陈云雨,方惠均.一种改进的语音信号非线性自适应预测编码方案.电路与系统学报.1999(2):7-9
    [14] 张江安,林良知等,基于预测神经元模型的语音线性预测系数求解新方法,上海交通大学学报,2001,35(5):717-720
    [15] 周志杰、胡光锐、李群,线性化逐层优化MLP算法,http://202.120.13.26/9901/105_1.htm
    [16] Townshend.B. Nonlineaar prediction of speech. Proc of ICASSP, 1991: 425-428.
    [17] Jes Thyssen al. Quantization of non-linear prediction in speech coding, Proceedings of International Conference on Signal Proceedings, 1995, pp. 265-268
    [18] Lizhong wu, Mahesan Niranjan and Frank Fallside. Fully Vector-quantized neural network-based nonlinear predictive speech coding. IEEE Transaction on-SP, 1994, 4, 482-489.
    [19] E. varoglu, K. hacioglu. Recurrent neural network speech predictor dynamical system approach. IEE Proceedings online no. 20000912: 149-156
    [20] 张嵩、汪元美,基于径向基函数神经网络的非线性时间序列预测器,电子科学学刊,2000,22(6):965-971
    [21] 林刚、刘泽民,ATM网中视频码流的非线性自适应预测,电子学报,1999,27(10):1-3
    [22] 杨震、毕厚杰,一种新的用于语音主观质量评价的谱失真参数,电子与信息学报,2001,23(7):669-676
    [23] 杨明、邱锋海等,一种利用多带激励模型改进的低速率线性预测语音编码算法,声学学报,2001,26(4):330-334
    [24] 袁曾任编著,人工神经元网络及其应用,清华大学出版社,19990
    [25] 阎平凡,张长水编著,人工神经网络与模拟进化计算,清华大学出版社,1999
    [26] 蒋宗礼编著,人工神经网络导论,高等教育出版社,2001
    [27] 潘维民、沈理,时间序列动态预测器的调整算法,电子学报,1999,27(11):1-4
    [28] Amber Lotus Publishing, Fractal Cosmos Wall Calendar 2000, Amber Lotus. 1999
    [29] 孙博文编著,电脑分形艺术,黑龙江美术出版社,1999
    [30] 潘金贵编著,分形艺术程序设计,南京大学出版社,1998
    [31] 陈衍仪编著,图像压缩的分形理论和方法,国防工业出版社,1997
    [32] CCITT,Recommendation G.721,32kbit/s自适应差分脉冲编码调制(ADPCM) 1984
    [33] V.C.Welch,T.E.TREMAIN,J.P.COmpbell,JR.美国政府标准化音编码器的比较,通信技术,1990,71(4)
    [34] 张刚、张雪英、马建芬编著,语音处理与编码,兵器工业出版社,2000
    [35] 黄德智、马尽文,LD-CELP语音编码算法中矢量量化过程的改进,电子学报,2001,29(10):1415-1417
    [36] 谢克明、张建伟,一种基于误差变化率的自适应反向传播算法,电子科学学刊,1998,第20卷,第4期
    [37] 袁征等编著,C语言编程技巧程序集,电子工业出版社,1993
    [38] 徐士良编著,C常用算法程序集,清华大学出版社,1995
    [39] 王安红,基于神经网络非线性预测语音编码系统的研究,太原重型机械学院硕士学位论文,2002
    [40] 樊昌信、詹道庸、徐炳祥等,通信原理,国防工业出版社,1995
    [41] 陈亮、张雄伟,语音信号非线性特征的研究,解放军理工大学学报,2000,第1卷,第2期
    [42] Robert Hecht-Nielson, Theory of the Back PropagationNeural Networks[C]. Proc. OfIJCNN, 1989, 1: 593-603P

常见问题　|　交通位置　|　联系我们　|　OA远程办公

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700