用户名: 密码: 验证码:
一种低码率水下语音通信方法的合成算法研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
随着语音压缩技术、宽带网络技术和无线宽带技术的发展,水下语音通信的传输速率已达到一定水准。由于水声信道自身的特点,难以做到高速率、远距离传输。利用语音识别和语音合成技术对信源编码与解码的方法,使信息在信道中低码率传输,能够解决水声信道容量有限的问题。但是在接收端语音合成的质量却不甚理想。本文给出了一种利用语音编码实现语音合成、利用语音韵律参数来优化语音合成效果的方法,具有合成质量高、所需音库容量小的特点,适于低码率的嵌入式水下语音通信系统,具有一定的工程实用价值。
     论文主要研究G.729A协议标准和词汇间韵律参数的调节规律,仿真G.729A协议的主体算法——根据协议的编码算法压缩音库,译码算法实现语音合成。此外,本文还完成调值、强度、时长和尾音等韵律参数的调节算法,并分析词汇间韵律参数的调节规律,最后构建小词汇量文语转换系统。
     通过对该文语转换系统的主观听辨测试,结果表明其重建的语音清晰可辨,合成语音质量的自然度较好,且算法复杂度适中,延时较小,适用于水下微小型载体和潜水员进行无缆语音通信的嵌入式系统。
With the development of the Speech compression and Wireless broadband network technology, the transmission rate of underwater speech communication has reached a certain standard. For the reason of underwater acoustic channel's own characteristics, it is difficult to transmit in high speed and long distance. The use of speech recognition and speech synthesis technology for source coding and decoding methods, which make it easy to transmit in low bit-rate, can solve the problem of limited acoustic channel capacity. However, in the receiver, the quality of speech synthesis is not ideal. In this paper, a speech synthesis's method is given. The method realizes speech synthesis through speech coding, optimizes synthetic effects through prosodic adjustment, with the characteristics of high synthesis quality and small speech database, especially suited for low rate of embedded underwater speech communication system, and have high engineering practical value.
     Paper first studied the recommendation G.729A and the regulation laws of prosody parameters between vocabularies, and then simulated the algorithm of recommendation G.729A, made voice database and completed text to speech. On this basis, paper simulated the algorithm of time-period tone pitch, amplitude tone tail, and analyzed prosodic parameter adjustment laws, finally, realized a text-to-speech conversion system of small vocabulary.
     Through the text-to-speech conversion system's subjective perception test, the results showed that the reconstructed speech was clear and natural, besides, the algorithm had moderate complexity and small delay, which was suitable for underwater micro vector and embedded system of wireless speech communication between divers and submarine.
引文
[1]许克平.基于水声的水下无线通信研究.厦门大学学报(自然科学版),2001.4:311-319页
    [2]刘伯胜,雷家煌.水声学原理.哈尔滨工程大学版社,2002.9:59-155页
    [3]惠俊英.水下声信道.国防工业出版社,1992:145-147页
    [4]Obert J.Urick.Principles of Underwater Sound.Mc-Graw-Hill,1983
    [5]熊省军.水声数字语音通讯系统设计与实现.哈尔滨工程大学硕士论文,2006:11-12页
    [6]王仁华.智能通信终端.中兴通讯技术,2001.7:44-48页
    [7]张后旗,俞振利,张礼和.基于TD-PSOLA算法的汉语普通话韵律合成.科技通报,2002.1:6-13页
    [8]赵立.语音信号处理.机械工业出版社,2003.4:20-26,32-35,197-208页
    [9]杨行峻,迟惠生.语音信号数字处理.电子工业出版社,1995:24,62-90页
    [10]赵晓群.数字语音编码.机械工业出版社,2007.5:55-65页
    [11]韩纪庆,张磊,郑铁然.语音信号处理 清华大学出版社,2004
    [12]吴家安.现代语音编码技术.科学出版社,2008.1:140-145页
    [13]Robert M.Gray.Vector Quantization.IEEE ASSP Magazine,1984,1(2):4-29P
    [14]胡征,杨有为.矢量量化原理及应用.西安电子科技大学出版社,1988
    [15]鲍长春.数字语音编码原理.西安电子科技大学出版社,2007.1:128-169,171-174页
    [16]胡航.语音信号处理.哈尔滨工业大学出版社,2000.4
    [17]Valhret H.Moulines,E.Tubach.J.P.Voice transformation using PSOLA techniques.Speech Communication,1992,11(2):175-187P
    [18]梁志强,李海洲.线性预测编码在变音长语音合成中的应用.华南理工大学学报,1998.3:27-31页
    [19]朱亚喆,柴佩琪.语音合成系统中语音库的设计与实现.计算机工 程,1997.12:45-46,64页
    [20]Peter Kroon.Ed,F.Deprettere.A Class of Analysis-by-Selected Areas for High Quality Speech Coding at Rates Between 4.8 and 16kbits/s,IEEE Jounnal on Selected Areas in Communications,1988.2
    [21]W.Bastiaan Kleijn,Daniel J.Krasinski,Richard H.Ketchum.Fast Methods for the CELP Speech Coding Algorithm.IEEE Trans Acoustics,Speech,Signal Processing,1990,38(8):1330-1342P
    [22]M.R.Schroeder,B.S.Atal.Code-Excited Linear Predictive(CELP) High Qaulity Speech at Very Low Bit Rates.Proc.of Int.Conf.on Acoustics Speech,Signal Processing,1985:937-940P
    [23]鲍长春.语音压缩编码中的两种基音预测器.电子科学学刊,1996,18(6):582-589
    [24]ITU-T.Recommendation G.729:Coding of Speech at 8 kbit/s Using Conjugate-Structure Algebraic Code Excited Linear Prediction (CS-ACELP),1996
    [25]ITU-T.Recommendation G.729 Annex A:Reduced Complexity 8 kbits/s CS-ACELP Speech Codec,1996
    [26]易顺广.低速率语音编解码中的矢量量化技术.电子科技大学硕士论文,2002
    [27]赵博,蔡莲红.合成语音自然度客观测度.计算机工程与应用,2005.7:32-34页
    [28]董世伟,张家騄.语音合成系统评测方法.第四届全国人机语音通讯学术会议,2007.5
    [29]Moulinese,Charpentier F.Pitch-asynchronous waveform processing techniques for text-to-speech synthesis using Dip hones.Speech Communicat ion,1990
    [30]Y.Medan E Yair,D.Chazan.Super Resolution Pitch Deter-urination of Speech Signals.IEEE,Transactions on Signal Processing.1991
    [31]Carol A.Mcgonegal,Lawrence R.Rabiner S.A Semiautomatic Pitch Detector.2001
    [32]王秀君,和应民,木建一.一种有效的语音基音周期提取算法.应用科技,2006.1:7-9页
    [33]鲍长春,樊昌信.基于归一化相关函数的基音检测算法.通信学报,1998,19(10):27-31页
    [34]H.Ney.A Dynamic Programming Technique for Nonlinear Smoothing.Proc.IEEE ICASSP,1981:62-65P
    [35]段凯宇,余一彪,石汝杰.基于基音同步帧叠接的吴语语音合成.通信技术,2002.3:1-3页
    [36]张鹏,王琳,刘胜.基于韵律匹配代价和韵律拼接代价的汉语语音合成.哈尔滨工业大学学报,2006.11:2006-2008页

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700