用户名: 密码: 验证码:
3G中语音端点检测算法及其实现研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
语音端点检测可以减少实时系统中的大量计算,使该系统仅处理语音输入,不至于在静音段白白浪费计算量和存储量,有利于在资源较为紧张的DSP系统上实现。本文首先对WCDMA的语音编码标准—AMR声码器中的语音端点检测算法进行了分析,重点探讨了算法的特点及其理论依据,其思想具有普遍意义,可以在信源压缩编码及其它需要高精度语音端点检测的应用场合中采用,具有较大的研究价值。然后,本文进一步提出改进的语音端点检测算法,仿真数据表明性能有较大提高,且在项目中得以实际运用,测试性能良好。本项目实现了基于TMS320C6203的灵活高效多通道AMR声码器,仿真和录放验证的结果与3GPP提供的结果满足比特级精确要求。实时处理的非正式主观测试表明,合成语音质量优于GSM的RPE-LTP的语音质量,达到长途语音质量,完全可以实际应用,为第三代移动通信中声码器设备的研制奠定了良好的软件和硬件基础。
Speech endpoint detection can reduce computation in real-time system largely. Therefore, the system can deal with speech input only .and avoid wasting computing and storing in silence segment. in favor of implementation in DSP system short of resources. This paper begins with an analysis of VAD(Voice Activity Detector) algorithm in AMR Vocoder(speech codec standard in WCDMA), with emphasis on the feature and fundamental principles of it, followed by a revised method that have better performance in simulation testo and is proved practicable in experimental test . Moreover, the paper discusses the principle of software and hardware design , based on the implementation of Multi-channel AIvIR Vocoder on the TMS320C6203. The results obtained from emulation are in bit exact agreement with the calculated results provided by 3GPP. The informal subjective test in real-time processing indicates that the synthetic speech quality of AMR Vocoder is better than it of RPE-LTP in GSM system, achieving toll quality, that can apply to devices employing the AMR Vocoder within the 3GPP system in terms of software and hardware.
引文
[1]易克初。田斌。付强。语音信号处理。北京:国防工业出版社,2000。56-57
    [2]胡光锐。语音处理与识别。上海:上海科学技术文献出版社,1993。87-119
    [3]陈尚勤。罗承烈 杨雪。近代语音识别。成都:电子科技大学出版社,1991。1-6
    [4]杨行峻。迟惠生等编著。语音信号数字处理。北京:电子工业出版社,1995
    [5]L.R.拉宾纳。R.W.谢弗著。朱雪龙等译。语音信号数字处理。北京:科学出版社,1983。90-105
    [6]陈永彬。王仁华。语音信号处理。合肥:中国科技大学出版社,1990。53-66
    [7][日]古井贞熙著。朱家新等译。数字声音处理。北京:人民邮电出版社,1993。181-182
    [8][美]A.V.Oppenheim。R.W.Schafer。离散时间信号处理。北京:科学出版社,1998
    [9]A.Papoulis: Probability, Random Variable and Stochastic Process. Second Edition McGraw-Hill,Inc, 1984
    [10]G.annakis,G.B.and M.K.Tsatsanis(1990),Signal detection and classification using matched filtering and higher-order statistics,IEEE Trans. Acoust, Speech, Signal Processing,vol ASSP-38,pp.1284-1296
    [11]G. annakis,G.B, and M.K.Tsatsanis(1992), A unifying maximum-likelihood view of cumulant and polyspectral measure for non-Gaussian signal classification and estimation. IEEE Trans. Inform. Theory, vol.IT-38,pp.386-406
    [12]D. W. Griffin and Jae S.Lim. Multi-band excitation Vocoder.IEEE Transactions on ASSP, 1988,36, 1223-1235
    [13]A.V. McGree and T.P. Barnwell Ⅲ A New Mixed Excitation LPC Vocoder. Proc.IEEE ICASSP, 1991, 593-596
    [14]W.B.Kleijn and J.Haagen,Continuous Representation in Linear Predictive Coding. Proc.IEEE ICASSP, 1991, 201-204
    [15]W.B.Kleijn and J.Haagen A Speech Coder Based On Decomposition Of Characteristic Waveforms. Proc.IEEE ICASSP, 1995, 508-511
    
    
    [16] Yuen E, Ho P. Cuperman .Variable rate speech and channel coding for mobile communication {J}.Proc.IEEE VTC-94,1994,2:1709-1713
    [17] D.K.Freeman, G.Cosier.C.B.Southcott, and I.boyd. "The voice activity detector for the Pan-European digital cellar mobile telephone service".ICASSP'89
    [18] M.Rangoussi,G.Carayannis "High order statistic based Gaussianity test applied to on line speech processing" Proc. IEEE Asilomar conf.,1995. pp303-307
    [ 19] L. R. Rabinar. B. H. juang. Fundamentals of speech recognition, Murry Hill.New Jersey,USA,1993
    [20] 3G TS 26. 071: "AMR Speech Codec; General Description."
    [21] 3G TS 26. 093: "AMR Speech Codec; Source Controlled Rate operation. "
    [22] 3G TS 26. 094:"AMR Speech Codec; Voice Activity Detector(VAD) "
    [23 ] 3G TS 26. 092: "AMR Speech Codec; Comfort Noise Aspects. "
    [20] 3G TS 26. 090: "AMR Speech Codec; Transcoding functions. "
    [25] 3G TS 26. 901: "AMR Speech Codec; Performance characterisation".
    [26] 3G TS 26. 073 : "AMR Speech Codec; ANSI-C code".
    [27] 3G TS 26. 074 : "AMR Speech Codec; Test sequences".
    [28] TMS320C62X/C67X Programmer's Guide, 1999, Texas Instruments.
    [29] TMS320C62X/C67X CPU and Instruction Set, Texas Instruments.
    [30] TMS320C6000 Optimizing C Compiler, 1999, Texas Instruments.
    [31 ] TMS320C6000 Assembly Language Tools, 1999, Texas Instruments.
    [32] TMS320C6000 Pheripherals,1999, Texas Instruments.
    [33] Roger S. Pressman, software Engineering, A Practitioner's Approach. Fourth Edition. McGraw-Hill Inc. 1997.
    [34] ITU,The International Telecommunications Union, Blue Book,Vol. III, Telephone Transmission Quality,lXth Plenary Assembly,Melbourne, 14-25 November, 1988,Recommendation G.711, Pulse code modulation (PCM) of voice frequencies.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700