腭裂术后腭咽闭合不全患者声门塞音自动识别

设为首页

收藏本站

网站地图 | English | 公务邮箱

读者指南

学术客户端

NSTL服务站

科技查新

腭裂术后腭咽闭合不全患者声门塞音自动识别

详细信息查看全文 | 推荐本文 |

英文篇名：Automatic detection of glottal stop from cleft palate patients with incomplete velopharyngeal after cleft palate surgery
作者：谭洁 ; 何凌 ; 唐铭 ; 郑谦 ; 尹恒 ; 郭春丽
英文作者：TAN Jie;HE Ling;TANG Ming;ZHENG Qian;YIN Heng;GUO Chun-li;School of Electrical Engineering and Information,Sichuan University;West China Hospital of Stomatology,Sichuan University;
关键词：腭裂语音 ; 声门塞音 ; 频谱能量加强段 ; 临界频带 ; 小波 ; 小波包
英文关键词：cleft palate speech;;glottal stop;;spectral energy strengthen segment;;critical band;;wavelet;;wavelet package
中文刊名：SJSJ
英文刊名：Computer Engineering and Design
机构：四川大学电气信息学院;四川大学华西口腔医院;
出版日期：2016-08-16
出版单位：计算机工程与设计
年：2016
期：v.37;No.356
基金：国家自然科学基金面上基金项目(81371127)
语种：中文;
页：SJSJ201608053
页数：7
CN：08
ISSN：11-1775/TP
分类号：292-298

摘要

通过对腭裂语音声门塞音的研究,提出基于频谱能量加强段、Mel倒频谱系数(MFCC)、频带功率谱、小波信息熵和小波包信息熵特征参数的腭裂语音声门塞音自动识别算法。提取的声学特征参数结合K-最近邻(KNN)分类器,实现对腭裂声门塞音的自动识别。实验结果表明,基于5种声学特征参数的声门塞音检测系统的正确率均达到70%以上,小波信息熵、小波包信息熵均达到近90%的正确率,临界频带功率谱达到近95%的正确率,可为语音师提供有效的临床辅助诊断。
An automatic glottal stop detection method was proposed.Five acoustic features were extracted,including spectral energy strengthen segment,MFCC,critical band based power spectrum,wavelet entropy and wavelet packet entropy.The extracted acoustic features were combined with KNN classifier.The experimental results show that the classification accuracies of the proposed method based on five acoustic features reach 70% above.Moreover,the detection accuracies,using the features based on wavelet entropy and wavelet package information entropy,are 90%above.Especially,the detection accuracy using critical band power spectrum feature achieves 95%.The proposed method can provide effective clinical diagnosis to the speech therapists.

引文

[1]Blumstein S E,Stevens K S.Acoustic invariance in speech production:Evidence from measurements of the characteristics of stop consonants[J].Journal of the Acoustical Society of America,2009,66(4):1001-1017.
    [2]Seid Hussien,Yegnanarayana B,Rajendran S.Spotting glottal stop in Amharic in continuous speech[J].Computer Speech and Language,2012,26(4):293-305.
    [3]SHEN Xiangrong.The acoustic performances of glottal stop[J].Studies in Language and Linguistics,2010,30(3):35-39(in Chinese).[沈向荣.喉塞音的声学表现[J].语言研究,2010,30(3):35-39.]
    [4]XIAO Yan,FENG Yongqiang,ZHAO Qingwei,et al.Acoustic analysis and detection of glottal stops substituted alveolar stops in cleft palate speech[J].Acta Acustica,2015(2):285-293(in Chinese).[肖彦,冯勇强,赵庆卫,等.腭裂语音中齿龈塞音的声门代偿现象声学分析与判定[J].声学学报,2015(2):285-293.]
    [5]CHEN Bin,ZHANG Lianhai,WANG Bo,et al.Boundary detection of Chinese initials and finals based on seneff’s auditory spectrum features[J].Acta Acustica,2012(1):104-112(in Chinese).[陈斌,张连海,王波,等.基于Seneff听觉谱特征的汉语连续语音声韵母边界检测[J].声学学报,2012(1):104-112.]
    [6]Rajesh Janakiraman,Chaitanya Kumar J,Hema A Murthy.Robust syllable segmentation and its application to syllable-centric continuous speech recognition[C]//IEEE Conference on Communications,2010:1-5.
    [7]MENG Zihou.Statistical survey of female pure vowel formants[J].Acta Acustica,2009(3):199-202(in Chinese).[孟子厚.普通话单元音女声共振峰统计特性测量[J].声学学报,2009(3):199-202.]
    [8]Loni DY,Subbaraman S.Formant estimation of speech and singing voice by combining wavelet with LPC and Cepstrum techniques[J].Industrial and Information Systems,2014:1-7.
    [9]Doulah ABMSU,Islam S.Detection of various diseases by using formant track extraction and pitch contour analysis[C]//14th International Conference on Computer and Information Technology,2011:366-369.
    [10]LYU Xiaoyun,WANG Hongxia.Abnormal sudio recognition algorithm based on MFCC and short term energy[J].Journal of Computer Applications,2010,30(3):796-798(in Chinese).[吕霄云,王宏霞.基于MFCC和短时能量混合的异常声音识别算法[J].计算机应用,2010,30(3):796-798.]
    [11]Ahmad KS,Thosar AS,Nirmal JH,et al.A unique approach in text independent speaker recognition using MFCC feature sets and probabilistic neural network[J].Advances in Pattern Recognition,2015:1-6
    [12]Wang Chen,Miao Zhenjiang,Meng Xiao.Differential MFCC and vector quantization used for real-time speaker recognition system[J].Image and Signal Proces-sing,2008,5:319-323.
    [13]ZHANG Ting,HE Ling,HUANG Hua,et al.Noisy speech endpoint detection based on wavelet transform and energy entropy[J].Computer Engineering and Design,2013,34(4):1331-1335(in Chinese).[张婷,何凌,黄华,等.基于小波及能量熵的带噪语音端点检测算法[J].计算机工程与设计,2013,34(4):1331-1335.]
    [14]Johari NA,Hariharan M,Saidatul A,et al.Multistyle classification of speech under stress using wavelet packet energy and entropy features[C]//IEEE Conference on Sustainable Utilization and Development in Engineering and Technology,2011:74-78.
    [15]Zhao Xiaolan,Wu Zuguo,Xu Jiren,et al.Speech signal feature extraction based on wavelet transform[C]//Intelligent Computation and Bio-Medical Instrumentation,2011:179-182.
    [16]ZHAO Lasheng.Study on feature extraction and recognition for speech emotion[D].Dalian:Dalian University of Technology,2010(in Chinese).[赵腊生.语音情感特征提取与识别方法研究[D].大连:大连理工大学,2010.]
    [17]ZHANG Lei,LIU Jianwei,LUO Xionglin.KNN and RVM based classification method KNN-RVM Classifier[J].PR&AI,2010,22(3):376-384(in Chinese).[张磊,刘建伟,罗雄麟.基于KNN和RVM的分类方法——KNNRVM分类器[J].模式识别与人工智能,2010,22(3):376-384.]
    [18]Pao Tsang-Long,Liao Wen-Yuan,Chen Yu-Te.Audiovisual speech recognition with weighted KNN-based classification in mandarin database[C]//International Conference on Intelligent Information Hiding and Multimedia Signal Processing,2007:39-42.
    [19]Zhou Lijuan,Wang Linshuang,Ge Xuebin,et al.A clustering-based KNN improved algorithm CLKNN for text classification[J].Informatics in Control,Automation and Robotics,2010:212-215

常见问题　|　交通位置　|　联系我们　|　OA远程办公

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700