时域和频域特征相融合的语音端点检测新方法

设为首页

收藏本站

网站地图 | English | 公务邮箱

读者指南

学术客户端

NSTL服务站

科技查新

时域和频域特征相融合的语音端点检测新方法

详细信息查看全文 | 推荐本文 |

英文篇名：A novel speech activity detection algorithm based on the fusion of time domain and frequency domain features
作者：刘欢 ; 王骏 ; 林其光 ; 王士同
英文作者：LIU Huan;WANG Jun;LIN Qiguang;WANG Shitong;School of Digital Media,Jiangnan University;Baihu Technology Company of Wuxi;
关键词：特征融合 ; 特征提取 ; 支持向量机 ; 语音端点检测 ; 主成分分析
英文关键词：feature fusion;;feature extraction;;support vector machine;;speech activity detection;;principal component analysis
中文刊名：HDCB
英文刊名：Journal of Jiangsu University of Science and Technology(Natural Science Edition)
机构：江南大学数字媒体学院;无锡百互科技有限公司;
出版日期：2017-02-09 14:46
出版单位：江苏科技大学学报(自然科学版)
年：2017
期：v.31;No.160
基金：国家自然科学基金资助项目(61300151);; 江苏省自然科学基金资助项目(BK20130155);; 江苏省高校自然科学研究项目(13KJB520001);; 科技部科技型中小企业技术创新基金(14C26213201061)
语种：中文;
页：HDCB201701014
页数：6
CN：01
ISSN：32-1765/N
分类号：77-82

摘要

为了提高语音端点检测的适应性和鲁棒性,提出一种时域和频域特征相融合的语音端点检测新方法.在对语音信号进行预处理的基础上,对每一帧分别提取调和性、清晰度和周期性这3个时域或频域特征,使用主成分分析进行特征融合,并采用双门限法得到语音端点的候选集合.在此基础上通过支持向量机对候选集合中的端点进行判断得到最终结果.仿真实验表明:相对于传统的语音端点检测算法、时域和频域特征相融合的语音端点检测新算法提高了语音端点检测的正确率,有效降低了误测率和漏检率,具有更好的适应性和鲁棒性,对不同噪声背景的信号都有较好的检测能力.
In order to improve the adaptability and robustness of speech activity detection,a novel algorithm for speech activity detection(SAD) is proposed based on the integration of time domain and frequency domain features. In the proposed method,three features,i. e. harmonicity,clarity,periodicity are extracted and combined together with principal component analysis. The candidates of the endpoints are detected by double-threshold method. SVM is utilized to determine the final set of endpoints based on the candidates. Experimental results indicate that the proposed SAD method is effective and provides superior and consistent performance across various noise and distortion levels.

引文

[1]朱恒军,于泓博,王发智.小波分析和支持向量机相融合的语音端点检测算法[J].计算机科学,2012,39(6):244-246.ZHU Hengjun,YU Hongbo,WANG Fazhi.Speech endpoints detection algorithm based on support vector machine and wavelet analysis[J].Computer Science,2012,39(6):244-246.(in Chinese)
    [2]郑中华.噪音环境下汉语连续数字识别与研究[D].合肥:合肥工业大学,2013:1-62.
    [3]李荣荣,胡昌奎,余娟.基于谱熵的语音端点检测算法改进研究[J].武汉理工大学学报,2013,35(7):134-139.LI Rongrong,HU Changkui,YU Juan.Research of speech endpoint detection based on spectral entropy algorithm[J].Journal of Wuhan University of Technology,2013,35(7):134-139.(in Chinese)
    [4]何俊红,王彪.基于倒谱距离-频带方差的端点检测方法[J].计算机与数字工程,2014,42(11):2014-2016.HE Junhong,WANG Biao.Endpoint detection method based on cepstrum distance-frequency band variance[J].Computer&Digital Engineering,2014,42(11):2014-2016.(in Chinese)
    [5]王坤峰,李镇江,汤淑明.基于多特征融合的视频交通数据采集方法[J].自动化学报,2011,37(3):322-330.WANG Kunfeng,LI Zhenjiang,TANG Shuming.Visual traffic data collection approach based on multi-features fusion[J].Acta Automatica Sinica,2011,37(3):322-330.(in Chinese)
    [6]SHIN W H,LEE B S,LEE Y K,et al.Speech/nonspeech classification using multiple features for robust endpoint detection[C]∥International Conference on Acoustics.[S.l.]:IEEE,2000:1399-1402.
    [7]徐大为,吴边,赵建伟,等.一种噪声环境下的实时语音端点检测算法[J].计算机工程与应用,2003,39(1):115-117.XU Dawei,WU Bian,ZHAO Jianwei,et al.A robust algorithm for real-time endpoint detection in noisy environments[J].Computer Engineering and Applications,2003,39(1):115-117.(in Chinese)
    [8]王晓华,屈雷.基于时频参数融合的自适应语音端点检测算法[J].计算机工程与应用,2015,51(20):203-207.WANG Xiaohua,QU Lei.Self-adaptive voice activity detection algorithm based on fusion of time-frequency parameter[J].Computer Engineering and Applications,2015,51(20):203-207.(in Chinese)
    [9]BRO R,SMILDE A K.Principal component analysis[J].Analytical Methods,2014,6(9):2812-2831.
    [10]PAN Yixiong,SHEN Peipei,SHEN Liping.Speech emotion recognition using support vector machine[J].International Journal of Smart Home,2012,6(2):101-107.
    [11]SADJADI S O,HANSEN J H L.Unsupervised speech activity detection using voicing measures and perceptual spectral flux[J].IEEE Signal Processing Letters,2013,20(3):197-200.
    [12]SRIPRIYA N,NAGARAJAN T.Pitch estimation using harmonic product spectrum derived from DCT[C]∥Tencon IEEE Region 10 Conference.[S.l.]:IEEE,2013:1-4.
    [13]王宏志,徐玉超,李美静.基于Mel频率倒谱参数相似度的语音端点检测算法[J].吉林大学学报(工学版),2012,42(5):1331-1335.WANG Hongzhi,XU Yuchao,LI Meijing.Voice activity detection algorithm based on Mel frequency cepstrum coefficient(MFCC)similarity[J].Journal of Jilin University(Engineering and Technology Edition),2012,42(5):1331-1335.(in Chinese)
    [14]宋知用.Matlab在语音信号分析与合成中的应用[M].北京:北京航空航天大学出版社,2013:1-378.

常见问题　|　交通位置　|　联系我们　|　OA远程办公

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700