基于分层识别的快速说话人识别研究

设为首页

收藏本站

网站地图 | English | 公务邮箱

NSTL服务站

基于分层识别的快速说话人识别研究

详细信息查看全文 | 推荐本文 |

英文篇名：Fast speaker recognition based on hierarchical recognition
作者：茅正冲 ; 涂文辉
英文作者：MAO Zheng-chong;TU Wen-hui;Key Laboratory of Advanced Process Control for Light Industry,Ministry of Education,Jiangnan University;
关键词：高斯混合模型 ; 说话人识别 ; KL散度 ; 模型聚类
英文关键词：Gauss mixture model;;speaker recognition;;KL divergence;;model clustering
中文刊名：JSJK
英文刊名：Computer Engineering & Science
机构：江南大学轻工过程先进控制教育部重点实验室;
出版日期：2018-07-15
出版单位：计算机工程与科学
年：2018
期：v.40;No.283
基金：国家自然科学基金(60973095);; 江苏省自然科学基金(BK20131107)
语种：中文;
页：JSJK201807015
页数：6
CN：07
ISSN：43-1258/TP
分类号：102-107

摘要

随着说话人模型数量的增加,说话人识别系统的识别速度下降,不能满足实时性要求。针对这个问题,提出了基于分层识别模型的快速说话人识别方法。将变分法求解的KL散度的近似值作为模型间的相似性度量准则,并设计了说话人模型聚类的方法。结果表明,本文方法能够保证说话人模型聚类结果的有效性,在系统识别率损失很小的情况下,使系统的识别速度得到大幅度提升。
As the number of speaker models increases,the recognition speed of the speaker recognition system decreases,thus it cannot meet real-time requirement.To solve this problem,we propose a fast speaker recognition method based on hierarchical recognition model.The approximate value of the KL divergence solved by the variational method is used as the similarity measure between speaker models and a speaker model clustering method is designed.Experimental results show that the proposed method can ensure the validity of speaker model clustering results and improve the recognition speed of the system greatly while maintaining a small system recognition rate loss.

引文

[1]An Mao-bo,Liu Jian.Design and implementation of a fast speaker recognition system[J].Journal of Network New Media,2012,1(3):37-41.(in Chinese)
    [2]Wang Huan-liang,Han Ji-qing,Zheng Gui-bin.Fast speaker identification method based on K-L divergence model clustering[J].Pattern Recognition and Artificial Intelligence,2010,23(6):856-861.(in Chinese)
    [3]Pellom B L,Hansen J H L.An efficient scoring algorithm for Gaussian mixture model based speaker identification[J].IEEE Signal Processing Letters,1998,5(11):281-284.
    [4]Mclaughlin J,Reynolds D A,Gleason T P.A study of computation speed-UPS of the GMM-UBM speaker recognition system[C]∥Proc of European Conference on Speech Communication and Technology,1999:1.
    [5]Xiong Hua-qiao,Zheng Jian-bin,Zhan En-qi,et al.Speaker recognition based on speaker model clustering[J].Computer Engineering and Applications,2014,50(2):133-136.(in Chinese)
    [6]Kullback S,Leibler R A.On information and sufficiency[J].Annals of Mathematical Statistics,1951,22(22):79-86.
    [7]Hershey J R,Olsen P A.Approximating the Kullback Leibler divergence between Gaussian mixture models[C]∥Proc of IEEE International Conference on Acoustics,2007(IV):317-320.
    [8]Yu Yan.A similarity measure method of Gauss mixture model based on KL divergence and distance[J].Journal of Computer Applications,2014,34(3):828-832.(in Chinese)
    [9]Julier S,Uhlmann J K.A general method for approximating nonlinear transformations of probability distributions:Tech.Rep.RRG[R].Oxford:Department of Engineering Science,University of Oxford,1996.
    [1]安茂波,刘建.一个快速说话人识别系统的设计和实现[J].网络新媒体技术,2012,1(3):37-41.
    [2]王欢良,韩纪庆,郑贵滨.基于K-L散度模型聚类的快速说话人辨识方法[J].模式识别与人工智能,2010,23(6):856-861.
    [5]熊华乔,郑建彬,詹恩奇,等.基于说话人模型聚类的说话人识别[J].计算机工程与应用,2014,50(2):133-136.
    [8]余艳.融合KL散度和移地距离的高斯混合模型相似性度量方法[J].计算机应用,2014,34(3):828-832.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700