用户名: 密码: 验证码:
基于改进的长短期记忆神经网络方言辨识模型
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:Dialect Identification Model Based on Improved Long Short-term Memory Neural Network
  • 作者:艾虎 ; 李菲
  • 英文作者:AI Hu;LI Fei;Department of Criminal Technology,Guizhou Police College;Faculty of Humanities,The Education University of Hong Kong;
  • 关键词:汉语方言辨识 ; 梅尔频率倒谱系数 ; 地区口头禅 ; 奇异值分解 ; 长短期记忆神经网络
  • 英文关键词:Chinese dialect identification;;Mel frequency cepstrum coefficients;;regional pet phrase;;singular value decomposition;;long short-term memory neural network
  • 中文刊名:KXJS
  • 英文刊名:Science Technology and Engineering
  • 机构:贵州警察学院刑事技术系;香港教育大学人文学院;
  • 出版日期:2019-01-18
  • 出版单位:科学技术与工程
  • 年:2019
  • 期:v.19;No.471
  • 基金:贵州省科技计划(黔科合[2016]支撑2847)资助
  • 语种:中文;
  • 页:KXJS201902027
  • 页数:7
  • CN:02
  • ISSN:11-4688/T
  • 分类号:168-174
摘要
在案件侦破中,方言的辨别能提供重要线索。为了对汉语方言进行辨别,基于长短期记忆神经网络(LSTM)的方言辨识模型被提出,语音样本数据,其中包括地区口头禅,均采集于贵州省6个地区,并提取梅尔频率倒谱系数(MFCC),每份语音样本MFCC后面加上相应的地区口头禅MFCC,然后采用滑窗进行信息重叠分块,对每块分别进行横向与纵向奇异值分解并保留高贡献率的特征向量,把分块合并作为方言辨识模型的输入数据。先对LSTM进行改进,然后构建方言辨识模型。通过交叉实验对该模型进行训练和验证,从而对滑窗的宽度进行优化,同时与循环神经网络(RNN)进行比较。实验结果证明研究构建的LSTM模型对汉语方言辨识是高效的。
        Chinese dialect identifications may provide important clues for forensic investigation. An effective dialect identification model has keen proposed for Chinese dialect identification based on improved long short-term memory( LSTM). Mel frequency cepstral coefficients( MFCC) was extracted from speech samples including regional pet phrase collected from six regions in Guizhou province,then added a corresponding regional pet phrase after each voice sample,and then used the sliding window to conduct information overlapping blocking. The singular value of each block was decomposed from horizontal and vertical and high contribution rate feature vectors were retained,and the blocks were combined as the input data of the dialect identification model. Firstly,the LSTM is improved,then a dialect identification model is constructed,and the model is trained and verified by adopting an experiment,so that the width of the sliding window are optimized and the LSTM is compared with recurrent neural network( RNN). The experimental results show that the model based on improved LSTM is efficient for Chinese dialect identification.
引文
1 Baker W,Eddington D,Nay L.Dialect identification:The effects of region of origin and amount of experience[J].American Speech,2009,84(1):48-71
    2 Alam M J,Kinnunen T,Kenny P,et al.Multitaper MFCC and PLPfeatures for speaker verification using ivectors[J].Speech Communication,2013,55(2):237-251
    3 Dehak N,Torres-Carrasquillo P A,Reynolds D A,et al.Language recognition via ivectors and Dimensionality reduction[C]//Proceedings of Conference of the International Speech Communication Association,Florence:International Speech Communication Association,2011:857-860
    4 Pucher M,Schabus D,Yamagishi J,et al.Modeling and interpolation of Austrian German and Viennese dialect in HMM-based speech synthesis[J].Speech Communication,2010,52(2):164-179
    5 Zaidan O F,Callison-Burchn C.Arabic dialect identification[J].Computational Linguistics,2014,40(1):171-202
    6 Sundermeyer M,Schlüter R,Ney H.LSTM neural networks for language modeling[J].Interspeech,2012,31(43):601-608
    7钱盛友,许慧燕.基于动态时间规整和神经网络的方言辨识研究[J].计算机工程与应用,2008,44(10):211-213Qian Shengyou,Xu Huiyan.Dialect identification based on dynamic time warping and neural network[J].Computer Engineering and Applications,2008,44(10):211-213
    8朱颖,钱盛友,赵新民.基于SOM神经网络和支持向量机的方言辨识[J].计算机工程与应用,2009,45(22):200-201Zhu Ying,Qian Shengyou,Zhao Xinmin.Dialect identification based on SOM and SVM[J].Computer Engineering and Applications,2009,45(22):200-201
    9彭湘陵,钱盛友,赵新民.基于混合特征参数和BP_Adaboost的方言辨识[J].计算机工程与应用,2013,49(3):152-155Peng Xiangling,Qian Shengyou,Zhao Xinmin.Chinese dialects identification based on mixed characteristic parameters and BP_Adaboost[J].Computer Engineering and Applications,2013,49(3):152-155
    10景亚鹏,郑骏,胡文心.基于深层神经网络(DNN)的汉语方言种属语音识别[J].华东师范大学学报(自然科学版),2014(1):60-67Jing Yapeng,Zheng Jun,Hu Wenxin.Belongingness of Chinese dialect speech recognition based on deep neural network[J].Journal of East China Normal University(Natural Science),2014(1):60-67
    11崔瑞莲,宋彦,蒋兵,等.基于深度神经网络的语种识别[J].模式识别与人工智能,2015,28(12):1093-1099Cui Ruilian,Song Yan,Jiang Bing,et al.Language identification based on deep neural network[J].Pattern Recognition and Artificial Intelligence,2015,28(12):1093-1099
    12 Press W H,Flannery B P,Teukolsky S A,et al.Numerical recipes in C:The art of scientific computing[M].Cambridge:Cambridge University Press,1988
    13 Schuster M,Paliwal K K.Bidirectional recurrent neural networks[J].IEEE Transactions on Signal Processing,1997,45(11):2673-2681
    14 Hochreiter S,Schmidhuber J.Long short-term memory[J].Neural Computation,1997,9(8):1735-1780
    15 Greff K,Srivastava R K,Koutník J,et al.LSTM:A search space odyssey[J].IEEE Transactions on Neural Networks&Learning Systems,2016,28(10):2222-2232
    16 Gers F A,Schmidhuber J,Cummins F.Learning to forget:Continual prediction with LSTM[J].Neural Computation,2014,12(10):2451-2471
    17 Sak H,Senior A,Beaufays F.Long short-term memory recurrent neural network architectures for large scale acoustic modeling[C]//15th Annual Conference of the International Speech Communication Association.Singapore:IACA,2014:338-342
    18 Goodfellow I,Bengio Y,Courville A.深度学习[M].赵申剑,译.3版.北京:人民邮电出版社,2017:248-255Goodfellow I,Bengio Y,Courville A.Deep learning[M].Zhao Shenjian,translated.3rd ed.Beijing:The People's Posts and Telecommunications Press,2017:248-255

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700