Speaker adaptation based on regularized speaker-dependent eigenphone matrix estimation

设为首页

收藏本站

网站地图 | English | 公务邮箱

读者指南

学术客户端

NSTL服务站

科技查新

Speaker adaptation based on regularized speaker-dependent eigenphone matrix estimation

详细信息查看全文

作者：Wen-Lin Zhang (1)
Wei-Qiang Zhang (2)
Dan Qu (1)
Bi-Cheng Li (1)

1. Zhengzhou Information Science and Technology Institute ; Zhengzhou ; 450000 ; China
2. Tsinghua National Laboratory for Information Science and Technology ; Department of Electronic Engineering ; Tsinghua University ; Beijing ; 100084 ; China
关键词：Eigenphones ; Speaker adaptation ; Regularization methods ; Sparse group lasso
刊名：EURASIP Journal on Audio, Speech, and Music Processing
出版年：2014
出版时间：December 2014
年：2014
卷：2014
期：1
全文大小：280 KB
参考文献：1. Kuhn, R, Junqua, JC, Nguyen, P, Niedzielski, N (2000) Rapid speaker adaptation in eigenvoice space. IEEE Trans. Speech Audio Process 8: pp. 695-707 CrossRef
2. Gales, MJF (1998) Maximum likelihood linear transformations for HMM-based speech recognition.Comput. Speech Lang 12: pp. 75-98 CrossRef
3. Zhang, WL, Zhang, WQ, Li, BC (2011) Speaker adaptation based on speaker-dependent eigenphone estimation. Paper presented at the IEEE Workshop on automatic speech recognition and understanding, Waikoloa, HI
4. Kenny, P, Boulianne, G, Ouellet, P, Dumouchel, P (2004) Speaker adaptation using an eigenphone basis. IEEE Trans. Speech Acoust. Process 12: pp. 579-589 CrossRef
5. Tan, QF, Georgiou, PG, Narayanan, SS (2011) Enhanced sparse imputation techniques for a robust speech recognition front-end. IEEE Trans. Acoust. Speech Signal Process 19: pp. 2418-2429
6. Lu, L, Ghoshal, A, Renals, S (2011) Regularized subspace Gaussian mixture models for speech recognition. IEEE Signal Process. Lett 18: pp. 419-422 CrossRef
7. D Yu, F, Seide, G, Li, L (2012) Deng, Exploiting sparseness in deep neural networks for large vocabulary speech recognition. Paper presented at ICASSP, Kyoto, Japan
8. Tan, QF, Narayanan, SS (2012) Novel variations of group sparse regularization techniques with applications to noise robust automatic speech recognition. IEEE Trans. Acoust. Speech Signal Process 20: pp. 1337-1346
9. WL Zhang, DQu, Zhang, WQ, Li, BC (2013) Rapid speaker adaptation using compressive sensing.Speech Commun 55: pp. 950-963
10. Zhang, WL, Zhang, WQ, BC Li, DQu, Johnson, MT (2012) Bayesian speaker adaptation based on a new hierarchical probabilistic model. IEEE Trans. Audio Speech Lang. Process 20: pp. 2002-2015 CrossRef
11. T Anastasakos, J, McDonough, R, Schwartz, J, Makhoul, A (1996) compact model for speaker-adaptive training. Paper presented at the ICSLP, Philadelphia, PA. USA, 3鈥?. pp. 1137-1140
12. Lee, CH, Lin, CH, Juang, BH (1991) A study on speaker adaptation of the parameters of continuous density hidden Markov models. IEEE Trans. Acoust. Speech Signal Process 39: pp. 806-814 CrossRef
13. Leggetter, CJ, Woodland, PC (1995) Flexible speaker adaptation using maximum likelihood linear regression.
14. Tibshirani, R (1996) Regression shrinkage and selection via the LASSO. J. R. Stat. Soc. (Ser. B) 58: pp. 267-288
15. T Hastie, R, Tibshirani, J (2005) Friedman, The Elements of Statistical Learning: Data Mining, Inference and Prediction. Springer, Berlin
16. Yuan, M, Lin, Y (2007) Model selection and estimation in regression with grouped variables. J. R. Stat. Soc. (Ser. B) 68: pp. 49-67 CrossRef
17. Zou, H, Hastie, T (2005) Regularization and variable selection via the elastic net. J. R. Stat. Soc. B (Stat. Methodol.) 67: pp. 301-320 CrossRef
18. Simon, N, Friedman, J, Hastie, T, Tibshirani, R (2013) A sparse-group lasso. J. Comput. Graph. Stat 22: pp. 231-245 CrossRef
19. Gemmeke, JF, Virtanen, T, Hurmalainen, A (2011) Exemplar-based sparse representations for noise robust automatic speech recognition. IEEE Trans. Acoust. Speech Signal Process 19: pp. 2067-2080
20. Figueiredo, M, Nowak, R, Wright, S (2007) Gradient projection for sparse reconstruction: application to compressed sensing and other inverse problems. IEEE J. Sel. Top. Signal Process 1: pp. 586-597 CrossRef
21. Liu, J, Ji, S, Ye, J (2009) SLEP: Sparse Learning with Eefficient Projections. Arizona State University, Tempe
22. Beck, A, Teboulle, M (2009) A fast iterative shrinkage-thresholding algorithm for linear inverse problems.SIAM. J. Imaging Sci 2: pp. 183-202 CrossRef
23. Richard, E, Savalle, PA (2012) Estimation of simultaneously sparse and low rank matrices.
24. Bertsekas, DP (2011) Incremental proximal methods for large scale convex optimization. Math. Program 129: pp. 163-195 CrossRef
25. Blatt, D, Hero, AO, Gauchman, H (2008) A convergent incremental gradient method with a constant step size.SIAM. J. Optim 18: pp. 29-51
26. Parikh, N, Boyd, S (2013) Proximal algorithms.Foundations Trends Optimization 1: pp. 1-108
27. Nesterov, Y (1983) A method of solving a convex programming problem with convergence rate . Sov. Math. Doklady 27: pp. 372-376
28. Daubechies, I, Friese, MD, Mol, CD (2004) An iterative thresholding algorithm for linear inverse problems with a sparsity constraint.Comm. Pure Appl. Math 57: pp. 1413-1457
29. Chang, E, Shi, Y, Zhou, J, Huang, C (2001) Speech lab in a box : a Mandarin speech toolbox to jumpstart speech related research. Aalborg, Denmark,. pp. 2799-2802
30. The National Institute of Standards and Technology the NIST Scoring Toolkit (SCTK-2.4.0)ftp://jaguar.ncsl.nist.gov/pub/sctk-2.4.0-20091110-0958.tar.bz2. Accessed 25 Sept 2013
31. S Young, G, Evermann, M, Gales, T, Hain, D, Kershaw, X, Liu, G, Moore, J, Odell, D, Ollason, V, Valtchev, P (2009) Woodland, The HTK Book (for HTK Version 3.4). Cambridge University Engineering Department, Cambridge
32. Digalakis, VV, Neumeyer, LG (1996) Speaker adaptation using combined transformation and Bayesian methods. IEEE Trans. Speech Audio Process 4: pp. 294-300 CrossRef
刊物主题：Signal, Image and Speech Processing;
出版者：Springer International Publishing
ISSN：1687-4722

文摘

常见问题　|　交通位置　|　联系我们　|　OA远程办公

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700