用户名: 密码: 验证码:
Speaker adaptation based on regularized speaker-dependent eigenphone matrix estimation
详细信息    查看全文
  • 作者:Wen-Lin Zhang (1)
    Wei-Qiang Zhang (2)
    Dan Qu (1)
    Bi-Cheng Li (1)

    1. Zhengzhou Information Science and Technology Institute
    ; Zhengzhou ; 450000 ; China
    2. Tsinghua National Laboratory for Information Science and Technology
    ; Department of Electronic Engineering ; Tsinghua University ; Beijing ; 100084 ; China
  • 关键词:Eigenphones ; Speaker adaptation ; Regularization methods ; Sparse group lasso
  • 刊名:EURASIP Journal on Audio, Speech, and Music Processing
  • 出版年:2014
  • 出版时间:December 2014
  • 年:2014
  • 卷:2014
  • 期:1
  • 全文大小:280 KB
  • 参考文献:1. Kuhn, R, Junqua, JC, Nguyen, P, Niedzielski, N (2000) Rapid speaker adaptation in eigenvoice space. IEEE Trans. Speech Audio Process 8: pp. 695-707 CrossRef
    2. Gales, MJF (1998) Maximum likelihood linear transformations for HMM-based speech recognition.Comput. Speech Lang 12: pp. 75-98 CrossRef
    3. Zhang, WL, Zhang, WQ, Li, BC (2011) Speaker adaptation based on speaker-dependent eigenphone estimation. Paper presented at the IEEE Workshop on automatic speech recognition and understanding, Waikoloa, HI
    4. Kenny, P, Boulianne, G, Ouellet, P, Dumouchel, P (2004) Speaker adaptation using an eigenphone basis. IEEE Trans. Speech Acoust. Process 12: pp. 579-589 CrossRef
    5. Tan, QF, Georgiou, PG, Narayanan, SS (2011) Enhanced sparse imputation techniques for a robust speech recognition front-end. IEEE Trans. Acoust. Speech Signal Process 19: pp. 2418-2429
    6. Lu, L, Ghoshal, A, Renals, S (2011) Regularized subspace Gaussian mixture models for speech recognition. IEEE Signal Process. Lett 18: pp. 419-422 CrossRef
    7. D Yu, F, Seide, G, Li, L (2012) Deng, Exploiting sparseness in deep neural networks for large vocabulary speech recognition. Paper presented at ICASSP, Kyoto, Japan
    8. Tan, QF, Narayanan, SS (2012) Novel variations of group sparse regularization techniques with applications to noise robust automatic speech recognition. IEEE Trans. Acoust. Speech Signal Process 20: pp. 1337-1346
    9. WL Zhang, DQu, Zhang, WQ, Li, BC (2013) Rapid speaker adaptation using compressive sensing.Speech Commun 55: pp. 950-963
    10. Zhang, WL, Zhang, WQ, BC Li, DQu, Johnson, MT (2012) Bayesian speaker adaptation based on a new hierarchical probabilistic model. IEEE Trans. Audio Speech Lang. Process 20: pp. 2002-2015 CrossRef
    11. T Anastasakos, J, McDonough, R, Schwartz, J, Makhoul, A (1996) compact model for speaker-adaptive training. Paper presented at the ICSLP, Philadelphia, PA. USA, 3鈥?. pp. 1137-1140
    12. Lee, CH, Lin, CH, Juang, BH (1991) A study on speaker adaptation of the parameters of continuous density hidden Markov models. IEEE Trans. Acoust. Speech Signal Process 39: pp. 806-814 CrossRef
    13. Leggetter, CJ, Woodland, PC (1995) Flexible speaker adaptation using maximum likelihood linear regression.
    14. Tibshirani, R (1996) Regression shrinkage and selection via the LASSO. J. R. Stat. Soc. (Ser. B) 58: pp. 267-288
    15. T Hastie, R, Tibshirani, J (2005) Friedman, The Elements of Statistical Learning: Data Mining, Inference and Prediction. Springer, Berlin
    16. Yuan, M, Lin, Y (2007) Model selection and estimation in regression with grouped variables. J. R. Stat. Soc. (Ser. B) 68: pp. 49-67 CrossRef
    17. Zou, H, Hastie, T (2005) Regularization and variable selection via the elastic net. J. R. Stat. Soc. B (Stat. Methodol.) 67: pp. 301-320 CrossRef
    18. Simon, N, Friedman, J, Hastie, T, Tibshirani, R (2013) A sparse-group lasso. J. Comput. Graph. Stat 22: pp. 231-245 CrossRef
    19. Gemmeke, JF, Virtanen, T, Hurmalainen, A (2011) Exemplar-based sparse representations for noise robust automatic speech recognition. IEEE Trans. Acoust. Speech Signal Process 19: pp. 2067-2080
    20. Figueiredo, M, Nowak, R, Wright, S (2007) Gradient projection for sparse reconstruction: application to compressed sensing and other inverse problems. IEEE J. Sel. Top. Signal Process 1: pp. 586-597 CrossRef
    21. Liu, J, Ji, S, Ye, J (2009) SLEP: Sparse Learning with Eefficient Projections. Arizona State University, Tempe
    22. Beck, A, Teboulle, M (2009) A fast iterative shrinkage-thresholding algorithm for linear inverse problems.SIAM. J. Imaging Sci 2: pp. 183-202 CrossRef
    23. Richard, E, Savalle, PA (2012) Estimation of simultaneously sparse and low rank matrices.
    24. Bertsekas, DP (2011) Incremental proximal methods for large scale convex optimization. Math. Program 129: pp. 163-195 CrossRef
    25. Blatt, D, Hero, AO, Gauchman, H (2008) A convergent incremental gradient method with a constant step size.SIAM. J. Optim 18: pp. 29-51
    26. Parikh, N, Boyd, S (2013) Proximal algorithms.Foundations Trends Optimization 1: pp. 1-108
    27. Nesterov, Y (1983) A method of solving a convex programming problem with convergence rate . Sov. Math. Doklady 27: pp. 372-376
    28. Daubechies, I, Friese, MD, Mol, CD (2004) An iterative thresholding algorithm for linear inverse problems with a sparsity constraint.Comm. Pure Appl. Math 57: pp. 1413-1457
    29. Chang, E, Shi, Y, Zhou, J, Huang, C (2001) Speech lab in a box : a Mandarin speech toolbox to jumpstart speech related research. Aalborg, Denmark,. pp. 2799-2802
    30. The National Institute of Standards and Technology the NIST Scoring Toolkit (SCTK-2.4.0)ftp://jaguar.ncsl.nist.gov/pub/sctk-2.4.0-20091110-0958.tar.bz2. Accessed 25 Sept 2013
    31. S Young, G, Evermann, M, Gales, T, Hain, D, Kershaw, X, Liu, G, Moore, J, Odell, D, Ollason, V, Valtchev, P (2009) Woodland, The HTK Book (for HTK Version 3.4). Cambridge University Engineering Department, Cambridge
    32. Digalakis, VV, Neumeyer, LG (1996) Speaker adaptation using combined transformation and Bayesian methods. IEEE Trans. Speech Audio Process 4: pp. 294-300 CrossRef
  • 刊物主题:Signal, Image and Speech Processing;
  • 出版者:Springer International Publishing
  • ISSN:1687-4722
文摘

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700