用户名: 密码: 验证码:
非时齐语言建模技术研究及实践
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
语言模型是自然语言的数学描述,是人们为了解释、利用自然语言规律而构建的抽象的形式化系统。语言模型的研究是自然语言处理领域的基础性研究,其研究成果可以被直接地应用到汉语音字转换任务中,并且能够被广泛地应用在语音识别、手写体识别、印刷品字符识别、机器翻译、信息检索、语料库多级加工等众多的自然语言应用领域当中。
     当前,随着网络信息的飞速增长,海量电子文本的获得已不再困难,概率统计的方法以其准确率高、鲁棒性强等优点成为语言建模领域的主要方法。统计语言模型成为当前的主流语言模型。然而,统计语言模型单纯从统计角度出发,将自然语言看作是语言元素的随机序列,而忽略了语言本身的规律和特点。如何在统计语言模型中利用语言学知识成为当前语言建模领域面临的难题之一。目前,将语言学知识直接与统计语言建模技术相结合面临如下困难:1.语言学知识难以精确地自动获得;2.语言学知识难以与现有的统计建模技术相融合。
     针对上述问题,本文提出通过研究语言单位在自然语言序列中的位置信息和规律来间接地反映自然语言的语法语义信息。语言单位因其语法语义属性不同,其可以充当的语言成分不同,在句子以及篇章中所起到的作用也不相同,它在自然语言文本中出现的位置和范围具有一定的规律性。这种规律是自然语言语法语义规律的体现。针对上述规律,本文在随机过程理论的基础上扩展了时齐性假设,提出非时齐语言建模假设,即假设当前语言单位的出现概率与它在自然语言序列中的位置相关。在此基础上,本文分别对非时齐语言建模的理论、技术、方法和相关问题进行研究,并将其应用到汉语音字转换任务中,从而提高汉语键盘输入系统的性能。本文的研究内容主要包含以下四个方面:
     第一,本文进行语言建模研究的资源准备工作,提出一种面向汉语语言建模的词表自动生成算法。本文首先将词表自动生成工作同汉语语言建模工作相结合,设计一种一体化迭代算法框架,通过建立优化词表的方式来提高现有语言模型的性能。在该框架下,本文采用统计特征与构词特征相结合的词表生成策略,以提高词表生成算法的性能。最后,本文提出两种启发式方法使系统自动适应训练语料的领域,从而使系统具有自适应性。
     第二,本文进行非时齐语言建模的理论与方法研究。首先,本文讨论了语言单位非时齐属性的量化表示方法,并在此基础上分析了语言单位非时齐属性的统计规律。接下来,本文将非时齐属性规律与现有的语言建模技术相结合,分别提出非时齐Ngram模型和非时齐最大熵马尔科夫模型,并讨论了模型构建、训练方法、参数平滑和模型复杂度等问题。最后,本文分别在音字转换和词性标注任务中对以上两种模型进行验证。
     第三,针对语言模型中的数据稀疏问题,本文提出基于语义的平滑算法。本文从Hownet和同义词词林等语言学资源中提取汉语语义信息,将其分别与回退平滑和插值平滑技术相结合,设计基于语义的回退和插值平滑算法,从而提高平滑后语言模型的性能。并且,本文设计基于迭代的参数优化方法,自动优化平滑算法中的各项参数。
     第四,本文将语言建模技术应用到汉语键盘输入任务中。首先,针对手机等移动设备上的拼音汉字输入法,本文提出键音转换问题,同时给出两种解决方案,并在实验中加以验证。接下来,本文提出利用用户输入的拼音信息来提高汉语音字转换系统的性能。一种基于类别的最大熵马尔科夫模型被用来高效地构建音字转换系统,使之能够同时利用用户输入的拼音信息和汉字之间的约束信息。实验表明,拼音信息能够有效提高汉语音字转换系统性能。
Language model is a mathematic description of natural language, which is usu-ally presented as a formalized system to explain and exploit the principle of language.The study of language model is fundamental in the research area of natural languageprocessing. Its achievements can apply to Chinese Pinyin-to-Character Conversiontask directly, and promotes many tasks of natural language processing, includingspeech recognition, handwriting recognition, optical character recognition, machinetranslation, information retrieval, multi-level processing of corpus, and so on.
     In these days, the quantity of digit text increases rapidly on the internet. Thestochastic techniques become the main way to language modeling due to its high ac-curacy and strong robustness. The stochastic language model becomes the most preva-lent language model. However, it takes natural language as a stochastic chain fromthe statistical view purely, ignoring the characters of language. It is one of the chal-lenges to involve linguist knowledge in stochastic language model. However, there aretwo problems to combine the linguist knowledge with the current stochastic languagemodel directly: 1. it is difficult to acquire the precise linguist knowledge automati-cally; 2. it is hard to integrate the linguist knowledge into the current framework oflanguage model.
     In order to solve the above problems, this paper represents the positional infor-mation of language element formally and exploits their principles in language mod-eling. Concretively speaking, language element plays different roles in different por-tions of sentence due to its syntax and semantic property. Therefore, the probabilityof language element is relevant to its positional information. In order to exploit thepositional information, the stationary hypothesis of traditional language element is re-laxed and the non-stationary hypothesis is made: the occurrence of current languageelement is determined partially by its position in the sequence of language elements.Based on the above hypothesis, the paper focuses on the studies of the theory, thetechnique, the method and the related issues of non-stationary language modeling. Fi-nally, these techniques are applied to the Chinese Pinyin-to-Character conversion taskso as to improve the performance. The paper mainly consists of four parts:
     Firstly, the paper does the resource preparation and proposes a Chinese lexi-con construction algorithm for language modeling. It combines the Chinese lexiconconstruction with language modeling and presents a unified framework of iterationalgorithm. The performance of current language model is improved by optimizingthe lexicon. Under the framework, a multi-feature lexicon construction algorithm isproposed which exploits both the statistical feature and the lexical feature. Finally,two heuristic methods are proposed to make the system self-adaptive the domain oftraining corpus.
     Secondly, the paper studies the theory and the technique of non-stationary lan-guage modeling. First of all, the paper provides the formal representation of positionalinformation of language element, based on which the principles of non-stationaryproperty of language element are induced. Then these principles are involved inthe process of language modeling. Two non-stationary language models, the non-stationary Ngram model and the non-stationary Maximum Entropy Markov model,are proposed. Several related issues, including the model construction, the trainingalgorithm, the smoothing technique and the model complexity, are well discussed.Finally, these models are verified on the Pinyin-to-Character conversion task and thePos-tagging task respectively.
     Thirdly, the paper proposes the semantic-based smoothing technique so as tosolve the data sparseness problem of language model. It acquires the semantic infor-mation from Hownet and TongyiciCilin, and then combines them with the traditionalsmoothing techniques. The iterative algorithms are designed to optimize the parame-ters automatically.
     Fourthly, the paper applies the techniques of language modeling on Chinese key-board input method. First of all, it proposes the Key-to-Pinyin conversion task for thedigit keyboard of mobile devices. Two kinds of solutions are provided and verified inthe experiments. Then, it improves the performance of the current Pinyin-to-Characterconversion system by exploitation of the pinyin constraint inputted by users. A class-based Maximum Entropy Markov model is proposed to describe both the constraintsfrom pinyin and the ones between characters. The experimental results show that thepinyin constraints improve the performance of Pinyin-to-Character conversion taskeffectively.
引文
1 X. L. Wang, D. S. Yeung, J. N. K. Liu, et al. A Hybrid Language Model Basedon Statistics and Linguistic Rules. International Journal of Pattern Recognitionand Artificial Intelligence. 2005, 19(1):109–128
    2 F. Jelinek. Self-organized Language Modeling for Speech Recognition. IEEEInternational Conference on Acoustics, Speech and Signal Processing (ICASSP1989). 1989:587–595
    3 R. Srihari, C. Baltus. Combining Statistical and Syntactic Methods in Recog-nizing Handwritten Sentences. In AAAI Symposium: Probabilistic Approachesto Natural Language. 1992:121–127
    4 G. Nagy. On the Frontiers of Ocr. Processings of IEEE. 1992
    5 P. F. Brown, S. A. D. Pietra, V. J. D. Pietra, et al. The Mathematics of StatisticalMachine Translation: Parameter Estimation. Computational Linguistics. 1993,19(2):263–311
    6 J. F. Gao, J. Y. Nie, G. Y. Wu, et al. Dependence Language Model for Informa-tion Retrieval. The 20th Annual International ACM SIGIR Conference on Re-search and Development in Information Retrieval(SigIR2004). 2004:170–177
    7 张民. 基于弱限制随机上下文相关文法的汉语树库构造方法研究. 哈尔滨工业大学博士学位论文. 1997
    8 付国宏. 汉语句法歧义消解的统计方法研究. 哈尔滨工业大学博士论文.2001
    9 赵岩. 基于统计语言模型的汉语词法分析研究. 哈尔滨工业大学博士论文. 2005
    10 A. Kobayashi, H. Fujioka. Personalizing a Web Site for Cellular Phones.Proceedings of the IEEE/WIC International Conference on Web Intelligence.2003:432–435
    11 R. Rosenfeld. Two Decades of Statistical Language Modeling: Where Do WeGo from Here. Proc.of the IEEE. 2000, 88:1270–1278
    12 K. W. Church. Speech and Language Processing: Where have We Been andWhere Are We Going? Tech. rep., Eurospeech, Geneva, Switzerland, 2003
    13 J. F. Gao, C. Y. Lin. Introduction to the Special Issue on Statistical Lan-guage Modeling. ACM Transactions on Asian Language Information Process-ing (TALIP). 2004, 3(2):87–93
    14 V. N. Vapnik. The Nature of Statistical Learning Theory. Springer-Verlag, 1995
    15 R. Rosenfeld. A Maximum Entropy Approach to Adaptive Statistical LanguageModeling. Computer, Speech and Language,. 1996, (10):187–228
    16 A. McCallum, D. Freitag, F. Pereira. Maximum Entropy Markov Models forInformation Extraction and Segmentation. Proceedings of the 7th InternationalConference on Machine Learning (ICML 2000). 2000:591–598
    17 C. Chelba, F. Jelinek. Structured Language Modeling. Computer Speech andLanguage. 2000, 14(4):283–332
    18 B. Roark. Probabilistic Top-down Parsing and Language Modeling. Computa-tional Linguistics. 2001, 27(2):249–276
    19 J. R. Bellegarda. Exploiting Latent Semantic Information in Statistical Lan-guage Modeling. Proceedings of IEEE,. 2000, 88(8):1279–1296
    20 I. Dagan, F. C. N. Pereira, L. Lee. Similarity-based Estimation of Word Cooc-currence Probabilities. The 32nd Annual Meeting of the Association for Com-putational Linguistics. Somerset, New Jersey, 1994:272–278
    21 I. Dagan, L. Lee, F. C. N. Pereira. Similarity-based Models of Word Cooccur-rence Probabilities. Machine Learning. 1999, 34:43–69
    22 R. Kuhn, R. de Mori. A Cache-based Natural Language Model for SpeechRecognition. IEEE Transactions on Pattern Analysis and Machine Intelligence.1990, 12(6):570–583
    23 G. D. Z. K. T. Lua. Word Association and Mi-trigger-based Language Mod-eling. The International Conference on Computational Linguistics and theAnnual Meeting of the Association for Computational Linguistics(COLING-ACL). Montreal,Canada, 1998
    24 P. Bühlmann. Model Selection for Variable Length Markov Chains and Tuningthe Context Algorithm. Annals of the Institute of Statistical Mechanics. 2000,52(2):287–315
    25 J. F. Gao, H. Yu, W. Yuan, et al. Minimum Sample Risk Methods for LanguageModeling. Proceedings of the conference on Human Language Technology andEmpirical Methods in Natural Language Processing. 2005
    26 C. D. Manning, H. Schutze. Foundation of Statistic Natural Language Process-ing. The MIT Press, 1999
    27 H. T. Ng, J. K. Low. Chinese Part-of-speech Tagging: One-at-a-time Or All-at-once? Word-based Or Character-based? Proceedings of the 2004 Conferenceon Empirical Methods in Natural Language Processing (EMNLP). Barcelona,Spain, 2004
    28 X. L. Wang, Q. C. Chen, D. S. Yeung. Mining Pinyin-to-character ConversionRules from Large-scale Corpus:a Rough Set Approach. IEEE Transaction onSystems Man and Cybernetics, Part B: Cybernetics. 2004, 34(2):834 –844
    29 M. Collins, B. Roark, M. Saraclar. Discriminative Syntactic Language Model-ing for Speech Recognition. In proceedings of the Annual Meeting of the Asso-ciation for Computational Linguistics (ACL). Ann Arbor, USA, 2005:507–514
    30 D. Chiang. A Hierarchical Phrase-based Model for Statistical Machine Trans-lation. In proceedings of the Annual Meeting of the Association for Computa-tional Linguistics (ACL). 2005
    31 H. Cui, M. Y. Kan, T. S. Chua. Generic Soft Pattern Models for DefinitionalQuestion Answering. Proceedings of the 28th annual international ACM SI-GIR conference on Research and development in information retrieval. Brazil,2005:384–391
    32 J. F. Gao, H. L. Qi, X. S. Xia, et al. Linear Discriminant Model for InformationRetrieval. Proceedings of the 28th annual international ACM SIGIR conferenceon Research and development in information retrieval. Brazil, 2005:290–297
    33 ACE. The Ace Evaluation Plan. http://www.nist.gov/speech/tests/ace/index.htm.2005
    34 S. Harabagiu, F. Lacatusu. Topic Themes for Multi-document Summarization.Proceedings of the 28th annual international ACM SIGIR conference on Re-search and development in information retrieval. 2005:202–209
    35 J. Morris, G. Hirst. Lexical Cohesion, the Thesaurus and the Structure of Text.Computational Linguistics. 1991, 17(1):21–48
    36 J. H. Li, G. Hirst. Semantic Knowledge in Word Completion. The SeventhInternational ACM SIGACCESS Conference on Computers and Accessibility(ASSETS). 2005:121–128
    37 J. M. Bernardo, A. F. M. Smith. Bayesian Theory. Wiley Series in Probabilityand Statistics, New York: Wiley, 1996
    38 C. E. Shannon. A Mathematical Theory of Communication. Bell SystemsTechnical. 1948, 27(3):379–423, 623–656
    39 B. Hajek. Information Measures for Discrete Random Fields. IEEE Transac-tions on Information Theory. 1999, 45(6):2210–2211
    40 I. J. Myung. Tutorial on Maximum Likelihood Estimation. Journal of Mathe-matical Psychology. 2003, (47):90–100
    41 G. K. Zipf. Psycho-biology of Languages. The MIT Press, 1935
    42 P. F. Brown, V. J. D. Pietra, P. V. deSouza, et al. Class-based N-gram Models ofNatural Language. Computational Linguistics. 1992, 18(4):467–479
    43 F. Pereira, N. Tishby, L. Lee. Distributional Clustering of English Words. 30thAnnual Meeting of the Association for Computational Linguistics. Morristown,New Jersey: Association for Computational Linguistics, 1993:183–190
    44 J. F. Gao, J. T. Goodman, J. Miao. The Use of Clustering Techniques for Lan-guage Model - Application to Asian Language. Computational Linguistics andChinese Language Processing. 2001, 6(1):27–60
    45 J. F. Gao, J. T. Goodman, G. H. Cao, et al. Exploring Asymmetric Clusteringfor Statistical Language Modeling. The Annual Meeting of the Association forComputational Linguistics. University of Pennsylvania, Philadelphia, PA, USA,2002
    46 R. Kneser, H. Ney. Improved Clustering Techniques for Class-based StatisticalLanguage Modeling
    47 J. P. Ueberla. An Extended Clustering Algorithm for Statistical Language Mod-els. IEEE Transactions on Speech and Audio Processing. 1996, 4(4):313–316
    48 T. R. Niesler. Category Based Statistical Language Models. Ph.D. thesis, Uni-versity of Cambridge. 1997
    49 H. Li, N. Abe. Word Clustering and Disambiguation Based on Co-occurenceData. The International Conference on Computational Linguistics and theAnnual Meeting of the Association for Computational Linguistics(COLING-ACL). 1998:749–755
    50 吴根清, 郑方, 金凌, 等. 一种在线递增式语言模型自适应方法. 中文信息学报. 2002, 16(1):60–65
    51 曲卫民, 张俊林, 孙乐等. 基于记忆的自适应汉语语言模型的研究. 中文信息学报. 2003, 17(5):13–19
    52 曲卫民, 张俊林, 孙乐. 基于主题的汉语语言模型的研究. 计算机研究与发展. 2004, 40(9):1368–1374
    53 R. Kuhn. Speech Recognition and the Frequency of Recently Used Words: AModified Markov Model for Natural Language. In the 12th International Con-ference on Computational Linguistics (COLING). Budapest, 1988:348–350
    54 F. Jelinek, B. Mérialdo, S. Roukos, et al. A Dynamic Language Model forSpeech Recognition. In Proceedings of the DARPA Workshop on Speech andNatural Language. 1991:293–295
    55 X. D. Huang, F. Alleva, H. Hon, et al. The Sphinx-ii Speech Recognition Sys-tem: An Overview. Computer, Speech, and Language. 1993, 2(137-148)
    56 R. Rosenfeld. Adaptive Statistical Language Modeling: A Maximum EntropyApproach. Ph.d. thesis, Carnegie Mellon University. 1994
    57 E. Novak, K. Ritter. The Curse of Dimension and a Universal Method for Nu-merical Integration. In Multivariate Approximation and Splines. 1998:177–188
    58 L. Saul, F. Pereira. Aggregate and Mixed-order Markov Models for StatisticalLanguage Processing. In Proceedings of the Second Conference on EmpiricalMethods in Natural Language Processing (EMNLP). 1997:81–89
    59 P. Dupont, R. Rosenfeld. Lattice Based Language Models. Tech. rep., School ofComputer Science, Carnegie Mellon University, Pittsburgh, PA, 1997. CMU-CS-97-173
    60 R. Blasig. Combination of Words and Word Categories in Varigram Histo-ries. IEEE International Conference on Acoustics, Speech and Signal Process-ing (ICASSP). 1999, 1:529–532
    61 J. F. Gao, H. Suzuki. Unsupervised Learning of Dependency Structure for Lan-guage Modeling. Proceedings of the 41st Annual Meeting on Association forComputational Linguistics. Sapporo, Japan, 2003:521–528
    62 C. Lee, G. G. Lee. Dependency Structure Language Model for InformationRetrieval. Electronics and Telecommunications Research Institute. 2006,28(3):337–346
    63 L. qi Gao, Y. Zhang, T. Liu, et al. Word Sense Language Model for Informa-tion Retrieval. Proceedings of the 3rd Asia Information Retrieval Symposium.Singapore, 2006:158–171
    64 C. Lee, G. G. Lee, M. Jang. Dependency Structure Language Model for TopicDetection and Tracking. Information Processing and Management: an Interna-tional Journal archive. 2007, 43(5):1249–1259
    65 E. Jaynes. Information Theory and Statistical Mechanics. Physics Reviews.1957, 106:620–630
    66 S. D. Pietra, V. D. Pietra, R. Mercer, et al. Adaptive Language Modeling UsingMinimum Discriminant Estimation. In Proceedings of the International Con-ference on Acoustics, Speech and Signal Processing (ICASSP). San Francisco,1992
    67 A. Berger, S. D. Pietra, V. D. Pietra. A Maximum Entropy Approach To. NaturalLanguage Processing. Computational Linguistics. 1996, 22(1)
    68 K. Nigam, J. Lafferty, A. McCallum. Using Maximum Entropy for Text Clas-sification. In Proceedings of the IJCAI Workshop on Information Filtering.Stockholm, 1999:421–426
    69 A. Ratnaparkhi. A Maximum Entropy Model for Part-of-speech Tagging. Pro-ceedings of the 1996 Conference on Empirical Methods in Natural LanguageProcessing (EMNLP). University of Pennsylvania, 1996:145–152
    70 R. Koeling. Chunking with Maximum Entropy Models. In Proceedings ofCoNLL and LLL. Lisbon, Portugal, 2000:139–141
    71 A. Ratnaparkhi, S. Roukos, R. T. Ward. A Maximum Entropy Model for Pars-ing. Proceedings of IEEE International Conference on Acoustics, Speech andSignal Processing (ICASSP). Yokohama, Japan, 1994:803–806
    72 J. Darroch, D. Ratcliff. Generalized Iterative Scaling for Log-linear Models.The Annals of Mathematical Statistics. 1972, (43):1470–1480
    73 S. D. Pietra, V. D. Pietra, J. Lafferty. Inducing Features of Random Fields. IEEETransactions on Pattern Analysis and Machine Intelligence. 1997, (19):380–393
    74 L. R. Rabiner. A Tutorial on Hidden Markov Models and Selected Applicationsin Speech Recognition. Proceedings of the IEEE. 1989, 77(2):257–286
    75 J. Lafferty, A. McCallum, F. Pereira. Conditional Random Fields: ProbabilisticModels for Segmenting and Labeling Sequence Data. In Proceedings of Interna-tional Conference on Machine Learning (ICML). Williams College, 2001:282–289
    76 F. Sha, F. Pereira. Shallow Parsing with Conditional Random Fields.In Proceedings of the 2003 Human Language Technology Conference andNorth American Chapter of the Association for Computational Linguistics(HLT/NAACL-03). 2003:213–220
    77 A. McCallum, F. F. Feng. Chinese Word Segmentation with Conditional Ran-dom Fields and Integrated Domain Knowledge. The Learning Workshop. Snow-bird, Utah, 2003
    78 T. Cohn, P. Blunsom. Semantic Role Labelling with Tree Conditional RandomFields. Conference on Natural Language Learninig (CoNLL). Ann Arbor, 2005
    79 W. Jiang, Y. Guan, X. L. Wang. A Pragmatic Chinese Word SegmentationApproach Based on Mixing Models. International Journal of ComputationalLinguistics and Chinese Language Processing. 2006, 11(4)
    80 赵健. 条件概率模型研究及其在中文名实体识别中的应用. 哈尔滨工业大学工学博士论文. 2006
    81 P. Blunsom, T. Cohn. Discriminative Word Alignment with Conditional Ran-dom Fields. In proceedings of the Annual Meeting of the Association for Com-putational Linguistics (ACL). 2006
    82 J. Lafferty, Y. Liu, X. Zhu. Kernel Conditional Random Fields: Representationand Clique Selection. International Conference on Machine Learning. 2004
    83 C. Sutton, K. Rohanimanesh, A. McCallum. Dynamic Conditional RandomFields: Factorized Probabilistic Models for Labeling and Segmenting SequenceData. International Conference on Machine Learning. 2004
    84 S. Sarawagi, W. W. Cohen. Semi-markov Conditional Random Fields for Infor-mation Extraction. In the Annual Conference on Neural Information ProcessingSystems. 2004
    85 D. Roth, W. T. Yih. Integer Linear Programming Inference for ConditionalRandom Fields. International Conference on Machine Learning. 2005
    86 N. Chomsky. Syntactic Structures. The Hague Prsee, 1957
    87 N. Chomsky. Aspects of the Theory of Syntax. Cambridge: The MIT Press,1968
    88 N. Chomsky. Rules and Representations. Blackwell: The Oxford Press, 1980
    89 S. Haykin. Neural Networks: A Comprehensive Foundation. Pearson Educa-tion, Inc, 1999
    90 B. Widrow, M. E. Hoff. Adaptive Switching Circuits. IRE Western ElectricShow and Convention Record. 1960, (4):96–104
    91 P. Werbos. The Roots of Backpropagation. John Wiley and Sons, 1974. IncludesWerbos’s 1974 Harvard Ph.D. thesis, Beyond Regression
    92 J. L. Elman. Finding Structure in Time. Cognitive Science. 1990, 14(2):179–211
    93 M. G. D. Risto Miikkulainen. Natural Language Processing with Modular PdpNetworks and Distributed Lexicon. Cognitive Science. 1991, 15(3):343–399
    94 R. Miikkulainen. Natural Language Processing with Subsymbolic Neural Net-works. In Neural Network Perspectives on Cognition and Adaptive Robotics.1997
    95 S. Bengio, Y. Bengio. Taking on the Curse of Dimensionality in Joint Distri-butions Using Neural Networks. IEEE Transaction on Neural Networks specialissue on data mining and knowledge discovery. 2000, 11(3):550–557
    96 Y. Bengio, S. Bengio. Modeling High-dimensional Discrete Data with Multi-layer Neural Networks. Advances in Neural Information Processing Systems.2000, (2):400–406
    97 Y. Bengio, R. Ducharme, P. Vincent. A Neural Probabilistic Language Model.Neural Information Processing Systems. 2000
    98 Y. Bengio, R. Ducharme, P. Vincent, et al. A Neural Probabilistic LanguageModel. Journal of Machine Learning Research. 2003, 3(6):1137–1155
    99 H. Schwenk, J. L. Gauvain. Connectionist Language Modeling for Large Vo-cabulary Continuous Speech Recognition. IEEE International Conference onAcoustics, Speech and Signal Processing (ICASSP). Location: Orlando, FL,USA, 2002, 1:765–768
    100 M. H. Christiansen, N. Chater, M. S. Seidenberg. Connectionist Natural Lan-guage Processing: The State of the Art. Special Issue of Cognitive Science.1999, 23(4):415–634
    101 Y. M. Wang. The Three Principles of Computer Chinese Character KeyboardDesign. Chinese Journal of Computers. 2005, 28(5):870–881
    102 X. L. Wang. Chinese Input System by Pinyin Sentence: Insun. Journal ofChinese Information Processing. 1993, 7(2):45–54
    103 ISO. Information and Documentation - Romanization of Chinese. ISO 7098.1991
    104 Y. Chen. Chinese Language Processing. Shanghai education publishing com-pany, 1997
    105 W. L. Hsu, K. J. Chen. The Semantic Analysis in Going - an Intelligent ChineseInput System. Proceedings of the Second Joint Conference of ComputationalLinguistics. Shia men, 1993:338–343
    106 J. J. Kuo. Phonetic-input-to-character Conversion System for Chinese UsingSyntactic Connection Table and Semantic Distance. Computer Processing ofChinese and Oriental Languages. 1995, 10(2):195–210
    107 张瑞强, 王作英, 陆大金. 关于汉语音字转换中语言模型零概率的问题.电子学报. 1998, 26(8):43–46
    108 徐志明, 王晓龙, 姜守旭. 一种语句级汉字输入技术的研究. 高技术通讯.2000, (1):51–56
    109 武健. 汉语语音识别中统计语言模型的构建及其应用. 清华大学, 工学硕士学位论文. 2000
    110 J. F. Gao, J. T. Goodman, M. Li, et al. Toward a Unified Approach to Statis-tical Language Modeling for Chinese. ACM Transactions on Asian LanguageInformation Processing. 2002, 1(1):3–33
    111 S. H. Jeffreys. Theory of Probability. 2 edn. Oxford: The Clarendon Press,1948
    112 S. M. Katz. Estimation of Probabilities from Sparse Data for the LanguageModel Component of a Speech Recognizer. IEEE Transactions on Acoustics,Speeech and Signal Processing. 1987, 35(3):400–401
    113 F. Jelinek, R. L. Mercer. Interpolated Estimation of Markov Source Parametersfrom Sparse Data. Pattern Recognition in Practice. 1980:381–397
    114 W. L. Hsu. Chinese Parsing in a Phoneme-to-character Conversion SystemBased on Semantic Pattern Matching. International Journal on Computer Pro-cessing of Chinese and Oriental Languages. 1995, 40:227–236
    115 J. L. Tsai, W. L. Hsu. Applying an Nvef Word-pair Identifier to the ChineseSyllable-to-word Conversion Problem. Proceedings of COLING. Taipei, 2002
    116 J. L. Tsai, T. J. Chiang, W. L. Hsu. Applying Meaningful Word-pair Identifier tothe Chinese Syllable-to-word Conversion Problem. Proceedings of ROCLING.2004
    117 J. L. Tsai. Applying a Mix Word-pair Identifier to the Chinese Syllable-to-word Conversion Problem. In 2th International Joint Conference on NaturalLanguage Processing (IJCNLP 2005). Jeju, Korea, 2005
    118 J. L. Tsai. Using Word Support Model to Improve Chinese Input System. Pro-ceedings of the 21st International Conference on Computational Linguistics and44th Annual Meeting of the Association for Computational Linguistics (COL-ING -ACL06). Sydney, Australia
    119 张瑞强, 王作英, 张建平. 带拼音纠错的汉语音字转换技术. 清华大学学报(自然科学版). 1997, 37(10):9–11
    120 S. F. Chen, R. Rosenfeld. A Survey of Smoothing Techniques for Me Models.IEEE Transactions on Speech and Audio Processing. 2000, 8(1):37–50
    121 俞士汶, 朱学锋, 王惠, 等.《现代汉语语法信息词典详解》. 1 edn. 清华大学出版社, 1998
    122 梅家驹, 竺一鸣, 高蕴琦, 等. 同义词词林. 上海辞书出版社, 1983
    123 Z. D. Dong, Q. Dong. Hownet and the Computation of Meaning. World Scien-tific Publishing Company, 2006
    124 P. Fung. Extracting Key Terms from Chinese and Japanese Texts. The Interna-tional Journal on Computer Processing of Oriental Language, Special Issue onInformation Retrieval on Oriental Languages. 1998:99–121
    125 J. A. T. Thomas M. Cover. Elements of Information Theory. New York: JohnWiley And Sons, Inc, 1991
    126 I. J. Good. The Population Frequencies of Species and the Estimation of Popu-lation Parameters. Biometrika. 1953, 40(237-264)
    127 R. Kneser, H. Ney. Improved Backing-off for M-gram Language Modeling. InProceedings of the IEEE International Conference on Acoustics, Speech andSignal Processing. 1995, 1:181–184
    128 S. C. Martin, H. Ney, J. Zaplo. Smoothing Methods in Maximum EntropyLanguage Modeling. IEEE International Conference on Acoustics, Speech andSignal Processing (ICASSP 1999). Phoenix, AR, 1999, 1:545–548
    129 S. F. Chen, J. Goodman. An Empirical Study of Smoothing Techniques forLanguage Modeling. Computer Speech and Language. 1999, 13:359–394
    130 U. Essen, V. Steinbiss. Coocurrence Smoothing for Stochastic Language Mod-eling. IEEE International Conference on Acoustics, Speech and Signal Process-ing (ICASSP). 1992, 1:161–164
    131 Z. D. Dong, Q. Dong. Hownet and the Computation of Meaning. World Scien-tific Publishing Company, 2006
    132 刘挺. 同义词词林(扩展版), 2005
    133 Www.bj.chinanews.com.cn, 2006
    134 I. S. MacKenzie, R. W. Soukoreff. A Character-level Error Analysis Techniquefor Evaluating Text Entry Methods. Proceedings of the Second Nordic Confer-ence on Human-Computer Interaction (NordiCHI 2002). New York, 2002
    135 I. S. MacKenzie, R. W. Soukoreff. Text Entry for Mobile Computing: Modelsand Methods, Theory and Practice. In Human-Computer Interaction. 2002,17(2):147–198
    136 R. W. Soukoreff, I. S. MacKenzie. Input-based Language Modelling in theDesign of High Performance Text Input Techniques. Proceedings of GraphicsInterface. 2003:89–96
    137 R. W. Soukoreff, I. S. MacKenzie. Recent Developments in Text-entry ErrorRate Measurement. 2004:1425–1428
    138 B. Q. Liu, X. L. Wang. An Approach to Machine Learning of Chinese Pinyin-to-character Conversion for Small-memory Application. IEEE Proceedings of In-ternational Conference on Machine Learning and Cybernetics. Beijing, China,2002
    139 G. Q. Wu, F. Zheng. A Method to Build a Super Small But Practically Ac-curate Language Model for Handheld Devices. Journal Computer Science andTechnology. 2003, 18(6):747–755
    140 G. D. Forney. The Viterbi Algorithm. Proceedings of the IEEE. 1973, 61:268–278
    141 俞士汶, 段慧明, 朱学锋, 等. 北大语料库加工规范:切分·词性标注·注音. Journal of Chinese Language and Computing. 2003, 13(2):121–158
    142 T. Emerson. The Second International Chinese Word Segmentation Bakeoff. InProceedings of The Fourth Sighan Workshop on Chinese Language Processing.2005:123–133

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700