用户名: 密码: 验证码:
基于领域本体的语义检索系统研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
随着网络技术的快速发展和网上信息资源的激增,如何在网上迅速而准确的检索出用户所需资源,成为目前信息检索领域所面临的一个关键问题。传统的信息检索系统主要是基于关键字匹配或基于主题分类进行检索,检索时往往会返回大量无关结果,在查全率和查准率上也均不能令用户满意。
     随着本体的广泛应用,人们尝试利用本体中概念之间的关系来提高检索系统的语义能力,因为本体不仅作为一种能够在语义和知识层次上描述信息系统的概念模型建模工具,而且具有良好的概念层次结构和对逻辑推理的支持。在将本体技术融合到传统的信息检索系统之后,检索技术也从目前基于关键词层面提高到基于知识的层面上来。
     本文对基于领域本体的语义检索技术进行研究。首先阐述了基于本体的语义检索系统的背景、意义以及国内外研究现状,概述了语义检索和本体的相关概念和理论。接着针对检索所需要的两项关键技术—文档的语义标注和查询扩展进行了深入的研究。在文档的语义标注部分,提出了一种基于领域本体的语义标注改进算法,该算法利用领域本体知识的语义环境和资源文档结构两方面的信息来进行标注。在查询扩展部分也主要是利用领域本体中概念属性关系对用户的查询进行处理,以实现语义上的查询扩展。其次,在以上理论研究的基础上,本文选取100篇服装方面的文档作为实验数据,设计并初步实现了基于服装领域本体的语义检索系统。测试结果表明,该系统在查全率和查准率上都有一定提高。最后,对全文进行总结,并对未来的工作进行了一些展望。
With the rapid development of Internet technology and increase in network information, how to search the required resource from Internet quickly and accurately has becoming an important topic in current information retrieval. Traditional information retrieval systems are searched based on key words match or top taxonomy, which will always return a lot of irrelevant results and can not be satisfy the users' demand of recall and precision.
     With the ontology being used widely, people attempt to use the relations among concepts of ontology to improve the retrieval system's semantic. Ontology can not only describe the information system in hierarchical of semantic and knowledge, but also has a good concept hierarchical structure, and support reasoning. After the traditional information retrieval system combined with ontology, the retrieval technology also transfer from based on key words to the new version based on knowledge.
     The paper researches the information retrieval systems based on domain ontology. First, the background,significance and the current research situation both abroad and at home are introduced.Besides, the relevant theories of ontology and information retrieval are outlined. Then the two key technologies, which are the semantic annotation of document and query expansion, are deeply studied. The paper proposes an improved semantic annotation method based on domain ontology in the part of semantic annotation. The method annotates document by the semantic context of the ontology entity and the structural information of the document. In the part of query expansion, it mainly use the ontology's concepts and attributes to deal with the user question, so as to realize semantic query expansion. After that,based on the above theoretical research, we chose 100 texts of the dress as the experimental data. I designed and basically achieved the dress domain ontology-based semantic retrieval system. The testing results proved that it has much higher query recall rate and precision rate. Finally, the whole text is summarized and the prospects for the future search work is advanced.
引文
[1]李晓明,张岩.搜索引擎技术及发展趋势[C],中国计算机科学技术发展报2006.
    [2]杨文忠,章兢.一种基于常用搜索引擎的智能信息检索系统[J].微计算机应用,2007,28(2):166—169
    [3]GruberT. Atranslation approach to Portable oniologies[J]. kowledge Acquisition,1993, 5(2):199-220.
    [4]张亮亮.基于领域本体的语义检索研究[D].长春:吉林大学,2009
    [5]黄敏.基于本体的信息检索方法研究[D].上海:上海交通大学,2007.
    [6]陈莉.基于领域本体的智能搜索系统的研究和应用[D].南京:南京航空航天大学,2008.
    [7]Fernandez M, Gomez-Perez A, Juristo N. METHONTOLOGY:From Ontological Art TowardsOntological Engineering[J]. AAAI-97 Spring Symposium on Ontological Engineering 1997:33-40.
    [8]祁延莉,赵丹群.信息检索概论[M].中国北京,北京大学出版社,2006
    [9]Tim Berners-Lee,James Hendler,Ora Lassila. The Semantic Web[J]. Scientific American, 2001,284 (5):34-43.
    [10]李国辉等.信息组织与检索[M].科学出版社,2003
    [11]张玉明,南凯,马永征.基于本体的信息检索模型研究[J].计算机应用研究,2008,25(8):2241—2244.
    [12]廖军.基于领域本体的信息检索研究[D].长沙:中南大学,2007.
    [13]邓志鸿,唐世渭,张铭,杨冬青,陈捷.Ontology研究综述[J].北京大学学报(自然科学版),2002,38(5):730—738
    [14]杜小勇,李曼,王珊.本体学习研究综述[J].软件学报,2006,17(9):1837—1847.
    [15]陈宏.基于本体的知识表示研究[D].长沙:长沙理工大学,2006.
    [16]P. M. E. Debra, R. D. J Post. Information Retrieval in the World-Wide Web[R]:Make Client-based Searching Feasible. Available From http://cite seem.j.nee. cam199604.html
    [17]Frank Manola, Eric Miller. RDF primer [EB/OL]. W3C Working Draft, 2004.http://www.w3.org/TR/rdf-primer/
    [18]宋炜等.语义网简明教程(M].高等教育出版社,2004
    [19]D Bahle,H E Williams, and J Zobel. Efficient phrase—querying with an auxiliary index[J]. in Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval.2005:215-221
    [20]KLEIN M, BERNSTEIN A. Searching services on the semantic Web using process ontologies[C]//Proceedings of the International Semantic Web Working Symposium(SWWS). Amster2, dam:IOS Press,2001:159-172.
    [21]Stojanovic L, Alexander Madeche, Boris Motik. User-driven Ontology Evaluation Manage-ment [C]. In:A.Gomez-Perez,eds.Proc.of the 13thInternational Conference on Knowledge Engineering and Knowledge Management:Ontology and the Semantic Web.London, UK: Sprin-ger-Verlag,2002,285-300.
    [22]鞠彦辉,刘闯.国外典型语义标注平台的比较研究[J].现代情报,2009,29(1):215—217.
    [23]邹亮,廖述梅.基于本体的语义标注工具比较与分析[J].计算机应用,2004,第24卷: 328——330.
    [24]万捷,腾至阳.本体论在基于内容信息检索中的应用[J].计算机工程,2003,29(4):122—124
    [25]王海涛,曹存根,高颖.基于领域本体的半结构化文本知识自动获取方法的设计和实现[J].计算机学报,2005,28(12):2010—2018.
    [26]袁柳,李战坏,陈世亮.基于本体的Deep Web数据标注[J].软件学报,2008,19(2):237-245
    [27]Ehrig, M., Hasse, P., Hefke, M.,& Stojanovic, N. (2004). Similarity for Ontologies-a Comprehensive Framework[C]. Paper presented at the 2004 Workshop Enterprise Modelling and Ontology:Ingredients for Interoperability, PAKM
    [28]Guarino N, Masolo C, Vetere G. OntoSeek:Content—Based Access to Web[J]. IEEE. Intelligent System.1999,14(3):70-80.
    [29]常平梅,李冠宇,张俊.基于本体集成的语义标注模型设计[J].计算机工程与设计,2010,31(5):1125—1129
    [30]张玉芳,艾东梅,黄涛,熊忠阳.结合编辑距离和Google距离的语义标注方法[J].计算机应用研究,2010,27(2):555—557
    [31]时念云,杨晨.基于领域本体的语义标注方法研究[J].计算机工程与设计,2007,28(24):5985—5987
    [32]陈叶旺,李文,彭鑫,赵文耘.基于本体的文档语义标注改进方法[J].东南大学学报(自然科学版),2009,39(6):1109—1113
    [33]Ming-Che Lee, Kun Hua Tsai, Tzone Wang. A practical ontology query expansion algorithm for semantic-aware learning objects retrieval[J]. Computers Education,2008, (50):1240-1257.
    [34]黄名选,严小卫,张师超.查询扩展技术进展与展望[J].计算机应用与软件,2007,24(11):1—4.
    [35]田萱,杜小勇,李海华.语义查询扩展中词语-概念相关度的计算[J].软件学报,2008,19(8):2043—2053.
    [36]郎皓,王斌,李锦涛,丁凡.文本检索的查询性能预测[J].软件学报,2008,19(2):291—300.
    [37]赵军,金千里,徐波.面向文本检索的语义计算[J].计算机学报,2005,28(12);2068—2078.
    [38]张敏,宋睿华,马少平. 基于语义关系查询扩展的文档重构方法[J].计算机学报,2004,27(10):1395—1401
    [39]聂卉.基于本体的查询扩展与规范[J].知识组织与知识管理,2007年第3期:35—38.
    [40]林国俊,叶飞跃,耿冬,郑国良.基于语义的概念查询扩展[J].计算机工程与设计,2009,30(6):1502—1504.
    [41]曹泽文,钱杰,张维明,邓苏.一种综合的概念相似度计算方法[J].计算机科学,2007,34(3):174-175.
    [42]刘爱军.基于领域本体的语义信息检索及相关技术研究[D].西安:西北大学,2008.
    [43]张婕.基于本体的语义检索系统逻辑模型研究[D].长春:吉林大学,2010.
    [44]龚才春,黄玉兰,许洪波.基于多重索引模型的大规模词典近似匹配算法[C].第三届全国信息检索与内容安全学术会议
    [45]中国互联网络信息中心.中国互联网络发展状况统计报告[EB/OL]. http://www.cnnic.net.cn/html/Dir/2010/01/15/5767.htm,2010-01-15.
    [46]Dave Reynolds.Jena 2 Inference support, [EB/OL].http://jena.sourceforge. net/inference/index.html.
    [47]The Apache Jakata Project:Lucene, [EB/OL], http://lucene.apache.org.2006
    [48]e. Hatcher o. Gospodnetic. Lucene in Action [M]. Unite States of American, Manning Publications Co,2005, pp.46-183
    [49]阮佳彬,杨育彬,林金杰,韦伟.基于本体词汇的三维模型语义检索[J].计算机科学,2009,36(2):152—155
    [50]张素静.基于本体的语义检索在轨道交通系统中的应用研究[D].北京:北京交通大学,2010.
    [51]S.Deerwester, S.T.Drmai, G.W.Furnas, T.K.Landauer, R.Harshman. Indexing by Latent semantic analysis[J]. Journal of ACM Transactions on Information Systems,2000.18(1):79-112
    [52]王爱丽,朱欣娟.基于本体的服装领域语义web检索方法[J].西安工程科技学院学报,2007,21(4):489—-493.
    [53]毛平.基于领域本体的文本信息语义检索研究[D].南京:南京理工大学,2007
    [54]Studer R, Benjamins V R, Fensel D. Knowledge Engineering, Principles and Methods[J]. Dataand Knowledge Engineering,1998,25(122):161-197
    [55]Fensel D. ontologies:Silver Bullet for Knowledge Management and Electronie Conuneree[J]. Springer.2001
    [56]郑廷,郑诚.基于Lucene的语义检索系统[J].计算机工程,2008,8(10):89-91.
    [57]李勇,张志刚.基于本体语义检索技术研究[J].计算机工程与科学,2008,30(4):17-19.
    [58]Dmitriev D A, Eiron N, el al. Using annotations in enterprise search[C]. Proc of 15th int'1 Conf on World Wide Web (WWW'06), New York;ACM,2006
    [59]Middleton S, Shadbolt N, David C De Roure. Ontological user profiling in recommend-ersystems[J]. ACM Transactions on In-formation Systems(TOIS),2004,22(1):54-88.
    [60]熊忠阳,李春玲,张玉芳.一种基于领域本体的混合信息检索模型[J].计算机工程,2008,34(21):68—70.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700