基于搜索引擎和数据挖掘的个性化web信息服务

设为首页

收藏本站

网站地图 | English | 公务邮箱

远程访问

NSTL服务站

基于搜索引擎和数据挖掘的个性化web信息服务

详细信息本馆镜像全文| 推荐本文 | | 获取CNKI官网全文

英文题名：Based on a Search Engine and Data Mining of Personalized Web Information Service
作者：李明浩
论文级别：硕士
学科专业名称：软件工程
中文关键词：搜索引擎 ; 数据挖掘 ; 个性化 ; Web信息服务
英文关键词：Search Engine ; Data Mining ; Personalization ; Web Information Service
学位年度：2008
导师：赫枫龄 ; 闫昭
学科代码：081203
学位授予单位：吉林大学
论文提交日期：2008-10-01

摘要

自从互联网诞生以来,随着搜索引擎的广泛应用,Web信息成为人们获得信息的重要途径以后,人们发现在日益爆炸式增长的Web信息中找到自己需要的信息十分困难,而现今的搜索引擎又不能满足人们的这种需要,所以人们迫切需要网络上提供的Web信息服务个性化。为了提供个性化,就需要了解用户的个性,如何了解用户的个性成为研究人员和Web信息服务者所面临的新挑战。
     数据挖掘技术为这种新的挑战找到了一个突破口。本文就是在搜索引擎提供基本Web信息服务的基础上,结合数据挖掘技术提供个性化Web信息服务展开了研究和探索。
     最终通过搜索引擎技术、数据挖掘技术和ASP、Basic、Java语言实现了个性化Web信息服务。
With the advent of the information age, people demand more and widespread information, and in the traditional sense the main sources of information are libraries, newspapers, magazines and books, which have been far from enough to meet this growing demand. Since the birth of the Internet, people find a more convenient and efficient way for information dissemination and access. With the advent of the twenty-first century, Information from internet has become an important way to gain information.
     After then, it has been found growing in the explosive growth of the Web to find information they need information very difficult, the difficulties in setting off the network beyond time and space brought about by the convenience, cost a great deal of time. Therefore, it is an urgent need for a technology and tools to solve this problem, search engines came into being. Search engine Web has greatly enhanced the efficiency of the users reach the information; people no longer like a needle in a haystack to find information and trouble.
     With the search engine application, it is more dependent on the network from access to the information they need, the need has become more diversified and personalized, and the traditional Web search engines to provide the information services are often stereotyped, must not be allowed to become increasingly "harsh" Customer satisfaction. In order to let users get what they need the unique information services, personalized Web information service in order to become a new development direction. In order to provide personalized, need to understand the user's personality, how to understand the user's personality to become researchers and information Web services are facing new challenges. Data mining technology for this new challenge is to find a breakthrough. This article is in the Web search engines to provide basic information services on the basis of the combination of data mining technology to provide personalized Web information service carried out research and exploration.
     The first chapter is part of the introduction, introduced the personalized Web information service background. With the increasingly frequent exchange of people and widespread, people's access to and retrieval of information put forward new demands. With the emergence of the Internet in this important, people using the Internet to retrieve the problems encountered, the search engine came into being, the search engine to enhance the Web in search of information on the efficiency, but the search for such people to work more Request, the emergence of Web services, personalized information.
     The second chapter is part of the search engine, introduced the major search engines and theoretical knowledge. First of all, describes the search engine production and development; followed by the introduction of search engines: the full text search engine and directory search engine; and then discussed the working principle of search engine, the search engine system includes five parts: the discovery and collection of information , Information processing, information retrieval, user interface, database, text in this part of the five discussed in detail; Finally, a search engine that existing problems and development trends, in which a very important development trend that is personalized, that is The focus of the discussion.
     The third chapter is part of the data mining; in order to achieve a personalized Web information service, users need to tap the personalized information, so this chapter discusses the theories about data mining. Introduced the background of data mining, data mining as early as the emergence of search engines, data mining is not the first Web information mining; then analyzed the data mining features: data mining is a dynamic, knowledge is reflected in the results, or rules, The results of data mining have a certain confidence level; introduced the classification of data mining, excavation in accordance with the results of data mining can be divided into broad knowledge of data mining, data mining related knowledge, knowledge of the classification of data mining, data mining cluster of knowledge, knowledge Forecast Data mining, data mining knowledge deviation; focus on details of key data mining techniques, including analysis of the association, classification analysis, cluster analysis, artificial neural network, decision tree, error analysis, pattern analysis of the sequence, and so on; that the final data The mining process, that is, the 5 steps: identifying the target, data preparation, data mining, and interpretation of the results of that, as well as the knowledge and application of maintenance, each step of the main tasks set out in detail.
     The fourth Chapter is the Design of the personalized Web information service, introduced a personalized Web information service system, the logical and physical architecture. Introduced the system into modules, the system includes a total of seven sub-systems: the user interface sub-system, system maintenance personnel sub-system interface, query optimization sub-system, system optimization, system maintenance, search engine subsystems, personalized service subsystem Sub-systems and databases. Next to this summary of the seven sub-systems design was introduced on a personalized Web information service system, as well as the detailed designing.
     The fifth chapter is the list of the realization of parts of the personalized Web information service, which mainly include user interface and data mining part of the implementation.
     The sixth chapter is the conclusion part of the paper on the work done by summary of the conduct, and further work done on.
     Personalized Web information service is of great practical significance. Resource on the Internet is very rich today, if the Web service or the user can not be efficient use of this information on the existence of the lost. As the Web personalization information services in meeting the needs of the user can achieve an unprecedented depth of the user's interest can provide a full and accurate summary of the general, you can efficiently and accurately provide the required information or Web services to Web information society Effectiveness and economic efficiency have been enhanced and will become the next hot Web applications.

引文

[1]汪晓岩,胡庆生等.面向Internet的个性化智能信息提取[J].计算机研究与发展.1999(9):1039-1046.
    [2]应晓敏,窦文华.Internet个性化服务的主要形式[J].计算机世界报,2002 (22):8-10.
    [3]周涛.中文搜索引擎[J].图书馆理论与实践,2001(3):52-53.
    [4]卢亮,张博文.搜索引擎原理、实践与应用[M].电子工业出版社,2007.
    [5]俞立文,赵政.搜索引擎的工作机制[J].微型机与应用,2002(9):31-33.
    [6]Andrea Garratt, Mike Jackson, Peter Burden, Jon Wallis.A Survey of Alternative Designs for a Search Engine Storage Structure[J] . Information and Software Technology, Vol.43, No.11, Oct.2001:661-677.
    [7]宛玲,杨秀丹,等.试析中文搜索引擎的评价标准[J].情报学报,2000,18 (1):28-31.
    [8]唐铭节.论搜索引擎的发展概况及发展趋势[J].情报杂志,2001(5):70-71.
    [9]黄于蓝,王洪,徐端颐,等.搜索引擎技术的新发展—多元搜索引擎系统[J].计算机工程, 2002 (1):4-5.
    [10] Margaret H.Dunham.数据挖掘教程[M].清华大学出版社,2005.
    [11]Feldman and Dagan . Knowledge discovery in textual databases[C] . In Proceedings of the First International Conference on Knowledge Discovery and Data mining (KDD-95), pages 112-117, Montreal Canada, 1995.
    [12]毛国君,段立娟,王实,等.数据挖掘原理与算法[M].清华大学出版社,2005.
    [13]陈登科,胡翠华.数据挖掘技术在远程教育中的应用[J].情报科学,2003,21(4):445-448.
    [14]Jiawei Han, Micheline Kamber著,范明孟,小峰,等译.数据挖掘概念与技术[M].机械工业出版社,2001.
    [15]Mehmed Kantardzic著,闪四清,陈茵,程雁,等译.数据挖掘-概念、模型、方法和算法[M].清华大学出版社,2003.
    [16]Rakesh Agrawal, Ramakrishnan Srikant.Fast Algorithms for Mining Association Rules in Large Databases[C].in:Jorge B.Bocca,Matthias Jarke,Carlo Zaniolo eds.Proceedings of the 20th International Conference on Very Large Data Bases,Santiago de Chile.1994,Morgan Kaufmann,1994:487-499.
    [17]郭景峰,米浦波,刘国华.决策树算法的并行性研究[J].计算机工程,2002,28 (8):56-58.
    [18]A.E.Howe, D.Dreilinger.Savvy Search: A Meta-Search Engine that Learns which Search Engines to Query[J].AI Magazine, Vol.18, No.2, 1997:19-25.
    [19]Clement Yu, Weiyi Meng, Wensheng Wu, King-Lup Liu.Efficient and Effective Met search for Text Databases Incorporating Linkages among Documents[J].ACM SIGMOD, 2001:187-198.
    [20]马琳.Web搜索引擎中个性化信息服务关键技术研究[D].南京:南京大学, 2002.
    [21]李洁.搜索引擎中相关性测算发展研究[J].情报检索,2003(12):62-64.
    [22]赵荣,黄燕云,张露.搜索引擎检索结果的组织技术[J].情报学报, 2004, (23):69-72.
    [23]李永平,文坤梅.集成搜索引攀中结果排序的优化分析[J].华中科技大学学报(自然科学版),2003,(11):28-30.
    [24]宋爱波,董逸生,陈静.基于Weblog的模式发现及应用的研究[J].小型微型计算机系统,2002,23(11):1331-1335.
    [25]蒋萍,崔志明.智能搜索引擎中用户兴趣模型分析与研究[J].微电子学与计算机,Vol.21, No.11, 2004:24-26.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700