用户名: 密码: 验证码:
专题新闻文本集信息可视化理论模型研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
随着网络技术的飞速发展,互联网已成为广大群众学习、生活和娱乐不可缺少的一部分,当人们登陆互联网时,各种新闻事件到处可见,其中大多的新闻信息都是以文本形式随机地“堆”在用户面前,使得用户无法快速地从大量随机信息中获取自己所需信息。本文在对信息可视化和复杂网络相关理论进行研究的基础之上,结合新闻要素本身的特点和用户的信息需求,提出了专题新闻文本集信息可视化理论模型,然后以新浪网四川汶川大地震灾后重建专题新闻文本集为实例对其进行实证研究。本文完成了如下研究工作:
     (1)在对新闻要素、信息可视化、复杂网络、文本挖掘、广义合作网络等理论进行综合分析研究的基础之上,结合用户的信息需求,以card提出的信息可视化理论模型和时间墙模型为依据,构建了专题新闻文本集信息可视化理论模型,并对专题新闻文本集信息可视化模型的要素进行了形式化表示。
     (2)界定了新闻文本关键词粒度的内涵,阐述了新闻文本关键词粒度的划分原则;叙述了新闻文本关键词的提取原则与方法;把新闻文本集中的每个关键词看作是一个项目,运用广义合作网络理论界定了关键词的关联规则。
     (3)提取了新浪网“汶川地震灾后重建”专题新闻文本集的关键词,然后根据各类关键词的粒度,应用概念分层理论对关键词进行了处理,构造了关键词库,最后根据关键词关联规则,利用EXCEL构造了关联矩阵。
     (4)根据关联矩阵,运用Ucinet,对汶川地震灾后重建专题新闻文本集的关联进行了可视化,对新闻文本总体情况和各类关键词交叉关联情况进行了分析。
     (5)分析了汶川地震灾后重建专题新闻文本集文本数量变化情况;提出了新闻事件“关注度”的概念,对各事件关键词关注度逐月变化情况和主要事件关注度演化情况进行了分析。
With the fast development of network technology, Internet network has become an important part of people’s learning, life and entertainment. Accessing to the internet network, people will find many kinds of news, which is listed in the form of texts mostly, and couldn’t find the useful information to them quickly from the random information. After studying the theory of information visualization and complex network,the paper proposes a information visualization model of special news texts set, associating the news elements’characteristic and users’information needs. Then, the paper visualizes the Wenchuan earthquake reconstruction news texts set on Sina as an example. The paper achieves the research results listed as below: First, after studying the theory of news’elements, information visualization, complex network, texts’digging, corporative broad network and etc., the paper proposes a information visualization model of special news texts set, associating users’information needs and basing on the theory of Stuart K. Card and time wall model. Then, the paper formalizes the model’s elements.
     Second, the paper explains the concept of the granularity of news texts’keywords, and narrates the principles and methods of news texts’keywords’selection. Regarding every keyword of the news’texts as a project, the paper illuminates the association rules of keywords by using the corporative broad network.
     Third, the paper distills the keywords of Wenchuan earthquake reconstruction news texts set on Sina. Then, it processes them by using concept hierarchy theory on the basis of the keywords’granularity and forms the keywords’database. Finally, it constructs the association matrix by using Excel on the basis of association rules.
     Fourth, the paper visualizes the association of Wenchuan earthquake reconstruction news texts set by using Ucinet on the basis of association matrix and analyses these results.
     Fifth, the paper analyses the change of Wenchuan earthquake reconstruction news texts set. Then, it proposes the concept of event attention-getting degree, analyses the change of all events keywords’attention-getting degree and of primary events keywords’attention-getting degree.
引文
[1] B.H.McCormick,T.A.DeFanti,M D Brown,eds.Visualization in Scientific Computing,Computer Graphics,1987,21(6).
    [2] Nahum D.Gershon,Stephen G. Eick.Information Visualization.IEEE Computer Graphics and Applications,1997:7-8.
    [3]张聪,张慧.信息可视化研究[J].武汉工业学院学报,2006,25(3):45-48.
    [4] Robertson G., Card S. K., Mackinlay J. D.. The cognitive co-processor for interactive user interfaces [C]. Proceedings of the ACM SIGGRAPH Symposium on User Interface Software and Technology, 1989: 10-18
    [5]徐新萍,王晓民,彭瑞云等.浅议信息可视化基本原理与应用[J].中国体视学与图像分析,2007,12(1):75-78.
    [6]黄小芬.新闻学理论中“新闻”定义辨析[J].新闻爱好者,2008年7月下半月:31-32.
    [7]王鹏,张永奎,张彦,刘睿.基于新闻网页主题要素的网页去重方法研究[J].计算机工程与应用,2007,43(28):177-180.
    [8]崔春莎.浅谈以用户为导向的信息需求分析[J].现代情报,2004(9):175-179.
    [9]赵慧萍.网络环境与用户需求[J].图书馆理论与实践,2000(2):25-26.
    [10]郑建程,韩新月.数字资源引进中的用户需求分析与评价方法[J].现代情报,2007(9):18-21.
    [11]赵伯兴,李利方.用户信息需求的诊断方法研究[J].实践研究,2007(6):791-793.
    [12] Albert R,Jeong H,Barabási A-L..Diameter of the World-Wide Web.Nature,1999,401-(6749):130-131.
    [13] Faloutsos M,Faloutsos P,Faloutsos C.On power-law relationships of the Internet topology.Comput. Commun.ev..1999,29,251.
    [14] Newman M E J . Scientific collaboration networks.ⅠNetwork construction and fundamental results.Phys.Rev.E.2001,64(1):016131.
    [15] Newman M E J.Scientific collaboration networks.ⅡShortest paths, weighted networks, and centrality.Phys.Rev.E,2001 ,64(1):016132.
    [16] Guelzim N,Bottani S,Bourgine P et al.Topological and causal structure of the yeast tran-scriptional regulatory network.Nature Genetics,2002,1(1):60-63.
    [17] Albert R,Barabási A-L.Statistical mechanics of complex networks.Rev.Mod.Phys.2002,74(1):47-97.
    [18] Newman M E J.The structure and function of complex networks.SIAM Review,2003,45(2):167-256.
    [19] Dorogovtsev S N,Mendes J F F.Evolution of networks.Oxford,Oxford university press,2003.
    [20] Boccalettia S,Latora V,Moreno Y et al.Complex networks:structure and dynamics.Physics Reports,2006,424,175-308.
    [21] Pastor-Satorras R,Vespignani A.Evolution and structure of the Internet:a statistical physics approach.Cambirdge,England,Cambridge university press, 2004.
    [22] Watts D J,Strogatz Steven H.Collective dynamics of‘small-world’networks.Nature,1998,393(6684):440-442.
    [23] Barabási A-L,Albert R.Emergence of scaling in random networks.Science,1999,286(5439):509-512.
    [24] Jeong H , Tombor B , Albert R et al . The large-scale organization of metabolic networks.Nature,2000,407(6804):651-654.
    [25]苏蓓蓓.关于广义合作网络的一些研究[D].扬州大学硕士论文,2006.
    [26] Newman M.E.J.,Assortative mixing in networks,Phys.Rev.Lett.89,2002, 208701,Mixing patterns in networks,Phys.Rev.E,67,2003,026126.
    [27]罗家德.社会网分析讲义[J].北京:社会科学文献出版社,2005.
    [28]刘军.社会网分析导论[J].北京:社会科学文献出版社,2004.
    [29]解绉,汪小帆.复杂网络中的社团结构[J].复杂系统与复杂科学, 2005,2,1-12.
    [30] Wasserman S. and Faust K..Social Network Analysis:Methods and Applications. Cambridge University Press,Cambridge 1994.
    [31] Rung-Ching Chen,Chung-Hsun Hsieh.Web page classification based on a support vector machine using a weighted vote schema[J].Expert Systems with Applications,2006,31(2):427-435
    [32] Tom Magerman,Bart Van Looy,Xiaoyan Song.Exploring the feasibility and accuracy ofLatent Semantic Analysis based text mining techniques to detect similarity between patent documents and scientific publications[J].Scientometrics,2010 (82):289–306
    [33] Jianping Zeng,Chengrong Wu,Wei Wang.Multi-grain hierarchical topic extraction algorithm for text mining[J].Expert Systems with Applications,2010 (37): 3202–3208
    [34]章成志,王惠临.多语言文本聚类研究综述[J].现代图书情报技术,2009(179):31-36
    [35]王晓勇,肖四友,张文祥.因特网文本智能挖掘的模糊聚类算法研究[J].计算机仿真,2009,26(7):216-219
    [36] Hsin-Chang Yang,hung-Hong Lee.A text mining approach for automatic construction of hypertexts[J].Expert Systems with Applications,2005 (9):723–734
    [37] Yaxin Bi,Terry Anderson,Sally McClean.A rough set model with ontologies for discovering maximal association rules in document collections[J].Knowledge-Based ,2003:243–251
    [38] Sukhamay Kundu.A better fitness measure of a text-document for a given set of keywords[J].Pattern Recognition,2000,33(5):841-848
    [39] Nan Li,Desheng Dash Wu.Using text mining and sentiment analysis for online forums hotspot detection and forecast[J].Decision Support Systems,2010,48(2):354-368
    [40] Feldman R,Hirsh H,Dagan I.Mining Text Using Keyword Distributions[J].Journal of Intelligent Information Systems,1998,10(3):281–300
    [41]张海营.信息可视化刍议[J].科技情报开发与经济,2005,15(8):69-70.
    [42]赵国庆,黄荣怀,陆志坚.知识可视化的理论与方法[J].开放教育研究,2005,11(1):23-27.
    [43]周宁,刘玮,赵丹.信息提供的可视化研究[J].情报科学,2004,22(3):257-260.
    [44]李艳.商业智能的支撑技术[J].中国计算机用户,2003(43):49.
    [45]杨达.数字图书馆信息可视化的研究框架[J].沈阳教育学院学报,2005,7(3):127-130.
    [46]张海营.信息可视化刍议[J].科技情报开发与经济,2005,15(8):69-70.
    [47] Card Stuart K, Mackinlay Jock D, Shneiderman B. Readings in information visualization: using vision to think [M]. San Francisco, CA (Morgan Kaufmann), 1999
    [48] Catherine van Zuylen. From documents to information: a new model for information retrieval [J], Inxight Software, 2004(10)
    [49]周宁,张李义著.信息资源可视化模型方法[M].科学出版社,2008:17-18.
    [50] Sunan Havre, Lucy Nowell. ThemeRiver: visualizing theme changes over time[C].Proceedings of the IEEE Symposium on Information Visualization, 2000(10): 115-123
    [51] Susan Havre, Elizabeth Hetzler, Paul Whitney, and Lucy Nowell. ThemeRiver: visualizing thematic changes in large document collections[C], Proceedings of the IEEE Transactions on Visualization and Computer Graphics, 2002, 18(1): 9-20
    [52]董献洲,胡晓峰,司光亚.信息可视化技术在情报分析中的应用研究[J].计算机工程与应用,2006(34):175-177.
    [53]刘玮,周宁,张芳芳.基于文本的信息可视化方法研究[J].现代图书情报技术,2003,2:34-36
    [54]周宁就,文燕平,刘玮.文献信息可视化研究[J].情报学报,2003,22(4):468-471
    [55] Teuvo Kohonen.The self-organizing map[J].Neurocomputing,1998,21(1-3):1-6
    [56] S.A.Morris,Z.Wu,G.Yen.A SOM Mapping Technique for Visualizing Documents in a Database[J].International Joint Conference on Neural Networks,2001:14-19
    [57] Anton Leuski,James Allan.Strategy-based interactive cluster visualization for information retrieval[J].International Journal on Digital Libraries,2000,3(2):170-184
    [58] Young Gil Kim,Jong Hwan Suh,Sang Chan Park.Visualization of patent analysis for emerging technology[J].Expert Systems with Applications,2008,34:1804-1812
    [59]任智军,朱东华,谢菲.科技文本的可视化分析研究[J].北京理工大学学报(社会科学版),2007,9(1):12-17
    [60]郑珩,朱东华,胡望斌.管理科学发展监测体系中的信息可视化研究[J].情报杂志,2006,8:41-43
    [61] Eades, P. .A heuristic for graph drawing.Congress Numerantium,1984:149-160
    [62] D.Chan,K.Chua,C.Leckie,A.Parhar.Visualization of Power-Law Network Topologies[J].Proceedings of the 11th IEEE International Conference on Networks(ICON 2003),2003:69-74
    [63] Kamada T,Kawai S.An Algorithm for Drawing General Undirected Graphs[J].Information Processing Letters,1989,31(1):7-15
    [64] Lucas Antiqueira,Osvaldo N.Oliveira Jr.Luciano da Fontoura Costa and Maria das Gra?as Volpe Nunes.A complex network approach to text summarization[J].Information Sciences,2009,179(5):584-599
    [65] David Madigan,Yehuda Vardi and Ishay Weissman.Extreme value theory applied todocument retrieval from large collections.Information Retrieval.2006,9(3):273-294
    [66]崔娜.面向用户需求的专题新闻文本集信息可视化模型研究[D].中国地质大学(北京)硕士论文,2009.
    [67]盛亮,李弼程,林琛.基于信息粒度原理的垃圾邮件过滤方法[J].信息工程大学学报2007,8(1):15-17
    [68]赵欣欣,朱铁丹,刘玉树.不同粒度下的文档分类[J].计算机工程,2006,32(10):183-184
    [69]张燕平,张铃,吴涛.不同粒度世界的描述法—商空间法[J].计算机学报,2004,27(3):328~333
    [70]杜秋香,张继福,张素兰.概念特化的概念格更新构造算法[J].智能系统学报,2008,3(5):443-448
    [71] Jiawei Han, Micheline Kamber. Date miningg concepts and techniques[M]. Beijing: China Machine Press, 2001: 100
    [72]熊馨,王卫平,叶跃祥.基于概念分层的个性化推荐算法[J].计算机应用,2005,25(5):1006-1008
    [73]陈奕奕.短信新闻写作特点刍议[J].军事记者,2005(5):21-22
    [74]冯礼.基于事件框架的突发事件信息抽取[J].上海交通大学硕士论文,2008:14-53
    [75]王鹏,张永奎,张彦,刘睿.基于新闻网页主题要素的网页去重方法研究[J].计算机工程与应用,2007,43(28):177-180
    [76] Haizhong An, Na Cui, Wenjing Yu ect. Information visualization of special news text sets [C]. Information Management Innovation Management and Industrial Engineering, 2008: 62-66
    [77]许光清,邹骥.系统动力学方法:原理、特点与最新进展[J].哈尔滨工业大学学报,2006,8(4):72-77
    [78]赵蓉英.论知识网络的结构[J].图书情报工作.2007,51(9):6-10
    [79]郭雷,徐晓鸣.复杂网络[M].上海:上海科技教育出版社,2006:166-185
    [80]艾丰.新闻写作方法论[M].北京:人民日报出版社,2007:44-55
    [81]中央政府门户网站[DB/OL] http://www.gov.cn
    [82]中国行政区域网站[DB/OL] http://www.xzqh.org.cn/zlzx_1.asp
    [83] Borgatti, S.P., M.G. Everett, and L.C.Freeeman.1999. UCINET 5.0 Version 1.00. Natick: Analytic Technologies
    [84]郭艳丽.金融网络中异常资金流的集群识别[D].太原科技大学硕士论文,2007.
    [85]Knoke, D.Political Networks: The Structural Perspective. New York: Cambridge University Press, 1990
    [86]赵华,赵铁军,于浩,张妹.面向动态演化的话题检测研究[J].高技术通讯,2006,12(16):1230-1235.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700