用户名: 密码: 验证码:
基于微博平台的事件趋势分析及预测研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
社交网络服务是近年来迅速兴起并逐渐渗透到社会各用户群体的计算机应用服务,微博是其中一个重要应用,并且在最近几年得到迅速发展。平台用户的高覆盖性、内容的自生产性和信息传播的及时性,使微博平台成为目前重要的消息传播媒介。平台上的巨大用户规模和海量信息内容,为研究者们提供了良好数据以进行群体用户的信息挖掘。本文尝试利用微博平台的海量文本资源,抽取出各种特征数据,对传统研究中难以量化的事件趋势这一社会内容进行计算和分析,并根据基于样本范围内数据的趋势建模,来预测范围外的事件未来趋势。本文旨在通过这一方面的工作,阐述对难以进行形式化描述的非确定性社会内容进行计算的可行性。
     本文研究了在微博平台上进行事件趋势分析及预测的几个关键问题,包括群体行为的定义与计算方法;事件趋势的样本回归分析和未来趋势预测模型;事件相关微博内容的识别及获取方法;微博平台上的用户特征和博文文本特征抽取;以及事件趋势的形式化描述和特征指标抽取方法。主要的研究工作和创新点概括如下:
     1.提出了一种基于群体行为的社会计算方法。首先根据样本用户的特征抽取和分类,获得特征相应的指标和计算方法,再通过对大规模用户特征值的综合计算,获得该用户群体的整体特征,直接对用户整体进行量化计算。结果表明,采用该方法进行群体特征计算具有可行性。
     2.提出了一种基于微博平台的事件趋势分析和事件未来趋势预测的算法,并给出了具体过程。首先通过对样本范围内数据的计算,获得事件趋势各相关指标的数据值,再通过回归分析,构建基于样本数据的回归模型。然后通过对最佳拟合模型的分析,计算预测点之前单位时长内的回归模型函数值,根据差值斜率的融合模型计算预测点的未来趋势。在实际语料基础上进行的实验结果表明该方法可以辅助人工决策,与实际数据的绝对差异较小,且在针对情感比重一类相对值的实验中有较好结果。
     3.提出了一种事件内容的抽取方法。该方法结合了MACD算法(MovingAverage Convergence and Divergence,指数平滑异同移动平均线)和LDA算法(Latent Dirichlet Allocation,潜在狄利克雷分布),分别进行突发事件内容的获取和已知事件的相关文本内容扩展。利用MACD算法,计算微博文本中单位时间片的词频变化,利用短周期移动平均线和长周期移动平均线之间的聚合及分离情况,识别平台文本流中讨论量突增内容,以此抽取有可能成为讨论热点的事件。而LDA算法,则被用来计算事件相关的“词袋”内容及各相关词在该事件中的关联权重。根据若干词组合的方式作为关键词查询的补充,以此扩展事件相关内容的抽取结果。实验结果表明此抽取方法效果明显。
     4.本文提出了一套微博平台上相关内容的形式化定义方法和一种简单高效的用户特征识别方法,以及事件特征的定义和事件趋势指标的建立方法。首先对用户群体和事件趋势等非数值化的社会内容进行量化,通过此方法对平台系统、平台涉及的网络、平台用户、用户消息内容等各项指标进行具体的数值计算,用可计算的数学模型对非量化的社会趋势内容进行描述。然后在此基础上,基于社会学、传播学和心理学中的个体及群体特征分析,以样本数据中标注用户的特征取值构造规则集,再以规则集为筛选标准,根据测试用户关键特征的数据值关系,来区分微博平台上的关键用户和垃圾用户,较好的支持了针对研究对象的计算与分析。
SNS (Social Networking Services) rise rapidly in recent years, and graduallypenetrate into the user groups all over the world. Microblogging is one of theimportant applications and have been rapidly developed in the last few years. Highcoverage, timeliness of content production and dissemination of information make themicroblogging platform a major news media. Huge number of users on the platformand the mass content, provide effective corpus for information mining in groups ofusers. This thesis attempts to extract various features of the data in the massive textresources of the microblogging platform, and calculate and analyze the trend of eventswhich is also in social computing area and difficult to quantify in the traditionalresearch. We model the trend according to the sample data, and predict the futuretrend according to the data outside the scope. This thesis is motivated to descript thepossibilities of computing the social contents.
     In this paper, we discussed several key issues of the event trend analysis andprediction on the Weibo platform. Including the calculation of group behavior;regression analysis of bursty event trends and future trends modeling; recognition andacquisition of event-related microblogging content; user characteristics and the text ofthe blog features extraction in microblogging platform, as well as the formaldefinition of the event trends. Main research and work results are summarized asfollows:
     1. Present a social computing framework based on group behavior. Within thisframework, we first define the indexes of ursers fetures. And then we have awhole portrait of massive users features. Thus to quantify the groups of users.Experiments results show the possibilities of group features computing.
     2. Present the framework of event trends analysis and prediction, and the methods indetail. Based on the sample data, we calculate the data value of each trend index.Thus we have a sample-based regression model. Then we calculate the futuretrend by the fusion model. Results show that it is a good way to aid the artificialdecision. Besides, there is little different between the absolute number ofpredictive data and actual data. Results of emotional proportion data also have a relative value.
     3. Present an event extraction method. This method combines the MACD algorithm(Moving Average Convergence and Divergence) and LDA algorithm (LatentDirichlet Allocation), and they are assigned to find the emergencies conten andrelated words of the known events expansion. By MACD algorithm, we calculatethe term frequency change of the unit time slice in the text of the microblogging,the use of aggregation and separation between the short-period moving averageline and long-period moving average line to recognize the burst content. The LDAalgorithm is used to calculate the event-related content of the "word bag" andrelated words in the event weight. Experimental results show that it is an effectmethod to extract the key content.
     4. Present a set of the formal definition of related content on the microbloggingplatform, including the platform, the users’ network, user data, and data itemsinvolved to the platform. We also present a feature recognition method to classifythe users, simple but effective. All the formal definitions support well for thecalculation and analysis of the study.
引文
[1] Cindy Xide Lin, Bo Zhao, Qiaozhu Mei, Jiawei Han. PET: A Statistical Model for PopularEvents Tracking in Social Communitiesl. In Proceedings of the16th ACM SIGKDDInternational Conference on Knowledge Discovery and Data Mining (KDD2010), ISBN:978-1-4503-0055-1DOI:10.1145/1835804.1835922, ACM, New York, NY, USA, July2010.
    [2] Saurabh Goorha, Lyle Ungar, Discovery of Significant Emerging Trends. In Proceedings of the16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining(KDD2010), ISBN:978-1-4503-0055-1DOI:10.1145/1835804.1835815, ACM, New York,NY, USA, July2010.
    [3] Auton Lab, Detection of Events in Multiple Streams of Surveillance Data: Multivariate,Multi-stream and Multi-dimensional Approaches. D. Zeng et al.(eds), INFECTIOUSDISEASE INFORMATICS AND BIOSURVEILLANCE, Integrated Series in InformationSystems,2011, Volume27, Part2,145-171, DOI:10.1007/978-1-4419-6892-0_7, SpringerScience-Business Media, LLC2011.
    [4] Chris Cieri, David Graff, Mark Liberman, Nii Martey, Stephanie Strassel, The TDT-2Text andSpeech Corpus, in DARPA Broadcast News Workshop,1999.
    [5] TDT-4Corpus Annotation Specification, LDC, Version1.4, November11,2002.
    [6]张晓艳,王挺.话题发现与追踪技术研究.计算机科学与探索,2009,3(4),347-357.
    [7]骆卫华,刘群.话题检测与跟踪技术的发展与研究.语言计算与基于内容的文本处理——全国第七届计算语言学联合学术会议论文集,北京,2003.
    [8]洪宇,张宇,刘挺,李生.话题检测与跟踪的评测及研究综述.中文信息学报,2007(6).
    [9] Jiang Yang, Scott Counts. Predicting the Speed, Scale, and Range of Information Diffusion inTwitter, In Proceedings of the Fourth International AAAI Conference on Weblogs and SocialMedia, Washington DC, USA,2010.
    [10] Swit Phuvipadawat, Tsuyoshi Murata. Breaking News Detection and Tracking in Twitter. InProceedings of2010IEEE/WIC/ACM International Conference on Web Intelligence andIntelligent Agent Technology-Workshops, WI-IAT2010, p120-123,2010.
    [11] Wei Zhang, Clement Yu, Weiyi Meng. Opinion Retrieval from Blogs. In Proceedings ofInternational Conference on Information and Knowledge Management (CIKM2007), p831-840,2007.
    [12] Lun-Wei Ku, Yu-Ting Liang and Hsin-Hsi Chen. Opinion Extraction, Summarization andTracking in News and Blog Corpora. AAAI Spring Symposium-Technical Report, vSS-06-03, p100-107, Stanford, CA, USA,2006.
    [13] Java, A., Finin, T., Song, X., and Tseng, B. Why We Twitter: Understanding MicrobloggingUsage and Communities. In Proceedings of the Joint9th WebKDD and1st SNA-KDD2007Workshop on Web Mining and Social Network Analysis (WebKDD/SNA-KDD '07). ACMNew York, NY, USA2007.
    [14] McFedries, P. All A-Twitter. IEEE Spectrum, v44, n10, p84, October2007.
    [15]冯英健,微博营销与博客营销的本质区别.见: http://www.jingzhengli.cn/blog/fyj/1096.html
    [16] J. Pontin. From many tweets, one loud voice on the internet. New York Times, April23,2007.Available at http://www.uscannenberg.org/projects/TECLAB/digital-readings/Pontin_2007.pdf
    [17] comScore, The2009U.S. Digital Year in Review: A Recap of the Year in Digital Marketing,February2010.
    [18]新浪,中国微博元年市场白皮书, Sep9,2010.
    [19] Sitaram Asur, Bernardo A. Huberman, Gabor Szabo, Chunyan Wang. Trends in Social Media:Persistence and Decay. HP Social Media Research Lab.2011. Available athttp://arxiv.org/PS_cache/arxiv/pdf/1102/1102.1402v1.pdf
    [20] Keinosuke Fukunaga. Introduction to Statistical Patten Recognition (Second Edition),Academic Press, San Diego, CA, USA,1990. Online version is available athttp://www.google.com/books?hl=zh-CN&lr=&id=BIJZTGjTxBgC&oi=fnd&pg=PR11&dq=Introduction+to+Statistical+Pattern+Recognition&ots=X4LopSmpkS&sig=Dj45UdKyLdQBZcsWQfKphQ1K7c8
    [21] Shuiwang Ji, Jieping Ye. Generalized Linear Discriminant Analysis: A Unified Framework andEfficient Model Selection, IEEE Transactions on Neural Networks, Vol.19, Issue.10, p1768-1782, Octember2008.
    [22] Vladimir Naumovich Vapnik. Estimation of Dependencies Based on Empirical Data. Berlin:Springer-Verlag,1982.
    [23] Vladimir Naumovich Vapnik. The Nature of Statistical Learning Theory, NY: Springer-Verlag,1995,张学工译.统计学习理论的本质.北京:清华大学出版社,1999.
    [24] J Allan, J Carbonell, G Doddington, J Yamron and Y Yang. Topic Detection and Tracking PilotStudy: Final Report. In Proceedings of the DARPA Broadcast News Transcription andUnderstanding Workshop, Virginia: Lansdowne, February1998,194-218.
    [25] Jianping Zeng, Shiyong Zhang. Incorporating Topic Transition in Topic Detection andTracking Algorithms, Expert Systems with Applications, Vol36, Issue1, Pages227-232,January2009.
    [26] Xiaomeng Wu, IchiroIde and Shin’ichi Satoh. Grasp the Development and Dependencies ofNews Stories. Lecture Notes in Computer Science (including subseries Lecture Notes inArtificial Intelligence and Lecture Notes in Bioinformatics), v5879LNCS, p755-766,2009,In Proceedings of Advances in Multimedia Information (PCM2009) and the10th Pacific RimConference on Multimedia, Bangkok, Thailand, December2009.
    [27] Sungjick Lee, Han-joon Kim. News Keyword Extraction for Topic Tracking. In Proceedings of4th International Conference on Networked Computing and Advanced InformationManagement (NCM2008). Gyeongju, Republic of Korea, September2008.
    [28] Yu-Ru Lin, Hari Sundaram, Yun Chi, Junichi Tatemura, Belle L. Tseng. Blog CommunityDiscovery and Evolution Based on Mutual Awareness Expansion. In Proceedings of theIEEE/WIC/ACM International Conference on Web Intelligence,(WI2007). Silicon Valley,CA, USA, November2007.
    [29] Changki Lee, Gary Geunbae Lee, Myunggil Jang. Dependency Structure Language Model for|10Topic Detection and Tracking. Information Processing&Management, Volume43, Issue5,Pages1249-1259, September2007.
    [30] Jian Zhang, Zoubin Ghahramaniyz. A Probabilistic Model for Online Document Clusteringwith Application to Novelty Detection. In: Saul, L. and Weiss, Y. and Bottou, L.,(eds.)Advances in Neural Information Processing Systems17. Bradford Series. MIT Press,Cambridge, MA, USA, pp.1617-1624. ISBN9780262195348,2005.
    [31] Gabriel Pui Cheong Fung, Jefrey Xu Yu, Philip S. Yu, Hongjun Lu. Parameter Free BurstyEvents Detection in Text Streams. In VLDB'05Proceedings of the31st InternationalConference,2005.
    [32] Satoshi Morinaga, Kenji Yamanishi. Tracking Dynamics of Topic Trends Using a FiniteMixture Model. KDD '04Proceedings of the Tenth ACM SIGKDD International Conferenceon Knowledge Discovery and Data Mining.2004.
    [33] Jason Bengel, Susan Gauch, Eera Mittur, Rajan Vijayaraghavan. ChatTrack: Chat Room TopicDetection Using Classification. Lecture Notes in Computer Science,2004, Volume3073/2004,266-277, DOI:10.1007/978-3-540-25952-7_20.
    [34] Juha Makkonen, Helena Ahonen-Myka. Utilizing Temporal Information in Topic Detection andTracking. Lecture Notes in Computer Science.Volume2769/2003,393-404, DOI:10.1007/978-3-540-45175-4_36.2003.
    [35] Victor Lavrenko, James Allan. Relevance Models for Topic Detection and Tracking. HLT '02Proceedings of the Second International Conference on Human Language TechnologyResearch.2002.
    [36] Kanagasabi Rajaraman, Ah-Hwee Tan. Topic Detection, Tracking, and Trend Analysis UsingSelf-Organizing Neural Networks. Lecture Notes in Computer Science, Volume2035/2001,102-107, DOI:10.1007/3-540-45357-1_13.2001.
    [37] Loulwah AlSumait, Daniel Barbar′a, Carlotta Domeniconi. On-Line LDA: Adaptive TopicModels for Mining Text Streams with Applications to Topic Detection and Tracking. DataMining CDM '08Eighth IEEE International Conference,2008.
    [38] Mario Cataldi, Luigi Di Caro, Claudio Schifanella. Emerging Topic Detection on Twitter basedon Temporal and Social Terms Evaluation. MDMKDD'10Proceedings of the TenthInternational Workshop on Multimedia Data Mining. July2010.
    [39] Florian Holz, Sven Teresniak. Towards Automatic Detection and Tracking of Topic Change.Lecture Notes in Computer Science, Volume6008/2010,327-339, DOI:10.1007/978-3-642-12116-6_27.2010.
    [40] Bar s Gü. Information Filtering on Micro-blogging Services. Master Thesis, Swiss FederalInstitute of Technology Zürich, August2010.
    [41] Michael Mathioudakis, Nick Koudas. TwitterMonitor: Trend Detection over the Twitter Stream.SIGMOD'10Proceedings of the2010International Conference on Management of Data. July2010.
    [42] Ye Tian,WendongWang, Xueli Wang, Jinghai Rao, Canfeng Chen. Topic Detection andOrganization of Mobile Text Messages. CIKM '10Proceedings of the19th ACM InternationalConference on Information and Knowledge Management. October2010.
    [43] Sui Yue, Yang Xuecheng. The Potential Marketing Power of Microblog. Second InternationalConference on Communication Systems, Networks and Applications.2010.
    [44] Sitaram Asur et al. Trends in Social Media: Persistence and Decay. HP Labs’ Social ComputingResearch Group.2011.
    [45] Beaux Sharifi, Mark-Anthony Hutton and Jugal K. Kalita. Experiments in MicroblogSummarization. In Proceedings of IEEE Second International Conference on SocialComputing (ICSC2010), Socialcom, pp.49-56, Minneapolis, Minnesota, USA, August2010.
    [46] Beaux Sharif, Mark-Anthony Hutton and Jugal K. Kalita. Summarizing MicroblogsAutomatically. In HLT '10: Human Language Technologies: The2010Annual Conference ofthe North American Chapter of the Association for Computational Linguistics (ACL2010),Stroudsburg, PA, USA, June2010.
    [47] Marc Cheong and Vincent C. S. Lee. A Microblogging-Based Approach to TerrorismInformatics-Exploration and Chronicling Civilian Sentiment and Response to TerrorismEvents via Twitter. Information Systems Frontiers (28September2010), pp.1-15-15.doi:10.1007/s10796-010-9273-x
    [48] Yang Shen, Shuchen Li, Ling Zheng, Xiaodong Ren, Xiaolong Cheng. Emotion MiningResearch on Micro-blog. Web Society, SWS '09,1st IEEE Symposium.2009.
    [49] Sarah Vieweg, Amanda L. Hughes, Kate Starbird, Leysia Palen. Microblogging During TwoNatural Hazards Events_What Twitter May Contribute to Situational Awareness. CHI '10Proceedings of the28th International Conference on Human Factors in Computing Systems.2010.
    [50] Jiesi Cheng, Aaron Sun, Daning Hu, Daniel Zeng, An Information Diffusion BasedRecommendation Framework for Micro-Blogging.(November1,2010). Available at SSRN:http://ssrn.com/abstract=1713486
    [51] Kathy J. Liszka, Chien-Chung Chan, Chandra Shekar, and Shruti Wakade. DetectingPharmaceutical Spam in Microblog Messages. University of Akron, OH, USA,2010.
    [52] Maria A. Perifanou, Language Micro-gaming_Fun and Informal Microblogging Activities forLanguage Learning. Communications in Computer and Information Science,2009, Volume49,Part1,1-14, DOI:10.1007/978-3-642-04757-2_1
    [53] H. T. Banks, Keri Rehm and Karyn L. Sutton, Dynamic Social Network Models IncorporatingStochasticity and Delays. Quarterly of Applied Mathematics, Vol.68, No4, Pages.783-802,2010.
    [54] Bernardo A. Huberman, Daniel M. Romero, and Fang Wu. Social Networks That Matter:Twitter under the Microscope. Social Computing Lab, HP Laboratories, Palo Alto, CA, USA,2008.
    [55] Ravi Kumar, Jasmine Novak, and Andrew Tomkins, Structure and Evolution of Online SocialNetworks, P.S. Yu, et al.(eds.), Link Mining: Models, Algorithms, and Applications, DOI10.1007/978–1–4419–6515–8-13, Springer Science Business Media, LLC2010.
    [56] Brian Skyrms and Robin Pemantle, A Dynamic Model of Social Network Formation, T. Gross,H. Sayama (eds.), Adaptive Networks, Understanding Complex Systems, DOI10.1007/978-3-642-01284-611, NECSI Cambridge, Massachusetts,2009.
    [57] Alan Mislove, Hema Swetha Koppula, Krishna P. Gummadi, Peter Druschel and BobbyBhattacharjee, Growth of the Flickr Social Network, WOSN’08, August18,2008, Seattle,Washington, USA.2008ACM978-1-60558-182-8/08/08,2009.
    [58] Pieter Noordhuis,Michiel Heijkoop,Alexander Lazovik, Mining Twitter in the Cloud: A CaseStudy. In Proceedings of2010IEEE3rd International Conference on Cloud Computing,Miami, Florida, USA,2010.
    [59] Alan Ritter,Colin Cherry,Bill Dolan, Unsupervised Modeling of Twitter Conversations. HLT'10Human Language Technologies: The2010Annual Conference of the North AmericanChapter of the Association for Computational Linguistics (2010NAACL), Stroudsburg, PA,USA2010.
    [60] Brendan O’Connor, Michel Krieger, David Ahn, TweetMotif: Exploratory Search and TopicSummarization for Twitter. In Proceedings of the Fourth International AAAI Conference onWeblogs and Social Media.2010.
    [61] Wei Wu, Bin Zhang, Mari Ostendorf. Automatic Generation of Personalized Annotation Tagsfor Twitter Users. In HLT '10: Human Language Technologies: The2010Annual Conferenceof the North American Chapter of the Association for Computational Linguistics(2010NAACL), Stroudsburg, PA, USA2010.
    [62] Lei Li,S huai Zhang, The Twitter-based Research of Personal Knowledge Management. In the3rd International Symposium on Knowledge Acquisition and Modeling (KAM),2010.
    [63] Martin Hepp. HyperTwitter: Collaborative Knowledge Engineering via Twitter Messages. in:Proceedings of the17th International Conference onKnowledge Engineering and KnowledgeManagement (EKAW2010), October11-15,2010, Lisbon, Portugal, Springer LNCS Vol.6317.
    [64] Rasha M. BinSult an Al-Eidan, Rend S. Al-Khalif a, AbdulMalik S. AI-Salman, Measuringthe Credibility of Arabic Text Content in Twitter, In the Fifth International Conference onDigital Information Management (ICDIM), Thunder Bay, ON, USA, July2010.
    [65] The-Minh Nguyen, Takahiro Kawamura,Yasuyuki Tahara, Akihiko Ohsuga. Capturing Users'Buying Activity at Akihabara Electric Town from Twitter. Lecture Notes in Computer Science,2010, Volume6422/2010,163-171, DOI:10.1007/978-3-642-16732-4_18
    [66] Bharath Sriram, David Fuhry, Engin Demir, Hakan Ferhatosmanoglu, Murat Demirbas. ShortText Classification in Twitter to Improve Information Filtering, In Proceeding of the33rdinternational ACM SIGIR conference on Research and development in information retrieval(SIGIR '10), New York, NY, USA2010.
    [67] Bongwon Suh, Lichan Hong, Peter Pirolli, Ed H. Chi, Want to be Retweeted? Large ScaleAnalytics on Factors Impacting Retweet in Twitter Network. Second IEEE InternationalConference on Social Computing (SocialCom);2010August20-22; Minneapolis, MN. LosAlamitos CA: IEEE Computer Society;2010;177-184.
    [68] Jeff Huang, Katherine M. Thornton, Efthimis N. Efthimiadis. Conversational Tagging inTwitter. In HT '10Proceedings of the21st ACM conference on Hypertext and Hypermedia,New York, NY, USA2010.
    [69] David Laniado, Peter Mika. Making sense of Twitter. Lecture Notes in Computer Science,2010,Volume6496/2010,470-485, DOI:10.1007/978-3-642-17746-0_30
    [70] Anish Das Sarma,Atish Das Sarma,Sreenivas Gollapudi, Ranking Mechanisms in Twitter-likeForums. In Proceedings of the Third ACM International Conference on Web Search and DataMining (WSDM '10), New York, NY, USA2010.
    [71] Saˇsa Petrovi′c, Miles Osborne, Victor Lavrenko. Streaming First Story Detection withapplication to Twitter. In Proceedings of the11th Annual Conference of the North AmericanChapter of the Association for Computation for Linguistics (NAACL HLT2010)
    [72] Julie Letierce, Alexandre Passant, John G. Breslin, Stefan Decker. Using Twitter during anAcademic Conference: The#iswc2009Use-Case. In Proceedings of the Fourth InternationalAAAI Conference on Weblogws and Social Media,2010.
    [73] Sarita Yardi, Danah Boyd. Tweeting from the Town Square: Measuring Geographic LocalNetworks. In Proceedings of the International Conference on Weblogs and Social Media(2010), Pages:194-201,2010.
    [74] Technische Universit t München, Predicting Elections with Twitter: What140CharactersReveal about Political Sentiment. In Proceedings of the Fourth International AAAI Conferenceon Weblogws and Social Media,2010.
    [75] Meeyoung Cha, Hamed Haddadi, Fabr′cio Benevenuto, Krishna P. Gummadi. Measuring UserInfluence in Twitter: The Million Follower Fallacy. In Proceedings of the Fourth InternationalAAAI Conference on Weblogws and Social Media,2010.
    [76] Kristina Lerman, Rumi Ghosh. Information Contagion: An Empirical Study of the Spread ofNews on Digg and Twitter Social Networks. In Proceedings of the Fourth International AAAIConference on Weblogws and Social Media,2010.
    [77] Meenakshi Nagarajan, Hemant Purohit, Amit Sheth. A Qualitative Examination of TopicalTweet and Retweet Practices. In Proceedings of the Fourth International AAAI Conference onWeblogws and Social Media,2010.
    [78] Haewoon Kwak, Changhyun Lee, Hosung Park, and Sue Moon What is Twitter, a SocialNetwork or a News Media? In Proceedings of the19th International World Wide Web (WWW)Conference, April26-30,2010, Raleigh NC, USA.
    [79] Courtenay Honeycutt, Susan C. Herring, Beyond Microblogging: Conversation andCollaboration via Twitter. In42nd Hawaii International Conference on System Sciences(HICSS '09), Big Island, HI, USA, January2009.
    [80] Johan Bollen, Huina Mao, Xiao-Jun Zeng. Twitter Mood Predicts The Stock Market, Journal ofComputational Science,2011(article inpress).
    [81] Vijay Erramilli, Xiaoyuan Yang, Pablo Rodriguez. Explore what-if scenarios with SONG:Social Network Write Generator. Telefonica Research, Spain,2011. Available athttp://arxiv.org/abs/1102.0699
    [82] Chuang, S.L., L.F. Chien. A Practical Web-Based Approach to Generating Topic Hierarchy forText Segments. In the13th ACM Conference on Information and Knowledge Management.2004.
    [83] Vijay V. Raghavan, Hayri Sever. On the Reuse of Past Optimal Queries. In the18th AnnualInternational ACM SIGIR Conference on Research and Development in Information Retrieval.1995.
    [84] L. Fitzpatrick and M. Dent. Automatic Feedback Using Past Queries: Social Searching? InProceedings of the20th Annual International ACM SIGIR Conference on Research andDevelopment in Information Retrieval. Philadelphia, Pennsylvania, USA,1997.
    [85] Sahami, M., T.D. Heilman. A Web-Based Kernel Function for Measuring the Similarity ofShort Text Snippets. In the15th International Conference on World Wide Web.2006..
    [86] L. C. Rudi and M. B. Paul. The Google Similarity Distance. IEEE Transactions on Knowledgeand Data Engineering IEEE Transactions on Knowledge and Data Engineering.19(3):370-383,2007.
    [87] S. Zelikovitz, H. Hirsh. Improving Short-Text Classification Using Unlabeled BackgroundKnowledge to Assess Document Similarity. In the17th International Conference on MachineLearning.2000.
    [88]董振东.语义关系的表达和知识系统的建造.语言文字应用,第3期:76-82,1998.
    [89]彭京,杨冬青,唐世渭,付艳,蒋汉奎.一种基于语义内积空间模型的文本聚类算法.计算机学报,2007,30(8).
    [90] L. Guo, E. Tan, S. Chen, X. Zhang, and Y. E. Zhao. Analyzing Patterns of UsercontentGeneration in Online Social Networks. Proceedings of KDD2009, pp.369–378. ACM NewYork, NY, USA,2009.
    [91] A. Passant, T. Hastrup, U. Bojars, J. Breslin. Microblogging: A Semantic Web and DistributedApproach. In Proceedings of the4th Workshop on Scriptingfor the Semantic Web, CEURWorkshop Proceedings.2008.
    [92]闫瑞,曹先彬,李凯.面向短文本的动态组合分类算法.电子学报,2009,37(5).
    [93] Bill Bishop. The Rise And Fall Of Chinese Facebook Clone Kaixin001. Nov27,2011.Available athttp://digicha.com/index.php/2011/11/groupm-on-the-rise-and-fall-of-chinese-facebook-clone-kaixin001/
    [94] Michael Saba. Twitter#occupywallstreet#movement aims to mimic Iran. CNN. September17,2011. Available athttp://www.cnn.com/2011/09/16/tech/social-media/twitter-occupy-wall-street/index.html
    [95] Apple Inc. Copyright2011Available at http://www.apple.com/icloud/features/find-my.html
    [96]陌陌.2011-2012All Rights Reserved. Available at http://immomo.com/
    [97]西安摩岩网络科技有限公司,Available at http://itongxing.com/
    [98] Mary Meeker. Internet Trends, Web2.0Summit. San Francisco, CA, October18,2011
    [99]王飞跃,曾大军,毛文吉.社会计算的意义、发展与研究状况. E-Science,2010年7月.
    [100] WikiPedia. Available at http://en.wikipedia.org/wiki/Complex_system
    [101] Weaver, Warren. Science and Complexity. American Scientist,36:536(Retrieved on2007–11–21).1948.
    [102] Luis M. Rocha. Complex Systems Modeling: Using Metaphors From Nature in Simulation andScientific Models. BITS: Computer and Communications News. Computing, Information, andCommunications Division. Los Alamos National Laboratory. November1999.
    [103] Complex Systems. Science. Vol.284. No.5411,2April1999.
    [104] Reuven Cohen, Shlomo Havlin. Complex Networks: Structure, Robustness and Function.Cambridge University Press. ISBN978-0-521-84156-6.2010.
    [105] Chris Charron, Jaap Favier, Charlene Li with Jennifer Joseph, Manuela Neurauter, Sally M.Cohen, Tenley McHarg, Jed Kolko. Social Computing: How Networks Erode InstitutionalPower, And What to Do About It? February13,2006. Available athttp://www.forrester.com/ResearchThemes/SocialComputing
    [106] Facebook Inc.&Milan University. Available athttps://www.facebook.com/notes/facebook-data-team/anatomy-of-facebook/10150388519243859
    [107] Gimpel, Kevin and Schneider, Nathan and O'Connor, Brendan and Das, Dipanjanand Mills, Daniel and Eisenstein, Jacob and Heilman, Michael and Yogatama,Dani and Flanigan, Jeffrey and Smith, Noah A. Part-of-Speech Tagging for Twitter:Annotation, Features, and Experiments. Proceedings of the49th Annual Meeting of theAssociation for Computational Linguistics: Human Language Technologies (ACL HLT2011).Portland, Oregon, USA.19-24June,2011.
    [108] D. Schuler. Social Computing. Communications of the ACM,1994, vol.37, no.1:28-29.
    [109] F.-Y. Wang. Social Computing: Concepts, Contents, and Methods. International Journal ofIntelligent Control and Systems,2004,vol.9, no.2:91-96.
    [110]王飞跃.人工社会、计算实验、平行系统—关于复杂社会经济系统计算研究的讨论.复杂系统与复杂性科学,2004, vol.1, no.4:25-35.
    [111]王飞跃.社会计算与数字网络化社会的动态分析.科技导报,2005,vol.23:4-7.
    [112]王飞跃.社会计算—科学、技术与人文的数字化动态交融.中国基础科学,2005,vol.7:5-12.
    [113]王飞跃.关于社会物理学的意义及方法讨论.复杂系统与复杂性科学,2005, vol.2, No.3:13-22.
    [114]王飞跃.社会计算的意义及其展望.中国计算机学会通讯,2006, vol.2, no.2:28-38.
    [115] D. Lazer, A. Pentland, L. Adamic, et al. SOCIAL SCIENCE: Computational Social Science.Science,2009, vol.323, no.5915:721-723,2009.
    [116] L. Cao, V. I. Gorodetski and P. A. Mitkas. Special Issue on Agents and Data Mining. IEEEIntelligent Systems, vol.24,2009.
    [117] H. Chen and C. Yang. Social Media Analytics: Understanding the Pulse of the Society. IEEETransactions on Systems, Man, and Cybernetics, Part A, vol.39,2009.
    [118] N. Liu, Q. Yang, Z. Zhou, et al. Special Issue on Social Learning. IEEE Intelligent Systems,2010.
    [119] D. Zeng, H. Chen, R. Lusch, et al. Special Issue on Social Media Analytics and Intelligence.IEEE Intelligent Systems, November/December,2010.
    [120] WikiPedia. Available at http://en.wikipedia.org/wiki/Social_computing
    [121] Richard Johansson and Alessandro Moschitti. Extracting Opinion Expressions and TheirPolarities–Exploration of Pipelines and Joint Models. Proceedings of the49th AnnualMeeting of the Association for Computational Linguistics: Human Language Technologies(ACL HLT2011). Portland, Oregon, USA.19-24June,2011.
    [122] Yulan He, Chenghua Lin, Harith Alani. Automatically Extracting Polarity-Bearing Topics forCross-Domain Sentiment Classification. Proceedings of the49th Annual Meeting of theAssociation for Computational Linguistics: Human Language Technologies (ACL HLT2011).Portland, Oregon, USA.19-24June,2011.
    [123] Carvalho, Paula and Sarmento, Luis and Teixeira, Jorge and J. Silva, Mario. Liars and Saviorsin a Sentiment Annotated Corpus of Comments to Political Debates. Proceedings of the49thAnnual Meeting of the Association for Computational Linguistics: Human LanguageTechnologies (ACL HLT2011). Portland, Oregon, USA.19-24June,2011.
    [124] Gonzalez-Ibanez, Roberto and Muresan, Smaranda and Wacholder, Nina. Identifying Sarcasmin Twitter: A Closer Look. Proceedings of the49th Annual Meeting of the Association forComputational Linguistics: Human Language Technologies (ACL HLT2011). Portland,Oregon, USA.19-24June,2011.
    [125] John H. Holland. Emergence: From Chaos To Order. OUP Oxford; New Ed edition,16Mar2000.
    [126] William Mendenhall, Terry Sincich, Statistics for Engineers and Sciences, Fifth Edition (ISBN0-13-187706-2), Pearson Education, Inc., publishing as Sun Microsystems, Inc.,2007.
    [127] Marc Gaffan, What Google Doesn't Show You:31%of Website Traffic Can Harm YourBusiness. Wednesday,14March2012. Incapsula.com. Available athttp://www.incapsula.com/the-incapsula-blog/blog-2012/114-what-google-doesnt-show-you-31-of-website-traffic-can-harm-your-business
    [128] Louis Yu, Sitaram Asur, Bernardo A. Huberman, What Trends in Chinese Social Media, The5th SNA-KDD Workshop’11(SNA-KDD’11), August21,2011, San Diego CA USA.2011ACM978-1-4503-0225-8.
    [129] Blei, D., Ng, A., Jordan, M., Latent Dirichlet Allocation, The Journal of Machine LearningResearch,993-1022,2003.
    [130] Bill Manaris, Natural Language Processing: A Human-Computer Interaction Perspective.Advances in Computers. Volume47,1999.
    [131]宗成庆.统计自然语言处理.清华大学出版社.2008.5.
    [132]冯志伟.当前自然语言处理发展的四个特点.暨南大学华文学院学报.2006年,第1期(总第21期).
    [133]费孝通.乡土中国.三联书店,1985.
    [134]李保利,陈玉忠,俞士汶.信息抽取研究综述.计算机工程与应用,第39卷第10期,1-5页,2003.
    [135] WikiPedia.参见: http://zh.wikipedia.org/wiki/Category:群集智能
    [136] Xiao-Feng Xie, Wen-Jun Zhang. SWAF: Swarm Algorithm Framework for NumericalOptimization. Genetic and Evolutionary Computation Conference (GECCO)(LNCS3102),Seattle, WA, USA,2004:238-250
    [137]肖人彬,陶振武.群集智能研究进展.管理科学学报,第10卷第3期,2007年6月.
    [138] Gustave Le Bon. Psychologie des Foules,1895.(Crowd: A Study of the Popular Mind, NewYork, Viking Press,1960(fourteenth printing,1976))
    [139] David G. Myers. Social Psychology (8thEdition). ISBN:0-07-291694-x, Copyright2005byThe McGraw-Hill Companies, Inc.社会心理学:第8版/(美)迈尔斯(Myers, D. G.)著;侯玉波,乐国安,张智勇等译.-北京:人民邮电出版社,2006.1
    [140] Elliot Aronson. The Social Animal. Palgrave Macmillan,10th revised edition,2007, ISBN1-4292-0316-1.
    [141] Elliot Aronson (Ed.): Readings about the Social Animal, W.H. Freeman&Co,10th edition,2007, ISBN1-4292-0617-9
    [142] Matthew A. Russell. Mining the Social Web. ISBN:978–1–449–38834–8, O’ReillyMedia, Inc.,2011
    [143] Malcolm Gladwell. The Tipping Poing: How Little Things Can Make a Big Difference.Copyright2002. Simplified Chinese translation copyright by CHINA CITIC PRESS.Published by arrangement with Little, Brown and Company through Arts&LicensingInternational, Inc., USA.
    [144] Robert B. Cialdini. Influence: The Psychology of Persuasion. HarperCollins,2006. ISBN:006124189x,9780061241895.
    [145] Chung Joo Chung, Yoonjae Nam, Michael A. Stefanone. Exploring Online News Credibility:The Relative Influence of Traditional and Technological Factors. Journal ofComputer-Mediated Communication, Special Issue: The Hyperlinked Society: Understandingthe Changing Nature of Communication in Online Environments. Volume17, Issue2, pages171-186, January2012. DOI:10.1111/j.1083-6101.2011.01565.x
    [146] Michael C. Frank, Sharon Goldwater, Vikash Mansinghka, Tom Griffiths and JoshuaTenenbaum. Modeling Human Performance in Statistical Word Segmentation. Cognition,117,107-125.
    [147] Dunbar, R., You’ve Got to Have (150) Friends. The New York Times, The Opinion Pages,2010.
    [148] WikiPedia,参见: http://en.wikipedia.org/wiki/1%25_rule_(Internet_culture)
    [149] Awan, A. N.(2007b)'Virtual Jihadist media: Function, legitimacy, and radicalising efficacy',in European Journal of Cultural Studies, vol.10(3), pp.389–408.
    [150] Wu, Michael (04-01-2010)."The Economics of90–9–1: The Gini Coefficient (with CrossSectional Analyses)". Lithosphere Community. Lithium Technologies, Inc.. Retrieved2010-07-10.
    [151] William C. Hill, James D. Hollan, Dave Wroblewski, Tim McCandless. Edit Wear and ReadWear. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems(ACM):3–9. DOI:10.1145/142750.142751. ISBN:0897915135.1992.
    [152] Alex Hai Wang. Don’t Follow Me: Spam Detection in Twitter. In Proceedings of the2010International Conference on Security and Cryptography (SECRYPT). Athens, Greece E-ISBN:978-989-8425-18-8,26-28July2010.
    [153] Fabricio Benevenuto, Gabriel Magno, Tiago Rodrigues, and Virgilio Almeida. DetectingSpammers on Twitter. CEAS2010-Seventh annual Collaboration, Electronic messaging,Anti-Abuse and Spam Conference July13-14,2010, Redmond, Washington, USA
    [154] Gianluca Stringhini, Christopher Kruegel, Giovanni Vigna. Detecting Spammers on SocialNetworks. ACSAC’10Dec.6-10,2010, Austin, Texas USA2010ACM978-1-4503-0133-6/10/12
    [155]中国互联网络信息中心(CNNIC).第29次中国互联网络发展状况统计报告.2012年1月.
    [156]工业和信息化部电信研究院.2012年移动终端白皮书.2012年4月.参见:http://shouji.catr.cn
    [157]新浪微博(Weibo).参见: http://weibo.com
    [158]腾讯微博.参见: http://t.qq.com
    [159]腾讯KDD2012公开语料.参见: http://www.kddcup2012.org/
    [160] NTCIR.标注语料库,参见: http://research.nii.ac.jp/ntcir/
    [161]北京大学计算语言学研究所.《人民日报》切分/标注语料库.参见:http://www.icl.pku.edu.cn/icl_res/
    [162]《华尔街日报中文版》.参见: http://cn.wsj.com
    [163] FT中文网(英国《金融时报》中文版). The Financial Times Ltd.,2012.参见:http://www.ftchinese.com/
    [164]射手网.2000-2010射手科技Sagittarius Technology Co., Ltd.参见:http://shooter.com.cn
    [165]汪秉宏,韩筱璞.人类行为的动力学与统计力学研究.物理,第39卷(2010年)1期.
    [166]焦玉,刘衍珩,王健,王静.基于习惯的人类动力学建模.科学通报,2010年,第55卷,第11期:1070-1076.
    [167] Jiao Y, Liu Y H, Wang J, et al. Model for human dynamics based on habit. Chinese Sci Bull,2010,55, doi:10.1007/s11434-010-0011-z
    [168] Google Translate. http://translate.google.com
    [169] Dictionary Online. http://dictionary.reference.com
    [170] P. D. Turney and M. L. Littman. Measuring praise and criticism: Inference of semanticorientation from association, ACM Transactions on Information Systems (TOIS),21(4),315-346.2003. OAI arXiv.org:cs/0309034
    [171] Marco Baroni and Stefano Vegnaduzzo. Identifying Subjective Adjectives through Web-basedMutual Information. In Proceedings of the7th Konferenz zur Verarbeitung NatürlicherSprache (German Conference on Natural Language Processing–KONVENS’04. Vienna, AU,2004.
    [172]新浪.新浪微博每秒信息量峰值较Twitter高出7000条.参见:http://tech.sina.com.cn/i/2012-01-30/15386667286.shtml
    [173]新浪.新浪美股官方.2012年4月3日.参见: http://weibo.com/1640337222/ycUUpeRL3
    [174] Gerald Appel. Technical Analysis Power Tools for Active Investors. Financial Times PrenticeHall. pp.166. ISBN0131479024.1999.
    [175] John Murphy. Technical Analysis of the Financial Markets. Prentice Hall Press. pp.252–255.ISBN0735200661.1999.
    [176] WikiPedia. http://en.wikipedia.org/wiki/MACD
    [177]新浪2011年第四季度及全年财报.新浪科技.2012年02月28日.参见:http://tech.sina.com.cn/i/2012-02-28/05296776965.shtml
    [178]北京市网络媒体协会BAOM,万瑞数据.缔元信微博媒体特性及用户使用状况研究报告.2010年8月.参见: http://www.wrating.com
    [179]谢丽星,周明,孙茂松.基于层次结构的多策略中文微博情感分析和特征抽取.中文信息学报,第26卷第1期.2012年1月.文章编号:1003-0077201201-0073-11
    [180]谢丽星.基于SVM的中文微博情感分析的研究.硕士学位论文.清华大学,2011.
    [181] Facebook. http://www.facebook.com
    [182] Google Inc. http://www.google.com
    [183]人人网2012.北京千橡网景科技发展有限公司.参见: http://www.renren.com
    [184] QQ圈子(体验版).1998-2012腾讯公司.参见: http://exp.qq.com/details.html#pid=276
    [185] Apple Inc. http://www.apple.com
    [186]李德毅.网络时代人工智能研究与发展.智能系统学报,第4卷第1期,2009年2月
    [187]赵妍妍,秦兵,刘挺,文本分析.软件学报,2010年第08期
    [188]赵妍妍,秦兵,车万翔,刘挺,基于句法路径的情感评价单元识别.软件学报,2011年第05期
    [189] Jakob Nielsen. Participation Inequality: Encouraging More Users to Contribute. October9,2006. http://www.useit.com/alertbox/participation_inequality.html
    [190] Laurence Brothers, Jim Hollan, Jakob Nielsen, Scott Stornetta, Steve Abney, George Furnas,and Michael Littman."Supporting informal communication via ephemeral interest groups,"Proceedings of CSCW92, the ACM Conference on Computer-Supported Cooperative Work.Toronto, Ontario, November1-4,1992, pp.84-90.
    [191] William C. Hill, James D. Hollan, Dave Wroblewski, and Tim McCandless."Edit wear andread wear," Proceedings of CHI'92, the SIGCHI Conference on Human Factors in ComputingSystems. Monterey, CA, May3-7,1992, pp.3-9.
    [192] Steve Whittaker, Loren Terveen, Will Hill, and Lynn Cherny."The dynamics of massinteraction," Proceedings of CSCW98, the ACM Conference on Computer-SupportedCooperative Work. Seattle, WA, November14-18,1998, pp.257-264.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700