用户名: 密码: 验证码:
网评信息的关键词计算方法
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:Method of Computing Keywords from Online Reviews
  • 作者:张世博 ; 魏战红
  • 英文作者:ZHANG Shibo;WEI Zhanhong;Computer Department,Beijing Institute of Petrochemical Technology;
  • 关键词:网评信息 ; 主题算法 ; 惩罚调节
  • 英文关键词:online reviews;;topic algorithm;;penalty regulation
  • 中文刊名:JSSG
  • 英文刊名:Computer & Digital Engineering
  • 机构:北京石油化工学院计算机系;
  • 出版日期:2018-08-20
  • 出版单位:计算机与数字工程
  • 年:2018
  • 期:v.46;No.346
  • 基金:国家自然科学基金项目(编号:61702040);; 北京市教委科技计划项目(编号:KM201810017005)资助
  • 语种:中文;
  • 页:JSSG201808023
  • 页数:5
  • CN:08
  • ISSN:42-1372/TP
  • 分类号:112-116
摘要
人们在网上消费前日益重视相关的评论,快速获取评论的有价值信息备受重视。文章基于网络评论中的普遍性特征,利用句法信息设计了主题算法,给出了主题分布的详细推理,在计算主题关键词时,设计了词相关性计算方法,采用惩罚因子调节横跨多个主题的关键词,使得关键词更精确反映本主题的内容。和标准主题模型对比,实验结果显示所设计的句法模型算法能清晰表示主题脉络,有助于人们获取海量评论中的关键信息。
        People pay more attention to the relevant review from the network,how to obtain the core value with rapid automation has gotten the attention highly. Considering the sentences feature of online review,topic algorithm based on sentence is designed,a detailed inference process is provided. When computing the topic's key word,word relevance measure is proposed which penalizes the word frequency by a factor that captures how much the word is shared across topics,words for topics can been selected more accurately. Compared with the standard LDA,experiments show that the proposed model is better,there is a clearer topic cue.
引文
[1]Hu M,Liu B.Mining and summarizing customer re-views[C]//Proceedings of the tenth ACM SIGKDD inter-nation-al conference on Knowledge discovery and data mining.ACM,2004:168-177.
    [2]Popescu A M,Etzioni O.Extracting product features andopinions from reviews[M].Springer London:Natural lan-guage processing and text mining,2007:9-28.
    [3]Riloff E,Patwardhan S,Wiebe J.Feature subsumptionfor opinion analysis[C]//In:Proceedings of the 2006 Con-fer-ence on Empirical Methods in Natural LanguageProc-essing.Stroudsburg,PA,USA:Association for Com-puta-tional Linguistics,2006:440-448.
    [4]Pang B,Lee L.Seeing stars:exploiting class relationshipsfor sentiment categorization with respect to rating scales[C]//In:Proceedings of the 43rd Annual Meeting on Asso-cia-tion for Computational Linguistics.Stroudsburg,PA,USA:Association for Computational Linguistics,2005:115-124.
    [5]Yu J,Zha Z J,Wang M,et al.Domain-assisted productaspect hierarchy generation:towards hierarchical or-gani-zation of unstructured consumer reviews[C]//Pro-ceedings of the Conference on Empirical Methods in Natu-ral Lan-guage Processing.Association for Computational Lin-guistics,2011:140-150.
    [6]Blei D M,Ng A Y,Jordan M I.Latent dirichlet allo-ca-tion[J].the Journal of machine Learning research,2003,3:993-1022
    [7]Zhao W X,Jiang J,Yan H,et al.Jointly modeling as-pects and opinions with a Max Ent-LDA hybrid[C]//Pro-ceed-ings of the 2010 Conference on Empirical Methodsin Natural Language Processing.Association for Compu-ta-tional Linguistics,2010:56-65.
    [8]Sauper C,Haghighi A,Barzilay R.Content models withattitude[C]//Proceedings of the 49th Annual Meeting ofthe Association for Computational Linguistics:HumanLanguage Technologies-Volume 1.Association forCom-putational Linguistics,2011:350-358.
    [9]Lin C H,He Y L.Joint sentiment/topic model for sen-ti-ment analysis[C]//In:Proceedings of the 18th ACMConfer-ence on Information and Knowledge Management.New York,NY,USA:ACM,2009:375-384.
    [10]Jo Y,Oh A H.Aspect and sentiment unification modelfor online review analysis[C]//In:Proceedings of the 4thACM International Conference on Web Search and DataMin-ing.New York,NY,USA:ACM,2011:815-824.
    [11]Darling W M.A theoretical and practical implementationtutorial on topic modeling and gibbs sampling[C]//Pro-ceedings of the 49th Annual Meeting of the Associa-tion for Computational Linguistics:Human LanguageTech-nologies,2011:642-647.
    [12]Yang Z.PAML:a program package for phylogenetic anal-ysis by maximum likelihood[J].Computer applica-tionsin the biosciences:CABIOS,1997,13(5):555-556.
    [13]Taddy M.On estimation and selection for topic mod-els[C]//International Conference on Artificial Intelligenceand Statistics,2012:1184-1193.
    [14]哈工大信息检索实验室等.停用词集合.数据堂[EB/OL]http://www.datatang.com/data/19300[2013-05-07].Research Center for Social Computing and InformationRetrieval,Harbin Institute of Technology.Stop Words.Data Tang[EB/OL]http://www.datatang.com/data/19300[2013-05-07].
    [15]Qiu Z,Wu B,Wang B,et al.Gibbs Collapsed Samplingfor Latent Dirichlet Allocation on Spark[J].2014:17-28.
    [16]Azzopardi L,Girolami M,van Risjbergen K.Investi-gat-ing the relationship between language model perplex-ity and IR precision-recall measures[C]//Proceedings ofthe 26th annual international ACM SIGIR conference onResearch and development in informaion retrieval.ACM,2003:369-370.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700