用户名: 密码: 验证码:
基于VSM和Bisecting K-means聚类的新闻推荐方法
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:A News Recommendation Method Based on VSM and Bisecting K-means Clustering
  • 作者:袁仁进 ; 陈刚 ; 李锋 ; 魏双建
  • 英文作者:YUAN Ren-jin;CHEN Gang;LI Feng;WEI Shuang-jian;Institute of Geospatial Information,Information Engineering University;
  • 关键词:个性化推荐 ; 向量空间模型 ; Bisecting ; K-means聚类算法 ; 用户兴趣模型
  • 英文关键词:personalized recommendation;;vector space model;;Bisecting K-means clustering algorithm;;user interest model
  • 中文刊名:BJYD
  • 英文刊名:Journal of Beijing University of Posts and Telecommunications
  • 机构:信息工程大学地理空间信息学院;
  • 出版日期:2019-03-21 10:42
  • 出版单位:北京邮电大学学报
  • 年:2019
  • 期:v.42
  • 基金:国家自然科学基金项目(41301428)
  • 语种:中文;
  • 页:BJYD201901018
  • 页数:6
  • CN:01
  • ISSN:11-3570/TN
  • 分类号:118-123
摘要
针对海量新闻数据给用户带来的困扰,为提升用户阅读新闻的个性化体验,提出了融合向量空间模型和Bisecting K-means聚类的新闻推荐方法.首先进行新闻文本向量化,使用向量空间模型和TF-IDF算法构建出新闻特征向量;采用Bisecting K-means聚类算法对新闻特征向量集进行聚类;然后将已聚类的新闻集分为训练集和测试集,根据训练集构建"用户—新闻类别—新闻"三层层次结构的用户兴趣模型;最后采用余弦相似度方法得出新闻推荐结果,并与测试集进行对比分析.实验以基于用户的协同过滤算法、基于物品的协同过滤算法、结合向量空间模型和K-means聚类的推荐方法为基准,实验结果表明,该方法具有可行性,在准确率、召回率和F值上都有所提高.
        Personalized recommendation technology is a good solution to the problem of information overload. In order to improve the user's personalized experience of reading news,a news recommendation method based on the vector space model and Bisecting K-means clustering is proposed. Firstly,the news text vectorization is carried out: using the vector space model and TF-IDF algorithm to construct news feature vectors; then Bisecting K-means clustering algorithm is utilized to cluster the news feature vector set;after that,the clustered news set is divided into training set and test set,according to the training set,a"user-news category-news"three-level structure of the user interest model is built; finally,the cosine similarity method is used to calculate news recommendation results. The experiments are based on userbased collaborative filtering algorithm,item-based collaborative filtering algorithm,combined vector space model and K-means clustering recommendation method,and the results show that the proposed method is feasible,and the accuracy rate,recall rate and F value all have been improved.
引文
[1]冷亚军,陆青,梁昌勇.协同过滤推荐技术综述[J].模式识别与人工智能,2014,27(8):720-734.Leng Yajun,Lu Qing,Liang Changyong. Survey of recommendation based on collaborative filtering[J]. PR and AI,2014,27(8):720-734.
    [2]Das A S,Datar M,Garg A,et al. Google news personalization:scalable online collaborative filtering[C]∥International Conference on World Wide Web.[S. l.]:ACM,2007:271-280.
    [3]Garcin F,Zhou K,Faltings B,et al. Personalized news recommendation based on collaborative filtering[C]∥IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology.[S. l.]:IEEE,2013:437-441.
    [4]Wu X,Xie F,Wu G,et al. Personalized news filtering and summarization on the web[C]∥IEEE International Conference on TOOLS with Artificial Intelligence.[S.l.]:IEEE,2011:68-76.
    [5]曹一鸣.基于协同过滤的个性化新闻推荐系统的研究与实现[D].北京:北京邮电大学,2013.
    [6]周由,戴牡红.语义分析与TF-IDF方法相结合的新闻推荐技术[J].计算机科学,2013,40(S2):267-269,300.Zhou You,Dai Muhong. News recommendation technology combining semantic analysis with TF-IDF method[J].Computer Science,2013,40(S2):267-269,300.
    [7]郝水龙,吴共庆,胡学钢.基于层次向量空间模型的用户兴趣表示及更新[J].南京大学学报(自然科学),2012,48(2):190-197.Hao Shuilong,Wu Gongqing,Hu Xuegang. Presentation and updation for user profile based on hierarchical vector space model[J]. Journal of Nanjing University(Natural Sciences),2012,48(2):190-197.
    [8]古万荣,董守斌,何锦潮,等.基于二次聚类的新闻推荐方法[J].华南理工大学学报(自然科学版),2014(7):15-20.Gu Wanrong,Dong Shoubin,He Jingchao,et al. News recommendation method based on secondary clustering[J]. Journal of South China University of Technology(Natural Science Edition),2014(7):15-20.
    [9]李佳珊.个性化新闻推荐引擎中新闻分组聚类技术的研究与实现[D].北京:北京邮电大学,2013:20-29.
    [10]刘啸剑.基于主题模型的关键词抽取算法研究[D].合肥:合肥工业大学,2016.
    [11]Abuaiadah D. Using bisect k-means clustering technique in the analysis of Arabic documents[J]. ACM Transactions on Asian and Low-Resource Language Information Processing,2016,15(3):1-13.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700