用户名: 密码: 验证码:
一种使用多跳事实的端到端知识库实体描述生成方法
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:An End-to-End Method of Entity Description Generation with Multi-hop Facts on Knowledge Bases
  • 作者:孟庆松 ; 张翔 ; 何世柱 ; 刘康 ; 赵军
  • 英文作者:MENG Qingsong;ZHANG Xiang;HE Shizhu;LIU Kang;ZHAO Jun;School of Automation,Harbin University of Science and Technology;State Key Laboratory of Pattern Recognition Chinese Academy of Sciences;
  • 关键词:知识图谱 ; 实体描述 ; 数据到文本生成
  • 英文关键词:knowledge graph;;entity description;;data-to-text generation
  • 中文刊名:MESS
  • 英文刊名:Journal of Chinese Information Processing
  • 机构:哈尔滨理工大学自动化学院;中国科学院自动化研究所模式识别国家重点实验室;
  • 出版日期:2019-05-15
  • 出版单位:中文信息学报
  • 年:2019
  • 期:v.33
  • 基金:国家自然科学基金(61533018,61702512);; 国家重点研发计划(2017YFB1002101)
  • 语种:中文;
  • 页:MESS201905008
  • 页数:9
  • CN:05
  • ISSN:11-2325/N
  • 分类号:71-79
摘要
自动化实体描述生成有助于进一步提升知识图谱的应用价值,而流畅度高是实体描述文本的重要质量指标之一。该文提出使用知识库上多跳的事实来进行实体描述生成,从而贴近人工编撰的实体描述的行文风格,提升实体描述的流畅度。该文使用编码器—解码器框架,提出了一个端到端的神经网络模型,可以编码多跳的事实,并在解码器中使用关注机制对多跳事实进行表示。该文的实验结果表明,与基线模型相比,引入多跳事实后模型的BLEU-2和ROUGE-L等自动化指标分别提升约8.9个百分点和7.3个百分点。
        Automatic generation of entity description is beneficial to the application of knowledge graphs.Good descriptions are usually written in fluent language,which is an important indicator of text quality.This paper proposes to utilize the multi-hop facts on knowledge graphs to generate entity descriptions,which are expected to match the writing style of human editors and improve the text fluency.Specifically,this paper adopts the encoder-decoder framework and proposes an end-to-end neural network model,encoding multi-hop facts with an attention mechanism in the decoding phase.Experiments show that,compared with the baseline,the proposed model trained with multihop facts obtains promising improvement in BLEU-2 by 8.9% and ROUGE-L by 7.3%,respectively.
引文
[1]Su Y,Yang S,Sun H,et al.Exploiting relevance feedback in knowledge graph search[C]//Proceedings of the 21st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.New York,NY,USA:ACM,2015:1135-1144.
    [2]Zhang F,Yuan N J,Lian D,et al.Collaborative knowledge base embedding for recommender systems[C]//Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.New York,NY,USA:ACM,2016:353-362.
    [3]Yih W-t,Chang M-W,He X,et al.Semantic parsing via staged query graph generation:Question answering with knowledge base[C]//Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing:Association for Computational Linguistics,2015:1321-1331.
    [4]Vrandeˇci'c D,Kr9tzsch M.Wikidata:A Free collaborative knowledgebase[J].Commun ACM,2014,57(10):78-85.
    [5]Voskarides N,Meij E,Tsagkias M,et al.Learning to explain entity relationships in knowledge graphs[C]//Proceedings of the 53rd Annual Meeting of the Association for Computation Linguistics.Beijing,2015:564-574.
    [6]Althoff T,Dong X L,Murphy K,et al.TimeMachine:Timeline generation for knowledge-base entities[C]//Proceedings of the 21st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.ACM,2015:19-28.
    [7]Haug T,Ganea O-E,Grnarova P.Neural multi-step reasoning for question answering on semi-structured tables[C]//Proceedings of the Advances in Information Retrieval.Cham:Springer International Publishing,2018:611-617.
    [8]Over P,Dang H,Harman D.DUC in context[J].Information Processing and Management,2007,43(6):1506-1520.
    [9]Goldberg E,Driedger N,Kittredge R I.Using naturallanguage processing to produce weather forecasts[J].IEEE Expert,1994,9(2):45-53.
    [10]Buchanan B G,Moore J D,Forsythe D E,et al.An intelligent interactive system for delivering individualized information to patients[J].Artificial Intelligence in Medicine,1995,7(2):117-154.
    [11]Iordanskaja L,Kim M,Kittredge R,et al.Generation of extended bilingual statistical reports[C]//Proceedings of the 14th Conference on Computational Linguistics.Stroudsburg,PA,USA:Association for Computational Linguistics,1992:1019-1023.
    [12]Reiter E,Dale R.Building applied natural language generation systems[J].Natural Language Engineering,1997,3(01):57-87.
    [13]张翔.基于大规模知识库的实体描述生成和应用[D].哈尔滨:哈尔滨理工大学硕士毕业论文,2018.
    [14]Angeli G,Liang P,Klein D.A simple domain-independent probabilistic approach to generation[C]//Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing.Stroudsburg,PA,USA:Association for Computational Linguistics,2010:502-512.
    [15]Duma D,Klein E.Generating natural language from linked data:Unsupervised template extraction[C]//Proceedings of the 10th International Conference on Computational Semantics(IWCS 2013):Association for Computational Linguistics,2013:83-94.
    [16]Saldanha G,Biran O,McKeown K,et al.An entityfocused approach to generate company descriptions[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics(Vol 2),2016:243-248.
    [17]Konstas I,Lapata M.Concept-to-text generation via discriminative reranking[C]//Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics:Long Papers-Volume 1:Association for Computational Linguistics,2012:369-378.
    [18]Gyawali B,Gardent C.Surface realibation from knowledge-bases[C]//Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics(Volume 1:Long Papers),2014,1:424-434.
    [19]Mei H,Bansal M,Walter M R.What to talk about and how?Selective generation using LSTMs with Coarse-to-Fine Alignment[C]//Proceedings of the2016Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies,2016:720-730.
    [20]Lebret R,Grangier D,Auli M.Neural text generation from structured data with application to the biography domain[C]//Proceedings of the 2016Conference on Empirical Methods in Natural Language Processing,2016:1203-1213.
    [21]Gu J,Lu Z,Li H,et al.Incorporating copying mechanism in sequence-to-sequence learning[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics(Volume 1:Long Papers),2016,1:1631-1640.
    [22]Chisholm A,Radford W,Hachey B.Learning to generate one-sentence biographies from Wikidata[C]//Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics:Volume 1,Long Papers:Association for Computational Linguistics,2017:633-642.
    [23]Kingma D P,Ba J.Adam:A method for stochastic optimization[C]//Proceedings of ICLR,2015.
    [24]Papineni K,Roukos S,Ward T,et al.BLEU:Amethod for automatic evaluation of machine translation[C]//Proceedings of the 40th Annual Meeting on Association for Computational Linguistics.Stroudsburg,PA,USA:Association for Computational Linguistics,2002:311-318.
    [25]Lin C Y.ROUGE:A package for automatic evaluation of summaries[C]//Proceedings of Text Summarization Branches Out,2004.
    [26]Novikova J,Du2ek O,Curry A C,et al.Why we need new evaluation metrics for NLG[C]//Proceedings of Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing,2017:2241-2252.
    (1)http://linkeddata.org/
    (2)https://www.wikidata.org/wiki/Q1339

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700