用户名: 密码: 验证码:
科技大数据知识图谱构建模型与方法研究
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:Building Knowledge Graph with Sci-Tech Big Data
  • 作者:王颖 ; 钱力 ; 谢靖 ; 常志军 ; 孔贝贝
  • 英文作者:Wang Ying;Qian Li;Xie Jing;Chang Zhijun;Kong Beibei;National Science Library, Chinese Academy of Sciences;Department of Library, Information and Archives Management, University of Chinese Academy of Sciences;
  • 关键词:科技大数据 ; 知识图谱 ; 本体 ; 知识抽取
  • 英文关键词:Sci-Tech Big Data;;Knowledge Graph;;Ontology;;Knowledge Extraction
  • 中文刊名:数据分析与知识发现
  • 英文刊名:Data Analysis and Knowledge Discovery
  • 机构:中国科学院文献情报中心;中国科学院大学图书情报与档案管理系;
  • 出版日期:2019-01-25
  • 出版单位:数据分析与知识发现
  • 年:2019
  • 期:01
  • 基金:国家社会科学青年基金项目“基于关联数据的学术资源深度挖掘方法研究”(项目编号:15CTQ006)的研究成果之一
  • 语种:中文;
  • 页:19-30
  • 页数:12
  • CN:10-1478/G2
  • ISSN:2096-3467
  • 分类号:G353.1
摘要
【目的】研究从科技大数据中提取结构化知识、构建学术知识网络的模型与方法,支持智能知识服务产品的研发提升精准知识发现能力。【方法】提出科技大数据知识图谱的构建模型和技术架构,在汇聚和融合科技大数据知识资源的基础上,以大数据平台分布式存储和高性能计算为支撑环境,详细设计和实现科研实体知识抽取、实体对齐和关系发现、知识融合与语义丰富化、语义化存储、质量管理等知识图谱构建技术。【结果】构建3亿实体和11亿关系的科技大数据知识图谱,有效支撑科技大数据知识发现平台和"慧科研"智能随身助手的服务。【局限】由于数据的规模和复杂性,知识图谱的质量管理仍需花费大量的人力,实体对齐的准确度也有待于提高。【结论】本文提出的知识图谱建设方案适用于科技大数据的知识管理和深加工,有助于科技知识的有效利用。
        [Objective] This paper tries to extract information from Sci-Tech big data and build an academic knowledge network, aiming to develop smart knowledge services. [Methods] We proposed an Ontology schema and a framework to contruct knowledge graph based on the distributed storage and high-performance computing of big data platform. The proposed model helped us extract and align research entities for relationship discovery. We also adopted the knowledge merging and enrichment, semantic storage and quality management techniques. [Results] We created a huge knowledge graph including more than 300 million entities and 1.1 billion relations. It also supported knowledge discovery platform and smart personal research assistant apps for scientific big data. [Limitations] More research is needed to improve the quality management of knowledge graph, as well as the precision of entity alignment. [Conclusions] The proposed method improve the knowledge management of scientific and technology big data.
引文
[1]Singhal A.Introducing the Knowledge Graph:Things,Not Strings[EB/OL].[2013-04-10].http://googleblog.blogspot.co.uk/2012/05/introducing-knowledge-graph-things-not.html.
    [2]Wu W,Li H,Wang H,et al.Probase:A Probabilistic Taxonomy for Text Understanding[C]//Proceedings of the2012 ACM SIGMOD International Conference on Management of Data.New York:ACM,2012:481-492.
    [3]Baidu Open Knowledge Graph[EB/OL].[2018-08-16].https://kgopen.baidu.com/.
    [4]张阔.从搜索信息到搜索知识--技术架构[EB/OL].[2013-03-26].http://weibo.com/1870490225/zbz Dwq5TF#_rnd1435219297630.(Zhang Kuo.From Information Search to Knowledge Search-Technology Infrastructure[EB/OL].[2013-03-26].http://weibo.com/1870490225/zbz Dwq5TF#_rnd1435219297630.)
    [5]王元卓,贾岩涛,赵泽亚,等.Open KN-网络大数据时代的知识计算引擎[J].中国计算机学会通讯,2014,10(11):30-35.(Wang Yuanzhuo,Jia Yantao,Zhao Zeya,et al.Open KG-Knowledge Computing Engine in the Era of Network Big Data[J].Communications of the Chinese Computer Federation,2014,10(11):30-35.)
    [6]Zhu J G,Wang H F,Shen B J.Software.Zhishi.Schema:ASoftware Programming Taxonomy Derived from Stackoverflow[C]//Proceedings of the 14th International Semantic Web Conference(ISWC 2015),Bethlehem,Pennsylvania,USA.2015:1-4.
    [7]Introduction to CN-Probase[EB/OL].[2017-11-29].http://kw.fudan.edu.cn/cnprobase/intro/.
    [8]国务院.新一代人工智能发展规划[R].[2017-07-08].http://www.gov.cn/zhengce/content/2017-07/20/content_5211996.htm.(State Council.New Generation Artificial Intelligence Development Plan[R].[2017-07-08].http://www.gov.cn/zhengce/content/2017-07/20/content_5211996.htm.)
    [9]钱力,谢靖,常志军,等.基于科技大数据的智能知识服务体系研究设计[J].数据分析与知识发现.DOI:10.11925/infotech.2096-3467.2018.1364.(Qian Li,Xie Jing,Chang Zhijun,et al.Designing Smart Knowledge Services with Sci-Tech Big Data[J].Data Analysis and Knowledge Discovery.DOI:10.11925/infotech.2096-3467.2018.1364.)
    [10]Springer Nature.SN Sci Graph[EB/OL].[2018-08-18].https://www.springernature.com/gp/researchers/scigraph.
    [11]Allen B P.The Roll of Metadata in the Second Machine Age[EB/OL].[2017-02-02].https://w.slideshare.net/bpa777/dc2016-keynote-20161013-67164305.
    [12]Taylor&Francis.Wizdom.ai[EB/OL].[2018-05-05].https://www.wizdom.ai/#about.
    [13]Tang J,Zhang J,Yao L M,et al.AMiner:Extraction and Mining of Academic Social Networks[C]//Proceedings of the14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining(SIGKDD’2008).Las Vegas,Nevada,USA.New York,ACM,2008:990-998.
    [14]Acemap Knowledge Graph[EB/OL].[2018-05-05].https://acemap.info/app/Ace KG/.
    [15]国家科技文献中心.NSTL统一文献元数据标准3.0[EB/OL].[2017-10-18].http://spec.nstl.gov.cn/embed/metastandard.htm.(National Science and Technology Library.Unified Meta Data Standard for Scientific Literature Version3.0[EB/OL].[2017-10-18].http://spec.nstl.gov.cn/embed/metastandard.htm.)
    (1)http://dbpedia.org.
    (1)https://www.grid.ac/.
    (2)http://www.meeting.edu.cn.
    (3)http://csp.escience.cn.
    (4)http://or.clas.ac.cn.
    (1)https://www.nsf.gov.
    (2)http://www.usda.gov.
    (3)http://www.bbsrc.ac.uk.
    (4)http://www.nsfc.gov.cn.
    (5)http://www.geonames.org.
    (6)http://kw.fudan.edu.cn/cndbpedia/.
    (7)http://www.mpi-inf.mpg.de/yago-naga/yago/.
    (8)http://babelnet.org.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700