用户名: 密码: 验证码:
地质报告文本自动标引技术方法分析
详细信息    查看官网全文
摘要
随着近些年地质工作的进展,我国积累了大量的地质资料,其中包含大量的地质报告文本。由于这些地质报告文本中的内容冗余,文字量巨大,这使得人们快速准确的获取文本标引词的难度大大增加。本文以固体矿产资源地质勘查报告文本为例,分析总结了固体矿产资源地质勘查报告文本的用词特点、句式特点、结构特点。针对这些特点,分析选择适用于地质报告文本的自动标引方法,并初步提出了适合于地质报告文本的自动标引设计方案。
With the progress of the geological work in recent years,accumulated a large amount of geological data in China, these geological date contains a lot of geological reports。In the text, as a result of these geological report content redundancy, literals, which makes people fast get the text research topics, and the main research content is more difficult。Based on solid mineral resources geological exploration report, for example, text analysis summarizes the solid mineral resources geological exploration report text and language characteristics, features, structure features. According to these characteristics, analysis of choice is suitable for the geological report text automatic indexing method, and put forward the suitable for preliminary design of automatic indexing of geological report text.
引文
[1]郎杨琴,孔丽华.美国发布“大数据的研究和发展计划"[J].科研信息化技术与应用,2012,03(02):89-93
    [2]杨宗喜,唐金荣,周平,张涛,金玺.大数据时代下美国地质调查局的科学新观[J].地质通报,2013,32(09):1337-1343
    [3]Luhn,H.P.The automatic creation of literature abstracts.IBM Journal of Research and Development,1958,02(02),159-165
    [4]王莉,许凯.浅谈文本数据自动标引系统的设计[J].图书馆理论与实践,2013,06:95-97
    [5]H.P.Luhn:A Statistical Approach to Mechanized Encoding and searching of Literary Information,IBM Journal of Research and Development.1957,1(4):309-317
    [6]Edmundson H P,Oswald V A:Automaic Indexing and Abstracting of the Content of Document[R].Planning Research Corp Document PRCR-126 ASTIA AD No.231606.Los Angeles 1959:1-142.
    [7]Edmundson H P.New Methods in Automatic Abstracting Extracting[J].Journal of the association for Computing Machinery.1969,16(2):264-285
    [8]Devadason F.Computerization of Deep Structure Based Indexes.International Classification,1985,12(2):87-94
    [9]Silva W.T,Mili Diu R.L.Belief Function Model for Information Retrieval.Jounral of the American Society for Information Science,1993,44(1):10-18
    [10]Tomokiyo T,Hurst M.A language Model Approach to Keyphrase Extraction.In:Proceedings of the ACL Workshop on Multiword Expressions:Analysis,Acquisition&Treatment,Sapporo,Japan,2003:33-40
    [11]李立燕.中文科技文献自动摘要系统[D].电子科技大学,2006.
    [12]吴军.数学之美[M].第2版.北京:人民邮电出版社,2012.
    [13]李绍儒,张玉杰.谈地质学文献检索(下)[J].黄金地质,2002,08(02):77-80
    [14]刘红光,黄文斌.中国地质文献数据库联机标引系统的主题标引方法[J].地质通报,2006,25(08):1010-1012
    [15]张作衡,洪为,蒋宗胜,段士刚,王志华,李凤鸣,石福品,赵军,郑仁乔.新疆西天山晚古生代铁矿床的地质特征、矿化类型及形成环境[J].矿床地质,2012,31(05):941-964
    [16]DZ/T0033-2002,中华人民共和国地质矿产行业标准[S].北京:中华人民共和国国土资源部,2002.
    [17]赵宗仁.汉语科技文献自动标引系统CADAIS[J].现代图书情报技术,1993,09(01):12-15
    [18]许爱琴.文本信息自动标引技术研究与改进[D].武汉理工大学,2013.
    [19]张美娜,迟呈英,战学刚,亓超.基于篇章结构的文本自动标引算法[J].计算机应用与软件,2008,25(09):122-124
    [20]张静.自动标引技术的回顾与展望[J].现代情报,2009,29(04):221-225
    [21]牛凯.中文科技文献计算机自动标引系统的研究[J].情报学报,1995,14(01):16-26

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700