用户名: 密码: 验证码:
A Pipeline Approach to Free-Description Question Answering in Chinese Gaokao Reading Comprehension
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:A Pipeline Approach to Free-Description Question Answering in Chinese Gaokao Reading Comprehension
  • 作者:TAN ; Hongye ; ZHAO ; Honghong ; LI ; Ru ; LIU ; Bei
  • 英文作者:TAN Hongye;ZHAO Honghong;LI Ru;LIU Bei;School of Computer and Information Technology of Shanxi University;Key Laboratory of Ministry of Education for Computation Intelligence and Chinese Information Processing of Shanxi University;
  • 英文关键词:Reading comprehension(RC);;Free-description question;;Answer sentence extraction(ASE);;Answer sentence fusion
  • 中文刊名:EDZX
  • 英文刊名:电子学报(英文)
  • 机构:School of Computer and Information Technology of Shanxi University;Key Laboratory of Ministry of Education for Computation Intelligence and Chinese Information Processing of Shanxi University;
  • 出版日期:2019-01-15
  • 出版单位:Chinese Journal of Electronics
  • 年:2019
  • 期:v.28
  • 基金:supported by the National Natural Science Foundation of China(No.61673248,and No.61772324);; the Project of Postgraduate Joint Cultivation Base of Shanxi Province(No.2018JD02)
  • 语种:英文;
  • 页:EDZX201901016
  • 页数:7
  • CN:01
  • ISSN:10-1284/TN
  • 分类号:117-123
摘要
This study attempted to answer complicated free-description questions in Chinese Gaokao Reading comprehension(RC) tasks. We found that quite a few questions can be answered by extracting sentences from the document and combining them, so we used a pipeline approach with two components:Answer sentence extraction(ASE) and Answer sentence fusion(ASF). Semantic vector similarity and topical distribution similarity were explored for ASE. Integer linear programming strategy was used for ASF, which combined dependencies with the language model, based on word importance. As a first step towards the new challenge, we obtained some encouraging results on actual exam questions in Chinese subject's RC tasks of Beijing Gaokao, which helped us obtain insights into techniques needed to solve real-word complex questions.
        This study attempted to answer complicated free-description questions in Chinese Gaokao Reading comprehension(RC) tasks. We found that quite a few questions can be answered by extracting sentences from the document and combining them, so we used a pipeline approach with two components:Answer sentence extraction(ASE) and Answer sentence fusion(ASF). Semantic vector similarity and topical distribution similarity were explored for ASE. Integer linear programming strategy was used for ASF, which combined dependencies with the language model, based on word importance. As a first step towards the new challenge, we obtained some encouraging results on actual exam questions in Chinese subject's RC tasks of Beijing Gaokao, which helped us obtain insights into techniques needed to solve real-word complex questions.
引文
[1]Matthew Richardson,Christopher J.C.Burges,et al.,“Mctest:A challenge dataset for the open domain machine comprehension of text”,Proc.of EMNLP2013,pp.193-203,2013.
    [2]Karl Moritz Hermann,Tomas Kociskyz,et al.,“Teaching machines to read and comprehend”,Proc.of NIPS2015,2015.
    [3]Akira Fujita,Akihiro Kameda,et al.,“Overview of Todai Robot Project and evaluation framework of its NLP-based problem solving”,Proc.of LREC2014,pp.2590-2597,2014.
    [4]G.Cheng,W.X.et al.,“Taking up the Gaokao challenge:An information retrieval approach”,Proc.of the TwentyFifth International Joint Conference on Artificial Intelligence(IJCAI-16),pp.2479-2485,2016.
    [5]D.M.Endres and J.E.Schindelin,“A new metric for probability distributions”,IEEE Transactions on Information Theory,Vol.49,No.7,pp.1858-1860,2003.
    [6]M.H.Zhang,H.L.Wang and G.D.Zhou,“An automatic summarization approach based on LDA topic feature”,Computer Applications and Software,pp.20-22,2011.
    [7]Katja Filippova and Michael Strube,“Sentence fusion via dependency graph compression”,Proc.of the 2008 Conference on Empirical Methods in Natural Language Processing,pp.177-185,2008.
    [8]Y.Zhou,D.J.Zhu,et al.,“Sentence similarity based on How Net”,Bulletin of Adavanced Technology Research,Vol.4,No.8,pp.32-37,2010.(in Chinese)
    [9]D.M.Blei,J.D.Lafferty,et al.,“A correlated topic model of Science”,Annals of Applied Statistics,,pp.17-35,2007.
    [10]X.H.Yan,et al.,“A biterm topic model for short texts”,Proc.of WWW2013,pp.1445-1456,2013.
    [11]Katja Filippova,Enrique Alfonseca,Carlos A.Colmenares,et al.,“Sentence compression by deletion with LSTMs”,Proc.of the 2015 Conference on Empirical Methods in Natural Language Processing,pp.360-368,2015.
    [12]Regina Barzilay and Kathleen R.McKeown,“Sentence fusion for multidocument news summarization”,Computational Linguistics,Vol.31,No.3,pp.297-328,2005.
    [13]L.D.Bing,P.J.Li,Y.Liao and W.Lam,“Abstractive multidocument summarization via phrase selection and merging”,Proc.of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Proceeding,pp.1587-1597,2015.
    [14]S.C.Yang,X.Y.Dai and J.J.Chen,“Advances in question classification for open-domain question answering”,Acta Electronica Sinica,Vol.43,No.8,pp.1627-1636,2015.(in Chinese)
    [15]H.Wang,M.Bansal,K.Gimpel,et al.,“Machine comprehension with syntax,frames,and semantics”,Proc.of ACL2015,pp.700-706,2015.
    [16]Mrinmaya Sachan,Avinava Dubey,Eric P.Xing,et al.,“Learning answer-entailing structures for machine comprehension”,Proc.of ACL2015,pp.239-249,2015.
    [17]Karthik Narasimhan and Regina Barzilay,“Machine comprehension with discourse relations”,Proc.of ACL2015,pp.1253-1262,2015.
    [18]W.P.Yin,S.Ebert and H.Schtze,“Attention-based convolutional neural network for machine comprehension”,Proc.of NAACL2016,pp.15-21,2016.
    [19]A.Trischler,Z.Ye,X.D.Yuan,“Natural language comprehension with the EpiReader”,Proc.of EMNLP 2016,pp.128-137,2016.
    [20]Y.M.Cui,Z.P.Chen,et al.,“Attention-over-attention neural networks for reading comprehension”,ACL2017,arXiv:1606.01603,2016.
    [21]álvaro Rodrigo,et al.,“Overview of CLEF QA entrance exams task 2015”,Working Notes of CLEF2015,2015.
    [22]Dominique Laurent,Baptiste Chardon,Sophie Ngre,et al.,“Reading comprehension at Entrance Exams”,Working Notes of CLEF2015,2015.
    [23]Hideyuki Shibuki,Kotaro Sakamoto,Madoka Ishioroshi,et al.,“Overview of the NTCIR-12 QA Lab-2 task”,Proc.of the12th NTCIR Conference on Evaluation of Information Access Technologies,2016.
    [24]Takuma Takada,Takuya Imagawa,et al.,“SML questionanswering system for world history essay and multiplechoice exams at NTCIR-12 QA Lab-2”,Proc.of the 12th NTCIR Conference on Evaluation of Information Access Technologies,2016.
    [25]Kotaro Sakamoto,Madoka Ishioroshi,et al.,“Forst:Question answering system for second-stage examinations at NTCIR-12QA Lab-2 Task”,Proc.of the 12th NTCIR Conference on Evaluation of Information Access Technologies,2016.
    (2)The Chinese(native language)subject consists of 4 tasks:Essays, Classical Chinese, RC of science and technology documents, RC of literature documents.
    (3)The parsing tree is obtained by Stanford Parser:http://nlp.stanford.edu/software/lex-parser.shtml
    (4)https://sourceforge.net/projects/lpsolve/
    (5)http://www.ltp-cloud.com/
    (6)http://Code.google.com/p/word2vec/
    (7)http://www.keenage.com/
    (8)Question types of free-description questions of Chinese subject’s RC tasks in Beijing Gaokao are:interpretation of complex sentences,rewriting or summarization of specific details in the background document, comprehension of the main idea, inferences about the author’s intention and sentiment, language appreciation and so on.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700