Academia.eduAcademia.edu

Outline

SumPubMed: Summarization Dataset of PubMed Scientific Articles

2021

https://doi.org/10.18653/V1/2021.ACL-SRW.30

Abstract

Most earlier work on text summarization is carried out on news article datasets. The summary in these datasets is naturally located at the beginning of the text. Hence, a model can spuriously utilize this correlation for summary generation instead of truly learning to summarize. To address this issue, we constructed a new dataset, SUMPUBMED, using scientific articles from the PubMed archive. We conducted a human analysis of summary coverage, redundancy, readability, coherence, and informativeness on SUMPUBMED. SUMPUBMED is challenging because (a) the summary is distributed throughout the text (not-localized on top), and (b) it contains rare domain-specific scientific terms. We observe that seq2seq models that adequately summarize news articles struggle to summarize SUMPUBMED. Thus, SUMPUBMED opens new avenues for the future improvement of models as well as the development of new evaluation metrics.

FAQs

sparkles

AI

What are the main characteristics of the SUMPUBMED dataset?add

SUMPUBMED consists of 33,772 biomedical articles, averaging 4,000 words each, from diverse medical literature. It emphasizes non-localized summarization, making it distinct from typical short news articles.

How does SUMPUBMED evaluate summary quality compared to previous datasets?add

The study finds significant differences in summary evaluation, noting that ROUGE scores correlate poorly with human assessments on SUMPUBMED. This indicates a need for new metrics tailored to scientific summarization.

What preprocessing techniques were implemented in the creation of SUMPUBMED?add

The dataset underwent extensive preprocessing, removing non-textual elements like figures and citations, resulting in succinct but informative content. This level of preprocessing is emphasized as a key differentiator from other datasets.

Which summarization models were evaluated on the SUMPUBMED dataset?add

The research assesses multiple models including extractive, abstractive (seq2seq with attention), and hybrid methods. Each method's performance was evaluated based on ROUGE metrics and human quality assessments.

What do the findings suggest about the effectiveness of hybrid summarization approaches?add

Results indicate that hybrid approaches, combining extractive and abstractive techniques, reduce redundancy and improve summary coherence significantly. Specifically, using coverage mechanisms enhanced summary quality in complex biomedical texts.

References (21)

  1. Wei-Fan Chen, Shahbaz Syed, Benno Stein, Matthias Hagen, and Martin Potthast. 2020. Abstractive snip- pet generation. In Proceedings of The Web Confer- ence 2020, pages 1309-1319.
  2. Arman Cohan, Franck Dernoncourt, Doo Soon Kim, Trung Bui, Seokhwan Kim, Walter Chang, and Nazli Goharian. 2018. A discourse-aware attention model for abstractive summarization of long documents. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computa- tional Linguistics: Human Language Technologies, Volume 2 (Short Papers), volume 2, pages 615-621.
  3. Arman Cohan and Nazli Goharian. 2016. Revisiting summarization evaluation for scientific articles. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16), pages 806-813.
  4. Shibhansh Dohare, Harish Karnick, and Vivek Gupta. 2017. Text summarization using abstract meaning representation. arXiv preprint arXiv:1706.01678.
  5. Günes Erkan and Dragomir R Radev. 2004. Lexrank: Graph-based lexical centrality as salience in text summarization. Journal of artificial intelligence re- search, 22:457-479.
  6. Jenny Rose Finkel, Trond Grenager, and Christopher Manning. 2005. Incorporating non-local informa- tion into information extraction systems by gibbs sampling. In Proceedings of the 43rd annual meet- ing on association for computational linguistics, pages 363-370. Association for Computational Lin- guistics.
  7. Annemarie Friedrich, Marina Valeeva, and Alexis Palmer. 2014. LQVSumm: A corpus of linguis- tic quality violations in multi-document summariza- tion. In Proceedings of the Ninth International Conference on Language Resources and Evalua- tion (LREC'14), pages 1591-1599, Reykjavik, Ice- land. European Language Resources Association (ELRA).
  8. Alexios Gidiotis and Grigorios Tsoumakas. 2019. Structured summarization of academic publications. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pages 636- 645. Springer.
  9. Jiatao Gu, Zhengdong Lu, Hang Li, and Victor OK Li. 2016. Incorporating copying mechanism in sequence-to-sequence learning. In Proceedings of the 54th Annual Meeting of the Association for Com- putational Linguistics (Volume 1: Long Papers).
  10. Anette Hulth. 2003. Improved automatic keyword ex- traction given more linguistic knowledge. In Pro- ceedings of the 2003 conference on Empirical meth- ods in natural language processing, pages 216-223. Association for Computational Linguistics.
  11. Chris Kedzie, Kathleen McKeown, and Hal Daumé III. 2018. Content selection in deep learning models of summarization. In Proceedings of the 2018 Con- ference on Empirical Methods in Natural Language Processing, pages 1818-1828.
  12. Chin-Yew Lin. 2004. ROUGE: A package for auto- matic evaluation of summaries. In Text Summariza- tion Branches Out, pages 74-81, Barcelona, Spain. Association for Computational Linguistics.
  13. Mark F. Medress, Franklin S Cooper, Jim W. Forgie, CC Green, Dennis H. Klatt, Michael H. O'Malley, Edward P Neuburg, Allen Newell, DR Reddy, B Ritea, et al. 1977. Speech understanding systems: Report of a steering committee. Artificial Intelli- gence, 9(3):307-316.
  14. Haitao Mi, Baskaran Sankaran, Zhiguo Wang, and Abe Ittycheriah. 2016. Coverage embedding models for neural machine translation. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pages 955-960.
  15. Rada Mihalcea and Paul Tarau. 2004. Textrank: Bring- ing order into text. In Proceedings of the 2004 con- ference on empirical methods in natural language processing, pages 404-411.
  16. Ramesh Nallapati, Bowen Zhou, Cicero dos Santos, C ¸aglar Gulc ¸ehre, and Bing Xiang. 2016. Abstrac- tive text summarization using sequence-to-sequence rnns and beyond. In Proceedings of The 20th SIGNLL Conference on Computational Natural Lan- guage Learning, pages 280-290.
  17. Shashi Narayan, Shay B Cohen, and Mirella Lapata. 2018. Don't give me the details, just the summary! topic-aware convolutional neural networks for ex- treme summarization. In Proceedings of the 2018 Conference on Empirical Methods in Natural Lan- guage Processing, pages 1797-1807.
  18. Karl Pearson. 1895. Vii. note on regression and inheri- tance in the case of two parents. proceedings of the royal society of London, 58(347-352):240-242.
  19. Alexander M Rush, Sumit Chopra, and Jason Weston. 2015. A neural attention model for abstractive sen- tence summarization. In Proceedings of the 2015 Conference on Empirical Methods in Natural Lan- guage Processing, pages 379-389.
  20. Abigail See, Peter J Liu, and Christopher D Manning. 2017. Get to the point: Summarization with pointer- generator networks. In Proceedings of the 55th An- nual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1073- 1083.
  21. Burr Settles. 2005. Abner: an open source tool for au- tomatically tagging genes, proteins and other entity names in text. Bioinformatics, 21(14):3191-3192.