Summarizing Information Graphics Textually

Kathleen McCoy

doi:10.1162/COLI_A_00091

Outline

Summarizing Information Graphics Textually

Kathleen McCoy

2012, Computational Linguistics

https://doi.org/10.1162/COLI_A_00091

visibility

…

description

48 pages

link

1 file

Abstract

Information graphics (such as bar charts and line graphs) play a vital role in many multimodal documents. The majority of information graphics that appear in popular media are intended to convey a message and the graphic designer uses deliberate communicative signals, such as highlighting certain aspects of the graphic, in order to bring that message out. The graphic, whose communicative goal (intended message) is often not captured by the document's accompanying text, contributes to the overall purpose of the document and cannot be ignored. This article presents our approach to providing the high-level content of a non-scientific information graphic via a brief textual summary which includes the intended message and the salient features of the graphic. This work brings together insights obtained from empirical studies in order to determine what should be contained in the summaries of this form of non-linguistic input data, and how the information required for realizing the sele...

References (82)

Alty, James L. and Dimitrios I. Rigas. 1998. Communicating graphical information to blind users using music: the role of context. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pages 574-581, Los Angeles, CA.
Baldwin, Breck and Thomas Morton. 1998. Dynamic coreference-based summarization. In Proceedings of the 3rd Conference on Empirical Methods in Natural Language Processing, pages 1-6, Granada.
Barzilay, Regina, Noemie Elhadad, and Kathleen McKeown. 2002. Inferring strategies for sentence ordering in multidocument news summarization. Journal of Artificial Intelligence Research, 17:35-55.
Barzilay, Regina and Mirella Lapata. 2006. Aggregation via set partitioning for natural language generation. In Proceedings of the Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics, pages 359-366, New York, NY.
Belz, Anja, Eric Kow, Jette Viethen, and Albert Gatt. 2008. The Grec challenge 2008: Overview and evaluation results. In Proceedings of the 5th International Natural Language Generation Conference, pages 183-191, Salt Fork, OH.
Brennan, Susan E., Marilyn W. Friedman, and Carl J. Pollard. 1987. A centering approach to pronouns. In Proceedings of the Annual Meeting of the Association of Computational Linguistics, pages 155-162, Stanford, CA.
Carberry, Sandra, Stephanie Elzer, and Seniz Demir. 2006. Information graphics: An untapped resource for digital libraries. In Proceedings of the ACM Special Interest Group on Information Retrieval Conference, pages 581-588, Seattle, WA.
Chester, Daniel and Stephanie Elzer. 2005. Getting computers to see information graphics so users do not have to. In Proceedings of the 15th International Symposium on Methodologies for Intelligent Systems, pages 660-668, Saratoga Springs, NY.
Clark, Herbert. 1996. Using Language. Cambridge University Press, Cambridge.
Coch, Jose. 1998. Interactive generation and knowledge administration in multimeteo. In Proceedings of 9th International Workshop on Natural Language Generation, pages 300-303, Niagara-on-the-Lake.
Corio, Marc and Guy Lapalme. 1999. Generation of texts for information graphics. In Proceedings of the 7th European Workshop on Natural Language Generation, pages 49-58, Toulouse.
Covington, M., C. He, C. Brown, L. Naci, and J. Brown. 2006. How complex is that sentence? A proposed revision of the rosenberg and abbeduto D-level scale. Research Report, Artificial Intelligence Center, University of Georgia. Cycorp. Open Cyc. 2011. http://www.cyc.com.
Dale, Robert and Ehud Reiter. 1995. Computational interpretations of the gricean maxims in the generation of referring expressions. Cognitive Science, 19(2):233-263.
Dalianis, Hercules. 1999. Aggregation in natural language generation. Computational Intelligence, 15(4):384-414.
Demir, Seniz, Sandra Carberry, and Stephanie Elzer. 2009. Issues in Realizing the Overall Message of a Bar Chart, John Benjamins, 5th edition. Amsterdam, pages 311-320.
Demir, Seniz, David Oliver, Edward Schwartz, Stephanie Elzer, Sandra Carberry, Kathleen F. McCoy, and Daniel Chester. 2010. Interactive sight: Textual access to simple bar charts. The New Review of Hypermedia and Multimedia, 16(3):245-279.
Di Eugenio, Barbara, Davide Fossati, Dan Yu, Susan Haller, and Michael Glass. 2005. Aggregation improves learning: Experiments in natural language generation for intelligent tutoring systems. In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics, pages 50-57, Ann Arbor, MI.
Elhadad, M. and J. Robin. 1999. SURGE: A comprehensive plug-in syntactic realization component for text generation. Technical Report, Department of Computer Science, Ben Gurion University. Beersheba, Israel.
Elzer, Stephanie, Sandra Carberry, and Ingrid Zukerman. 2011. The automated understanding of simple bar charts. Artificial Intelligence, 175(2):526-555.
Elzer, Stephanie, Nancy Green, Sandra Carberry, and James Hoffman. 2006. A model of perceptual task effort for bar charts and its role in recognizing intention. International Journal on User Modeling and User-Adapted Interaction, 16(1):1-30.
Fasciano, Massimo and Guy Lapalme. 2000. Intentions in the coordinated generation of graphics and text from tabular data. Knowledge and Information Systems, 2(3):310-339.
Fellbaum, Christiane. 1998. WordNet: An Electronic Lexical Database. The MIT Press, Cambridge, MA.
Ferres, Leo, Petro Verkhogliad, Gitte Lindgaard, Louis Boucher, Antoine Chretien, and Martin Lachance. 2007. Improving accessibility to statistical graphs: the igraph-lite system. In Proceedings of the 9th International ACM SIGACCESS Conference on Computers and Accessibility, pages 67-74, Tempe, AZ.
Foster, Mary Ellen. 1999. Automatically generating text to accompany information graphics. Master's Thesis, University of Toronto.
Friendly, Michael. 2008. A brief history of data visualization. In C. Chen, W. Härdle, and A. Unwin, editors, Handbook of Computational Statistics: Data Visualization, volume III. Springer-Verlag, Heidelberg, pages 1-34.
Gatt, Albert, Francois Portet, Ehud Reiter, Jim Hunter, Saad Mahamood, Wendy Moncur, and Somayajulu Sripada. 2009. From data to text in the neonatal intensive care unit: Using NLG technology for decision support and information management. AI Communications, 22(3):153-186.
Goldberg, Eli, Norbert Driedger, and Richard I. Kittredge. 1994. Using natural-language processing to produce weather forecasts. IEEE Expert: Intelligent Systems and Their Applications, 9(2):45-53.
Goldstein, Jade, Vibhu Mittal, Jaime Carbonell, and Mark Kantrowitz. 2000. Multi-document summarization by sentence extraction. In Proceedings of the NAACL-ANLP Workshop on Automatic Summarization, pages 40-48, Seattle, WA.
Greenbacker, Charlie, Sandra Carberry, and Kathleen F. McCoy. 2011. A corpus of human-written summaries of line graphs. In Proceedings of the EMNLP 2011 Workshop on Language Generation and Evaluation (UCNLG+Eval), pages 23-27, Edinburgh.
Grice, H. Paul. 1975. Logic and conversation. Speech Acts, 3:41-58.
Grosz, Barbara and Candace Sidner. 1986. Attention, intentions, and the structure of discourse. Computational Linguistics, 12(3):175-204.
Grosz, Barbara J., Scott Weinstein, and Aravind K. Joshi. 1995. Centering: A framework for modeling the local coherence of discourse. Computational Linguistics, 21(2):203-225.
Hovy, Eduard H. 1988. Planning coherent multisentential text. In Proceedings of the 26th Annual Meeting of the Association for Computational Linguistics, pages 163-169, Buffalo, NY.
Hovy, Eduard H. 1993. Automated discourse generation using discourse structure relations. Artificial Intelligence, 63(1-2):341-385.
Hovy, Eduard and Chin-Yew Lin. 1996. Automated text summarization and the summarist system. In Proceedings of the Workshop on TIPSTER Text Program, pages 197-214, Vienna, VA.
Ina, Satoshi. 1996. Computer graphics for the blind. SIGCAPH Computers and the Physically Handicapped, 55:16-23.
Jayant, Chandrika, Matt Renzelmann, Dana Wen, Satria Krisnandi, Richard Ladner, and Dan Comden. 2007. Automated tactile graphics translation: in the field. In Proceedings of the 9th International ACM SIGACCESS Conference on Computers and Accessibility, pages 75-82, Tempe, AZ.
Johnson, Mark. 1998. Proof nets and the complexity of processing center embedded constructions. Journal of Logic, Language and Information, 7(4):433-447.
Joshi, Aravind, Bonnie Webber, and Ralph Weischedel. 1984. Living up to expectations: Computing expert responses. In Proceedings of the National Conference on Artificial Intelligence, pages 169-175, Austin, TX.
Karamanis, Nikiforos, Chris Mellish, Massimo Poesio, and Jon Oberlander. 2009. Evaluating centering for information ordering using corpora. Computational Linguistics, 35(1):29-46.
Kennel, A. 1996. Audiograf: A diagram-reader for the blind. In Proceedings of the 2nd Annual ACM Conference on Assistive Technologies, pages 51-56, Vancover, BC, Canada.
Kerpedjiev, Stephan and Steven Roth. 2000. Mapping communicative goals into conceptual tasks to generate graphics in discourse. In Proceedings of the International Conference on Intelligent User Interfaces, pages 60-67, New Orleans, LA.
Kibble, Rodger and Richard Power. 2004. Optimizing referential coherence in text generation. Computational Linguistics, 30(4):401-416.
Kidd, Evan and Edith Bavin. 2002. English-speaking children's comprehension of relative clauses: Evidence for general-cognitive and language-specific constraints on development. Journal of Psycholinguistic Research, 31(6):599-617.
Krahmer, E. and M. Theune. 2002. Efficient context-sensitive generation of referring expressions. In K. van Deemter and R. Kibble, editors, Information Sharing: Reference and Presupposition in Language Generation and Interpretation, Center for the Study of Language and Information-Lecture Notes, volume 143 of CSLI Lecture Notes. CSLI Publications, Stanford, CA, pages 233-264.
Krahmer, Emiel, Sebastiaan Van Erk, and André Verleg. 2003. Graph-based generation of referring expressions. Computational Linguistics, 29(1):53-72.
Kukich, Karen. 1983. Design of a knowledge-based report generator. In Proceedings of the 21st Annual Meeting of the Association for Computational Linguistics, pages 145-150, Cambridge, MA.
Kurze, Martin. 1995. Giving blind people access to graphics (example: business graphics). In Proceedings of the Software-Ergonomie Workshop, Dannstadt, Bremen.
Lavoie, Benoit and Owen Rambow. 1997. A fast and portable realizer for text generation systems. In Proceedings of the 5th Conference on Applied Natural Language Processing, pages 265-268, Washington, DC.
Lazar, J., A. Allen, J. Kleinman, and C. Malarkey. 2007. What frustrates screen reader users on the web: A study of 100 blind users. International Journal of Human-Computer Interaction, 22(3):247-269.
Lester, James C. and Bruce W. Porter. 1997. Developing and empirically evaluating robust explanation generators: The KNIGHT experiments. Computational Linguistics, 23(1):65-101.
Lin, Dekang. 1996. On the structural complexity of natural language sentences. In Proceedings of the International Conference on Computational Linguistics, pages 729-733, Copenhagen, Denmark.
Mann, William C. and Sandra A. Thompson. 1987. Rhetorical structure theory: A theory of text organization. In Livia Polanyi, editor, The Structure of Discourse. Ablex Publishing Corporation, Norwood, NJ.
Marcu, Daniel. 1998. The Rhetorical Parsing, Summarization, and Generation of Natural Language Texts. Ph.D. thesis, Department of Computer Science, University of Toronto.
McCoy, Kathleen F., Sandra Carberry, Tom Roper, and Nancy Green. 2001. Towards generating textual summaries of graphs. In Proceedings of the 1st International Conference on Universal Access in Human- Computer Interaction, pages 695-699, New Orleans, LA.
McCoy, Kathleen F. and Jeannette Cheng. 1991. Focus of attention: Constraining what can be said next. In Cecile Paris, William Swartout, and William Mann, editors, Natural Language Generation in Artificial Intelligence and Computational Linguistics. Kluwer Academic Publishers, Berlin, pages 103-124.
McKeown, Kathleen R. 1985. Discourse strategies for generating natural-language text. Artificial Intelligence, 27(1):1-41.
McKeown, Kathleen R., Shimei Pan, James Shaw, Desmond A. Jordan, and Barry A. Allen. 1997. Language generation for multimedia healthcare briefings. In Proceedings of the 5th Conference on Applied Natural Language Processing, pages 277-282, Washington, DC.
Meijer, Peter B. 1992. An experimental system for auditory image representations. IEEE Transactions on Biomedical Engineering, 39(2):112-121.
Mellish, Chris, Alisdair Knott, Jon Oberlander, and Mick O'Donnell. 1998. Experiments using stochastic search for text planning. In Proceedings of the 9th International Workshop on Natural Language Generation, pages 98-107, Niagara-on-the-Lake.
Moore, Johanna D. and Cecile Paris. 1993. Planning text for advisory dialogues: Capturing intentional and rhetorical information. Computational Linguistics, 19(4):651-694.
Nenkova, Ani and Kathleen McKeown. 2003. References to named entities: a corpus study. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, pages 70-72, Edmonton.
O'Donnell, M., C. Mellish, J. Oberlander, and A. Knott. 2001. Ilex: an architecture for a dynamic hypertext generation system. Natural Language Engineering, 7(3):225-250.
Paris, Cecile. 1988. Tailoring object descriptions to a user's level of expertise. Computational Linguistics, 14(3):64-78.
Pastra, Katerina, Horacio Saggion, and Yorick Wilks. 2003. Extracting relational facts for indexing and retrieval of crime-scene photographs. Knowledge-Based Systems, 16(5-6):313-320.
Portet, Francois, Ehud Reiter, Albert Gatt, Jim Hunter, Somayajulu Sripada, Yvonne Freer, and Cindy Sykes. 2009. Automatic generation of textual summaries from neonatal intensive care data. Artificial Intelligence, 173(7-8):789-816.
Radev, Dragomir R., Hongyan Jing, Malgorzata Stys, and Daniel Tam. 2004. Centroid-based summarization of multiple documents. Information Processing and Management: An International Journal, 40(6):919-938.
Ramloll, Rameshsharma, Wai Yu, Stephen Brewster, Beate Riedel, Mike Burton, and Gisela Dimigen. 2000. Constructing sonified haptic line graphs for the blind student: First steps. In Proceedings of the 4th International ACM Conference on Assistive Technologies, pages 17-25, Arlington, VA.
Reiter, Ehud. 2007. An architecture for data-to-text systems. In Proceedings of the 11th European Workshop on Natural Language Generation, pages 97-104, Schloss Dagstuhl.
Reiter, Ehud and Robert Dale. 2000. Building Natural-language Generation Systems. Cambridge University Press, Cambridge.
Schiffman, Barry, Ani Nenkova, and Kathleen McKeown. 2002. Experiments in multidocument summarization. In Proceedings of the 2nd International Conference on Human Language Technology Research, pages 52-58, San Diego, CA.
Shaw, James. 1998. Clause aggregation using linguistics knowledge. In Proceedings of the 9th International Workshop on Natural Language Generation, pages 138-147, Niagara-on-the-Lake.
Somayajulu, Sripada, Ehud Reiter, and Ian Davy. 2003. Sumtime-mousam: Configurable marine weather forecast generator. Expert Update, 6(3):4-10.
Stent, A., Rashmi Prasad, and Marilyn Walker. 2004. Trainable sentence planning for complex information presentation in spoken dialog systems. In Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics, pages 79-86, Barcelona.
Stone, Matthew and Bonnie Webber. 1998. Textual economy through closely coupled syntax and semantics. In Proceedings of the International Natural Language Generation Conference, pages 178-187, Niagara-on-the-Lake.
Suri, Linda Z. and Kathleen F. McCoy. 1994. Raft/rapr and centering: A comparison and discussion of problems related to processing complex sentences. Computational Linguistics, 20(2):301-317.
Turner, Ross, Yaji Sripada, and Ehud Reiter. 2009. Generating approximate geographic descriptions. In Proceedings of the 12th European Workshop on Natural Language Generation, pages 42-49, Athens.
Walker, M., O. Rambow, and M. Rogati. 2002. Training a sentence planner for spoken dialogue using boosting. Computer Speech and Language: Special Issue on Spoken Language Generation, 16(3):409-434.
Walker, M., A. Stent, F. Mairesse, and R. Prasad. 2007. Individual and domain adaptation in sentence planning for dialogue. Journal of Artificial Intelligence Research, 30(1):413-456.
Wu, Peng, Sandra Carberry, Stephanie Elzer, and Daniel Chester. 2010. Recognizing the intended message of line graphs. In Proceedings of the International Conference on the Theory and Application of Diagrams, pages 220-234, Portland, OR.
Yngve, Victor H. 1960. A model and an hypothesis for language structure. American Philosophical Society, 104:444-466.
Yu, Jin, Ehud Reiter, Jim Hunter, and Chris Mellish. 2007. Choosing the content of textual summaries of large time-series data sets. Natural Engineering, 13(1):25-49.

Summarizing Information Graphics Textually

Sign up for access to the world's latest research

Abstract

Related papers

References (82)

Related papers

Related topics

Cited by