Papers by Giuseppe Carenini

Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing
Our analysis of large summarization datasets indicates that redundancy is a very serious problem ... more Our analysis of large summarization datasets indicates that redundancy is a very serious problem when summarizing long documents. Yet, redundancy reduction has not been thoroughly investigated in neural summarization. In this work, we systematically explore and compare different ways to deal with redundancy when summarizing long documents. Specifically, we organize the existing methods into categories based on when and how the redundancy is considered. Then, in the context of these categories, we propose three additional methods balancing non-redundancy and importance in a general and flexible way. In a series of experiments, we show that our proposed methods achieve the state-of-the-art with respect to ROUGE scores on two scientific paper datasets, Pubmed and arXiv, while reducing redundancy significantly. 1
Summarizing Text Conversations
Springer eBooks, 2011

Machine Learning for Healthcare Conference, Oct 28, 2019
Early prediction of neurodegenerative disorders such as Alzheimer's disease (AD) and related deme... more Early prediction of neurodegenerative disorders such as Alzheimer's disease (AD) and related dementias is important in developing early medical supports and social supports, and may identify ideal stages for testing novel therapeutics aimed at preventing disease progression. Currently, a diagnosis is based on clinical expertise and cognitive screening tests, which have limited accuracy in earlier stages of disease, or invasive and resourceintensive testing, such as lumbar puncture or specialized neuroimaging. Changes in speech and language patterns can occur in dementia in its earliest stages and may worsen as the disease progresses. This has led to recent attempts to create automatic methods that predict dementia through language analysis. In addition to features extracted from language samples, previous works have improved the prediction accuracy by introducing some taskspecific features. But task-specific features prevent the model from generalizing to other tests. In this paper, we apply a neural model (Hierarchical Attention Networks) to the dementia prediction task. Remarkably, the model requires no task-specific feature and achieves state-of-the-art classification result on a widely used dementia dataset of spoken language. We also perform a detail analysis to interpret how a prediction is made. Interestingly, the same neural model does not work well on a corpus of written text, suggesting that dementia prediction from language may require different methods depending on the genre of the source language.

International Conference on Computational Linguistics, Dec 1, 2016
Discourse parsing is a popular technique widely used in text understanding, sentiment analysis an... more Discourse parsing is a popular technique widely used in text understanding, sentiment analysis and other NLP tasks. However, for most discourse parsers, the performance varies significantly across different discourse relations. In this paper, we first validate the underfitting hypothesis, i.e., the less frequent a relation is in the training data, the poorer the performance on that relation. We then explore how to increase the number of positive training instances, without resorting to manually creating additional labeled data. We propose a training data enrichment framework that relies on co-training of two different discourse parsers on unlabeled documents. Importantly, we show that co-training alone is not sufficient. The framework requires a filtering step to ensure that only "good quality" unlabeled documents can be used for enrichment and re-training. We propose and evaluate two ways to perform the filtering. The first is to use an agreement score between the two parsers. The second is to use only the confidence score of the faster parser. Our empirical results show that agreement score can help to boost the performance on infrequent relations, and that the confidence score is a viable approximation of the agreement score for infrequent relations.

Information Visualization, Feb 23, 2018
In the last decade, there has been an exponential growth of asynchronous online conversations (e.... more In the last decade, there has been an exponential growth of asynchronous online conversations (e.g. blogs), thanks to the rise of social media. Analyzing and gaining insights from such discussions can be quite challenging for a user, especially when the user deals with hundreds of comments that are scattered around multiple different conversations. A promising solution to this problem is to automatically mine the major topics from conversations and organize them into a hierarchical structure. However, the resultant topic hierarchy can be noisy and/or it may not match the user's current information needs. To address this problem, we introduce a novel human-in-the-loop approach that allows the user to revise the topic hierarchy based on her feedback. We incorporate this approach within a visual text analytics system that helps users in analyzing and getting insights from conversations by exploring and revising the topic hierarchy. We evaluated the resulting system with real users in a lab-based study. The results from the user study, when compared to its counterpart that does not support interactive revisions of a hierarchical topic model, provide empirical evidence of the potential utility of our system in terms of both performance and subjective measures. Finally, we summarize generalizable lessons for introducing human-in-the-loop computation within a visual text analytics system.

Impact of Individual Differences on User Experience with a Real-World Visualization Interface for Public Engagement
There is increasing evidence that the effectiveness of Information Visualization (Infovis) is aff... more There is increasing evidence that the effectiveness of Information Visualization (Infovis) is affected by the user needs and abilities. For instance, cognitive abilities (e.g., perceptual speed, working memory) [e.g., 1-4] have been shown to impact users' performance and satisfaction with a given visualization. These findings suggest that it can be valuable to develop visualization systems that can provide personalized support targeting specific user characteristics. Furthermore, recent research [e.g., 3,5] has shown that eye tracking data can be leveraged to identify the elements of a visualization for which specific user differences hinder user experience or performance, thus providing concrete information on which specific personalized support could be helpful for different users (e.g., users with low perceptual speed may benefit from help in processing legends [1]). Though these findings are encouraging toward the design of user-adaptive or customized visualizations, they are generally related to either fictional tasks or research prototypes. So, it is unclear if existing results on the value of user-adaptive visualizations can transfer to real-world settings.

arXiv (Cornell University), Dec 17, 2020
Discourse information, as postulated by popular discourse theories, such as RST and PDTB, has bee... more Discourse information, as postulated by popular discourse theories, such as RST and PDTB, has been shown to improve an increasing number of downstream NLP tasks, showing positive effects and synergies of discourse with important real-world applications. While methods for incorporating discourse become more and more sophisticated, the growing need for robust and general discourse structures has not been sufficiently met by current discourse parsers, usually trained on small scale datasets in a strictly limited number of domains. This makes the prediction for arbitrary tasks noisy and unreliable. The overall resulting lack of high-quality, highquantity discourse trees poses a severe limitation to further progress. In order the alleviate this shortcoming, we propose a new strategy to generate tree structures in a task-agnostic, unsupervised fashion by extending a latent tree induction framework with an auto-encoding objective. The proposed approach can be applied to any tree-structured objective, such as syntactic parsing, discourse parsing and others. However, due to the especially difficult annotation process to generate discourse trees, we initially develop a method to generate larger and more diverse discourse treebanks. In this paper we are inferring general tree structures of natural text in multiple domains, showing promising results on a diverse set of tasks.
arXiv (Cornell University), Oct 17, 2022
arXiv (Cornell University), Oct 17, 2022
Conclusions / Final Thoughts
Springer eBooks, 2011
Background: Corpora and Evaluation Methods
Springer eBooks, 2011
Transformers are the dominant architecture in NLP, but their training and fine-tuning is still ve... more Transformers are the dominant architecture in NLP, but their training and fine-tuning is still very challenging. In this paper, we present the design and implementation of a visual analytic framework for assisting researchers in such process, by providing them with valuable insights about the model's intrinsic properties and behaviours. Our framework offers an intuitive overview that allows the user to explore different facets of the model (e.g., hidden states, attention) through interactive visualization, and allows a suite of built-in algorithms that compute the importance of model components and different parts of the input sequence. Case studies and feedback from a user focus group indicate that the framework is useful, and suggest several improvements.
Opinion Summarization and Visualization
Elsevier eBooks, 2017
Given a very large amount of social media text, being able to understand the variety of opinions ... more Given a very large amount of social media text, being able to understand the variety of opinions contained in the text often depends on the generation of summaries and visualizations of the dataset so as to make it more manageable. This chapter first surveys approaches used for both extractive and abstractive summarization of opinion-filled social media text, including discussion of summarization evaluation. We then survey approaches for presenting these opinion summaries to users in the form of visualizations, including interactive visualizations. Taken together, these summarization and visualization techniques can allow users to get concise overviews of the opinions expressed in the data, enabling them to draw sound conclusions and make more informed decisions.
International Joint Conference on Artificial Intelligence, Jul 9, 2016
Confusion has been found to hinder user experience with visualizations. If confusion could be pre... more Confusion has been found to hinder user experience with visualizations. If confusion could be predicted and resolved in real time, user experience and satisfaction would greatly improve. In this paper, we focus on predicting occurrences of confusion during the interaction with a visualization using eye tracking and mouse data. The data was collected during a user study with ValueChart, an interactive visualization to support preferential choices. We report very promising results based on Random Forest classifiers.

Social media is a rich source where we can learn about people's reactions to social issues. As CO... more Social media is a rich source where we can learn about people's reactions to social issues. As COVID-19 has significantly impacted on people's lives, it is essential to capture how people react to public health interventions and understand their concerns. In this paper, we aim to investigate people's reactions and concerns about COVID-19 in North America, especially focusing on Canada. We analyze COVID-19 related tweets using topic modeling and aspect-based sentiment analysis, and interpret the results with public health experts. We compare timeline of topics discussed with timing of implementation of public health interventions for COVID-19. We also examine people's sentiment about COVID-19 related issues. We discuss how the results can be helpful for public health agencies when designing a policy for new interventions. Our work shows how Natural Language Processing (NLP) techniques could be applied to public health questions with domain expert involvement.
Impact of Individual Differences on User Experience with a Visualization Interface for Public Engagement
Although there is abundant evidence that individual differences such as cognitive abilities impac... more Although there is abundant evidence that individual differences such as cognitive abilities impact visualization effectiveness, this influence has mostly been shown for fictional tasks/scenarios. This paper extends previous findings by investigating the impact of individual differences on user experience with a real-world information visualization tool designed to support preferential choices in public engagement. We show that several cognitive abilities do have an influence on user experience in this task, and show that this influence can be explained by eye tracking. We discuss how these results are promising towards the design of visualizations for preferential choice in public engagement that can adapt to the user's needs, abilities and expertise.

User Modeling and User-adapted Interaction, Aug 30, 2019
Previous research has shown that various user characteristics (e.g., cognitive abilities, persona... more Previous research has shown that various user characteristics (e.g., cognitive abilities, personality traits, and learning abilities) can influence user experience during information visualization tasks. These findings have prompted researchers to investigate user-adaptive information visualizations that can help users by providing personalized support based on their specific needs. Whereas existing work has been mostly limited to tasks involving just visualizations, the aim of our research is to broaden this work to include scenarios where users process textual documents with embedded visualizations, i.e., Magazine Style Narrative Visualizations, or MSNVs for short. In this paper, we analyze eye tracking data collected from a user study with MSNVs to uncover processing behaviors that are negatively impacting user experience (i.e., time on task) for users with low abilities in these user characteristics. Our analysis leverages Linear Mixed-Effects Models to evaluate the relationships among user characteristics, gaze processing behaviors, and task performance. Our results identify several MSNV processing behaviors within the visualization that contribute to poor task performance for users with low reading proficiency. For instance, we identify that users with low reading proficiency transition significantly more often compared to their counterparts between relevant and non-relevant bars, and transition more often from bars to the labels. We present our findings as a step toward designing user-adaptive support mechanisms to alleviate these difficulties with MSNVs, and provide suggestions on how our results can be leveraged for creating a set of meaningful interventions for future evaluation (e.g., dynamically highlighting relevant bars and labels in the visualization to help users with low reading proficiency locate them more effectively).
In this paper, we propose a novel neural singledocument extractive summarization model for long d... more In this paper, we propose a novel neural singledocument extractive summarization model for long documents, incorporating both the global context of the whole document and the local context within the current topic. We evaluate the model on two datasets of scientific papers , Pubmed and arXiv, where it outperforms previous work, both extractive and abstractive models, on ROUGE-1, ROUGE-2 and ME-TEOR scores. We also show that, consistently with our goal, the benefits of our method become stronger as we apply it to longer documents. Rather surprisingly, an ablation study indicates that the benefits of our model seem to come exclusively from modeling the local context, even for the longest documents.

arXiv (Cornell University), Oct 30, 2019
Discourse parsing could not yet take full advantage of the neural NLP revolution, mostly due to t... more Discourse parsing could not yet take full advantage of the neural NLP revolution, mostly due to the lack of annotated datasets. We propose a novel approach that uses distant supervision on an auxiliary task (sentiment classification), to generate abundant data for RSTstyle discourse structure prediction. Our approach combines a neural variant of multipleinstance learning, using document-level supervision, with an optimal CKY-style tree generation algorithm. In a series of experiments, we train a discourse parser (for only structure prediction) on our automatically generated dataset and compare it with parsers trained on human-annotated corpora (news domain RST-DT and Instructional domain). Results indicate that while our parser does not yet match the performance of a parser trained and tested on the same dataset (intra-domain), it does perform remarkably well on the much more difficult and arguably more useful task of inter-domain discourse structure prediction, where the parser is trained on one domain and tested/applied on another one.
arXiv (Cornell University), Nov 6, 2020
RST-based discourse parsing is an important NLP task with numerous downstream applications, such ... more RST-based discourse parsing is an important NLP task with numerous downstream applications, such as summarization, machine translation and opinion mining. In this paper, we demonstrate a simple, yet highly accurate discourse parser, incorporating recent contextual language models. Our parser establishes the new state-of-the-art (SOTA) performance for predicting structure and nuclearity on two key RST datasets, RST-DT and Instr-DT. We further demonstrate that pretraining our parser on the recently available large-scale "silver-standard" discourse treebank MEGA-DT provides even larger performance benefits, suggesting a novel and promising research direction in the field of discourse analysis.
Uploads
Papers by Giuseppe Carenini