Papers by Ameneh Shamekhi
A Virtual Self-care Coach for Individuals with Spinal Cord Injury
Most persons with spinal cord injury (SCI) require training and support for self-care management ... more Most persons with spinal cord injury (SCI) require training and support for self-care management to help prevent the development of serious secondary conditions after hospital discharge. We have designed a virtual coach system, in which an animated character engages users in simulated face-to-face conversation to provide health education and motivate healthy behavior. We conducted an exploratory study with nine participants who have SCI to examine the acceptance and attitudes towards our system. Results of the study show that participants are highly receptive of the virtual coach technology and recognize it as an effective medium to promote self-care.
Smart Health, Mar 1, 2021
This is a PDF file of an article that has undergone enhancements after acceptance, such as the ad... more This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of record. This version will undergo additional copyediting, typesetting and review before it is published in its final form, but we are providing this version to give early visibility of the article. Please note that, during the production process, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

Longitudinal agent-based interventions only work if people continue using them on a regular basis... more Longitudinal agent-based interventions only work if people continue using them on a regular basis, thus identifying users who are at risk of disengaging from these applications is important for retention and efficacy. We develop machine learning models that predict long-term user engagement in three longitudinal virtual agent-based health interventions. We achieve accuracies of 74% to 90% in predicting user dropout in a given prediction period of the intervention based on the user's past interactions with the agent. Our models contain features related to session frequency and duration, health behavior, and user-agent dialogue content. We find that the features most predictive of dropout include number of user utterances, percent of user utterances that are questions, and the percent of user health behavior goals met during the observation period. Ramifications for the design of virtual agents for longitudinal applications are discussed.
Lecture Notes in Computer Science, 2015
A virtual agent that guides users through mindfulness meditation sessions is described. The agent... more A virtual agent that guides users through mindfulness meditation sessions is described. The agent uses input from a respiration sensor to both respond to user breathing rate and use deep breaths as a continuation and acknowledgment signal. A pilot evaluation study comparing the agent to a selfhelp video indicates that users are very receptive to the virtual meditation coach, and that it is more effective at reducing anxiety and increasing mindfulness and flow state compared to the video.

Speaker diarization is a key component of systems that support multiparty interactions of co-loca... more Speaker diarization is a key component of systems that support multiparty interactions of co-located users, such as meeting facilitation robots. The goal is to identify who spoke what, often to provide feedback, moderate participation, and personalize responses by the robot. Current systems use a combination of acoustic (e.g. pitch differences) and visual features (e.g. gaze) to perform diarization, but involve the use of additional sensors or require overhead signal processing efforts. Alternatively, automatic speech recognition (ASR) is a necessary step in the diarization pipeline, and utilizing the transcribed text to directly identify speaker labels in the conversation can eliminate such challenges. With that motivation, we leverage large language models (LLMs) to identify speaker labels Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the owner/author(s).

A Multimodal Robot-Driven Meeting Facilitation System for Group Decision-Making Sessions
Group meetings are ubiquitous, with millions of meetings held across the world every day. However... more Group meetings are ubiquitous, with millions of meetings held across the world every day. However, meeting quality, group performance, and outcomes are challenged by a variety of dysfunctional behaviors, unproductive social dynamics, and lack of experience in conducting efficient and productive meetings. Previous studies have shown that meeting facilitators can be advantageous in helping groups reach their goals more effectively, but many groups do not have access to human facilitators due to a lack of resources or other barriers. In this paper, we describe the development of a multimodal robotic meeting facilitator that can improve the quality of small group decision-making meetings. This automated group facilitation system uses multimodal sensor inputs (user gaze, speech, prosody, and proxemics), as well as inputs from a tablet application, to intelligently enforce meeting structure, promote time management, balance group participation, and facilitate group decision-making processes. Results of a between-subject study of 20 user groups (N=40) showed that the robot facilitator is accepted by group members, is effective in enforcing meeting structure, and users found it helpful in balancing group participation. We also report design implications derived from the findings of our study.

Breathe Deep
Mindfulness meditation has been demonstrated to be an effective approach for alleviating symptoms... more Mindfulness meditation has been demonstrated to be an effective approach for alleviating symptoms related to a variety of chronic health conditions, including pain, anxiety, and depression. Meditation takes practice and requires training, especially for novices, to learn mindfulness and emotion regulation. However, while face-to-face instructions can provide the best long-term results, many people cannot afford or schedule attendance at meditation classes. We present an automated conversational agent that acts as a virtual meditation coach that is interactive and adaptive to a user's breathing behavior, based on inputs from a respiration sensor in a meditation session. We designed and validated three interaction techniques based on the user's breathing. Results from two experimental studies demonstrate that users are highly receptive of the virtual coach technology, and appreciated the interactivity afforded by the respiration sensor. Participants also felt more relaxed when the meditation coach adapted the instructions to their breathing.

We are interested in increasing the ability of groups to collaborate efficiently by leveraging ne... more We are interested in increasing the ability of groups to collaborate efficiently by leveraging new advances in AI and Conversational Agent (CA) technology. Given the longstanding debate on the necessity of embodiment for CAs, bringing them to groups requires answering the questions of whether and how providing a CA with a face affects its interaction with the humans in a group. We explored these questions by comparing group decision-making sessions facilitated by an embodied agent, versus a voice-only agent. Results of an experiment with 20 user groups revealed that while the embodiment improved various aspects of group's social perception of the agent (e.g., rapport, trust, intelligence and power), its impact on the group-decision process and outcome was nuanced. Drawing on both quantitative and qualitative findings, we discuss the pros and cons of embodiment, argue that the value of having a face depends on the types of assistance the agent provides, and lay out directions for future research.

Extended Abstracts of the 2023 CHI Conference on Human Factors in Computing Systems
Speaker diarization is a key component of systems that support multiparty interactions of co-loca... more Speaker diarization is a key component of systems that support multiparty interactions of co-located users, such as meeting facilitation robots. The goal is to identify who spoke what, often to provide feedback, moderate participation, and personalize responses by the robot. Current systems use a combination of acoustic (e.g. pitch differences) and visual features (e.g. gaze) to perform diarization, but involve the use of additional sensors or require overhead signal processing efforts. Alternatively, automatic speech recognition (ASR) is a necessary step in the diarization pipeline, and utilizing the transcribed text to directly identify speaker labels in the conversation can eliminate such challenges. With that motivation, we leverage large language models (LLMs) to identify speaker labels Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the owner/author(s).

A Multimodal Robot-Driven Meeting Facilitation System for Group Decision-Making Sessions
2019 International Conference on Multimodal Interaction
Group meetings are ubiquitous, with millions of meetings held across the world every day. However... more Group meetings are ubiquitous, with millions of meetings held across the world every day. However, meeting quality, group performance, and outcomes are challenged by a variety of dysfunctional behaviors, unproductive social dynamics, and lack of experience in conducting efficient and productive meetings. Previous studies have shown that meeting facilitators can be advantageous in helping groups reach their goals more effectively, but many groups do not have access to human facilitators due to a lack of resources or other barriers. In this paper, we describe the development of a multimodal robotic meeting facilitator that can improve the quality of small group decision-making meetings. This automated group facilitation system uses multimodal sensor inputs (user gaze, speech, prosody, and proxemics), as well as inputs from a tablet application, to intelligently enforce meeting structure, promote time management, balance group participation, and facilitate group decision-making processes. Results of a between-subject study of 20 user groups (N=40) showed that the robot facilitator is accepted by group members, is effective in enforcing meeting structure, and users found it helpful in balancing group participation. We also report design implications derived from the findings of our study.

Multimodal Assessment of Oral Presentations using HMMs
Proceedings of the 2020 International Conference on Multimodal Interaction, 2020
Audience perceptions of public speakers' performance change over time. Some speakers start st... more Audience perceptions of public speakers' performance change over time. Some speakers start strong but quickly transition to mundane delivery, while others may have a few impactful and engaging portions of their talk preceded and followed by more pedestrian delivery. In this work, we model the time-varying qualities of a presentation as perceived by the audience and use these models both to provide diagnostic information to presenters and to improve the quality of automated performance assessments. In particular, we use HMMs to model various dimensions of perceived quality and how they change over time and use the sequence of quality states to improve feedback and predictions. We evaluate this approach on a corpus of 74 presentations given in a controlled environment. Multimodal features-spanning acoustic qualities, speech disfluencies, and nonverbal behavior were derived both automatically and manually using crowdsourcing. Ground truth on audience perceptions was obtained using judge ratings on both overall presentations (aggregate) and portions of presentations segmented by topic. We distilled the overall presentation quality into states representing the presenter's gaze, audio, gesture, audience interaction, and proxemic behaviors. We demonstrate that an HMM of state-based representation of presentations improves the performance assessments.

Journal on Multimodal User Interfaces, 2020
The quality of scientific oral presentations is often poor, owing to a number of factors, includi... more The quality of scientific oral presentations is often poor, owing to a number of factors, including public speaking anxiety. We present DynamicDuo, a system that uses an automated, life-sized, animated agent to help inexperienced scientists deliver their presentations in front of an audience. The design of the system was informed by an analysis of TED talks given by pairs of human presenters to identify the most common dual-presentation formats and transition behaviors used. We explore the usability and acceptability of DynamicDuo in both controlled laboratory-based studies and real-world environments, and its ability to decrease public speaking anxiety and improve presentation quality. In a within-subjects study (N 12) comparing co-presenting with DynamicDuo against solo-presenting with conventional presentation software, we demonstrated that our system led to significant improvements in public speaking anxiety and speaking confidence for non-native English speakers. Judges who viewed videotapes of these presentations rated those with DynamicDuo significantly higher on speech quality and overall presentation quality for all presenters. We also explore the affordances of the virtual co-presenter through empirical evaluation of novel roles the agent can play in scientific presentations and novel ways it can interact with the speaker in front of the audience.

Agents that are tailored to appear and behave as members of a particular culture are more accepta... more Agents that are tailored to appear and behave as members of a particular culture are more acceptable by and persuasive to members of that culture than agents that are not tailored. We report a study that systematically unpacks two tailoring components—appearance and argumentation—for virtual exercise coaches designed for the Indian and American cultures. Indian participants who interacted with an agent whose argumentation was tailored to their culture were significantly more satisfied with the agent irrespective of the agent’s appearance. ACM Reference Format: Prasanth Murali, Ameneh Shamekhi, Dhaval Parmar and Timothy Bickmore. 2020. Argumentation is More Important than Appearance for Designing Culturally Tailored Virtual Agents In Proc. of the 19th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2020), Auckland, New Zealand, May 9–13, 2020, IFAAMAS, 3 pages.

Several conversational agents have now been developed for automated health education and health b... more Several conversational agents have now been developed for automated health education and health behavior change counseling, in areas as diverse as exercise promotion [1], substance abuse counseling [2], and chronic disease self-care management [3]. For practical reasons, most have been deployed as animated characters rather than robots, and for safety reasons most have used fully constrained user input so that health advice can be validated [4]. We are currently exploring alternative input and output modalities for health counseling agents to determine whether they have significantly different impacts on user perceptions and health outcomes, assuming the other limitations described above can be addressed. Related results in other areas suggest that robotic embodiments can lead to a greater sense of presence and engagement compared to animated agents [5, 6]. Our own pilot studies also revealed a user preference for unconstrained speech input. In this paper, we discuss preliminary res...

Health Counseling by Robots: Modalities for Breastfeeding Promotion
2019 28th IEEE International Conference on Robot and Human Interactive Communication (RO-MAN), 2019
Conversational humanoid robots are being increasingly used for health education and counseling. P... more Conversational humanoid robots are being increasingly used for health education and counseling. Prior research provides mixed indications regarding the best modalities to use for these systems, including user inputs spanning completely constrained multiple choice options vs. unconstrained speech, and embodiments of humanoid robots vs. virtual agents, especially for potentially sensitive health topics such as breastfeeding. We report results from an experiment comparing five different interface modalities, finding that all result in significant increases in user knowledge and intent to adhere to recommendations, with few differences among them. Users are equally satisfied with constrained (multiple choice) touch screen input and unconstrained speech input, but are relatively unsatisfied with constrained speech input. Women find conversational robots are an effective, safe, and non-judgmental medium for obtaining information about breastfeeding.

Proceedings of the 18th International Conference on Intelligent Virtual Agents, 2018
Longitudinal agent-based interventions only work if people continue using them on a regular basis... more Longitudinal agent-based interventions only work if people continue using them on a regular basis, thus identifying users who are at risk of disengaging from these applications is important for retention and efficacy. We develop machine learning models that predict long-term user engagement in three longitudinal virtual agent-based health interventions. We achieve accuracies of 74% to 90% in predicting user dropout in a given prediction period of the intervention based on the user's past interactions with the agent. Our models contain features related to session frequency and duration, health behavior, and user-agent dialogue content. We find that the features most predictive of dropout include number of user utterances, percent of user utterances that are questions, and the percent of user health behavior goals met during the observation period. Ramifications for the design of virtual agents for longitudinal applications are discussed.

Breathe Deep
Proceedings of the 12th EAI International Conference on Pervasive Computing Technologies for Healthcare, 2018
Mindfulness meditation has been demonstrated to be an effective approach for alleviating symptoms... more Mindfulness meditation has been demonstrated to be an effective approach for alleviating symptoms related to a variety of chronic health conditions, including pain, anxiety, and depression. Meditation takes practice and requires training, especially for novices, to learn mindfulness and emotion regulation. However, while face-to-face instructions can provide the best long-term results, many people cannot afford or schedule attendance at meditation classes. We present an automated conversational agent that acts as a virtual meditation coach that is interactive and adaptive to a user's breathing behavior, based on inputs from a respiration sensor in a meditation session. We designed and validated three interaction techniques based on the user's breathing. Results from two experimental studies demonstrate that users are highly receptive of the virtual coach technology, and appreciated the interactivity afforded by the respiration sensor. Participants also felt more relaxed when the meditation coach adapted the instructions to their breathing.
Proceedings of the 19th ACM International Conference on Intelligent Virtual Agents, 2019
Our research explores the development of new interaction formats for oral presentations that leve... more Our research explores the development of new interaction formats for oral presentations that leverage a life-sized virtual agent that co-delivers a scientific talk with a human presenter. We developed a taxonomy of 36 novel interaction formats as well as 37 roles the agent can take on in co-presentations. We evaluated the impact of these formats and roles by selecting 10 from the taxonomy and recording brief presentations on the same topic using the different formats. Judges ranked dynamic agent roles higher on engagement and rated non-standard interaction formats no lower on appropriateness, compared to standard turn-taking co-presentations.

Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems, 2018
We are interested in increasing the ability of groups to collaborate efficiently by leveraging ne... more We are interested in increasing the ability of groups to collaborate efficiently by leveraging new advances in AI and Conversational Agent (CA) technology. Given the longstanding debate on the necessity of embodiment for CAs, bringing them to groups requires answering the questions of whether and how providing a CA with a face affects its interaction with the humans in a group. We explored these questions by comparing group decision-making sessions facilitated by an embodied agent, versus a voice-only agent. Results of an experiment with 20 user groups revealed that while the embodiment improved various aspects of group's social perception of the agent (e.g., rapport, trust, intelligence and power), its impact on the group-decision process and outcome was nuanced. Drawing on both quantitative and qualitative findings, we discuss the pros and cons of embodiment, argue that the value of having a face depends on the types of assistance the agent provides, and lay out directions for future research.
Smart Health, 2020
This is a PDF file of an article that has undergone enhancements after acceptance, such as the ad... more This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of record. This version will undergo additional copyediting, typesetting and review before it is published in its final form, but we are providing this version to give early visibility of the article. Please note that, during the production process, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
Uploads
Papers by Ameneh Shamekhi