Papers by Danielle McNamara

International Journal of Artificial Intelligence in Education, Dec 9, 2016
Source-based essays are evaluated both on the quality of the writing and the content appropriate ... more Source-based essays are evaluated both on the quality of the writing and the content appropriate interpretation and use of source material. Hence, composing a highquality source-based essay (an essay written based on source material) relies on skills related to both reading (the sources) and writing (the essay) skills. As such, sourcebased writing must involve language comprehension and production processes. The purpose of the current study is to examine the impact of reading, writing, and blended (i.e., reading and writing) strategy training on students' performance on a contentspecific source-based essay writing task. In contrast to general source-based writing tasks, content-specific source-based writing tasks are tasks wherein writers are provided the source material on which to base their essays. Undergraduate students (n = 175) were provided with strategy instruction and practice in the context of two intelligent tutoring systems, Writing Pal and Interactive Strategy Training for Active Reading and Thinking (iSTART). Results indicated that participants in the blended strategy training condition produced higher quality source-based essays than participants in the reading comprehension-only, writing-only, or control condition, with no differences observed between the latter three conditions. Further, the benefits of this blended strategy instruction remained significant regardless of prior reading and writing skills, or time on task.

Grantee Submission, 2016
This study investigates how and whether information about students' writing can be recovered from... more This study investigates how and whether information about students' writing can be recovered from basic behavioral data extracted during their sessions in an intelligent tutoring system for writing. We calculate basic and time-sensitive keystroke indices based on log files of keys pressed during students' writing sessions. A corpus of prompt-based essays was collected from 126 undergraduates along with keystrokes logged during the session. Holistic scores and linguistic properties of these essays were then automatically calculated using natural language processing tools. Results indicated that keystroke indices accounted for 76% of the variance in essay quality and up to 38% of the variance in the linguistic characteristics. Overall, these results suggest that keystroke analyses can help to recover crucial information about writing, which may ultimately help to improve student models in computer-based learning environments.

Reading in a foreign language, Apr 1, 2016
This study uses a moving windows self-paced reading task to assess text comprehension of beginnin... more This study uses a moving windows self-paced reading task to assess text comprehension of beginning and intermediate-level simplified texts and authentic texts by L2 learners engaged in a text-retelling task. Linear mixed effects (LME) models revealed statistically significant main effects for reading proficiency and text level on the number of text-based propositions recalled: More proficient readers recalled more propositions. However, text level was a stronger predictor of propositional recall than reading proficiency. LME models also revealed main effects for language proficiency and text level on the number of extra-textual propositions produced. Text level, however, emerged as a stronger predictor than language proficiency. Post-hoc analyses indicated that there were more irrelevant elaborations for authentic texts and intermediate and authentic texts led to a greater number of relevant elaborations compared to beginning texts.

Educational Data Mining, Dec 25, 2016
This study investigates a novel approach to automatically assessing essay quality that combines n... more This study investigates a novel approach to automatically assessing essay quality that combines natural language processing approaches that assess text features with approaches that assess individual differences in writers such as demographic information, standardized test scores, and survey results. The results demonstrate that combining text features and individual differences increases the accuracy of automatically assigned essay scores over using either individual differences or text features alone. The findings presented here have important implications for writing educators because they reveal that essay scoring methods can benefit from the incorporation of features taken not only from the essay itself (e.g., features related to lexical and syntactic complexity), but also from the writer (e.g., vocabulary knowledge and writing attitudes). The findings have implications for educational data mining researchers because they demonstrate new natural language processing approaches that afford the automatic assessment of performance outcomes.

Grantee Submission, Mar 1, 2015
This study builds upon previous work aimed at developing a student model of reading comprehension... more This study builds upon previous work aimed at developing a student model of reading comprehension ability within the intelligent tutoring system, iSTART. Currently, the system evaluates students' self-explanation performance using a local, sentence-level algorithm and does not adapt content based on reading ability. The current study leverages natural language processing tools to build models of students' comprehension ability from the linguistic properties of their self-explanations. Students (n = 126) interacted with iSTART across eight training sessions where they self-explained target sentences from complex science texts. Coh-Metrix was then used to calculate the linguistic properties of their aggregated self-explanations. The results of this study indicated that the linguistic indices were predictive of students' reading comprehension ability, over and above the current system algorithms. These results suggest that natural language processing techniques can inform stealth assessments and ultimately improve student models within intelligent tutoring systems.
This study investigates a new approach to automatically assessing essay quality that combines tra... more This study investigates a new approach to automatically assessing essay quality that combines traditional approaches based on assessing textual features with new approaches that measure student attributes such as demographic information, standardized test scores, and survey results. The results demonstrate that combining both text features and student attributes leads to essay scoring models that are on par with state-of-the-art scoring models. Such findings expand our knowledge of textual and nontextual features that are predictive of writing success.

Reading and Writing, Dec 6, 2018
The assessment of argumentative writing generally includes analyses of the specific linguistic an... more The assessment of argumentative writing generally includes analyses of the specific linguistic and rhetorical features contained in the individual essays produced by students. However, researchers have recently proposed that an individual's ability to flexibly adapt the linguistic properties of their writing may more accurately capture their proficiency. However, the features of the task, learner, and educational context that influence this flexibility remain largely unknown. The current study extends this research by examining relations between linguistic flexibility, reading comprehension ability, and feedback in the context of an automated writing evaluation system. Students (n = 131) wrote and revised six argumentative essays in an automated writing evaluation system and were provided both summative and formative feedback on their writing. Additionally, half of the students had access to a spelling and grammar checker that provided lower-level feedback during the writing period. The results provide evidence for the supposition that skilled writers demonstrate linguistic flexibility across the argumentative essays that they produce. However, analyses also indicate that lower-level feedback (i.e., spelling and grammar feedback) have little to no impact on the properties of students' essays nor on their variability across prompts or drafts. Overall, the current study provides important insights into the role of flexibility in argumentative writing skill and develops a strong foundation on which to conduct future research and educational interventions.
Springer eBooks, 2015
We investigated linguistic factors that relate to misalignment between students' and teachers' ra... more We investigated linguistic factors that relate to misalignment between students' and teachers' ratings of essay quality. Students (n = 126) wrote essays and rated the quality of their work. Teachers then provided their own ratings of the essays. Results revealed that students who were less accurate in their self-assessments produced essays that were more causal, contained less meaningful words, and had less argument overlap between sentences.

HAL (Le Centre pour la Communication Scientifique Directe), 2018
A critical task for tutors is to provide learners with suitable reading materials in terms of dif... more A critical task for tutors is to provide learners with suitable reading materials in terms of difficulty. The challenge of this endeavor is increased by students' individual variability and the multiple levels in which complexity can vary, thus arguing for the necessity of automated systems to support teachers. This chapter describes ReaderBench, an open-source multi-dimensional and multi-lingual system that uses advanced Natural Language Processing techniques to assess textual complexity at multiple levels including surface-based, syntax, semantics and discourse structure. In contrast to other existing approaches, ReaderBench is centered on cohesion and makes extensive usage of two complementary models, i.e., Cohesion Network Analysis and the polyphonic model inspired from dialogism. The first model provides an in-depth view of discourse in terms of cohesive links, whereas the second one highlights interactions between points of view spanning throughout the discourse. In order to argue for its wide applicability and extensibility, two studies are introduced. The first study investigates the degree to which ReaderBench textual complexity indices differentiate between high and low cohesion texts. The ReaderBench indices led to a higher classification accuracy than those included in prior studies using Coh-Metrix and TAACO. In the second study, ReaderBench indices are used to predict the difficulty of a set of various texts. Although the high number of predictive indices (50 plus) accounted for less variance than previous studies, they make valuable contributions to our understanding of text due to their wide coverage.

Journal of writing assessment, 2015
This study investigates the relative efficacy of using linguistic micro-features, the aggregation... more This study investigates the relative efficacy of using linguistic micro-features, the aggregation of such features, and a combination of micro-features and aggregated features in developing automatic essay scoring (AES) models. Although the use of aggregated features is widespread in AES systems (e.g., e-rater; Intellimetric), very little published data exists that demonstrates the superiority of using such a method over the use of linguistic micro-features or combination of both micro-features and aggregated features. The results of this study indicate that AES models comprised of micro-features and a combination of micro-features and aggregated features outperform AES models comprised of aggregated features alone. The results also indicate that that AES models based on micro-features and a combination of micro-features and aggregated features provide a greater variety of features with which to provide formative feedback to writers. These results have implications for the development of AES systems and for providing automatic feedback to writers within these systems. Automated essay scoring (AES) is the use of computers to predict human ratings of essay quality. AES can be helpful in both classroom settings and in high-stakes testing by increasing reliability and decreasing the time and costs normally associated with essay evaluation (Bereiter

Grantee Submission, Mar 1, 2017
In this study, we investigated the degree to which the cognitive processes in which students enga... more In this study, we investigated the degree to which the cognitive processes in which students engage during reading comprehension could be examined through dynamical analyses of their natural language responses to texts. High school students (n = 142) generated typed self-explanations while reading a science text. They then completed a comprehension test that measured their comprehension at both surface and deep levels. The recurrent patterns of the words in students' self-explanations were first visualized in recurrence plots. These visualizations allowed us to qualitatively analyze the different self-explanation processes of skilled and less skilled readers. These recurrence plots then allowed us to calculate recurrence indices, which represented the properties of these temporal word patterns. Results of correlation and regression analyses revealed that these recurrence indices were significantly related to the students' comprehension scores at both surface-and deep levels. Additionally, when combined with summative metrics of word use, these indices were able to account for 32% of the variance in students' overall text comprehension scores. Overall, our results suggest that recurrence quantification analysis can be utilized to guide both qualitative and quantitative assessments of students' comprehension.
Grantee Submission, 2016
The opinions, findings, and conclusions or recommendations expressed are those of the authors and... more The opinions, findings, and conclusions or recommendations expressed are those of the authors and do not necessarily represent views of the IES or the ONR. We also thank Laura Allen for her helpful comments. Completed 2017. Jacovina, E. J., & McNamara, D. S. (in press). Intelligent tutoring systems for literacy: Existing technologies and continuing challenges. In R. Atkinson (Ed.), Intelligent tutoring systems: Structure, applications and challenges. Hauppauge, NY: Nova Science Publishers Inc. To be published with acknowledgment of federal support.

Springer eBooks, 2013
We present an evaluation of the Writing Pal (W-Pal) intelligent tutoring system (ITS) and the W-P... more We present an evaluation of the Writing Pal (W-Pal) intelligent tutoring system (ITS) and the W-Pal automated writing evaluation (AWE) system through the use of computational indices related to text cohesion. Sixty-four students participated in this study. Each student was assigned to either the W-Pal ITS condition or the W-Pal AWE condition. The W-Pal ITS includes strategy instruction, game-based practice, and essay-based practice with automated feedback. In the ITS condition, students received strategy training and wrote and revised one essay in each of the 8 training sessions. In the AWE condition, students only interacted with the essay writing and feedback tools. These students wrote and revised two essays in each of the 8 sessions. Indices of local and global cohesion reported by the computational tools Coh-Metrix and the Writing Assessment Tool (WAT) were used to investigate pretest and posttest writing gains. For both the ITS and the AWE systems, training led to the increased use of global cohesion features in essay writing. This study demonstrates that automated indices of text cohesion can be used to evaluate the effects of ITSs and AWE systems and further demonstrates how text cohesion develops as a result of instruction, writing, and automated feedback.

Routledge eBooks, May 25, 2023
Literacy can be broadly defined as 'the ability to understand, evaluate, use, and engage with wri... more Literacy can be broadly defined as 'the ability to understand, evaluate, use, and engage with written texts to participate in society, to achieve one's goals, and to develop one's knowledge and potential' (OECD, 2013). Literacy skills are essential for students to succeed in educational contexts and in nearly all aspects of everyday life . However, improving these skills continues to be a challenging task in the United States. According to the 2019 National Assessment of Educational Progress, 27% of 8th grade students perform below the basic levels of reading comprehension, and 66% do not reach proficient levels. Assessments of 12th graders indicate a similar pattern, with 30% and 63% of students not reaching basic and proficiency levels, respectively. Literacy assessments have been used extensively to help improve students' skills by providing targeted information about where students may struggle the most. In particular, accurate, valid, and reliable assessments of literacy skills are critical in order to provide students with opportunities for individualized instruction and practice, as well as timely feedback . However, traditional literacy assessments typically occur after students have finished reading a text or writing an essay, thus rendering the delivery of timely feedback nearly impossible. In contrast to traditional approaches to assessment, stealth assessment offers an innovative method to assess students' literacy skills during learning. This type of assessment is a type of user modeling wherein the assessment is seamlessly woven into game-based learning environments to assess students unobtrusively (Shute & Ventura, 2013;. Specifically, the evaluation of students' abilities occurs during the learning activity, rather than at summative or 'checkpoint' assessments. In addition, stealth assessments are not presented as 'quizzes' or 'tests', but instead are based on students' behaviors and performance during the tasks themselves. As such, stealth assessments can assess students' literacy dynamically and provide timely feedback throughout the learning process. In this chapter, we describe and analyze the feasibility of implementing stealth assessment in a game-based learning environment to evaluate students' literacy skills. We first describe literacy assessments and how natural language processing (NLP) techniques have been used to assess literacy. We then provide an overview of stealth assessment and its application in digital environments to assess students' knowledge and skills. We next summarize the Interactive Strategy Training for Active Reading and Thinking (iSTART), a game-based intelligent tutoring system (ITS) designed to help students improve their reading comprehension. We describe how NLP methods embedded in iSTART are used to assess students' skills and guide the adaptation of the system (e.g., providing individualized feedback and customizing learning paths). Finally, we report two preliminary analyses demonstrating how NLP can be used to develop stealth assessments of students' literacy skills in iSTART to guide the macro-level adaptivity of the system.

Proceedings of the Ninth ACM Conference on Learning @ Scale
This study presents the results of a computational discourse analysis of discussion threads withi... more This study presents the results of a computational discourse analysis of discussion threads within an online Math tutoring platform. This work is theoretically motivated by prior work that established the importance of linguistic and semantic features in the discourse in mathematics education. The end goal of this study is to understand the characteristics of language that is produced and used within a discussion board for math. The discussion board corpus comprises of posts from 4,720 students, teachers, and study experts who interacted within an online teaching and learning tutoring platform for math. Linguistic profiles of the discussion board discourse were estimated using Principal Component Analysis (PCA) based on Coh-Metrix linguistic features related to cohesion, language sophistication, and lexical characteristics. The PCA analysis yielded seven Math Discourse Linguistic Components, which collectively explained 49% of the variance in the dataset. Theoretical and conceptual validation of components revealed that the linguistic features align with the communication goal and the nature of mathematics. The linguistic profiles that characterized the discussion board discourse included referential cohesion, information density, instructional language, lexical variation, compare and contrast devices, explicit relations devices, and syntactic complexity. The dominance of cohesive cues within the linguistic profiles demonstrate the communication goals within the Math discourse such as elaboration, providing instruction, compare and contrast, establishing explicit relations, and presenting information. As such, these components characterize the Math Discussion Board discourse in terms of variations in cohesive and task-oriented cues within communication among students.
Journal of Engineering and Computer Innovations, Jan 30, 2010
Writing-Pal is an intelligent tutoring system designed to offer high school students writing stra... more Writing-Pal is an intelligent tutoring system designed to offer high school students writing strategy instruction and guided practice to improve their essay-writing skills. Students are taught to use writing strategies via interactive lessons, games, and essay-writing practice. This paper presents an overview of Writing-Pal's foundations and design, which are based on key pedagogical and educationaltechnology, and design principles. These considerations are important for the efficacy of the system, as well as its stability and portability in diverse settings, such as the laboratory, classroom, or students' homes. We expect this paper to be of interest to educational developers, as well as other developers who may face similar goals and challenges.
Grantee Submission, 2016
Revising is an essential writing process yet automated writing evaluation systems tend to give fe... more Revising is an essential writing process yet automated writing evaluation systems tend to give feedback on discrete essay drafts rather than changes across drafts. We explore the feasibility of automated revision detection and its potential to guide feedback. Relationships between revising behaviors and linguistic features of students' essays are discussed.

This study leverages natural language processing to assess dimensions of language and discourse i... more This study leverages natural language processing to assess dimensions of language and discourse in students’ discussion board posts and comments within an online learning platform, Math Nation. This study focusses on 1,035 students whose aggregated posts included more than 100 words. Students’ wall post discourse was assessed using two linguistic tools, Coh-Metrix and SEANCE, which report linguistic indices related to language sophistication, cohesion, and sentiment. A linear model including prior math scores (i.e., Mathematics Florida Standards Assessments), grade level, semantic overlap (i.e., LSA givenness), incidence of pronouns, and noun hypernymy accounted for 64.48% of the variance for the Algebra I end of course scores (RMSE=13.73). Students with stronger course outcomes used more sophisticated language, across a wider range of topics, and with less personalized language. Overall, this study confirms the contributions of language and communication skills over and above prior...
Uploads
Papers by Danielle McNamara