Automated Writing Evaluators (AWEs) in L2 Writing

Later today, a guest blog that I wrote for ELT Research Bites will come online. In that posting, I discuss how the Global Masters in Social Work (gMSW) program at NYU Shanghai decided to make a premium version of Grammarly available to our students free-of-charge, and I summarize a recent study that looked at student attitudes towards automated writing evaluators (AWEs) like Grammarly. In today’s blog post on Applied Linguistics (Re)Coded, I’m going to reflect a little more on AWEs and their (possible) role in L2 writing.

First, allow me to say that I was entirely in support of the gMSW program’s decision to roll out Grammarly to its students—L1 and L2 English users alike. I should also note, that I am an adjunct lecturer at the Silver School of Social Work in the gMSW program. However, I was also vocal in stating that we had to properly train students on how to integrate the AWE into their writing practice in a critical way. Far too often, students view AWEs as mere checking tools (see Cavaleri & Dianati, 2016; Reis & Huijser, 2016). Also, students have been known just to click, click, click through the suggested changes, to accept them with little thought as to how the recommended alterations might impact meaning and stylistics in their texts (ibid). Both of these practices drastically reduce the potential efficacies of AWEs—first, as learning tools; and second, as writing improvement tools. So, there are a couple of things that need to be kept in mind when scaffolding students’ use of AWEs.

For L1 and L2 writers alike, there is the need to highlight the fact that many advanced AWEs such as Grammarly, PaperRateror After the Deadline, aren’t just checking tools; they are also great learning tools. One of the reasons that I’m such a fan of Grammarly is because I can hover over a suggestion to learn more about why the AWE flagged it as a potential error. I can then see examples of other sentences with similar mistakes; and then, I can reflect on what I’ve written to decide whether or not it’s actually an error that needs to be addressed. Also, every week Grammarly sends me an email that lists my top-five errors (e.g., faulty parallelism, improper article use, etc.), which I can then use to identify a pattern of errors in my writing. After identifying the pattern, I can then begin actively to hunt them down in future writing. Both of these features are great tools for learning and not just for checking.

The second issue, however, is a much harder one to deal with. Novice writers tend to see AWEs as a silver bullet to fix their writing. I tell my students the same thing I tell early-service educators and researchers, “There’s no such thing as a silver bullet, so quit wasting your time.” The only way to improve our writing, our practice, or our research is to invest the time in it that it needs to be nurtured. This means critically reflecting on what we’re doing, the tools we’re using, and the successes and failures we’re encountering. Students must be taught to use AWEs critically. AWEs are, despite how advanced they’ve become, pretty stupid things that can only check for what they’ve been programmed to monitor for. And, even then, they’re only as good as the code that makes them run and the databases and corpora that serve as their foundations. So, you must look into all the suggestions that they make to inform your revision decisions. There are some cases where an AWE will make recommendations that actually change the intended meaning of a sentence, or that will introduce new error into the text. Likewise, there are somethings that the checker simply will not catch. So, it’ll never replace an additional set of human eyes.

I’ve been using the premium version of Grammarly for about a year, and will likely fork over the 140$ for another year. Since I’ve started using it, I’ve become more confident in my writing, and I’ve added an additional—and necessary—level of revision to my work. I actually got a compliment on my writing and style from an editor for an article that I submitted to their journal (an article that was also my first “accept with minor revisions”). But, that’ doesn’t mean that error doesn’t still creep into the writing. Clicking through the AWE doesn’t solve all the problems. It helps, for sure. But, you must always go through multiple rounds of revision and review. And, you can never replace a second set of eyes.

This post has been edited with the help of Grammarly, and I’m sure you still might find a floating error here or there that I missed because I was in a hurry to run downstairs for another cup of Starbucks.

source material

Cavaleri, M., & Dianati, S. (2016). You want me to check your grammar again? The usefulness of an online grammar checker as perceived by students. Journal of Academic Language & Learning, 10(1), 223-236.

Reis, C., & Huijser, H. (2016). Correcting tool or learning tool? Student perceptions of an online essay writing support tool at Xi’an Jiaotong-Liverpool University. Show Me the Learning. Adelaide, AU: ASCILITE.

Personal PD (Cont.): Mobile Learning in the Japanese Context

Obari, H., Goda, Y., Shimoyama, Y. & Kimura, M. (2010). Mobile technologies and language learning in Japan: Learn anywhere, anytime. In S. Levy, F. Blin, C.B. Siskin, and O. Takeuchi (eds.). WorldCALL: International Perspectives on Language Learning (pp. 38-54). New York: Routledge.


In this chapter, the authors presented the material conditions for mobile Computer-assisted Language Learning (m-CALL) in the Japanese context and explored the efficacy of mCALL implementations for English language learning. They showed that a majority of college-aged students at Japanese universities owned a mobile device (~94%) and that of these students over 60% preferred to use their mobile devices for language learning activities, ranging from lexicon expansion activities to listening comprehension activities. Based on some small-scale exploratory studies, the authors found that students, except for liberal arts students, all had marked gains in linguistic skills after experiencing mCALL interventions.


The biggest take away from this piece was that CALL interventions should meet students where they are at. Increasingly, in Asian contexts at least, this is on mobile devices while on the metro, while queued up to buy food, or even while watching TV. The prevalence of mobile devices in our students’ lives has substantial implications for how we design and implement our CALL tasks, as not every website will be immediately mobile friendly, nor will every file type be equally accessible on mobile devices. It is a critical takeaway when we consider the ubiquity of high-powered smart devices that our students use. Having said this, this chapter had many flaws that made it difficult to follow, and that decreased its potential impact. Most critically, it just tried to do too much. The authors summarized five different studies that they carried out but didn’t go into enough detail for them to be useful. They presented some charts and figures that were never fully explained, which made it difficult to see their relevance to the argument that they were trying to advance.

Personal PD (cont.): Blended Learning in Higher Ed Language Education

Now that I’m back in Shanghai, it’s time to get back to work on my TESOL Advanced Practitioner’s certificate, AKA: This year’s PD project. As promised, I’m going to continue “live” blogging my way through by first sharing the outputs of each stage of the certificate program. First, I’m compiling an annotated bibliography of readings in the area that I wish to grow in—in this case, Computer-assisted Language Learning, or CALL. Below, you’ll find the next entry. If you want to see other entries in this series, use the search box to find titles that contain “Personal PD” in them. Enjoy!

Ticheler, N., & Sachdev, I. (2010). Blended learning, empowerment, and world languages in higher education: The Flexi-Pack Project for “languages of the wider world. In S. Levy, F. Blin, C.B. Siskin, and O. Takeuchi (eds.). WorldCALL: International Perspectives on Language Learning (pp. 163-171). New York: Routledge


In this chapter, the authors explored the creation and adoption of flexi-packs, digitally delivered language learning support lessons that depended on multimodal resource delivery, and the impacts on students and teacher motivation. They argued that flexi-packs allowed educators to create a blended learning environment by deploying tools that met students in their time of need and gave students control over how they used the support resources. It should be noted, that the flexi-pack content and structure was always tied to specific pedagogical imperatives in the live classes for which they were created. The authors went on to provide an outline for what should go into an effective flexi-pack module: e.g., clear learning objectives, multi-modal, authentic resources, review material, etc. They concluded by providing a view of how flexi-packs influenced students and teachers, reporting positive results from both populations and marked positive impacts on learner motivation.


This article introduces the notion of flexi-packs as ways to give students greater control over their learning. They argue that doing so will facilitate increased student motivation in engaging with language learning. However, I fear that they missed the mark here. It’s not just about control over learning. What needs to happen with all CALL interventions is that students are guided to taking increased reflective agency over their learning. That means structuring lessons and materials in such a way that students come to understand how their eco-social environments prime them make individual choices and how this may impact learning. Students need help to become critically aware of the tools that they use to support their language learning/acquisition endeavors and how these tools will feedback into their linguistic performances in their eco-social worlds.

While I appreciate that the authors firmly grounded the creation of flexi-packs in the pedagogical imperatives of the classes that they support, I can’t help but notice the overly optimistic tone that they take towards their subject. I’m all for technologic supports for language learning—I use many myself—but there is little talk about how flexi-packs can move beyond the institution to support language learners worldwide—allowing the universities to provide a free service to their increasingly global constituents. Nor does this article adequately discuss the shortcomings of flexi-packs. For example, from my work on the Purdue Online Writing Lab, I know that students make use of these resources and feel that they have learned something, but time and again you see the misapplication of the learning in ways that produce errors. This issue is one that needs to be considered more fully in CALL research.

Personal PD (cont.): Design and Computer-assisted Language Learning (CALL)

Levy, S., & Stockwell, G. (2006). Chapter 2 Design. In S. Levy and G. Stockwell. CALL Dimensions: Options and Issues in Computer Assisted Language Learning (pp. 10-39). New York: Routledge.


In this chapter, Levy and Stockwell (2006) discussed the multiple layers that make up design considerations in computer-assisted language learning (CALL) environments. They began by discussing the role of task factors in CALL design, providing different models of task effects that may impact design related decisions. They then examined how curriculum and syllabus planning may influence CALL design based on where in the curriculum/course planning process CALL is being introduced. While early-process CALL introduction may be easier to incorporate, adding CALL interventions to an established course may allow the instructor to find novel solutions to pre-existing inefficiencies. Finally, the authors discussed the three perspectives on CALL design problems that must be kept in mind: The teacher, the learner, and the institution. They maintained that each of these stakeholders introduced unique but equally important considerations into the equation—whether it be increased prep time, ease of accessing materials, or fiscal considerations.


Levy and Stockwell (2006) provides an acceptable overview of CALL design issues. The chapter’s multifaceted approach highlights the complexity of developing effective CALL interventions. This doesn’t mean that it is not worth doing, just that it mustn’t be done willy-nilly or for the sake of having CALL on your CV or in your class. Rather, it must be purposefully considered from the perspective of the educator, the learner, and the institution before being implemented. And, as has shown up time and again, CALL interventions must be directly tied to the learning needs of the students. While this chapter is at times hard to follow because of how complex the issue is, it does provide a good introduction to the topic for novices. Granted, I can easily see the need for a book-length treatment on design in language teaching.

Personal PD (cont.): Qualitative and Quantitative Methods for CALL Research

Leaky, J. (2011c). Chapter 5: A model for evaluating CALL part 2: Quantitative and qualitative measures. In J. Leaky. Evaluating Computer-assisted Language Learning: An integrated approach to Effectiveness Research in CALL (pp. 115-132). New York: Peter Lang.


In this chapter, Leaky (2011c) continued outlining his evaluation paradigm by examining how qualitative (judgmental) and quantitative (evaluative) approaches can be used to evaluate the efficacy of computer-assisted language learning (CALL) interventions. He provided nine principles to guide CALL research, which included controlling for confounding variables, doing a literature review, using random sampling, and providing a transparent methodology section (among others). He then provided a research design checklist to guide instrument design. This multipoint checklist covers matters of sampling (N size); integration of program, pedagogy, and platform; the conditions of the study (e.g., activity type, variables, etc.); and the quantitative instruments to be used. Qualitative methods and measures are absent from this methodological check list.


In this chapter, Leaky continues his combative approach to building his own framework. This attack on the work that has come before not only sets a disappointing tone, it also introduces weakness into his arguments. He continues to conflate qualitative research with mere subjective storytelling by holding qualitative research to the same validity/reliability measures as quantitative research and presenting a 2D view of qualitative research. His tone throughout the chapter continues to alienate the reader, at least this reader. And, again the most useful elements of the section lie in poorly supported graphical elements. Namely, his checklist for designing CALL efficacy studies. So, if you want to skip the entitled dross, move to pages 125-126 and update the list to be actually mixed-method by considering how robust qualitative research can enrich the quantitative data—something Leaky (2011c) continues to fail to do.