Incorporating Learning Characteristics into Automatic Essay Scoring Models: What Individual Differences and Linguistic Features Tell Us about Writing Quality
This study investigates a novel approach to automatically assessing essay quality that combines natural language processing approaches that assess text features with approaches that assess individual differences in writers such as demographic information, standardized test scores, and survey results. The results demonstrate that combining text features and individual differences increases the accuracy of automatically assigned essay scores over using either individual differences or text features alone. The findings presented here have important implications for writing educators because they reveal that essay scoring methods can benefit from the incorporation of features taken not only from the essay itself (e.g., features related to lexical and syntactic complexity), but also from the writer (e.g., vocabulary knowledge and writing attitudes). The findings have implications for educational data mining researchers because they demonstrate new natural language processing approaches that afford the automatic assessment of performance outcomes.
How to Cite
automated essay scoring, natural language processing, individual differences, intelligent tutoring systems, writing quality
ALLEN, L. K., SNOW, E. L., & MCNAMARA, D. S. (2016). The narrative waltz: The role of flexibility on writing performance. Journal of Educational Psychology. doi: 10.1037/edu0000109
APPLEBEE, A. N., LANGER, J. A., JENKINS, L. B., MULLIS, I., & FOERTSCH, M. A. (1990). Learning to write in our nation’s schools. Washington, DC: U.S. Department of Education, Office of Educational Research and Improvement.
ATTALI, Y., & POWERS, D. (2008). A developmental writing scale. ETS Research Report Series, 2008(1). Princeton, NJ: ETS
BAKER, R., & SIEMENS, G. (2014). Educational data mining and learning analytics. In Sawyer, K. (Ed.) Cambridge Handbook of the Learning Sciences: 2nd Edition, pp. 253-274.
BAKER, R. S, DE CARVALHO, A. M., RASPAT, J., ALEVEN, V., CORBETT, A. T., & KOEDINGER, K. R. (2009). Educational software features that encourage and discourage "gaming the system". Proceedings of the 14th International Conference on Artificial Intelligence in Education, 475-482.
BAKER, R.S., & YACEF, K. (2009). The State of Educational Data Mining in 2009: A Review and Future Visions. Journal of Educational Data Mining, 1 (1), 3-17.
BECK, J.E., CHANG, K., MOSTOW, J., CORBETT, A.T. (2008). Does help help? Introducing the Bayesian evaluation and assessment methodology. Proceedings of Intelligent Tutoring Systems, ITS 2008, 383–394.
BOWERS, A.J. (2010). Analyzing the longitudinal K-12 grading histories of entire cohorts of students: Grades, data driven decision making, dropping out and hierarchical cluster analysis. Practical Assessment, Research & Evaluation (PARE), 15(7), 1-18.
CROSSLEY, S. A., MCNAMARA, D. S., BAKER, R., WANG, Y., PAQUETTE, L., BARNES, T., & BERGNER, Y. (2015). Language to completion: Success in an educational data mining massive open online class. In Santos, O. C., Boticario, J. G., Romero, C., Pechenizkiy, M., Merceron, A., Mitros, P., Luna, J. M., Mihaescu, C., Moreno, P., Hershkovitz, A., Ventura, S., & Desmarais, M. (eds.) Proceedings of the 8th International Conference on Educational Data Mining (EDM). (pp. 388-392).
CROSSLEY, S. A., ROSCOE, R. D., MCNAMARA, D. S., & GRAESSER, A. C. (2011). Predicting human scores of essay quality using computational indices of linguistic and text features. In G. Biswas, S. Bull, J. Kay, & A Mitrovic (Eds.), Proceedings of the 15th International Conference on Artificial Intelligence in Education (pp. 438-440). Auckland, New Zealand: AIED.
EZEN-CAN, A., & BOYER, K. E. (2015). Understanding student language: An unsupervised dialogue act classification approach. Journal of Educational Data Mining, 7 (1), 51-78.
FERRARI, M., BOUFFARD, T., & RAINVILLE, L. (1998). What makes a good writer? Differences in good and poor writers’ self-regulation of writing. Instructional Science, 26, 473-488. doi:10.1023/A:1003202412203
FITZGERALD, J. & SHANAHAN, T. (2000). Reading and writing relations and their development. Educational Psychologist, 35, 39-50.
JING, S. (2015). Automatic grading of short answers for MOOC via semi-supervised document clustering. In Santos, O. C., Boticario, J. G., Romero, C., Pechenizkiy, M., Merceron, A., Mitros, P., Luna, J. M., Mihaescu, C., Moreno, P., Hershkovitz, A., Ventura, S., & Desmarais, M. (eds.) Proceedings of the 8th International Conference on Educational Data Mining (EDM).
MACGINITIE, W. H., & MACGINITIE, R. K. (1989). Gates-MacGinitie reading tests. Chicago: Riverside.
MCNAMARA, D. S., CROSSLEY, S. A., & MCCARTHY, P. M. (2010). The linguistic features of writing quality. Written Communication, 27, 57-86.
SAMEI, B., RUS, V., NYE, B., & MORRISON, D. (2015). Hierarchical dialogue act classification in online tutoring sessions, In Santos, O. C., Boticario, J. G., Romero, C., Pechenizkiy, M., Merceron, A., Mitros, P., Luna, J. M., Mihaescu, C., Moreno, P., Hershkovitz, A., Ventura, S., & Desmarais, M. (eds.) Proceedings of the 8th International Conference on Educational Data Mining (EDM).
VARNER, L. K., ROSCOE, R. D., & MCNAMARA, D. S. (2013). Evaluative misalignment of 10th-grade student and teacher criteria for essay quality: An automated textual analysis. Journal of Writing Research, 5, 35-59.
WEN, M., YANG, D., & ROSE, C. P. (2014a). Sentiment analysis in MOOC discussion forums: What does it tell us? In the Proceedings of the 7th International Conference on Educational Data Mining, 130-137.
WEN, M., YANG, D., & ROSE, C. P. (2014b). Linguistic reflections of student engagement in massive open online courses. In the Proceedings of the International Conference on Weblogs and Social Media.
WITTE, S., & FAIGLEY, L. (1981). Coherence, cohesion, and writing quality. College Composition and Communication, 32, 189-204.
Authors who publish with this journal agree to the following terms:
- The Author retains copyright in the Work, where the term “Work” shall include all digital objects that may result in subsequent electronic publication or distribution.
- Upon acceptance of the Work, the author shall grant to the Publisher the right of first publication of the Work.
- The Author shall grant to the Publisher and its agents the nonexclusive perpetual right and license to publish, archive, and make accessible the Work in whole or in part in all forms of media now or hereafter known under a Creative Commons 4.0 License (Attribution-Noncommercial-No Derivatives 4.0 International), or its equivalent, which, for the avoidance of doubt, allows others to copy, distribute, and transmit the Work under the following conditions:
- Attribution—other users must attribute the Work in the manner specified by the author as indicated on the journal Web site;
- Noncommercial—other users (including Publisher) may not use this Work for commercial purposes;
- No Derivative Works—other users (including Publisher) may not alter, transform, or build upon this Work,with the understanding that any of the above conditions can be waived with permission from the Author and that where the Work or any of its elements is in the public domain under applicable law, that status is in no way affected by the license.
- The Author is able to enter into separate, additional contractual arrangements for the nonexclusive distribution of the journal's published version of the Work (e.g., post it to an institutional repository or publish it in a book), as long as there is provided in the document an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post online a pre-publication manuscript (but not the Publisher’s final formatted PDF version of the Work) in institutional repositories or on their Websites prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (see The Effect of Open Access). Any such posting made before acceptance and publication of the Work shall be updated upon publication to include a reference to the Publisher-assigned DOI (Digital Object Identifier) and a link to the online abstract for the final published Work in the Journal.
- Upon Publisher’s request, the Author agrees to furnish promptly to Publisher, at the Author’s own expense, written evidence of the permissions, licenses, and consents for use of third-party material included within the Work, except as determined by Publisher to be covered by the principles of Fair Use.
- The Author represents and warrants that:
- the Work is the Author’s original work;
- the Author has not transferred, and will not transfer, exclusive rights in the Work to any third party;
- the Work is not pending review or under consideration by another publisher;
- the Work has not previously been published;
- the Work contains no misrepresentation or infringement of the Work or property of other authors or third parties; and
- the Work contains no libel, invasion of privacy, or other unlawful matter.
- The Author agrees to indemnify and hold Publisher harmless from Author’s breach of the representations and warranties contained in Paragraph 6 above, as well as any claim or proceeding relating to Publisher’s use and publication of any content contained in the Work, including third-party content.