Annotating argumentative structure in English-as-a-Foreign-Language learner essays
Abstract Argument mining (AM) aims to explain how individual argumentative discourse units (e.g. sentences or clauses) relate to each other and what roles they play in the overall argumentation. The automatic recognition of argumentative structure is attractive as it benefits various downstream tasks, such as text assessment, text generation, text improvement, and summarization. Existing studies focused on analyzing well-written texts provided by proficient authors. However, most English speakers in the world are non-native, and their texts are often poorly structured, particularly if they are still in the learning phase. Yet, there is no specific prior study on argumentative structure in non-native texts. In this article, we present the first corpus containing argumentative structure annotation for English-as-a-foreign-language (EFL) essays, together with a specially designed annotation scheme. The annotated corpus resulting from this work is called “ICNALE-AS” and contains 434 essays written by EFL learners from various Asian countries. The corpus presented here is particularly useful for the education domain. On the basis of the analysis of argumentation-related problems in EFL essays, educators can formulate ways to improve them so that they more closely resemble native-level productions. Our argument annotation scheme is demonstrably stable, achieving good inter-annotator agreement and near-perfect intra-annotator agreement. We also propose a set of novel document-level agreement metrics that are able to quantify structural agreement from various argumentation aspects, thus providing a more holistic analysis of the quality of the argumentative structure annotation. The metrics are evaluated in a crowd-sourced meta-evaluation experiment, achieving moderate to good correlation with human judgments.