Corpus linguistics and the study of literature
The present paper introduces corpus-based analytical techniques and surveys some of the specific ways in which corpus analysis has been applied to the study of literature. In recent years, those research efforts have been mostly carried out under the umbrella of ‘corpus stylistics’. Most of these studies focus on the distribution of words (analyzing keywords, extended lexical phrases, or collocations) to identify textual features that are especially characteristic of an author or particular text. Corpus-based grammatical and pragmatic analyses of literary language are also briefly considered. Then, in the concluding part of the paper, I briefly survey earlier computational and statistical research on authorship attribution and literary style. While that research tradition is in some ways the precursor to more recent work in corpus stylistics, it is also complementary to recent research in its application of sophisticated statistical and computational methods.