Corpus-informed test development: Making it about more than word frequency

Article appearing in Shiken 18.1 (August 2014) pp. 3-9.

Authors: Jonathan W. Trace1 & Gerriet Janssen2
1. University of Hawai'i at Manoa
2. Universidad de los Andes-Colombia

Given the rising popularity and usefulness of corpora in the field of applied linguistics, more and more there is a need to identify practical applications of the different tools available beyond just word frequency. One area where corpora seem ideal for this is in the realm of second language assessment. This study looks at the use of corpus-informed test items on an academic English vocabulary test (N = 203). Two different formats of the test (c-test and multiple-choice) are analyzed to explore possible relationships between item characteristics for difficulty and contextual information. First, Rasch measurement is used to determine the difficulty of a set of common items across both tests. These results are then compared with a series of mutual information scores based on collocations and multi-word constructions with the target items. The goal is to examine possible relationships between context and item difficulty, and more importantly provide teachers and test-designers with one way to utilize corpus linguistics to create more effective language assessment tools.

Keywords: corpus linguistics, language testing, formulaic language, vocabulary

Download full article (PDF)