Examining the reliability of a TOEIC Bridge practice test under 1- and 3-parameter item response models

Article appearing in Shiken 16.2 (Nov 2012) pp. 8-14.

Authors: Jeffrey Stewart¹, Aaron Gibson² & Luke Fryer³
1. Kyushu Sangyo University, Cardiff University
2. Kyushu Sangyo University
3. Kyushu Sangyo University

Abstract:
Unlike classical test theory (CTT), where estimates of reliability are assumed to apply to all mem- bers of a population, item response theory provides a theoretical framework under which reliability can vary by test score. However, different IRT models can result in very different interpretations of reliability, as models that account for item quality (slopes) and probability of a correct guess significantly alter estimates. This is illustrated by fitting a TOEIC Bridge practice test to 1 (Rasch) and 3-parameter logistic models and comparing results. Under the Bayesian Information Criterion (BIC) the 3-parameter model provided superior fit. The implications of this are discussed.

Download full article (PDF)