Evaluating the construct validity of an EFL test for PhD candidates:
|
|
". . . the greater a test's social impact is, the higher its need for validity." |
Construct validation is the process of gathering evidence to support the contention that a given test indeed measures the psychological construct the makers intend it to measure. The goal is to determine the meaning of scores from the test, to assure that the scores mean what we expect them to mean (1991, p. 108).
[ p. 2 ]
One way of assessing the construct validity of a test is to correlate its different test components (Alderson, Clapham & Wall, 2000, pp. 183-184). Using quantitative methods, this paper focuses on correlated item analyses of the 1986 and 2005 versions of the CASEEEDC to see how it has changed.[ p. 3 ]
Section | Sub-Section | Item types | Points | Item number (k) | Time |
I. Structure |
A | MCQ | 13 | 13 | 20 minutes |
B | MCQ | 12 | 12 | ||
II. Vocabulary |
A | MCQ | 10 | 10 | 25 minutes |
B | MCQ | 8 | 8 | ||
C | blank filling | 7 | 7 | ||
III. Cloze |
blank filling | 20 | 20 | 20 minutes |
|
IV. Reading |
MCQ | 30 | 30 | 50 minutes |
|
V. Writing |
extended essay | 20 | 1 | 35 minutes |
|
TOTAL: | 120 | 101 | 150 minutes |
Section | Sub-Section | Item types | Points | Item number (k) | Time |
I. Listening |
MCQ | 20 | 20 | 20 minutes |
|
II. Vocabulary |
MCQ | 10 | 20 | 15 minutes |
|
III. Cloze |
MCQ | 15 | 15 | 15 minutes |
|
IV. Reading |
MCQ | 30 | 30 | 60 minutes |
|
V. Translation |
Sentence level translations | 10 | 5 | 30 minutes |
|
VI. Writing |
extended essay | 15 | 1 | 40 minutes |
|
TOTAL: | 100 | 91 | 180 minutes |
[ p. 4 ]
1986 | 2005 | 1986 | 2005 | 1986 | 2005 | |||
Overall correct answer rate: | 49% | 69.6% | Mode: | 55.5 | 75.0 | Standard Deviation: | 10.58 | 7.40 |
High Score: | 78.5% | 85.5% | Median: | 59.0 | 70.5 | Range: | 34 points | 37.5 points |
Low Score: | 44.5% | 48.0% | Mean: | 58.8 | 69.6 | Variance: | 111.9 | 54.8 |
[ p. 5 ]
I. Structure (25 items total) |
II. Vocabulary (25 items total) |
III. Cloze (20 items total) |
IV. Reading (30 items total) |
V. Writing (a single extended item) |
Q11, Q12, Q25 (3 items acceptable) |
Q43, Q46 (2 items acceptable) |
Q58, Q60, Q63, Q65, Q66, Q68, Q69 (7 items acceptable) |
Q93, Q94, Q96, Q98 (4 items acceptable) |
(that item did not appear to discriminate well) |
I. Listening (20 items total) |
II. Vocabulary (20 items total) |
III. Cloze (15 items total) |
IV. Reading (30 items total) |
V. Translation (5 items total) |
VI. Writing (a single extended item) |
(no acceptable items) | Q35, Q39 (2 acceptable items) |
Q41 (1 acceptable item) |
(no acceptable items) | (no acceptable items) | (no acceptable items) |
List wise Correlations (n=66) |
Total score | I. Structure | II. Vocabulary | III. Cloze | IV. Reading | V. Writing | |
Total score | Pearson Correlation | 1 | .387** | .326** | .522** | .337** | .339** |
Sig. (2-tailed) | .001 | .008 | .000 | .006 | .005 | ||
I. Structure | Pearson Correlation | .387** | 1 | .288* | .293* | .080 | .217 |
Sig. (2-tailed) | .001 | .019 | .017 | .522 | .080 | ||
II. Vocabulary | Pearson Correlation | .326** | .288* | 1 | -.039 | .176 | .186 |
Sig. (2-tailed) | .008 | .019 | .755 | .157 | .136 | ||
III. Cloze | Pearson Correlation | .522** | .293* | -.039 | 1 | .088 | .306* |
Sig. (2-tailed) | .000 | .017 | .755 | .481 | .012 | ||
IV. Reading | Pearson Correlation | .337** | .0806 | .176 | .088 | 1 | .093 |
Sig. (2-tailed) | .006 | .522 | .157 | .481 | .455 | ||
V. Writing | Pearson Correlation | .339** | .217 | .186 | .306 | .093 | 1 |
Sig. (2-tailed) | .005 | .080 | .136 | .012 | .455 |
[ p. 6 ]
Table 7. Correlation coefficients of the total score of the 2005 test with each subtest and the various subtests with each otherList wise Correlations (n=66) |
Total Score | I. Listening | II. Vocabulary | III. Cloze | IV. Reading | V. Translation | VI. Writing | |
Total Score | Pearson Correlation | 1 | .487** | .535** | .464** | .627** | .542** | .548** |
Sig. (2-tailed) | .000 | .000 | .000 | .000 | .000 | .000 | ||
I. Listening | Pearson Correlation | .487** | -.073 | .143 | .293* | -.026 | .071 | .430** |
Sig. (2-tailed) | .000 | .561 | .253 | .839 | .570 | .000 | ||
II. Vocabulary | Pearson Correlation | .535** | -.073 | 1 | .027 | .285* | .341** | .104 |
Sig. (2-tailed) | .000 | .561 | .831 | .021 | .005 | .405 | ||
III. Cloze | Pearson Correlation | .464** | .143 | .027 | 1 | .039 | .093 | .293* |
Sig. (2-tailed) | .000 | .253 | .831 | .756 | .456 | .017 | ||
IV. Reading | Pearson Correlation | .627(**) | -.026 | .285* | .039 | 1 | .281* | .188 |
Sig. (2-tailed) | .000 | .839 | .021 | .756 | .022 | .131 | ||
V. Translation | Pearson Correlation | .542(**) | .071 | .341** | .093 | .281* | 1 | .097 |
Sig. (2-tailed) | .000 | .570 | .005 | .456 | .022 | .439 | ||
VI. Writing | Pearson Correlation | .548** | .430** | .104 | .293* | .188 | .097 | 1 |
Sig. (2-tailed) | .000 | .000 | .405 | .017 | .131 | .439 |
[ p. 7 ]
2005 Total score |
2005 Vocabulary Section |
2005 Cloze Section |
2005 Reading Section |
2005 Writing Section |
||
1986 Total score |
Pearson Correlation |
.316** | ||||
Sig. (2-tailed) |
.010 | |||||
1986 Vocabulary Section |
Pearson Correlation |
.167** | ||||
Sig. (2-tailed) |
.180 | |||||
1986 Cloze Section |
Pearson Correlation |
.146** | ||||
Sig. (2-tailed) |
.243 | |||||
1986 Reading Section |
Pearson Correlation |
.059** | ||||
Sig. (2-tailed) |
.638 | |||||
1986 Writing Section |
Pearson Correlation |
.357** | ||||
Sig. (2-tailed) |
.003 |
NOTE: 1 = "very easy" and 5 = "very difficult" for Qs 1-4; 1 = "strongly disagree" and 5 = "strongly agree" for Qs 5-9. | |||
Survey Item | Number of responses | Mean | Std. Deviation |
Q1 How difficult was the Structure Section of this test? | 62 | 3.31 | .841 |
Q2 How difficult was the Vocabulary Section of this test? | 62 | 3.71 | .876 |
Q3 How difficult was the Cloze Section of this test? | 62 | 3.52 | .911 |
Q4 How difficult was the Reading Section of this test? | 62 | 3.66 | 1.007 |
Q5 "The Structure Section reflects my English proficiency." | 61 | 3.43 | .991 |
Q6 "The Vocabulary Section reflects my English proficiency." | 61 | 3.28 | .985 |
Q7 "The Cloze Section reflects my English proficiency." | 61 | 3.54 | .886 |
Q8 "The Reading Section reflects my English proficiency." | 61 | 3.62 | 1.051 |
Q8 "The Writing Section reflects my English proficiency." | 61 | 3.70 | 1.025 |
[ p. 8 ]
Table 10. Survey responses for the 2005 CASEEEDC test
NOTE: 1 = "very easy" and 5 = "very difficult" for Qs 1-5; 1 = "strongly disagree" and 5 = "strongly agree" for Qs 6-11. | |||
Survey Item | Number of responses | Mean | Std. Deviation |
Q1 How difficult was the Listening Section of this test? | 66 | 2.77 | .819 |
Q2 How difficult was the Vocabulary Section of this test? | 66 | 3.62 | .873 |
Q3 How difficult was the Cloze Section of this test? | 66 | 3.38 | .651 |
Q4 How difficult was the Reading Section of this test? | 66 | 3.65 | .774 |
Q5 "The Structure Section reflects my English proficiency." | 65 | 3.31 | .828 |
Q6 "The Listening Section reflects my English proficiency." | 65 | 3.73 | .833 |
Q7 "The Vocabulary Section reflects my English proficiency." | 66 | 3.39 | .926 |
Q8 "The Cloze Section reflects my English proficiency." | 66 | 3.68 | .844 |
Q9 "The Reading Section reflects my English proficiency." | 66 | 3.71 | .827 |
Q10 "The Translation Section reflects my English proficiency." | 63 | 3.62 | .831 |
Q10 "The Writing Section reflects my English proficiency." | 65 | 3.63 | .875 |
"The fact that Listening section of the 2005 test correlated negatively with the Reading and Vocabulary sections should raise the eyebrows of any researcher." |
[ p. 9 ]
Acknowledgements
We would like to thank Professor Li Xiaodi for providing the 1986 version of the CASSSEDC. In addition, help was received from two classes of M.S. students at the Chinese Academy of Sciences for taking two versions of the CASSSEDC and completing the surveys that informed this study.
[ p. 10 ]
References
Main Article | Appendix A | Appendix B | Appendix C | Appendix D | Appendix E |
[ p. 11 ]