JALT Testing & Evaluation SIG Newsletter Vol. 1 No. 2 Sep. 1997, (p. 2 - 13)

An Overview of ACTFL Proficiency Interviews (Part 2)



The ACTFL uses a single global rating of oral proficiency with no basis in current linguistic theory or empirical research in language testing. In order to validate a test like the OPI, we must begin with definitions of language ability and testing procedures that give direction for test design and hypotheses about test performance. A high rating of grammatical competence based on an OPI is a highly reliable indicator of a very narrow ability: a subject's skill in using grammatical structures accurately in contexts and under the conditions that are included in the testing procedure. Thus the ACTFL OPI, as currently designed, is not a valid measure of communicative language ability.
The OPI claims to be based on procedures developed over years of practice and governmental experience, but is this experience evidence of validity? Governmental experience with the OPI only provides (1) evidence that OPI findings can predict something about an individual in a government position, and (2) circumstantial evidence that ratings measure some ability of interest to the government's language schools.
The question is, is this type of predictability relevant to the academic language teaching community and second language acquisition researchers?
To its credit, the proficiency movement has infused new interest into the field of foreign language teaching. However, because of the litigious society in which we live, ACTFL must be prepared to show that a given rating is a valid indicator of proficiency in speaking and valid for the purpose of making decisions about minimum competency or employability.

Construct Validity

In 1990 Dandonoli (ACTFL) and Henning (Educational Testing Service) reported on the results of research conducted by ACTFL on the construct validity of the ACTFL Proficiency Guidelines and OPI. This was one of the few studies which examined the validity of the ACTFL OPI using statistical procedures. A multitrait-multimethod validation study was the basis of the research design and analysis, including speaking, writing, listening, and reading tests in English as a Second Language and French. She acknowledged that this validation study was done in answer to criticisms by several authors, including Bachman and Savignon, and Lantolf and Frawley. For purposes of the data analyses, proficiency ratings (ordinal scale) were converted to numerical scores (interval scale) as follows:



       Proficiency Rating    Numerical Equivalent
         Novice Low                       0.1      
         Novice Mid                       0.3      
         Novice High                      0.8      
         Intermediate Low                 1.1      
         Intermediate Mid                 1.3      
         Intermediate High                1.8      
         Advanced                         2.3      
         Advanced High                    2.8      
         Superior                         3.3

[ p. 9 ]

No other rationale for converting the ordinal scale to an interval scale, nor the fractional distances between the intervals, was given. There were fewer proficiency levels in the table for listening and reading than for speaking and writing because passages were purposely selected at only five levels, the 'major borders' in listening and reading. In French, the correlation matrix revealed that speaking, writing, and reading showed discriminant validity in all required comparisons, but in listening, only three of the required 12 comparisons were met. The ESL skills of speaking and reading showed convergent/discriminant construct validity in all thirteen required comparisons, but writing only exhibited such validity in eleven of thirteen comparisons and listening in ten of thirteen comparisons. In French listening, only four of the required thirteen comparisons were met. The calibration figures indicated that there was generally adequate progression in the right direction on the latent ability and difficulty continua associated with the descriptors provided in the Guidelines, but in English there were three exceptions: (1) ratings of Novice High speaking tended to be higher on the scale than ratings of Intermediate Low speaking; (2) Novice High writing higher than Intermediate Low, and (3) items related to Intermediate Listening passages tended to have lower mean difficulty than those related to Novice listening. In French there were also two exceptions: (1) Novice High writing tended to be higher than Intermediate Low writing and (2) Intermediate reading passages tended to hive higher mean difficulty than those related to Advanced reading.

Conclusion

Very few statistically based research studies are available about the validity, reliability, and test-method effect of the ACTFL Oral Proficiency Interview. Many experts in the field urge a revision of the guidelines, a redefinition of the proficiency scales (Bachman and Savignon). Of course, until more empirical data is collected and a sound theoretical framework of what constitutes oral proficiency is constructed, it is premature to make sweeping generalizations about the effectiveness of this testing procedure. We must be particularly cautious about basing our decisions, either administrative or academic, solely on the OPI results.

[ p. 10 ]

This said, the author would like to look at some positive aspects of the ACTFL OPI. It is a direct, integrative test of speaking. In spite of scholarly criticism, no suggestion has been made to do away with the OPI, possibly because no better alternative presently exists. Most authorities in the field of language testing recognize the limitations of the test format, and advocate various changes, yet the feasibility of developing a testing procedure which would respond to all their specifications has not yet been realized. Raffaldini (1988) for example, is critical of the OPI because it does not adequately assess all components of communicative competence. Her point is well taken, however, is it possible to construct a test which would? Tests, by their very nature, can provide only a limited range of interactive contexts. Designing an evaluative mechanism which would assess all competencies (discourse, grammatical, sociolinguistic, and strategic) in all settings (from very formal to very informal) and on all contents (factual, hypothetical, and abstract) is a virtual impossibility. In defense of the OPI, it is perhaps necessary to look at the high positive correlation between the OPI ratings and subsequent measurements of success on job assignments. It is also instructive to examine the correlation between the OPI results and the score on more traditional tests such as College Board Achievement Test (Huebner and Jensen, 1982) Lastly, the washback effect on classroom teaching has been positive as the practitioners place more emphasis on speaking, encouraging student oral production in class.
The ACTFL OPI is a test of speaking ability within severely restricted parameters. It is, therefore, necessary to treat it as such rather than as an ultimate test of oral proficiency, Until more research into its validity, reliability, and theoretical underpinnings can warrant a wider interpretation its scores should be judged very judiciously.

References

American Council on the Teaching of Foreign Languages. (1986). ACTFL Proficiency Guidelines. Hastings-on-Hudson, NY: Author.

American Council on the Teaching of Foreign Languages. (1988). Oral Proficiency Interview Tester Training Manual. Hastings-on-Hudson, NY: Author.

Bachman, L. F.(1988). Problems in examining the validity of the ACTFL Oral Proficiency Interview. Studies in Second Language Acquisition, 10 (2), 149-161.

Bachman, L. F. & Savignon, S. J. (1986). The evaluation of communicative language proficiency: A critique of the ACTFL oral interview. Modern Language Journal, 70 (4), 380-389.

Byrnes, H. (1989). Evidence for discourse competence in the oral proficiency interview. Applied Language Learning, 1(1), 1-13.

Clark, J. & Clifford, R. (1988). The FSI/ILR/ACTFL proficiency scales and testing techniques. Studies in Second Language Acquisition, 10 (2), 129-147.

Dandonoli, P. & Henning, G. (1990). An investigation of the construct validity of the ACTFL proficiency guidelines and oral interview procedure. Foreign Language Annals, 23(1), 11-21.

Lantolf, J. P. & Frawley, W. (1988). Proficiency: Understanding the construct. Studies in Second Language Acquisition, 10(2), 181-195.

[ p. 11 ]


Lowe, P., Jr. (1986). Proficiency: Panacea, framework, process? A reply to Kramsch, Schulz, and particularly to Bachman and Savignon. Modern Language Journal, 70(4), 391-397.

Lowe, P., Jr. (1987). The unassimilated history. In P. Lowe & C. Stansfield (Eds.) Second Language Proficiency Assessment, Old Tappan, New Jersey: Prentice Hall Regents.

Meredith, R. A. (1990). The oral proficiency interview in real life: Sharpening the scale. Modern Language Journal, 74 (3) 288-295.

Rafaldini, T. (1988). The use of situation tests as measures of communicative ability. Studies in Second Language Acquisition. 10 (2), 197-211.

Savignon, S. J. (1985). Evaluation of communicative competence: The ACTFL provisional proficiency guidelines. Modern Language Journal, 69 (2), 129-134.

Sieloff Magnan, S. (1987). Rater reliability of the ACTFL Oral Proficiency Interview. The Canadian Modern Language Review. 3 (2), 525-537.

van Lier, L. (1989). Reeling, writhing, drawling, stretching, and fainting in coils: Oral proficiency interviews as conversation. TESOL Quarterly, 23(3), 489-507.

Appendix: Oral Proficiency Interview Facets

FACETS OF THE TEST RUBRIC
Time Allocation - 15 to 35 minutes, depending on the level
FACETS OF THE INPUT FORMAT
Channel of presentation: aural Mode of presentation: receptive Form of presentation: language (aural) Vehicle of presentation: live Language of presentation: L2 Identification of problem: N/A Degree of speededness: no speed
NATURE OF LANGUAGE
Length: varied, usually relatively short. Propositional content: Vocabulary; familiar Degree of contextualization: context embedded Distribution of new information: N/A Type of information: concrete <-> abstract Topic: general, nominated by the testee (varied) Genre: conversation Organizational characteristics: Grammar: formal to semi-formal Cohesion: yes Rhetorical organization: yes Pragmatic organization: Illocutionary force: functional Sociolinguistic characteristics: usually standard var. of the TL, semi-formal register

[ p. 12 ]

FACETS OF THE EXPECTED RESPONSE FORMAT
Channel: aural Mode: productive Type of response: unrestricted Form of response: language - oral Language of response: L2
NATURE OF LANGUAGE
Length: varied, mostly relatively short sentences Propositional content: Vocabulary: familiar/general Degree of contextualization: mostly embedded Distribution of new information: N/A Type of information: concrete & abstract, depending on interviewee Topic: varied Genre: conversation Organizational characteristics Grammar: yes (from formal to informal registers) Cohesion: yes Rhetorical organization: yes Pragmatic characteristics Illocutionary force: functional Sociolinguistic characteristics : standard variety of the TL
RESTRICTIONS ON RESPONSE
Channel: restricted Format: unrestricted Organizational characteristics: restricted Propositional and illocutionary: fairly unrestricted characteristics Time or length of response: fairly unrestricted Relations between input & response: reciprocal


NEWSLETTER: Topic IndexAuthor IndexTitle IndexDate Index
TEVAL SIG: Main Page Background Links Network Join
last Main Page next
HTML: http://www.jalt.org/test/yof_2.htm   /   PDF: http://www.jalt.org/test/PDF/Yoffe1.pdf

[ p. 13 ]