Assessing integrated language skills

Assessing integrated language skills

The COVID pandemic and recent advances in AI accelerated the pace of innovation in language assessment. The most popular innovation was one that involved language assessments written for online delivery instead of a paper and pencil and test center delivery. Muhammad and Ockey (2021) summarized this innovation “into three distinct groups: synchronous at-home test delivery, synchronous outdoor face-to-face test delivery, and asynchronous at-home test delivery” (p. 51). Although these administrative innovations are appropriate for the COVID era, the most significant innovation has not been in these administrative innovations. It is in the central area of language assessment in which the conceptualization of the construct of language assessment has shifted from independent language skills to integrated language skills. This shift in conceptualization is reflected in the transition from the old CEFR into the CEFR Companion Volume (Council of Europe, 2020; CEFR CV from now on). 

The CEFR CV implies a definition of language proficiency that has implications for assessment design. Read and Chapelle (2001) presented three definitions of language proficiency that have been used in designing assessments: (1) Trait definition (that focuses on language knowledge and skills but not relating it to the context), (2) Behaviorist definition (that does not specify language skills, but the context is specified), and (3) Interactionalist definition (that specifies both language skills and the context). In terms of assessment design, typically, a trait-based approach would have a test design of discrete tests randomly selected with no context of test takers specified; a behaviorist approach would have a design that is similar with tasks like in the trait-based approach, but the context of the test takers specified; and an interactionalist approach that would have a design that assesses the underlying language characteristics in a particular test taker context. It is the Interactionalist definition that supports the view of language skill integration.

Independent and Integrated language skills

While assessments of independent language skills have a history, assessments of integrated language skills are relatively new. The internet-based TOEFL was the first to have integrated skill tasks. Cumming (2013) pointed out the importance of integrated language skills and source-based writing: “it is rare to write extended texts without reference to some source reading or to some audio or visual material – or to both – just as it is unusual to speak a language without interacting with some other speakers and engaging in ideas. In academic and many workplace contexts, the fundamental purpose of extended writing or speaking is usually to display one’s knowledge appropriately with reference to relevant source information…” (p. 216). He further argued “integrated skills assessment presupposes an interactionist theory of human communication in which knowledge is constructed through the interpretation and expression of relevant ideas through multiple media. These abilities are fundamental to being able to use language effectively for extensive writing and speaking in academic and professional contexts; so, they need to be the guiding principles in language assessments made for these purposes” (2014; p. 225).

Examples of this approach in language assessments include combined skills such as listening-speaking (LS) or reading-writing (RW) tasks, or a listening-reading-writing (LRW) task. In a multi-skill task such as an LRW task, test takers are required to respond to a multi-stage task: first, they are required to listen to an audio recording or watch a video clip, then to read a text, and finally to write a paragraph or two based on the listening and reading materials. The use of such multiple skills requires test takers to integrate information from multiple texts and multiple modes in their responses. This interactionalist approach expands the definition of second or foreign language proficiency from independent skills to integrated skills along with the inclusion of context.

Table 1 (adapted from Read and Chapelle, 2001) presents three construct definition approaches, and corresponding skills and components. As illustrated, independent language skills (listening, speaking, reading, and writing) are typical of trait-based and behaviorist construct-based approaches, and both independent language skills and integrated language skills (in combinations of listening and writing, reading, and speaking) are examples of an interactionalist construct-based approach. 

Table 1: Approaches, skills, components, and independent and integrated skills

No alt text provided for this image

Both independent and integrated language skills have been mapped onto language proficiency standards for many decades particularly among high stakes standardized assessments. The U.S. Inter-Agency Roundtable (ILR) and the American Council for Teaching Foreign Languages (ACTFL) Guidelines have been in use in the U.S. But it has been the Council of Europe’s Common European Framework of References for Languages that has become popular around the world in planning and designing curricula for teaching and learning and for the design and development of language assessments. Many institutions have mapped their tasks to the CEFR or have adapted the CEFR to meet their national language assessments or standards.

The CEFR (CEFR, 2001) was designed to harmonize the teaching, learning and assessment of the Council of Europe’s multilingual context. This reference manual was made available in 2001 in 40 languages. The framework describes six levels of proficiency with associated descriptors for each level: A1 (Beginner), A2 (Elementary), B1 (Intermediate), B2 (Upper Intermediate), C1 (Advanced), and C2 (Proficient). It quickly helped to recognize language proficiency across Europe’s different language teaching and learning systems and to ease European mobility of people and ideas. 

But, while the CEFR provided a common framework for these purposes in Europe, it promoted the homogenization and institutionalization of the language teaching, learning and assessment systems. Fulcher (2004) sounded a warning, “for teachers, the main danger is that they are beginning to believe that the scales in the CEFR represent an acquisitional hierarchy, rather than a common perception” (p. 75). Other criticisms of the CEFR included the poor methodological framework, the lack of support from SLA theory, the vagueness of the proficiency descriptors, and the use of the native speaker as the highest level of proficiency. As a result, the initial value of the CEFR has somewhat eroded. But despite all the criticisms, the CEFR has been used widely in Europe. For countries outside Europe, it has become a model to adopt, adapt, or modify. In some countries, they have their own versions of the CEFR (examples, CEFR-J for the Japanese version; CEFR-V for Vietnam; CSE for China; CEFR-Th for Thailand; etc.) to suit their own educational contexts. (See Negishi et al., 2013 for the CEFR-J project).

As a result of the many criticisms, in 2018, the Council of Europe began a project to study and produce an extension of the CEFR. This CEFR Companion volume (CEFR CV, 2018) was produced, and it included new areas and addressed many of the criticisms against it. The CEFR CV included the role of context, improved consistency of scales across proficiency levels, introduced the pre-A1 level, and removed the notional “native speaker” as the highest level of proficiency. But its main contribution is the reorientation of language proficiency in terms of four macro functions: Reception, Production, Interaction and Mediation. This is a notable shift from the independent skills of Listening, Speaking, Reading, and Writing (LSRW) which has dominated the field for 60 years although language learners and language assessors have long known that it is unrealistic to perform with strict independent language skills (for example, only listening or only speaking). This reorientation will hopefully bring about a shift in how new language assessments are conceptualized and operationalized; in terms of bundles of skills, for example, listening and speaking; reading and writing; or listening, reading, and writing.

The four macro functions: Reception refers to reception activities such as listening comprehension and reading comprehension. Production refers to production activities in terms of spoken production and written production. Interaction refers to interaction activities such as spoken interaction and written interaction and online interaction. Mediation refers to mediation activities such as mediating a text, mediating mediation as a key to effective communication. Practically, mediation could be included if the test is providing translanguaging and translation activities.

Table 2 (from the CEFR CV) presents scales, subscales, and possible tasks related to Reception in listening and reading and Production in speaking and writing and Interaction and Mediation in various combinations of skills.

Table 2: Illustrative descriptor scales, sub-scales, and possible task types

No alt text provided for this image

Scenario-based assessment

One way of organizing an assessment that involves Reception, Production, Interaction, and Mediation could be a Scenario-Based Assessment Approach (SBA). In such an approach, tasks are  housed within scenarios or hypothetical situations. In these scenarios, there are stories of characters and a series of events that follow logically and in which the test taker is placed as a key character within these scenarios. As Sabatini, Bennett, and Deane (2011) stated, these “task sets are composed of a series of related tasks that unfold within an appropriate social context” (p.10). In an assessment, the features of a scenario-based approach could be (a) individual goal-orientation with purposeful multi-stage activities, (b) integration and synthesis of language skills, and (c) motivating and engaging tasks not directly focused on language skills, and (d) language use for problem solving.

The SBA approach could be favored over the traditional approach of organizing tasks  because in a traditional approach a series of six to ten random topics for listening, speaking, reading, and writing with no goal or connecting theme are provided. Purpura (personal communication) has termed the traditional approach as “Read, Respond and Forget.” In contrast, the SBA approach could offer the possibility of a themed sequence of tasks that has a clear goal for the test taker; thus, engaging the test taker towards task completion. Following Purpura and Banerjee (2022), the test development process could involve the Elicitation dimension at the center of the scenarios and the Contextual and Proficiency dimensions along with the Socio-cognitive, Affective, Socio-Interactional dimensions and Instructional dimension as needed. Table 3 shows an example flow of a single SBA with four tasks on a single theme. Imagine a scenario in which test takers listen and read about Gamelan (a Javanese/Balinese musical form) and then they need to write and speak about it with the overall goal of persuading classmates to go to a Gamelan concert.

Table 3: Example flow of an SBA with a single topic or theme

No alt text provided for this image

Conclusion

In this article, I briefly discussed how language proficiency is defined implicitly through assessment design, using the three construct-based approaches (from Read and Chapelle, 2001). Both independent language skills (listening, speaking, reading, and writing) and integrated language skills (in combinations of listening and writing, reading, and speaking) are examples of what language skills could be included in the design of the tasks. Next, I listed the scales, subscales, and possible tasks related to Reception in listening and reading and Production in speaking and writing and Interaction and Mediation in various combinations of skills. I concluded with an illustration with how a Scenario-Based approach could be used in the integration of language skills. Thus, combining the integration of language skills, the CEFR CV approach, and the Scenario-based approach, assessments could have tasks that are truly interactionalist in terms of their constructs.  

References

Common European Framework of References of Languages: Learning, Teaching, Assessment. (2010). Council of Europe. Strasbourg.

Common European Framework of References of Languages: Learning, Teaching, Assessment.Companion Volume with New descriptors (2018). Council of Europe. Strasbourg.

Cumming, A. (2013). Assessing integrated writing tasks for academic purposes: Promises and perils. Language Assessment Quarterly, 10, 1-18.

Cumming, A. (2014). Assessing integrated skills. In A. J. Kunnan (Ed.) The Companion to Language Assessment (pp. 216-229). Wiley.

Fulcher, G. (2003). Deluded by artifices? The Common European Framework and harmonization. Language Assessment Quarterly, 1, 253-266.

Kunnan, A. J. (2018). Evaluating language assessments. Routledge.

Muhammad, A. & Ockey. G. (2021). Upholding language assessment quality during the COVID pandemic: Some final thoughts and questions. Language Assessment Quarterly, 18, 51- 55.

Negishi, M., Takada, T., & Tono, Y. (2013). A progress report on the development of the CEFR-J. In E. Galaczi & C. Weir (2013) (Eds.) Exploring Language Frameworks: Proceedings of the ALTE Krakow Conference, (pp. 135–163). Cambridge. Studies in Language Testing Series 36, Cambridge University Press.

Read, J. & Chapelle, C. (2001). A framework for second language vocabulary assessment. Language Testing, 18, 1-32.

To view or add a comment, sign in

More articles by Duolingo English Test

Insights from the community

Others also viewed

Explore topics