Content introduction chapter-i assessing learner's writing skills according to cefr scales

Download 113,49 Kb.

bet	17/20
Sana	18.06.2023
Hajmi	113,49 Kb.
	#1582565

1 ... 12 13 14 15 16 17 18 19 20

Bog'liq
Content introduction chapter-i assessing learner\'s writing skill

Task Difficulty
Rating Criteria Difficulty

DISCUSSION
We first discuss the design factors task difficulty, difficulty of rating criteria, rater severity, and student proficiency, in relation to our first research question. We also address the question how far the a priori classifications match the empirical task difficulties. Based on this, we answer the question whether our analyses suggest empirical cut-scores in line with the targeted CEFR proficiency levels.
Task Difficulty
As anticipated, the relative pass rates decrease with the level of the tasks in both the HSA and the MSA sample. In other words, it is relatively easier to get a pass in a lower level task than it is in a higher level task. Rasch analyses confirmed the assumed differences between task difficulties reflecting the task design. Descriptive statistics and Rasch analyses showed that the intended a priori task difficulty classification in terms of CEFR levels correspond to a high degree with the empirical task difficulty estimates as the tasks cluster accordingly along the scale. Apart from one task in each sample, which was preclassified as A1 but is empirically estimated closer to other preclassified A2 tasks and should thus be reclassified, ¹ all remaining tasks appear in the anticipated order of difficulty, which suggests that these tasks seem to function as intended. These findings were corroborated by the relative variance components from the g-theory analyses as well. Thus, there is some empirical evidence that a core set of tasks functioned as intended. At the same time, however, descriptive analyses and the Rasch-model analyses suggested that the task administration should be redesigned to match the students' proficiency levels better, as tasks of the highest or lowest level—specifically, tasks targeting Level B2 in the HSA sample and Levels A1 and C1 in the MSA sample—did not discriminate well in this study.
Rating Criteria Difficulty
The second design factor we discuss concerns the rating criteria. G-theory analyses showed a negligible influence of the rating criteria on overall rating variation. This is, in part, supported by the Rasch analyses. With regards to the relative difficulty of the five criteria, the analytic criterion task fulfillment proved to be the easiest. This was anticipated as the task instructions clearly state the expected content, addressee and communicative purpose. The thresholds of the remaining criteria are located close to each other on the Rasch scale, indicating similar difficulties. This means that the global rating would probably be sufficient for operational calibration purposes. This also confirms that treating the ratings jointly as one design facet, rather than performing a multidimensional multifaceted Rasch analysis with highly correlated and, therefore, essentially redundant latent variable dimensions was sufficient for these data. However, because the global rating is based on detailed analyses of student responses, we cannot conclude that one holistic, impressionistic rating instead of the analytic approach would have led to the same effects.

Download 113,49 Kb.

Do'stlaringiz bilan baham:

1 ... 12 13 14 15 16 17 18 19 20