160
Reading skills
1.
Multiple-choice items
2. Short answers test
3. Cloze test
4.
Gap-filling tests
5. Matching
6. Rearrangement
7. False/true statement
8. Completion
Listening skills
1. Multiple-choice items
2. False/true statements
3. Gap-filling tests
4. Dictations
5. Listening recall
6. Rearrangement
7. Matching
Writing skills
1. Dictations
2.
Compositions
3. Reproductions
4. Writing stories
5. Writing diaries
6.
Filling-in forms
7. Word formation
8. Sentence transformation
Speaking skills
1. Retelling stories
2.
Describing pictures
3. Describing people
4. Spotting the differences
5. Interview
The created tests must be evaluated via certain criteria. There
are a lot of evaluation criteria. Test qualities
include among others
reliability, validity, consistency and practicality.
Methodologists (Alderson, Clapham & Wall, 1996:286;
Bachman & Palmer, 1997:19-42) explain these criteria as:
- Reliability is permanence of the measurement results
produced by a test. Testing productive skills such as speaking and
creative writing is less reliable than testing listening and reading.
161
E.g., there is always more room for subjectivity in assessing an
essay than a dictation. “Reliability” is the opposite to “randomness”
in the marking given by the teachers or examiners.
-
Consistency is agreement between parts of the test. All the
tasks in a consistent test have the same
level of difficulty for the
learners. Some tests are more difficult to make consistent than
others, e.g. a dictation will contain the words with a different level
of difficulty for spelling.
-
Construct validity pertains to whether the text measures what
it claims to measure. If a test claims to measure such “construct” as
“oral” skill, then a valid test should measure exactly an “oral skill”
but not other “constructs” such as the “knowledge of grammar”.
-
Concurrent validity is the coincidence of the test scores with
other measures of the learner’s language performance, e.g.
teacher’s.
-
Practicality is the degree to which a test can be used as a
convenient tool for measuring language performance. If a test needs
much
preparation time, or requires too long time in the lesson, it
will be perceived as “impractical”.
Do'stlaringiz bilan baham: