How to Set Up Objective in Testing


Error, Feedback and Assessment

 

1. Assessment

 

  1.    Assessment
    1. Why assess?
    2. Assessment Impact
    3. Types of test
    4. Writing test items
    5. Using self and peer assessment

 

a) Why assess?

 

Think about assessment in these terms and compare your notes with the ideas below:

 

 

 

Assessment takes many forms and occurs throughout the learning process. It can  benefit many different people – the learners themselves, the teaching staff, and also other external stakeholders such as parents who wish to know about their child’s progress, or further education institutes or prospective employers who need to know that a candidate’s language competence is adequate for the tasks s/he will later be asked to perform. The evaluation may be formal (a test, an exam (which may be internal or government-recognised), or a task for which the learner knows s/he will receive a final grade) – or informal (such as on-going observation of the learner’s performance in class and homework tasks). In this section we’ll be looking mainly at formal assessment. In the next (on Error and Feedback) at informal assessment.

 

Some common reasons for assessment are :

 

a)  to divide learners into similar ability groups before the course starts – i.e. to make sure that you don’t have elementary and intermediate or advanced learners in the same class (a placement test). This helps the teacher gear the course to the level and needs of all the students in the group, making sure that the content is neither too hard nor to easy for individual learners. If you write a placement test, you will need to include some items that could successfully be answered by learners at each level – for example, if your courses are taught at six different levels, the first ten items might be based on the contents of the first level course, the second ten on the contents of the second level course and so on. By looking at which items the learner got right and which wrong, you can assess which course s/he now needs to take.  For example, if s/he got eighteen of the first twenty correct, but only three of items 21-30 and nothing afterwards, then s/he probably needs to go into the third level course, as she seems to have assimilated most of the material of levels 1 and 2 but not of course 3.

 

b) To find out, at the beginning of the course, what the learners do and don’t know already (a diagnostic test or task). This helps you decide a) what knowledge you can take for granted because everyone knows it, b) what you need to revise before moving on because some of the learners seem to know it but others are still confused, and c) what is completely new for everyone and will need to be taught from scratch. If the learners have studied together previously, this may not be necessary – the achievement test (see below) from their last course will give you the information you need. But if they’re coming from different institutions where they followed different courses, they may know slightly different things, even if at roughly the same level. In this case, your diagnostic test will need to include the items which you expect them to know already at this level (to check that they really do) plus items from the course that you intend to teach (to confirm that they don’t).

 

c)  During the course, to find out whether the learners have assimilated the material you’ve covered up to that point (a progress test or task). This helps you evaluate how effective your teaching and their learning has been. If the results are poor, you may decide to change approach or to help the learners more with their learning strategies. In any case, you know that you need to revise and consolidate the material before you move on to more difficult areas. If you set a progress test, it should cover only the items which you have taught, and which you expect the learners to be able to answer correctly.

 

d)  At the end of the course, to find out how well the learners have assimilated all the material covered by the course. This may sometimes affect their ability to move to the next level, or may be certificated. (An achievement test).  Generally, if there are several classes at the same level in the institution, they will all take a standardised achievement test at the end of the course, even if different classes have been taught by different teachers. It should therefore ideally be set by a team of teachers working together, in order to ensure standardisation, but if you are responsible for writing it you will need to liaise with the other teachers to ensure that both the content of the test has been covered by all the classes and that format of the test items (eg multiple choice, essay writing) are familiar to all the learners. Learners will do better on test types that they have practised during the course and including a test type which one or more classes have not practised will make the test unfair.

 

e)  To find out the overall level of competence of the learners (a proficiency test). Proficiency tests are not linked to any specific syllabus, and are generally set by an external examining body. They can be used to provide evidence to future employers, universities etc of the learner’s competence in the language.

 

 

b) Assessment Impact

 

Think about it ...

 

Think about your previous experiences of learning a language. How were you assessed? Did you find the assessment motivating or demotivating? Useful or not? Did it have any impact on your future learning? Or on your life? Why? If the effects were in any way negative, how could the assessment process have been changed to make it a more positive experience?

 

 

Assessment doesn’t happen in a vacuum. The frequency of assessment, the types of assessment task, and the results of the assessment, all have an impact which may be either positive or negative. The impact may affect...

 

 

How a learner reacts to a test will depend on a number of factors :

 

- Do they consider the test to have been fair or unfair? If unfair (because eg it was too difficult or included content they had barely covered) it may affect their motivation towards the course.

- Was the result positive? If not, and particularly if it’s one of a string of low marks, it can affect their self esteem and belief in their ability to learn the language. It may also put young learners under pressure from their parents  - and all this may again may affect their motivation towards the course. Notice though that a “positive” result doesn’t necessarily mean a high grade. A learner who receives a low-grade may still feel positive about the test, and about the learning experience, if s/he can see that s/he is making progress, and has improved since the last assessment. This is largely dependent on the type of feedback which s/he receives, which we’ll look at in the next section.

 

Taking into account the effect that the test will have on the students is particularly important when working with younger learners. Receiving a “bad mark” at school can easily demotivate a child and convince them that they are “no good at” and “don’t like” the subject. If formal testing is used, it is therefore essential that the test is “doable” and that it is made as enjoyable as possible. For example, knowledge of vocabulary and/or pronunciation can be assessed by asking the learners to play a game of Memory, using pairs of flashcards placed face down on a table. In turn they each turn one card over, say the name of the object and then try to remember where they other identical card is. While they play, the teacher can circulate listening to and assessing their knowledge of the lexis and its pronunciation.

 

 

Imagine you know that your learners face a final exam which is composed exclusively of exclusively on multiple choice tests of grammar. This means that to pass the test the learners have to understand and recognise the grammar in individual sentences, but don’t necessarily have to be able to use it when speaking or writing. But the test is important – it will determine whether they can continue with their education or not. So, in this situation, how much time do you spend on improving your students listening skills? How much time on improving their fluency? And how much time on  improving their ability to answer multiple choice questions on grammar ? Yet is this really the best way to improve their English?

 

Any pre-determined test will inevitably have an effect (called backwash) on the course that precedes it. So when deciding on the format of tests which you have responsibility for, you need to think about what the format tests, and whether this is really what you want to spend your time teaching your students. When the test is not your responsibility (such as an official or government test), you need to think carefully about the balance of activities in the classroom - how much time must you spend “teaching for the test” and how much time can you devote to the other things you feel to be important.

 

 

 

c) Types of test

 

Think about it...

 

All of the following tasks could be used as tests. What type of test are they and what exactly are they or could they be testing? What would you need to take into account when marking

 

  1. Put the verbs in brackets in the correct form

Last Wednesday I .................... (go) to see my aunt who ................ (live) in a village 30 kms away.

 

  1. Choose the most polite, positive response.

Could I use your phone?

a) Yes, you could   b) Sure, go ahead   c) Oh, all right   d) Yes, you could use it.

 

  1. Write the names of the objects a-g in the picture.

 

  1. Read the texts (1-4) and choose the best title (a-e) for each. There is one extra title which you do not need.

 

f.   Write about 150 words discussing “The Advantages and Disadvantages of

      Living in the City”.

 

g.   Discuss with your partner : what did you do at the weekend?

 

 

 

 

(a)      A gap-fill focusing on the learner’s ability to use  verb forms accurately.

(b)     A multiple choice test focusing on appropriacy. There is nothing wrong

grammatically with any of the answers, but only (b) is appropriate for a polite,  

positive response.

(c)      A labelling task which could focus on knowledge of vocabulary and spelling. Here

you would have to decide how to award marks for the two aspects : must the spelling be correct to get a mark?  Is knowing the word enough and spelling unimportant? Is there one mark for the word and another for correct spelling?

(d)     A matching task testing gist comprehension in reading – the titles will

summarise the main idea of the text.

(f)     A composition to assess writing skills, including : organising information into paragraphs; developing a logical argument; structuring a paragraph; sequencing and linking information; range and accuracy of grammar; range and accuracy of vocabulary. Notice that here there is a lot more involved than just writing sentences with accurate grammar, and that all of these areas need to be taught before they can be tested. For example, the paragraph structuring conventions of English (eg starting with a topic sentence) are not necessarily used in other languages, and cannot just be assumed to be known.

 

(g)     A speaking task which could be used to assess : oral fluency (ability to get your meaning across regardless of errors); ability to interact (eg asking follow-up questions, or making comments to show you’re listening and interested, such as Really? or Oh that’s great/What a pity.); range of grammar and vocabulary; accuracy of grammar (eg past verb forms) and vocabulary; sequencing a narrative; accuracy of pronunciation.

 

Notice that when you marked these test, (f) and (g) would have to be marked very differently to the first five. Items a-e are objective tests – there is one right answer and one only, and different markers marking the same test would always come up with the same score. But (f) and (g) are subjective tests : different markers with different priorities could easily come up with different grades (if, for instance marker A thinks that logical organisation of information is essential, while marker B is only interested in grammatical accuracy) .  Even if the marker was the same, just the mood s/he was in at the time, or an irrelevant aspect like his/her reaction to the learner’s handwriting  might influence the result. To avoid this, the criteria for marking have to be decided in advance, and a set number of marks awarded for each criterion. Here’s what a criteria based mark scheme might look like for the criterion “Range and accuracy of grammar” in “f” above (which assumes a good intermediate level class), for which 4 marks are available :

 

 

0 marks

Even very simple structures are used with a high degree of inaccuracy.

1 mark

Simple structures used, with some error but often accurately. More complex structures, if used, are never accurate.

2 marks

Only simple structures used consistently accurately. More complex structures, if used, always inaccurate.

3 marks

Simple structures used consistently accurately. More complex structures are attempted and are sometimes accurate, sometimes not.

4 marks

A wide range of structures of varying complexity are used, with a high degree of accuracy.

 

 

Task

 

Work with another teacher. Choose a piece of written work produced by one of your learners. Decide together the marking criteria that you will use on the text and then mark it independently. Compare the results. Do you come out with the same mark? If not, discuss how you allotted the marks and why you differed.

 

 

d. Writing test items

 

Think about it...

 

All of the following test items have weaknesses. What are they?

 

1.  Choose the correct alternative : Yesterday I go/went to the market

 

2.  Choose the correct alternative : The sky was grey and ___

     a) cloud   b) clouded   c) cloudy  d) clowdy  e) clouds

    

3.  Put the verb in the correct form  :  She ..... (go) to the market every day

 

4.  Read the text and decide if the statement is true or false.

Text :               It wasn’t unexpected.                  

     Statement :     She didn’t know it was going to happen.  True/False?

 

 

Writing test items is tricky, and any test you write should ideally always be done by another teacher first to see if there are any bugs which you didn’t spot. These examples show just some of the most common problems.

 

(1) is a multiple choice item – but with only two choices. This means that the possibility of getting the right answer by chance is 50% - much too high to make the test results reliable. Ideally a multiple choice item should have at least five choices –but often there just aren’t five likely options – so the test writer is reduced to putting in a couple which are unlikely to fool anyone. This happens in (2). Here a clever learner might easily work out that clowdy was the only one spelt differently – and therefore unlikely to be correct. Even more importantly, the options in (2) contain two possible right answers : although the writer probably wanted cloudy, clouded is also possible.

 

This problem comes up again in (3). The writer probably expects goes, but in different contexts many forms are possible : went, is going... See how many you can think of.

 

Did you find yourself doing mental gymnastics in (4)?  All the negatives, plus the implied positive/negative contrast in the concept of True/False make it difficult to work out. The learner may well have understood the text but just get mixed up when giving the answer.

 

 

Task

 

 

 

          

a)    Could they do it without rushing in fifteen minutes? Or was it too long or too short?

b)   What were the results like? The high ability students should have scored highly – if not, the test was too difficult and the less able students would certainly be demotivated by it.

c)    Were there any questions which they got wrong because they didn’t understand what they had to do? How could your instructions be rephrased to make the question clearer?

 

 

 

e. Using Self and Peer Assessment

 

So far we have been assuming that it is the teacher who marks the test, but handing this job over to the learners can help develop their awareness of their own strengths and weaknesses – and in large classes allows the teacher to set more work without increasing the marking load  to an unrealistic level. The learners can’t be expected to do this unaided of course, but can work from model answers and prompts. Here are some ideas :

 

- objective tests (those with specific right answers) can be marked from answer sheets. Working in pairs or groups, the learners have to correct their answers and then, for each error, discuss why it was wrong and why the correct answer was correct.

- subjective tests (those like compositions which don’t have a single answer) can be marked by comparing the learners’ work with a model written by the teacher. They answer prompt questions to analyse the model, and then compare it with their own work. For example, if learners in an elementary class have been asked to write a composition on their family, some of the prompt questions might be :

 

- How many paragraphs are there? Can you identify the topic of each paragraph?

- How many sentences do the paragraphs contain?

- Does the first sentence introduce the topic of the paragraph?

- Does the composition give some interesting information about the writer’s family?

- Are the sentences all very short, or do some of them join two idea using and or but?

- Is there a capital letter at the beginning of every sentence and a full stop at the end?

- Is the spelling of these words correct : father, mother, brother, sister, aunt, uncle, cousin, grandfather, grandmother

- Is there an s on the end of all the third person singular verbs – eg He lives in Bamako

 

... and so on. The prompt questions can focus on anything which you know is a problem area for your learners - exactly what the problems are will depend on their first language and previous experience of writing. Once the tasks have been peer or self assessed in this way, the writer can be given the chance to improve the text before it is handed in to the teacher, or the group can decide on the recommended grade (using the sort of criteria mentioned above), which the teacher can accept or modify as s/he wishes.

 

Peer and self-assessment is a powerful tool for learning – but may not always match the learners’ (or the school’s) idea of the roles of teacher and student. In cultures where the teacher is expected to be an authority, learners may react badly to being asked to take on an evaluating role. However, as suggested above, the technique can still be used as a way of getting learners to critique and improve their own work before final evaluation by the teacher.