Characteristics of a good test
In order to judge the effectiveness of any test, it is sensible to lay down criteria against which the test can be measured, as follows:
Validity: a test is valid if it tests what it is supposed to test. Thus it is not valid, for example, to test writing ability with an essay question that demands specialist knowledge of history or biology — unless it is known that all students share this knowledge before they do the test.
A particular kind of ‘validity’ that concerns most test designers is face validity. This means that the test should look, on the ‘face’ of it, as if it is valid. A test which consisted of only three multiple choice items would not convince students of its face validity however reliable or practical teachers thought it to be.
Reliability: a good test should give consistent results. For example, if the same group of students took the same test twice within two days — without reflecting on the first test before they sat it again — they should get the same results on each occasion. If two groups who were demonstrably alike took the test, the marking range would be the same.
In practice, ‘reliability’ is enhanced by making the test instructions absolutely clear, restricting the scope for variety in the answers. Reliability also depends on the people who mark the tests. Clearly a test is unreliable if the result depends to any large extent on who is marking it. Much thought has gone into making the scoring of tests as reliable as possible.
(Jeremy Harmer. The practice of English language teaching.2007. Adaptado)
Outro importante critério em relação a testes é seu grau de confiabilidade, descrito como a congruência nos resultados obtidos caso o teste seja reaplicado. De acordo com o texto, tem-se como quesito necessário no que concerne à garantia de confiabilidade: