Tailored online NAPLAN better for monitoring high and low achievers

Different questions for low and high achievers is actually beneficial for judging how students at each end of the spectrum are going. Alan Porritt/AAP

Australia’s national literacy and numeracy tests (NAPLAN) will be available online from 2017. While there is little difference in test scores between typical paper-and-pencil tests and computerised versions of the same test, some are concerned about the proposal to use “computerised adaptive testing”.

Computerised adaptive tests start with an item that is known to be of medium difficulty for the population to which a student belongs. If he/she solves it correctly, the next item will be harder. If he/she answers the item incorrectly, the following item will be easier.

Thus, the difficulty level of each subsequent item is adjusted depending on the answers provided to previous items. Individual students will no longer get the traditional, “one-size-fits-all”, tests as a set of chosen items are adjusted to fit each student’s ability level. NAPLAN developers refer to this approach as tailored testing.

The actual computerised NAPLAN tests will not employ individual items. Instead, a set of items (called “testlets”) varying in difficulty level will be used.

One-size-fits-all vs tailored testing

Tailored testing has a long history. The idea was first put forward in the late 1960s, widely researched in the 1980s, and implemented in large-scale testing in the mid-1990s. The main selling point is the use of a smaller number of items and, consequently, a shorter time needed for testing.

A general consensus in the research community is that tailored testing provides an equally valid assessment of students’ abilities as do the traditional tests. A suit tailored specifically for you may look similar to one from a department store, but you can feel the difference because it fits you better. Similarly, the “one-size-fits-all” and tailored tests will produce similar scores but a tailored test will fit each individual student better. The items are more carefully selected.

An additional advantage is that test scores of students at the ends of the ability spectrum (either low or high) are likely to be more precisely measured. By contrast, the typical tests consist mostly of items of medium difficulty with a few easy and difficult items added.

The questions will get easier or harder as the student progresses through the test. AAP/Alan Porritt

In addition, the long reporting time – which is one of the most heated issues in NAPLAN testing – may be resolved. In tailored tests a student’s score will be calculated and produced online at the end of a testing session.

Some tailored myths

Proponents of tailored testing sometimes make statements that cannot be supported by the evidence. One of these is the claim that since test items will be of appropriate difficulty for a student, he/she will experience less anxiety. This claim has no empirical support.

Main causes of student anxiety over NAPLAN are likely to be externally generated and have little to do with student experience during the testing session itself. Perhaps it can be said that students will be better challenged and less frustrated during the testing.

Another myth is that tailored testing will provide teachers, parents and other stakeholders with “greater insight” into students’ abilities. In general, tailored testing does not provide any new or additional information or “descriptions” about individual students above what a traditional test can produce.

Further work needed

Several issues may need to be addressed prior to the launch of NAPLAN online. On the technical side, there are still some remaining questions about the effects of using different gadgets – computers, tablets and smartphones for test administration.

Another issue is the assessment of students with different kinds of disabilities. Computer administration provides for the possibility of developing test items that are different from those employed in paper-and-pencil tests. Sounds, moving pictures, sequential presentation of the elements of the tasks and many other options become available.

And last but not least is the possibility of having open-ended items that include short or even longer written material which can be scored by the machines. The computers may not be able to understand jokes and other persuasive writing techniques.

Overall, attempts to introduce computerised large-scale assessments have been successful in other parts of the world. Australia will be one of the first countries in the world to employ it at the national level.