Weighing those words

computers are becoming "literate". Using a recently developed software, home computers will now be able to access your writing skills. However, getting rid of critics might just take a little more time. Teachers benefit the most: grading student essays is far from getting tedious, say the developers of the software.

The software can analyse word usage and can do the job of grading essays just as good as a human examiner. "I'm not ready to claim that it's foolproof, but I'm ready to claim that it's approximately as foolproof as a human," says Thomas Landauer, a cognitive scientist at the University of Colorado, usa, who led the team that developed this software.

Landauer recently unveiled the software at a meeting of the American Educational Research Association in San Diego, usa.Using this software, the computer first "learns" about the subject of the essay it is to mark by scanning relevant passages from course textbooks. It hunts for statistical patterns of words occurring together. Thereafter, it calculates the extent to which different written passages share these patterns.

The software has been tested on student essays so far, and if Landauer is to be believed, its performance has been nothing short of an amazing "A plus". To grade an essay, the computer compares it with a set of sample essays of varying quality that have been marked manually. "We assign a grade to it based on a weighted average of how similar it is to these sample essays. If it's very similar to an essay that got 90 per cent marks, then it will probably get close to a 90," explains Peter Foltz, a cognitive psychologist at New Mexico State University in Las Cruces, usa, who helped develop the technique.

This approach is more subtle than programmes that merely look for matching words, the way most Internet search engines operate. Provided the programme has first scanned a sufficiently wide range of texts, it can recognise, for instance, that an essay about doctors is similar to one that talks of physicians. This might seem simple enough for us, but for a machine which runs on a few microchips this is an achievement to be proud of. Experimenting with essays written by 94 university undergraduates on the anatomy and function of the human heart, the grades assigned by two trained essay readers showed a correlation of a 0.77 on a scale from -1 for total disagreement to 1 of exact agreement. The software's grades gave a correlation of 0.68 to one reader and 0.77 to the other. In other words, it gave the two essay readers some really stiff competition.

But will it be easy to "fool" this software? Landauer confirms that it contains several checks to foil cheating attempts, most of which he will not talk about. But he notes that students cannot gain high marks simply by throwing in tough-sounding technical terms that even they do not understand. "You can't get the right combination of words just by listing them," he says.

Essays that bear no close resemblance to the samples are set aside by the software. These unusual essays - which are likely to be either brilliantly original or "stunningly stupid", according to Landauer - can then be graded manually.

Landauer admits that his programme cannot judge the literary quality of the essays, so it will never be able to examine properly students whose understanding of their subject is accompanied by an elegant use of language.

America's largest supplier of standardised tests - including the admission tests used by most us universities - is also developing a software along similar lines."When we score essays we have up to three readers doing the evaluation," says Lawrence Frase, head of cognitive and instructional science at the Educational Testing Service at Princeton, New Jersey. "If our computer system works as well as a human, why not have the computer replace one human?"