Turing Test ‘Pass’ Doesn’t Convince All

Alan-Turing

A Russian computer program is said to have become the first to pass the “Turing Test” for artificial intelligence. Whether that’s the case is a question of interpretation.

The Turing test is named after computer pioneer Alan Turing, whose 1950 paper “Computing Machinery and Intelligence” examined whether computers can think. He said such a question was difficult to define, but that he believed it possible that, in a test, a computer could successfully imitate a human in answering questions.

While several prizes have been offered over the years for a computer that can beat the test, the precise rules and threshold to pass aren’t universally agreed. Turing’s paper only discussed the operations of a test in general terms and didn’t give either a time limit or a “pass mark.”

However, a common interpretation is that a computer program must fool at least 30 percent of a panel of human judges that it is itself human during a text conversation of at least five minutes. Some sources claim this is based on a specific prediction made by Turing in 1950 which also stated a computer would do this by 2000 using 100MB of storage, though that’s certainly not in the paper.

The 30 percent mark has reportedly been achieved by the program “Eugene Goostman” which claims to be a 13-year-old boy, but is actually named after one of its developers, Eugene Demchenko. In a test of five programs at Reading University in the UK, held to mark the 60th anniversary of Turing’s death, Eugene fooled 33 percent of judges. Appropriately one judge was actor Robert Llewellyn, aka Kryten for Red Dwarf.

The university has not yet published full details of the test, nor any transcripts of the chats. There are also a couple of significant limitations to Eugene — or, depending on your point of view, creative features.

Firstly, Eugene appears to have been billed as a Ukrainian boy conversing in English, giving it more scope for language mistakes. Secondly, the developers intentionally “built” a young teenager rather than an adult to make it more credible that Eugene would not only have limitations to his knowledge, but also lack self-awareness about those limitations.

There’s also some question about whether the judges in such a test should be ordinary members of the public, experts in the field (who might use more probing and appropriate questions to spot a fake) or — as appears to the be the case here — somewhere in between.

Given the controversy, perhaps its time for the Turing Test test, which can only be passed by successfully convincing 30 percent of a panel of computer scientists that your test really was a Turing Test.