A computer program at a US university has looked at three million images in an attempt to learn more about the world. It’s not the facts themselves that are important, but rather what the project can teach us about the possibilities and limits of computer learning.
The program is titled Never Ending Image Learner (NEIL), a reference to the fact that the idea is to leave it running indefinitely. It’s based around the problem that for all their speed, computers lack the human ability to recognize images quickly, for example the way we can recognize a friend in the street even if they’ve got new clothes, changed their hairstyle or if they’re tanned.
The NEIL project, which runs at Carnegie Mellon University, is funded by Google and the Department of Defense. It’s not so much about recognizing individual images, but rather using multiple images to spot relationships between components in images, and in turn facts.
For example, having seen pictures labeled as being of a famous leaning tower, and having seen similar pictures labeled as being in a particular Italian city, NEIL has successfully grasped that the Leaning Tower in indeed in Pisa. It’s also figured out that scenes described as “urban” are usually in a city.
It’s also been able to work on categorization, for example deducing that a wheel is a part of a car, while an Airbus 330 is a type of airplane.
The program works by looking at pictures that have similar labels, then looking for common visual information. For example, it’s isolated which part of a picture is a microphone. In turn, it has correctly identified that you’ll often find a part resembling a microphone in a headset.
Some of the relationships are logical enough while others may be a little more questionable, though it could just be something humans have never thought about. For example, NEIL associates the attribute of being chubby with call centers, the city of Medina, opticians and people on a witness stand.
Some of the limitations are easy to understand. NEAL is aware that pink can be a color, but primarily associates it with being a singer. That’s likely because the pictures come from sources such as Google Image Search, which assumes that’s the link most searchers are making and thus puts shots of the singer higher in the rankings. The inherent limitation here is that the more fundamental association (pink is a color) is downgraded because it’s so fundamental that people don’t search for it as much.