A team of seven former rivals has won a three-year contest to improve Netflix’s recommendation system. They pipped another team to the million dollar prize by just twenty minutes.
The contest involved the system the movie rental firm uses to predict which films a particular customer might like, based on how they’ve rated previous titles. As you’d imagine, there are simple ways to do this: a Rocky fan would likely enjoy Rocky II, while someone who gave a zero star rating to Scarface, the Godfather, and Casino probably isn’t going to go crazy about Goodfellas.
The existing Netflix system, Cinematch, built on this simple approach by analyzing customer data to build up more complex patterns: people who hated Rocky probably wouldn’t like Rocky IV, but fans of the original could be split in their attitude to it, so you’d need to look at their attitudes to cheesy 80s action flicks as well as sports movies.
With the average customer rating 200 movies, Cinematch worked very well, but not perfectly. It was particularly thrown by offbeat titles, most famously Napoleon Dynamite, where there didn’t seem to be any way of predicting how much a viewer would like it from their previous responses.
Netflix then launched the prize contest with the target of beating Cinematch’s accuracy by 10 per cent. It’s now finally been won by a team, which includes computer engineers from Austria, Canada, Israel and the United States, who had previously been working for the prize individually.
The team, BellKor’s Pragmatic Chaos, broke the 10 per cent mark in early June, at which point rivals had 30 days to beat their score. Another team matched their achievement just 20 minutes later, but with both sides on a 10.06 per cent improvement, BellKor took the million dollar prize.
The second-place team Ensemble, consisting of staff from data analytics firm Opera Solutions, says it’s not too upset about missing the prize as it estimates the value of its own findings at $10 million.
And Netflix certainly doesn’t mind having to pay out the prize, describing the input of more than 100 contestants as like getting PhD expertise for a dollar an hour. It has already built in some of the suggestions from contestants to its system and says customer retention rates have improved.
The firm is now running a second contest which will run for eighteen months. Researchers will be challenged to produce a recommendation system which takes account not only of previous movie ratings but also age, gender and zip code.
(Picture, courtesy of Chris Hefele, shows extract from a visualization of Ensemble’s database of movie relationships.)