In my opinion, that’s the wrong measure of success. Netflix selected for algorithms that predicted well across all data, penalizing large misses extra. But that’s not what makes a recommendation algorithm good.
The best algorithm, I think, should observe my tastes and recommend just one product that I’ve never heard of (or at least never tried), that I absolutely love. It’s OK if I like a movie and you show me another one by the same director — but I could have done that myself. The best algorithm would say:
You like Cowboy Bebop + Out Of Africa + Winged Migration so you will like = Seven Samurai.
Cowboy Bebop indicates that I like Asian sh*t; Out Of Africa is an old classic; Winged Migration doesn’t have a lot of talking. Put them together and you get an Asian classic without a lot of talking.
That’s just an example of a recommendation that would fit my criteria of goodness.
In other words,
- only the "most recommended" movie matters
- it should blow me away
- it should be surprising.
RMSE fails #1 because accuracy in the highest recommendation matters just as much as accuracy in every other recommendation.
As a result, today’s recommendation engines are conservative in the wrong ways and basically hack together machine learning fads.