Thursday, September 27, 2007

Which movies did 305344 fail to rate?

Originally posted to a previous version of this blog 27 April 2007.

I expected that the 117 movies not rated by someone or something that seems to rate every movie would have few raters and an earliest rating date close to the cutoff date for the data. That would be consistent with a rating program of some sort that scores the entire database periodically. This did not prove to be the case. The list of movies customer 305344 failed to rate includes Dr. Shivago, Citizen Kane and A Charlie Brown Christmas.

Unlike most of the recent questions, this one cannot be looked up in the rater signature or the movie signature because this information has been summarized away. Instead I used a query on the original training data that has all the rating transactions. Later, I looked up the earliest rating date for each movie not rated by the alpha movie geek to test my hypothesis that they would be movies only recently made available for rating.

select t.movid from
(select r.movid as movid, sum(custid=305344) as geek
from netflix.train r
group by movid) t
where t.geek = 0

The most rated movies not rated by the alpha rater geek

Mystic River143,6822003-09-20
The Notebook115,9902004-05-19
The Aviator108,3542004-11-30
Million Dollar Baby102,8612004-11-16
Hotel Rwanda92,3452004-12-09
The Hunt for Red October83,2491999-12-17
12 Monkeys76,4751999-12-30
Citizen Kane61,7582001-03-17
The Saint28,4482000-01-05
Doctor Zhivago17,7852000-01-12
The Grapes of Wrath16,3922001-03-18
The Pledge10,9692001-01-21
A Charlie Brown Christmas7,5462000-08-03
The Tailor of Panama7,4212001-03-28
The Best Years of Our Lives7,0312000-01-06

No comments:

Post a Comment

Your comment will appear when it has been reviewed by the moderators.