topbanner_forum
  *

avatar image

Welcome, Guest. Please login or register.
Did you miss your activation email?

Login with username, password and session length
  • Thursday March 28, 2024, 5:56 pm
  • Proudly celebrating 15+ years online.
  • Donate now to become a lifetime supporting member of the site and get a non-expiring license key for all of our programs.
  • donate

Last post Author Topic: Netflix contest: $1M to developer of best movie recommendation engine  (Read 32474 times)

KenR

  • Super
  • Blogger
  • Joined in 2006
  • ***
  • Posts: 826
    • View Profile
    • Donate to Member
Netflix is holding a 1 million dollar contest awarded to the coder who can betst improve their movie recommendation engine.



from http://oreilly.com/
Kenneth P. Reeder, Ph.D.
Clinical Psychologist
Jacksonville, North Carolina  28546
« Last Edit: October 02, 2006, 01:21 PM by KenR »

Gerome

  • Charter Honorary Member
  • Joined in 2006
  • ***
  • Posts: 154
    • View Profile
    • Get my Freestyle Basic Script Language + compiler!
    • Donate to Member
Hi,

Bah... I'd prefer 1 million Euros :)
Yours,
(¯`·._.·[Gerome GUILLEMIN]·._.·´¯)
http://www.fbsl.net [FBSL Author]
http://gedd123.free.fr/FBSLv3.zip [FBSL Help file]
(¯`·._.·[If you need help... just ask]·._.·´¯)

KenR

  • Super
  • Blogger
  • Joined in 2006
  • ***
  • Posts: 826
    • View Profile
    • Donate to Member
Ok Gerome,
You want to try to convince us that is NOT complaining?
Ken
Kenneth P. Reeder, Ph.D.
Clinical Psychologist
Jacksonville, North Carolina  28546

brotherS

  • Master of Good Ideas
  • Honorary Member
  • Joined in 2005
  • **
  • Posts: 2,260
    • View Profile
    • Donate to Member
Netflix is holding a 1 million dollar contest awarded to the coder who can betst improve their movie recommendation engine.
What a great idea! I read articles in the past that Netflix seems to be innovative (I'm no customer nor related to them) and this proves it. By running the contest until "at least October 2, 2011" they show they really plan to spend the million to gain a 10% performance win, and I also like the idea of their "Progress Prizes: $50,000 (USD) Cash each award", which is handed out once a year for those who are improving the performance the most, but not reaching the 10% hurdle.

@Gerome: we'd still accept you if you'd show less negativity, really!

Gerome

  • Charter Honorary Member
  • Joined in 2006
  • ***
  • Posts: 154
    • View Profile
    • Get my Freestyle Basic Script Language + compiler!
    • Donate to Member
Hi,

@Gerome: we'd still accept you if you'd show less negativity, really!

What about negativity ?
Because 1 million of Euro is > to 1 million of USD ?
ROFL
Yours,
(¯`·._.·[Gerome GUILLEMIN]·._.·´¯)
http://www.fbsl.net [FBSL Author]
http://gedd123.free.fr/FBSLv3.zip [FBSL Help file]
(¯`·._.·[If you need help... just ask]·._.·´¯)

allen

  • Charter Member
  • Joined in 2006
  • ***
  • Posts: 1,206
    • View Profile
    • Donate to Member
Hi,

@Gerome: we'd still accept you if you'd show less negativity, really!

What about negativity ?
Because 1 million of Euro is > to 1 million of USD ?
ROFL

But perhaps the point, Gerome, is that 1 millsion USD > Nothing
 -- and as an American company, USD would make sense as the prize, no?

Gerome

  • Charter Honorary Member
  • Joined in 2006
  • ***
  • Posts: 154
    • View Profile
    • Get my Freestyle Basic Script Language + compiler!
    • Donate to Member
Hello,

But perhaps the point, Gerome, is that 1 millsion USD > Nothing
 -- and as an American company, USD would make sense as the prize, no?

Ok Allen et al, I've just seen my jokes are just badly interpreted... I'm sorry for you... :)
BTW, this offer seems nice.
Yours,
(¯`·._.·[Gerome GUILLEMIN]·._.·´¯)
http://www.fbsl.net [FBSL Author]
http://gedd123.free.fr/FBSLv3.zip [FBSL Help file]
(¯`·._.·[If you need help... just ask]·._.·´¯)

brotherS

  • Master of Good Ideas
  • Honorary Member
  • Joined in 2005
  • **
  • Posts: 2,260
    • View Profile
    • Donate to Member
I'm sorry for you...
And here you go again...

mouser

  • First Author
  • Administrator
  • Joined in 2005
  • *****
  • Posts: 40,896
    • View Profile
    • Mouser's Software Zone on DonationCoder.com
    • Read more about this member.
    • Donate to Member
interesting blog critisizing the approach:
http://www.feedblog....netflix_ranking.html

If you assume they can get 4x compression out of this dataset you're still looking at 50G which you'll probably have to query at runtime. This probably means a memory based cluster and to do that you'll need at least $15-20k at a minimum to get started.

What Netflix should do is allow teams to pitch them their proposals and fund their research with a winning prize of $1M.


JavaJones

  • Review 2.0 Designer
  • Charter Member
  • Joined in 2005
  • ***
  • Posts: 2,739
    • View Profile
    • Donate to Member
I'm not a coder but it seems to me you wouldn't necessarily need to use the whole data set to prototype and see if your results net reasonable gains. I presume they are supporting this to a reasonable degree by providing a good amount of info about their current system (if not source code). If that's the case it should be reasonable to extrapolate results for your test data set size/range and work against that. If you can show significant potential in a new approach I would think they will work with you to scale it up and see if it's really practical. It only makes sense from their end to do so.

- Oshyan

mouser

  • First Author
  • Administrator
  • Joined in 2005
  • *****
  • Posts: 40,896
    • View Profile
    • Mouser's Software Zone on DonationCoder.com
    • Read more about this member.
    • Donate to Member
Alex (www.3form.org) has convinced me to join with him in working on this a little since both of us have backgrounds in machine learning.

I have my doubts about the possibilities of winning this but it seems an interesting challenge and worth a little bit of time playing with it.  I'm happy to talk with anyone in the donationcoder irc channel (#donationcoder on efnet, or hit the chat button aboce) who thinks they might also want to try entering.

urlwolf

  • Charter Member
  • Joined in 2006
  • ***
  • Posts: 1,837
    • View Profile
    • Donate to Member
all,

Do you know of any webservice where users log which movies they have actually seen? Something like last.fm for music, but using movies instead...

Actually, same thing for books read would be great too..
@mouser: I sent you an email, cannot reach the #IRC channel for some reason...

CWuestefeld

  • Supporting Member
  • Joined in 2006
  • **
  • Posts: 1,009
    • View Profile
    • Donate to Member
I made a stab at this late last year with a friend. The amount of data isn't a big deal for the storage capacity of a modern PC. More difficult is being able to process it in a reasonable time.

We started with a bit of success. Our very first attempt was enough to beat Netflix's baseline and yield a score that qualified for the leaderboard (at the time). Doing this was surprisingly easy. Nothing but a good working knowledge of statistics was needed to beat their baseline. Simply normalizing the scores of all the users according to their standard deviations was enough to do this.

Our project fizzled out after just a couple of weeks. The problem was that neither of us has an understanding of the collaborative filtering techniques that appear to be necessary to do really well. And experimentation was difficult. The size of the data set is such that we could only really make one run a day (we were using MS SQL Server, if you're wondering).

Also, I was a little frustrated because I think that their criteria measure the wrong thing. At first it was an intellectual challenge anyway, but that wore out. The challenge is to predict the score that a customer would assign for each of a set of movies. The thing is, this isn't really an interesting question to solve. I don't care very much if Netflix thinks that I'd give this movie a 3.1 or 3.5. What's really interesting is only selecting a list of movies at the top end of the scale, so Netflix can give me a list of recommendations when I ask "what good movies have you got for me today?".

mouser

  • First Author
  • Administrator
  • Joined in 2005
  • *****
  • Posts: 40,896
    • View Profile
    • Mouser's Software Zone on DonationCoder.com
    • Read more about this member.
    • Donate to Member
great to hear you worked on it!!

my main source of skepticism is a suspicion/fear that there is some inherent noise in the data, and that winning the contest may in fact require being better than the noise would allow without pure luck.  in other words, a person's rating for any given movie may vary a little bit depending on their mood at the time, or based on some truly unpredictable factors.  for example, imagine i ask you to predict the votes of 100 people and i happen to know that 95 of them always vote one way, but the other 5 flip a coin to decide which way to vote.  now you can't hope to get a perfect score because of the element of true randomness (coin flip).  At some point, trying to get better than a certain score on the netflix challenge data is going to look like that -- though whether that happens well before the million dollar prize accuracy i dont know, but that's my fear.

mouser

  • First Author
  • Administrator
  • Joined in 2005
  • *****
  • Posts: 40,896
    • View Profile
    • Mouser's Software Zone on DonationCoder.com
    • Read more about this member.
    • Donate to Member
the forum makes for great reading: http://www.netflixprize.com/community/

mouser

  • First Author
  • Administrator
  • Joined in 2005
  • *****
  • Posts: 40,896
    • View Profile
    • Mouser's Software Zone on DonationCoder.com
    • Read more about this member.
    • Donate to Member
Re: Netflix contest: $1M to developer of best movie recommendation engine
« Reply #15 on: September 12, 2007, 07:58 PM »
about 20 days until first round closes..
i'm playing around with some code and hope to have it done in time to submit.

tinjaw

  • Supporting Member
  • Joined in 2006
  • **
  • Posts: 1,927
    • View Profile
    • Donate to Member
Re: Netflix contest: $1M to developer of best movie recommendation engine
« Reply #16 on: September 12, 2007, 08:01 PM »
Go mouser. Give us an update on your scores. I remember you talking about an ongoing scoring mechanism during the contest.

mouser

  • First Author
  • Administrator
  • Joined in 2005
  • *****
  • Posts: 40,896
    • View Profile
    • Mouser's Software Zone on DonationCoder.com
    • Read more about this member.
    • Donate to Member
After 2 years it looks like there is going to be a grand winner.. amazing.

http://tech.slashdot...y-Have-Been-Achieved

The nice thing will be reading about the final algorithms.. we've actually known most of the details of it since the authors are academics and published papers on their algorithms.. But it will be interesting to hear how they squeezed out the last drops of performance.

mouser

  • First Author
  • Administrator
  • Joined in 2005
  • *****
  • Posts: 40,896
    • View Profile
    • Mouser's Software Zone on DonationCoder.com
    • Read more about this member.
    • Donate to Member
The contest is now closed and the winner chosen, and WOW it seems like a surprising ending to a surprising contest..

By A Nose: Netflix Prize Leaders One-Upped With One Day Remaining:


Homepage of the winning group:

wow. just wow.
« Last Edit: July 27, 2009, 01:27 AM by mouser »

TheQwerty

  • Supporting Member
  • Joined in 2007
  • **
  • default avatar
  • Posts: 84
    • View Profile
    • Donate to Member
Note that no winner has been chosen yet.
Netflix is going to evaluate the entries and also take performance into consideration before they select one or re-open the contest.

mouser

  • First Author
  • Administrator
  • Joined in 2005
  • *****
  • Posts: 40,896
    • View Profile
    • Mouser's Software Zone on DonationCoder.com
    • Read more about this member.
    • Donate to Member
my understanding is that they were only supposed to verify that the code works, not decide based on efficiency, etc. and that they could not "re-open" the contest unless there was some violation of the rules.

do you have any more info about this and links?

mouser

  • First Author
  • Administrator
  • Joined in 2005
  • *****
  • Posts: 40,896
    • View Profile
    • Mouser's Software Zone on DonationCoder.com
    • Read more about this member.
    • Donate to Member
I've read some of the previous papers published by both teams that broke the 10% barrier and it will be fascinating to read the final papers.

TheQwerty

  • Supporting Member
  • Joined in 2007
  • **
  • default avatar
  • Posts: 84
    • View Profile
    • Donate to Member
my understanding is that they were only supposed to verify that the code works, not decide based on efficiency, etc. and that they could not "re-open" the contest unless there was some violation of the rules.

do you have any more info about this and links?
Well for links, just look at what you posted, none of them have an official statement that Ensemble won. :P  And the official Netflix blog says that they'll begin verifying: http://blog.netflix....n-closes-but-no.html

Qualified entries will be evaluated as described in the Rules. We look forward to awarding the Grand Prize, which we expect to announce in a few weeks. However if a Grand Prize cannot be awarded because no submission can be verified by the judges, the Contest will reopen. We will make an announcement on the Forum after the Contest judges reach a decision.
So maybe not taking performance into consideration, but there's also ...

Thanks. In fact, this is a very happy day for us - our team is top contender for winning the Grand Prize, as we have a better Test score than The Ensemble. (Probably this is the first post revealing this in the forum smile)
Which led me to believe that the leaderboards are not calculated on the full/exact data set that the winning algorithm will be tested on.

Don't really know.. but I wouldn't get ahead of yourself congratulating The Ensemble until Netflix has made an announcement. Certainly not with results as close as these.

mouser

  • First Author
  • Administrator
  • Joined in 2005
  • *****
  • Posts: 40,896
    • View Profile
    • Mouser's Software Zone on DonationCoder.com
    • Read more about this member.
    • Donate to Member
Which led me to believe that the leaderboards are not calculated on the full/exact data set that the winning algorithm will be tested on.

ah that's an interesting point.. i'm trying to remember now what the rules said about that.
it does seem like with scores this close, it's a bit silly to say that one of these 2 teams one, and the other "lost."

« Last Edit: July 27, 2009, 04:58 PM by mouser »