« Good guys in sports need a dose of reality | Main | Reality check on the long tail »


Feed You can follow this conversation by subscribing to the comment feed for this post.

Max Lin

Can the user-seen pair setup that does not suffer from missing data problem be treated as a special case of the user-rateing pair, but with seen movies with rating 1 and unseen movies with with rating 0?


Max: In terms of running the algorithm, you can do as you said. What you'll notice is that the matrix is now complete as opposed to very sparse in the case of ratings.


But doesn't the user-seen pair also suffer from not distinguishing between user - 'not aware movie existed' and user - 'chose not to watch it'


Chris: Awareness is a different concept. The proposed model is based on actual watching; aware but don't watch is grouped with not aware. You can of course build a more complicated model if you so desire.

The comments to this entry are closed.

Get new posts by email:
Kaiser Fung. Business analytics and data visualization expert. Author and Speaker.
Visit my website. Follow my Twitter. See my articles at Daily Beast, 538, HBR, Wired.

See my Youtube and Flickr.


  • only in Big Data
Numbers Rule Your World:
Amazon - Barnes&Noble

Amazon - Barnes&Noble

Junk Charts Blog

Link to junkcharts

Graphics design by Amanda Lee

Next Events

Jan: 10 NYPL Data Science Careers Talk, New York, NY

Past Events

Aug: 15 NYPL Analytics Resume Review Workshop, New York, NY

Apr: 2 Data Visualization Seminar, Pasadena, CA

Mar: 30 ASA DataFest, New York, NY

See more here

Principal Analytics Prep

Link to Principal Analytics Prep