top | item 3891766

Comparing taste in films using pairwise vector comparisons

37 points| johnb | 14 years ago |blog.goodfil.ms | reply

16 comments

order
[+] baddox|14 years ago|reply
I'd be interested in reading a more general blog article about their theory behind using "quality" and "rewatchability" as their key user rating. It sounds reasonable at first, but when I think more deeply about it, I wonder what "quality" is supposed to be interpreted as. Is it "how much I enjoyed the first viewing of the film," something more specific like "how skillful was the camera work" or "how good was the acting," or something more meta like "how good I think critics or movie buffs would think the film is?"

I've gone through stages of armchair film criticism, so I've thought about personal ratings a lot. I even drafted a web app to track my viewings and watchlist, and the rating idea I've liked the most is a stupidly simple boolean rating. You could call it almost anything: "Like/Dislike," "Good/Bad," "Enjoyed/Didn't Enjoy," or even something a bit different like "I'm glad I watched it/I wish I hadn't watched it."

[+] geelen|14 years ago|reply
Check out this post on the topic: http://blog.goodfil.ms/blog/2011/10/07/a-better-way-to-rate-...

'Quality' is intended to be a more objective score of the craft of the film. Quality of writing, directing, acting; originality of the idea; how influential it is.

'Rewatchability' is where your enjoyment gets factored in. We think it's important to consider 'watching it again' rather than enjoyment first time because it separates out films better. For example, the film Avatar is quite enjoyable the first time round, but IMO not particularly worth rewatching.

Btw, I'm the author of both posts :)

[+] Locke1689|14 years ago|reply
How is this better than a normalized cosine similarity? The vector being arbitrary, but in this case being a normalized value on quality and rewatchability.

Cosine similarity would also let you express pairwise similarity as a single normalized value, instead of a 9-way comparison.

[+] ileitch|14 years ago|reply
Interesting read. Vector Victor sounds like a linear correlation algorithm. I wonder what coefficient they're using under the hood...
[+] geelen|14 years ago|reply
It's not actually linear correlation, since we effectively normalise the pairwise scores to [-1,0,1],[-1,0,1] (nine possible combos). We're exploring blending in a few other signals along the way, but we wanted to see how far we could get by discretising the pairwise comparisons in this way.

Once we've collapsed all pairs down to a Vector Victor, we treat matching Vector Victors as a thumbs up and non-matching as a thumbs down, take the square root of both then take the lower bound of the Wilson interval as our ranking function.

More questions? Shoot!

[+] chrisberkhout|14 years ago|reply
It's cool they're taking a new approach. I wonder if there aren't scientific papers on this stuff.
[+] johnb|14 years ago|reply
I wouldn't be surprised if there were. If anyone knows of any, please share here.

Glen is bringing a lot of what he learned doing transport modeling and advertising analysis to the project - but we don't have any reference material specific to what we're doing now.