walkerkq's comments

walkerkq | 10 years ago | on: Text Mining South Park

Hi, I'm the author. I appreciate the time you've taken to read and provide constructive criticism of my work. Here's my full write up (on GitHub, so it should continue to work): https://github.com/walkerkq/textmining_southpark/blob/master...

I was working under the assumption that we do not know ALL the words since the show's been renewed through 2019. This covers the first 18 seasons.

Additionally, the counting up their most frequent words produced results with very little semantic meaning - things like "just" and "dont" - which can be seen in this (really boring) wordcloud: https://github.com/walkerkq/textmining_southpark/blob/master...

Looking into the log likelihood of each word for each speaker produced results that were much more intuitive and carried more meaning, like ppod said below: I think the idea is that what we are really trying to measure is something unobservable like the underlying nature of the character or the writers' tendencies to give characters certain ways of speaking.

page 1