Berkeley, sadly, is perhaps too large and diverse for an overall characterization.
So true. As a Cal alum, I think he is spot on here. :-)
I pulled about 400 followers from each school, and added a couple filters, to try to ensure that followers were actual attendees of the schools rather than general people simply interested in them
How did he ensure followers were actual attendees of the schools programmatically? It would be really hard to find out this type of information. And can be considered as borderline creepiness in some cases.
EDIT: He also works as a data scientist at Twitter. So I'm sure he has access to lot more internal data rather some sort of mashup between Twitter, FB, LinkedIn APIs.
How did he ensure followers were actual attendees of the schools programmatically?
He mentions in the comments "I basically just checked that they didn't follow any other schools (from a small list). It's certainly not the greatest filter, but it did seem to work for a small number of people I hand-checked."
The article does not point out that sampling quora users might not be (I would say is probably not) an unbiased estimator of the the students of these places as a whole. Quora attracts a certain kind of person, I've never seen it mentioned anywhere outside of techy startup / valley circles. Maybe that's implied by virtue of it being in the HN ecosystem, but still it should be explicitly stated as a flaw in the method. Lies, damned lies and statistics etc.
MIT and Stanford like Hip-Hop Music. I'm pretty sure MIT prefers Biggie and Stanford prefers Tupac (note: this has nothing to do with east/west coast, but rather pure skill vs. charisma)
Interesting, but wrong use of conditional probabilities. All OP is saying is that the frequency of people from school x following topic y is p.
The dataset is just not the right one to say that P(x|y) = p, because of, say, all the people who follow food in NYC and go to NYU which were not taken into account here.
Surprisingly diverse interests, considering the huge bias in the dataset (public quora profiles). Well worth reading, though I was left wondering about the inverse probabilities.
"Berkeley, sadly, is perhaps too large and diverse for an overall characterization."
That isn't sad at all. It's great. As a UC grad, the diversity of the student body was one of the absolute best things about my college experience. Do I wish I went to Harvard or Stanford? Sure, but not for their homogeneous student bodies.
[+] [-] dm8|13 years ago|reply
So true. As a Cal alum, I think he is spot on here. :-)
I pulled about 400 followers from each school, and added a couple filters, to try to ensure that followers were actual attendees of the schools rather than general people simply interested in them
How did he ensure followers were actual attendees of the schools programmatically? It would be really hard to find out this type of information. And can be considered as borderline creepiness in some cases.
EDIT: He also works as a data scientist at Twitter. So I'm sure he has access to lot more internal data rather some sort of mashup between Twitter, FB, LinkedIn APIs.
[+] [-] aggie|13 years ago|reply
He mentions in the comments "I basically just checked that they didn't follow any other schools (from a small list). It's certainly not the greatest filter, but it did seem to work for a small number of people I hand-checked."
[+] [-] oraj|13 years ago|reply
[+] [-] ScottBurson|13 years ago|reply
[+] [-] ballooney|13 years ago|reply
The article does not point out that sampling quora users might not be (I would say is probably not) an unbiased estimator of the the students of these places as a whole. Quora attracts a certain kind of person, I've never seen it mentioned anywhere outside of techy startup / valley circles. Maybe that's implied by virtue of it being in the HN ecosystem, but still it should be explicitly stated as a flaw in the method. Lies, damned lies and statistics etc.
[+] [-] rjtavares|13 years ago|reply
[+] [-] carlob|13 years ago|reply
The dataset is just not the right one to say that P(x|y) = p, because of, say, all the people who follow food in NYC and go to NYU which were not taken into account here.
[+] [-] sesqu|13 years ago|reply
[+] [-] sadga|13 years ago|reply
Obviously, the author chose a more wieldly title, but a far less correct wrong.
[+] [-] romain_g|13 years ago|reply
[+] [-] probably|13 years ago|reply
But good eye on that. ;)
[+] [-] bitwize|13 years ago|reply
[+] [-] wilfra|13 years ago|reply
That isn't sad at all. It's great. As a UC grad, the diversity of the student body was one of the absolute best things about my college experience. Do I wish I went to Harvard or Stanford? Sure, but not for their homogeneous student bodies.
Otherwise, pretty interesting read.
[+] [-] echen|13 years ago|reply
[+] [-] cubicle|13 years ago|reply
[deleted]
[+] [-] baritalia|13 years ago|reply
[+] [-] nuttendorfer|13 years ago|reply
[+] [-] yen223|13 years ago|reply