> People with these names are more likely than others to have these professions.
Shouldn't it say: "People with these names happen to be in those professions more often than others"?
Anyway, there are a couple of fun ones in there, but I'll let you figure those out yourself. Unfortunately neither my name nor my profession are covered — I'm not quite sure what to make of that. :/
---
They use the same language in their blog post: "Arnolds therefore appear to have a much higher tendency to be accountants than Shanes." That's just wrong, no? By wrong I just mean intentionally misleading. I'm sure that has nothing to do with the fact that they sell an app that helps you find names for you babies though.
Take above quote, which compares "1.9% of Arnolds are accountants" to the "0.55% of Shanes [are accountants]". They're implying that the probability of being an accountant (J), given that ones name (N) is Arnold, is above the expected probability of being an accountant in general. So they're looking for a high P[J|N]/P[J].
Now compare with what we were expecting to see. We assumed the chart showed, for a given job, names which had a higher incidence than normal. i.e., we're looking for a high P[N|J]/P[N].
Guess what. P[J|N]/P[J]=P[N|J]/P[N] by Bayes' Theorem [1]. These are EXACTLY the same metric! So their technique, and the chart, is correct. (And my original post, below, was wrong.)
(Not saying anything about causation here, and I don't think they were either.)
Yes, that is completely backward. 99% of Arnolds could go into farming, yet still the 1% who go into accounting dominate that field, and hence show up on this chart.
EDIT: but you missed the first half of that quote: "In our sample of two and a half million people, a whopping 1.9% of Arnolds are accountants. Contrast that with just 0.55% of Shanes." So I think the quote is correct. Makes me wonder if their chart is backward. (i.e., they put "Arnold" under "Accountant" because Arnolds are likely to be accountants, not because accountants are likely to be Arnolds, as the grouping implies).
It doesn't appear incorrect to me. The blurb on the graph says "For example, a higher percentage of Elwoods are farmers than of most any other name."
As I read it, it means that, say, 1% of Elwoods become farmers, while only .1% of Steves are farmers. That is, if your name is Elwood, you are much more likely to be a farmer than if your name is anything else.
I've heard mentioned in books on psychology before about this effect, and it seems at least from a clinical perspective, that people do tend to choose their professions based on their names.
I remember when I joined CMU my Freshman year the big thing was that the previous year there had been more guys named "Dave" that had graduated from Computer Science than women. This kind of reminds me of that. This was circa year 2001.
A comedy channel in the UK decided to rename the channel a few years back. They bandied names around, then someone said. "How about Dave? Everyone knows Dave, Dave's your mate." Dave it was. As it turns out, David has been in the top 3 names from 1954-1994, so reality is if you're a Boomer, Gen X, or Gen Y, you really do know a Dave. (Stats table here: http://www.ons.gov.uk/ons/publications/re-reference-tables.h...)
I don't get why this is presented as a 'chart'. Do the layout and colours mean anything? Or is it just a set of lists laid out in circles for no reason?
As far as this dude, with his aging eyeballs is concerned, the colour's meaning, if any was irrelevant.
I mean, the information was interesting enough to me to try to read it, but I frankly can not parse what looks like #FFFFFF text on #FEFFFE background.
For the love of god, If you want to present data like this, ensure at least a little contrast exists between the text and background
Gah, I hate stats that a) don't say where the information comes from b) don't say which part of the world they're talking about (US sites are particularly prone to this). I'm assuming this is from US census data, but it's not particularly obvious. There's a rest of the world, you know....
[EDIT: looking at the app, maybe it is worldwide? Still, referring to things like "Republican" doesn't inspire confidence that it's international]
I visited a fire station in the country in my province a few years ago and on the wall were names of the volunteers.
Nearly every one was John Gallant, the surname is very common and it's a small community.
The funny thing is the surname is so common people have nicknames such as John 'Rabbit' Gallant but that name becomes so well-known his son will be called Rabbit Jr.
So the plaque is full of a wild mix of actual surnames and given names and of course nicknames but also the junior of the nicknamed people.
Add to that one family has seven daughters all named Mary.
Its interesting to see the difference in type of names between Football Coach (Dan, Bill, Mike, Jim, Rich, Steve) and Electrical Engineer (Bernard, Eugene, Edwin, Charles, Alfred, Harvey). Short (one vowel each), common, English versus longer (two vowels each) French/"Posh" English names. Correlation between the names and background/educative level seems likely?
Songwriter is interesting too: 4 out of 5 end with a variation of "y".
It's also possible that football coaches have a tendency to commonize their names - whether to fit in and make their players more relaxed, or maybe they were just raised in solid blue-collar American families.
Their birth certificates could very well read Daniel, William, Michael, James, Richard, Steven.
Similarly, the EE's probably have a tendency to be more formal on their resumes or business cards.
Their family and buddies probably calls them Bernie, Gene, Eddie, Chuck, Freddy, Harv.
As another commentator points out, (https://news.ycombinator.com/item?id=8881131 )the actual results appear to be a heavily human-edited "6 of the top 37" list. In which case most of the interesting patterns would reflect the biases of the people that compiled the chart.
Chances are that there are largely unsurprising gender/age/class/race biases in the full dataset too, but these have been selected for effect
(I bet most of the rest of the disproportionately common football coach names are pretty regular names for men born between 30 and 55 years ago with some of them possibly even having additional syllables, and similarly and wouldn't be surprised to find that Jim and Bill, for example, also featured high in the list of people disproportionately likely to be electrical engineers, possibly even above Eugene)
That is interesting and I am sure there is a good point there but meteorologist is also a profession requiring a high education level. Yet the meteorologist list (Bill, Joe, Jim, Jeff, Mike, Scott) is way more similar to the football coach list to the point where 3 of the names are the same.
I wonder however if Eugene were a Football Coach if he Edwin would then be called Ed for short. Perhaps Dan into Daniel and Mike into Michael for Electrical Engineer.
This seems fairly easy to explain: Many given names are passed down through family lines, and many families pass on knowledge, habits, and even professions from generation to generation.
There are lots of common surnames that demonstrate this - Taylor, Cooper, Cobbler, Smith, and so on. Why not given names?
Yep my cousin was named George as was My late Uncle as the idea was that he would carry on the family trade as a turf accountants (book makers)in Birmingham.
Note that this was pre legalisation so watching peaky blinders(uk version of Atlantic boardwalk) on the BBC was interesting shall we say
Some time ago, I saw a paper when some scientists find that people often tends to have place of live and job which is somehow connected with their names eg. there's more Denises who works as dentists or Louises in Louisiane. I also noticed that in my country (Poland) there's quite more people working in IT with names or surnames which begins on K (Polish word for computer is "komputer"), I'm the case.
> I also noticed that in my country (Poland) there's quite more people working in IT with names or surnames which begins on K (Polish word for computer is "komputer")
You don't say... :) Perhaps because names starting with K are simply very common in Poland? ;) 1 in 6 out of 100 most popular last names starts with K.
And quite more IT employees than whom exactly? Presidents of the country? Since 1989 Poland had: Jaruzelski, Walesa, Kwasniewski, Kaczynski and Komorowski... Among prime ministers (since the 90s) about 1 in 5 had either first name or last name that begins with K :)
> Here's a chart with 6 of the names that are the most disproportionately common in 37 professions.
It doesn't say they're the top 6, just that they're "6 of" -- and having worked with a lot of similar data sets in the past, the results here feel a little overly edited (i.e. exaggerated, stereotyped) to me. I'd be happy to be proven wrong, though.
There they list the actual top 5 for some professions. Having read that, I am inclined to agree with you here.
The top 6 for car salesmen in that graph has literally only one name (Clay) that is in the actual top 5 and even that was only 5th. The top 4 names (Emmett, Luther, Emanuel, Morton) all got replaced with stereotypical white working class guy names.
The top 6 for surgeon in the graph has no female names yet the actual most disproportionately common name for surgeons is 'Vivienne'.
Not sure I trust the results. For guitarists they list: Mick, Richie, Trey, Sonny, Buddy, Eddie. That correlates strongly with famous musician names (Mick Jagger, Lionel Richie, Trey Anastasio, Sonny Rollins, Buddy Guy, Eddie Van Halen). Maybe kids are named after these legends and are pushed toward music, but maybe their software just counted a lot of duplicate mentions?
In some of those cases, it could also partly be people adapting their names after those legends. For example, a budding guitarist called Michael who loves Mick Jagger choosing to go by the name of Mick. As Michael is a far more common name than Mick, it would only take a small proportion of 'Michael's to do that and suddenly 'Mick' makes the guitarist list.
Indeed, the time period in which someone was born seems to have a significant effect on the likelihood of certain names. Here's one study that documents that effect in the US:
"Worthless" in what sense? If we want to know something in particular, maybe, but perhaps knowing that people who have a certain job tend to be older is of interest? Some jobs seem to have entirely female names in the chart; is the fact that women are apparently significantly overrepresented in those fields also "worthless?"
The correlation is obviously indirect, since names are correlated with age, social class, region and other demographics, and these are in turn correlated with career choices.
Okay so normally I hate people that bring up gender, but there were some interesting correlations between professions and gender here. Assuming there aren't many guys named Sue. For example, meteorologist: 100% male. When I think meteorologist I think newscast in front of a greens creen so this is perhaps a case of my terrible misconceptions showing.
Also, WHY are things separated by color and opacity? If we're going by opacity, apparently one of the most populous and important professions is...race car driver. Really? That's one of the few professions on that huge infographic that is at maximum opacity?
Looks like people alter their name to fit the stereotype of their profession. Lots of drummers named Billy, but I bet it says William on their birth certificate. Same for coaches named Rich, not Richard.
Apparently they don't state what "disproportionately" is supposed to mean. If that is indeed the case I doubt much valid insight can be derived from this chart.
"In our sample of two and a half million people, a whopping 1.9% of Arnolds are accountants. Contrast that with just 0.55% of Shanes. Arnolds therefore appear to have a much higher tendency to be accountants than Shanes."
You see where this is going. If you correlate the data with the popularity of names in general, you'll find that Arnold is a much more popular name than Shane ...
Interesting choice of professions. Race car drivers and musicians seem to have similar names. Other than that I have no idea what to do with this information.
[+] [-] teamhappy|11 years ago|reply
Shouldn't it say: "People with these names happen to be in those professions more often than others"?
Anyway, there are a couple of fun ones in there, but I'll let you figure those out yourself. Unfortunately neither my name nor my profession are covered — I'm not quite sure what to make of that. :/
---
They use the same language in their blog post: "Arnolds therefore appear to have a much higher tendency to be accountants than Shanes." That's just wrong, no? By wrong I just mean intentionally misleading. I'm sure that has nothing to do with the fact that they sell an app that helps you find names for you babies though.
[+] [-] colanderman|11 years ago|reply
Take above quote, which compares "1.9% of Arnolds are accountants" to the "0.55% of Shanes [are accountants]". They're implying that the probability of being an accountant (J), given that ones name (N) is Arnold, is above the expected probability of being an accountant in general. So they're looking for a high P[J|N]/P[J].
Now compare with what we were expecting to see. We assumed the chart showed, for a given job, names which had a higher incidence than normal. i.e., we're looking for a high P[N|J]/P[N].
Guess what. P[J|N]/P[J]=P[N|J]/P[N] by Bayes' Theorem [1]. These are EXACTLY the same metric! So their technique, and the chart, is correct. (And my original post, below, was wrong.)
(Not saying anything about causation here, and I don't think they were either.)
[1] http://en.wikipedia.org/wiki/Bayes%27_theorem
-----------
Yes, that is completely backward. 99% of Arnolds could go into farming, yet still the 1% who go into accounting dominate that field, and hence show up on this chart.
EDIT: but you missed the first half of that quote: "In our sample of two and a half million people, a whopping 1.9% of Arnolds are accountants. Contrast that with just 0.55% of Shanes." So I think the quote is correct. Makes me wonder if their chart is backward. (i.e., they put "Arnold" under "Accountant" because Arnolds are likely to be accountants, not because accountants are likely to be Arnolds, as the grouping implies).
[+] [-] adevine|11 years ago|reply
As I read it, it means that, say, 1% of Elwoods become farmers, while only .1% of Steves are farmers. That is, if your name is Elwood, you are much more likely to be a farmer than if your name is anything else.
[+] [-] Lambdanaut|11 years ago|reply
I'm sorry I can't provide sources.
[+] [-] anonu|11 years ago|reply
[+] [-] vickytnz|11 years ago|reply
[+] [-] callum85|11 years ago|reply
[+] [-] munificent|11 years ago|reply
These days, you pretty much have to put a pretty picture in an article if you want traffic from social sites.
[+] [-] GoodIntentions|11 years ago|reply
I mean, the information was interesting enough to me to try to read it, but I frankly can not parse what looks like #FFFFFF text on #FEFFFE background.
For the love of god, If you want to present data like this, ensure at least a little contrast exists between the text and background
[+] [-] kenj0418|11 years ago|reply
[+] [-] vickytnz|11 years ago|reply
[+] [-] Houshalter|11 years ago|reply
[+] [-] marche101|11 years ago|reply
Direct link to the blog that goes in to a few details of how they work out the names.
[+] [-] dghughes|11 years ago|reply
Nearly every one was John Gallant, the surname is very common and it's a small community.
The funny thing is the surname is so common people have nicknames such as John 'Rabbit' Gallant but that name becomes so well-known his son will be called Rabbit Jr.
So the plaque is full of a wild mix of actual surnames and given names and of course nicknames but also the junior of the nicknamed people.
Add to that one family has seven daughters all named Mary.
[+] [-] theorique|11 years ago|reply
[+] [-] tarpherder|11 years ago|reply
Songwriter is interesting too: 4 out of 5 end with a variation of "y".
[+] [-] swamp40|11 years ago|reply
Their birth certificates could very well read Daniel, William, Michael, James, Richard, Steven.
Similarly, the EE's probably have a tendency to be more formal on their resumes or business cards.
Their family and buddies probably calls them Bernie, Gene, Eddie, Chuck, Freddy, Harv.
[+] [-] notahacker|11 years ago|reply
Chances are that there are largely unsurprising gender/age/class/race biases in the full dataset too, but these have been selected for effect (I bet most of the rest of the disproportionately common football coach names are pretty regular names for men born between 30 and 55 years ago with some of them possibly even having additional syllables, and similarly and wouldn't be surprised to find that Jim and Bill, for example, also featured high in the list of people disproportionately likely to be electrical engineers, possibly even above Eugene)
[+] [-] toupeetape|11 years ago|reply
[+] [-] MichaelTieso|11 years ago|reply
[+] [-] Kurtz79|11 years ago|reply
[+] [-] mrcarrot|11 years ago|reply
edit: (or yeah, what MichaelTieso said :) )
[+] [-] longlivegnu|11 years ago|reply
[+] [-] jMyles|11 years ago|reply
There are lots of common surnames that demonstrate this - Taylor, Cooper, Cobbler, Smith, and so on. Why not given names?
[+] [-] walshemj|11 years ago|reply
Note that this was pre legalisation so watching peaky blinders(uk version of Atlantic boardwalk) on the BBC was interesting shall we say
[+] [-] kornakiewicz|11 years ago|reply
[+] [-] kornakiewicz|11 years ago|reply
http://andrewgelman.com/2005/08/05/dennis_the_denv/ - short summary.
[+] [-] V-2|11 years ago|reply
You don't say... :) Perhaps because names starting with K are simply very common in Poland? ;) 1 in 6 out of 100 most popular last names starts with K.
And quite more IT employees than whom exactly? Presidents of the country? Since 1989 Poland had: Jaruzelski, Walesa, Kwasniewski, Kaczynski and Komorowski... Among prime ministers (since the 90s) about 1 in 5 had either first name or last name that begins with K :)
[+] [-] crazygringo|11 years ago|reply
It doesn't say they're the top 6, just that they're "6 of" -- and having worked with a lot of similar data sets in the past, the results here feel a little overly edited (i.e. exaggerated, stereotyped) to me. I'd be happy to be proven wrong, though.
[+] [-] toupeetape|11 years ago|reply
There they list the actual top 5 for some professions. Having read that, I am inclined to agree with you here.
The top 6 for car salesmen in that graph has literally only one name (Clay) that is in the actual top 5 and even that was only 5th. The top 4 names (Emmett, Luther, Emanuel, Morton) all got replaced with stereotypical white working class guy names.
The top 6 for surgeon in the graph has no female names yet the actual most disproportionately common name for surgeons is 'Vivienne'.
[+] [-] logn|11 years ago|reply
[+] [-] toupeetape|11 years ago|reply
[+] [-] jiggy2011|11 years ago|reply
[+] [-] duaneb|11 years ago|reply
Not that I don't love both.
[+] [-] unknown|11 years ago|reply
[deleted]
[+] [-] Camillo|11 years ago|reply
[+] [-] stygiansonic|11 years ago|reply
http://fivethirtyeight.com/features/how-to-tell-someones-age...
[+] [-] emodendroket|11 years ago|reply
[+] [-] V-2|11 years ago|reply
Once the stats were adjusted for the above, I suppose we'd end up with not much more but noise and some spurious correlations occassionally: http://twentytwowords.com/funny-graphs-show-correlation-betw...
[+] [-] theorique|11 years ago|reply
Seems sort of ... logical.
[+] [-] ryanmim|11 years ago|reply
Also, WHY are things separated by color and opacity? If we're going by opacity, apparently one of the most populous and important professions is...race car driver. Really? That's one of the few professions on that huge infographic that is at maximum opacity?
[+] [-] mojuba|11 years ago|reply
[+] [-] normloman|11 years ago|reply
[+] [-] proveanegative|11 years ago|reply
[+] [-] teamhappy|11 years ago|reply
http://www.verdantlabs.com/blog/2014/12/30/names-by-professi...
You see where this is going. If you correlate the data with the popularity of names in general, you'll find that Arnold is a much more popular name than Shane ...
[+] [-] Shengbo|11 years ago|reply
[+] [-] breitling|11 years ago|reply
Also, many cultures obsess over giving the child a good name with a good meaning as they think it determines their future.