I totally agree that some of the data is suspicious. I did spend some time cleaning the data, and removed any data points that looked too unrealistic. That particular one actually was discussed in the original thread (https://news.ycombinator.com/item?id=8573423), and I chose to leave it in because it's definitely feasible.
There are lots of unavoidable issues with self-reported data sets, which is partially why I added the list on the right to increase transparency and let people make their own decisions about confidence in the data.
wattenberger|11 years ago
There are lots of unavoidable issues with self-reported data sets, which is partially why I added the list on the right to increase transparency and let people make their own decisions about confidence in the data.