I did something pretty similar over christmas, though I used named entity recognition to extract book titles rather than looking for amazon links, and (so far) also limited it to specific "Ask HN" threads about books. You can find it here: http://www.hnreads.com/. It is interesting to see how little overlap there is between the two, though that may be due to my using far fewer (and also newer) threads!
Surprised to see Permutation City in that list. Given that the book is written in 1994, Gregg displays admirable prescience about how computing would develop. Honestly you would think it was written in the last 5 years or so. His vision of cloud computing is absolutely outstanding. It blew me away when I checked when the book was written after reading the first few chapters.
I'd read Schild's Ladder prior to reading Permutation city, which is also a good read. It does seem to get bogged down in the technical and descriptive side of things at times, however, it's a fantastic idea for a story. The main premise of the film would make a great movie.
Whilst I'm on the subject of good "Hard sci-fi" novels, Tau Zero is also worth reading.
One thing that struct me about your site (apart from being a good list, well done!) is how blazingly fast (close to HN, which I find funny) the page loads. Could you fill us mere mortals in on how the fuck you got it so fast?
This is really cool. I started reading HPMoR after finding it.
I'm curious: after performing NET on the corpus, how did you filter to find books only? Did you just search on Amazon's catalogue for exact matches, or was more tweaking required?
On the system browser on my phone (a Kyocera Rise running Android 4.0.4) all I see on that page is the header and footer, no content. :-( I get the same result if I hit it with Firefox with Javascript disabled, but Javascript is very much enabled on my phone browser.
Thanks for sharing! This list is a list I actually expected the OP's list to be. Probably because I'm also more likely to view the Ask HN threads about books.
At 4 GB, I'd just as soon query this locally, but this looks like a fun exercise.
I notice that there were 10,729 distinct ASINs out of 15,583 Amazon links in 8,399,417 comments. Since I don't generally (ever?) post Amazon links, I'd be interested in expanding on this in two ways.
First, I'd reduce/eliminate the weight of repeated links to the same book by the same commenter.
Second, I'd search for references to the linked books that aren't Amazon links. Someone links to Code Complete? Add it to the list. In a second pass, increment its count every time you see "Code Complete," whether it's in a link or not.
Discounting multiple links by the same user is a good idea. Your seconds suggestion brings some rather complex problems, for example if a comment goes like "Code Complete is the worst book I ever read" it is certainly not an endorsement, while linking to a book in most cases is. Also a sentence like "programming perl is fun" does not necessarily refer to the book.
So this would require some form of sentiment analysis and also require book titles to be uniquely identifiable.
> At 4 GB, I'd just as soon query this locally, but this looks like a fun exercise.
This requires scraping all the Hacker News data manually, for which I have a tool to do so (https://github.com/minimaxir/get-all-hacker-news-submissions...) which I mentioned in the post you linked, but it still requires a significant amount of time to get/process the data, hence why the BigQuery dataset has a significant advantage.
The absence of SICP, I imagine, is because when people refer to the SICP, they usually just link to the open link to the book: https://mitpress.mit.edu/sicp/ .
Yes, that is probably the case. Quoting from the post
"Amazon is often the goto website for referring books, but many books have dedicated homepages as well as pages pages on their publisher's website. Moreover, many freely available are referred frequently in comments, but are not considered in this ranking."
The approach used here has limitations, I hoped to make that clear by pointing them out and choosing titles and headlines accordingly.
Having owned and read through "Introduction to Algorithms" for years I agree that it is a good book. However, recently I have been feeling like it is recommended way too often without thought.
It is not the best when it comes to explaining things in an intuitive manner. It is a great reference book with lots of algorithms and proofs.
In recent years I have been drawn more towards Levitin's "Introduction to the Design and Analysis of Algorithms".
Anyone else have similar feelings about "Introduction to Algorithms"?
How come "Darwin's Theorem" appears so often? It's quite unknown, with one review on Goodreads and 4 reviews on Amazon
Is this a result of the author spamming his own work?
Edit: Looks like it, short skimming of "darwin's theorem site:news.ycombinator.com" shows that all links are from user tjradcliffe, who is the author. A case for manual curation of data.
Out of 8 million data points the top book got around 50 references. I wonder how much significance should be attached to that, it looks to me to be down in the noise level.
Related: There are a ton of sites set up like this. Hopefully somebody will post a list. Lotta work by HN folks on various ways of slicing and dicing the data.
I wrote this curated site from HN several years ago. Got tired of people continuously asking for book recommendations. http://www.hn-books.com/
Couple points of note. This is 1) an example of a static site, 2) terrible UI, 3) contains live searches to comments on each book from all the major hacking sites, and 4) able to record a list of books that you can then share as a link, like so (which was my reason for making the site)
He made a snarky comment about Andrew Bartbeit's death, so conservatives gave him a slew of one star reviews. If you filter by "verified purchases", it's not as polarized.
I've considered building the same myself. It would be lovely if you tracked the various HN reader client apps. A few that come to mind are: Hacker News Enhancement Suite for Chrome [1], Hacker Menu for OS X [2], and Premii's HN web app [3].
I remember Jeff Atwood's 4k monitor review post [1]. Someone had calculated that he made thousands of dollars pimping that thing.
I have no issue with people doing this, as long as their posts are not solely motivated by wanting an excuse to post their affiliate link. I guess the more popular you get, the more likely that is to happen.
I believe people would just write the name of the really popular books like TAOCP, Hackers, Founders at work etc rather than linking to them.
The list:
"The Rent Is Too Damn High: What To Do About It, And Why It Matters More Than You Think" by Matthew Yglesias
Publisher: Simon & Schuster
"The Four Steps to the Epiphany: Successful Strategies for Products that Win" by Steven Gary Blank
Publisher: Cafepress.com
"Introduction to Algorithms, 3rd Edition" by Thomas H. Cormen
Publisher: The MIT Press
"Influence: The Psychology of Persuasion, Revised Edition" by Robert B. Cialdini
Publisher: Harper Business
"Peopleware: Productive Projects and Teams (Second Edition)" by Visit Amazon's Tom DeMarco Page
Publisher: Dorset House Publishing Company, Incorporated
"Code: The Hidden Language of Computer Hardware and Software" by Charles Petzold
Publisher: Microsoft Press
"Working Effectively with Legacy Code" by Michael Feathers
Publisher: Prentice Hall
"Three Felonies A Day: How the Feds Target the Innocent" by Harvey Silverglate
Publisher: Encounter Books
"JavaScript: The Good Parts" by Douglas Crockford
Publisher: O'Reilly Media
"The Little Schemer - 4th Edition" by Daniel P. Friedman
Publisher: The MIT Press
"The E-Myth Revisited: Why Most Small Businesses Don't Work and What to Do About It" by Michael E. Gerber
Publisher: HarperCollins
"Feeling Good: The New Mood Therapy" by David D. Burns
Publisher: Harper
"Programming Collective Intelligence: Building Smart Web 2.0 Applications" by Toby Segaran
Publisher: O'Reilly Media
"The Non-Designer's Design Book (3rd Edition)" by Robin Williams
Publisher: Peachpit Press
"The C Programming Language" by Brian W. Kernighan
Publisher: Prentice Hall
"The Design of Everyday Things" by Donald A. Norman
Publisher: Basic Books
"Cracking the Coding Interview: 150 Programming Questions and Solutions" by Gayle Laakmann McDowell
Publisher: CareerCup
"What Intelligence Tests Miss: The Psychology of Rational Thought" by Keith E. Stanovich
Publisher: Yale University Press
"On Writing Well, 30th Anniversary Edition: The Classic Guide to Writing Nonfiction" by William Zinsser
Publisher: Harper Perennial
"Darwin's Theorem" by TJ Radcliffe
Publisher: Siduri Press
"Knowing and Teaching Elementary Mathematics: Teachers' Understanding of Fundamental Mathematics in China and the United States (Studies in Mathematical Thinking and Learning Series)" by Liping Ma
Publisher: Routledge
"Don't Make Me Think: A Common Sense Approach to Web Usability, 2nd Edition" by Steve Krug
Publisher: New Riders
"Expert C Programming: Deep C Secrets" by Peter van der Linden
Publisher: Prentice Hall
"Clean Code: A Handbook of Agile Software Craftsmanship" by Robert C. Martin
Publisher: Prentice Hall
"The Elements of Computing Systems: Building a Modern Computer from First Principles" by Noam Nisan
Publisher: The MIT Press
"Code Complete: A Practical Handbook of Software Construction, Second Edition" by Steve McConnell
Publisher: Microsoft Press
"The Box: How the Shipping Container Made the World Smaller and the World Economy Bigger" by Marc Levinson
Publisher: Princeton University Press
"Software Estimation: Demystifying the Black Art (Developer Best Practices)" by Steve McConnell
Publisher: Microsoft Press
"Refactoring: Improving the Design of Existing Code" by Martin Fowler
Publisher: Addison-Wesley Professional
"Design for Hackers: Reverse Engineering Beauty" by David Kadavy
Publisher: Wiley
Thanks for posting the list. The chart in the article makes it impossible to tell what the books are without hovering over each one to see the captions.
Hard to read on mobile. Couldn't get past the first few. It is annoying to have to click a tiny thumbnail to read a bad, extracted synopsis from Amazon.
Interesting to see Influence so high, but Predictably Irrational not listed at all. I've heard Influence is a really great book, but from a quick skim it seems like Predictably Irrational covers the subject matter as least as well if not better. I'd be happy to hear the opinion of someone who has actually read both.
I've read both and Influence is far more useful if you're trying to, well, influence someone. The art of influencing is complex and involves more than just a few behavioral economics insights. Influence is a total framework for understanding the psychology and emotions of selling.
I was surprised not to see Dale Carnegie's book either but I suppose its rather dated and not as scientific (How to win friends and...). Carnegie's book had some of the greatest impacts on my personal life and professional.
Influence was a great book, but it is a bit outdated (in my opinion). Predictably Irrational and his other books were much more relevant. Thinking Fast and Slow was the best one of then all.
[+] [-] _lpa_|10 years ago|reply
[+] [-] bitcointicker|10 years ago|reply
I'd read Schild's Ladder prior to reading Permutation city, which is also a good read. It does seem to get bogged down in the technical and descriptive side of things at times, however, it's a fantastic idea for a story. The main premise of the film would make a great movie.
Whilst I'm on the subject of good "Hard sci-fi" novels, Tau Zero is also worth reading.
Edit - I'll also throw this in: http://www.amazon.co.uk/gp/product/0814703259
Magic :-)
[+] [-] temo4ka|10 years ago|reply
[+] [-] erispoe|10 years ago|reply
[+] [-] fratlas|10 years ago|reply
[+] [-] rahimnathwani|10 years ago|reply
I'm curious: after performing NET on the corpus, how did you filter to find books only? Did you just search on Amazon's catalogue for exact matches, or was more tweaking required?
[+] [-] jccalhoun|10 years ago|reply
[+] [-] wtracy|10 years ago|reply
[+] [-] gkst|10 years ago|reply
[+] [-] Pietertje|10 years ago|reply
[+] [-] ALee|10 years ago|reply
[+] [-] edpichler|10 years ago|reply
[+] [-] unknown|10 years ago|reply
[deleted]
[+] [-] unknown|10 years ago|reply
[deleted]
[+] [-] SloopJon|10 years ago|reply
https://news.ycombinator.com/item?id=10440502
At 4 GB, I'd just as soon query this locally, but this looks like a fun exercise.
I notice that there were 10,729 distinct ASINs out of 15,583 Amazon links in 8,399,417 comments. Since I don't generally (ever?) post Amazon links, I'd be interested in expanding on this in two ways.
First, I'd reduce/eliminate the weight of repeated links to the same book by the same commenter.
Second, I'd search for references to the linked books that aren't Amazon links. Someone links to Code Complete? Add it to the list. In a second pass, increment its count every time you see "Code Complete," whether it's in a link or not.
[+] [-] gkst|10 years ago|reply
So this would require some form of sentiment analysis and also require book titles to be uniquely identifiable.
[+] [-] minimaxir|10 years ago|reply
This requires scraping all the Hacker News data manually, for which I have a tool to do so (https://github.com/minimaxir/get-all-hacker-news-submissions...) which I mentioned in the post you linked, but it still requires a significant amount of time to get/process the data, hence why the BigQuery dataset has a significant advantage.
[+] [-] niuzeta|10 years ago|reply
[+] [-] gkst|10 years ago|reply
"Amazon is often the goto website for referring books, but many books have dedicated homepages as well as pages pages on their publisher's website. Moreover, many freely available are referred frequently in comments, but are not considered in this ranking."
The approach used here has limitations, I hoped to make that clear by pointing them out and choosing titles and headlines accordingly.
[+] [-] meadori|10 years ago|reply
It is not the best when it comes to explaining things in an intuitive manner. It is a great reference book with lots of algorithms and proofs.
In recent years I have been drawn more towards Levitin's "Introduction to the Design and Analysis of Algorithms".
Anyone else have similar feelings about "Introduction to Algorithms"?
[+] [-] dankohn1|10 years ago|reply
https://twitter.com/mattyglesias/status/689169613779808257 "The only book ranking that matters"
[+] [-] a_bonobo|10 years ago|reply
Is this a result of the author spamming his own work?
Edit: Looks like it, short skimming of "darwin's theorem site:news.ycombinator.com" shows that all links are from user tjradcliffe, who is the author. A case for manual curation of data.
[+] [-] tagawa|10 years ago|reply
[+] [-] mattip|10 years ago|reply
[+] [-] jacko0|10 years ago|reply
[+] [-] DanielBMarkham|10 years ago|reply
I wrote this curated site from HN several years ago. Got tired of people continuously asking for book recommendations. http://www.hn-books.com/
Couple points of note. This is 1) an example of a static site, 2) terrible UI, 3) contains live searches to comments on each book from all the major hacking sites, and 4) able to record a list of books that you can then share as a link, like so (which was my reason for making the site)
"My favorite programming books? Here they are: http://www.hn-books.com#B0=138&B1=15&B2=118&B3=20&B4=16&B5=1... "
I started writing reviews each month on the books, but because they were all awesome books, I got tired of so many superlatives!
Thanks for the site.
[+] [-] willyyr|10 years ago|reply
[+] [-] greesil|10 years ago|reply
http://www.amazon.com/The-Rent-Too-Damn-High-ebook/product-r...
It's the most polarized I've ever seen in my life.
[+] [-] vellum|10 years ago|reply
[+] [-] brink|10 years ago|reply
[+] [-] tern|10 years ago|reply
[+] [-] tedmiston|10 years ago|reply
I've considered building the same myself. It would be lovely if you tracked the various HN reader client apps. A few that come to mind are: Hacker News Enhancement Suite for Chrome [1], Hacker Menu for OS X [2], and Premii's HN web app [3].
1: https://chrome.google.com/webstore/detail/hacker-news-enhanc... 2: https://hackermenu.io/ 3: https://hn.premii.com/
[+] [-] nextos|10 years ago|reply
E.g:
- SICP: Structure and Interpretation of Computer Programs
- CTM: Concepts, Techniques, and Models of Computer Programming
- TAOP: The Art of Prolog
[+] [-] spinchange|10 years ago|reply
[+] [-] anc84|10 years ago|reply
[+] [-] ryangittins|10 years ago|reply
I just never understand people's hatred for affiliate links in good pieces of content.
[+] [-] artursapek|10 years ago|reply
I have no issue with people doing this, as long as their posts are not solely motivated by wanting an excuse to post their affiliate link. I guess the more popular you get, the more likely that is to happen.
[1] http://blog.codinghorror.com/our-brave-new-world-of-4k-displ...
[+] [-] gkst|10 years ago|reply
[+] [-] myth_buster|10 years ago|reply
The list:
[+] [-] jraines|10 years ago|reply
[+] [-] jlarocco|10 years ago|reply
[+] [-] busterarm|10 years ago|reply
SICP gets mentioned a lot too.
[+] [-] nefitty|10 years ago|reply
[+] [-] corysama|10 years ago|reply
[+] [-] wbeckler|10 years ago|reply
[+] [-] agentgt|10 years ago|reply
[+] [-] misiti3780|10 years ago|reply
[+] [-] noobie|10 years ago|reply
[+] [-] fhoffa|10 years ago|reply
On https://reddit.com/r/bigquery, /u/omicron_n2 left queries to repeat the experiment on HN and on reddit comments too:
- https://reddit.com/r/bigquery/comments/41py1v/top_30_books_o...
And a presentation by /u/Pentium10 on the same topic, using the books that redditors read:
- http://www.slideshare.net/martonkodok/complex-realtime-event...