Pretty long post for "I know that I know nothing".
However, there is gold in the comment section: Kristen McLean actually throws some numbers at us [1]. "66% of those books from the top 10 publishers sold less than 1,000 copies over 52 weeks". Well, uh, that's what I thought. Interesting nonetheless.
I know that I know nothing is a very important realization, though, and if a lot of words helps someone else realize that they know nothing, too, it's probably worth it.
Actually I think the post highlights that even the statistic you quoted means less than one thinks it means. For example, some of the books included may have been only a few weeks on the market. Some titles may be niche books with the expected lower sales volume calculated in the price. BookScan only covers a certain percentage of print sales in the US, the total sales could be more than double that.
I used to work for a big name scientific publisher, and we published a lot of super-niche research monographs that didn't sell many copies.
What a lot of people don't understand was that our bread and butter came from university libraries. We published a lot of books not because they sold individually, but libraries wanted them. They like collections!
"The long and short of it is publishing is very much a gambler's game, and I think that has been clear from the testimony in the DOJ case. It is true that most people in publishing up to and including the CEOs cannot tell you for sure what books are going to make their year."
My first self-published book sold maybe half a dozen copies, and there even was one refund. The reason is simple. It is a bad book.
The next one sold none. Which is expected since I made it openly available from my site. It hit about half a million downloads in the first year. Alas, most of them were from one IP somewhere in Czechia so I guess someone's spider got stuck in the web. But real people downloaded it too, about 12 000 copies. I even got a positive review from John D. Cook of which I'm extremely proud https://www.johndcook.com/blog/2020/04/24/programming-langua...
I'm now working on Geometry for Programming book for Manning https://www.manning.com/books/geometry-for-programmers. This would be my first collaboration with a publisher and so far it looks like it will become a moderate success. The early access sales are good, the reviews are encouraging. Even though, I would expect first year sales to be in thousands, maybe even low tens if we're lucky but no more. Definitely not enough to live of.
So while the post doesn't reveal any specific statistics, it does agree with my experience. Most books might indeed sell in very low numbers but they are not really published and maybe not even really books. If you're working with a well established publisher, you will get some moderate sales numbers guaranteed, and a even small chance to go big.
interesting. does your book cover geohashing. I've been curious how points and polygons get converted to geohashes and how the search using these work. I currently postgis as a black box but I haven't been able to find any books or articles that really explain in depth how they work.
My car broken, so I had to take a long intercity bus trip, for the first time in 15 years. After a short stop in a station, the driver ignited the bus again, but he forgot to turn the internal lights system on - so I couldn't read. I waited a few minutes, but he didn't turn it on, so I had to get up to ask him in the cabin. Coming back, that's when I realized mine was the one and only light on: none of the other 100 passengers had it, so they couldn't possibly be reading, unless electronically. 15 years before, I did remember, it would be at least 3 lights on most of the times, tops 5, but being the only one made me feel lonely.
The overwhelming majority of e-readers nowadays have a backlight, right? And reading "electronically" is still reading. But I guess I see your point: reading "traditionally" isn't something that many people do. Just like sitting to listen to an album from start to finish, which is something I did a lot when a was a teenager.
Audiobooks are the new travel book. I'm old school and like reading my books, but everyone else I know consumes them as audiobooks the majority of the time.
The whole 2% makes 95% of all revenues doesn't just apply to books... it certainly is true for the video game industry, and I suspect, most of the industries being sold online.
On video game sales, I find it interesting how successful games can be while being 90% the same as an existing established game.
I’m not coming at this from some kind of ethics/copyright perspective, I think it’s totally fine to rehash the same idea. I’m just surprised they sell so well when I look and think “why would I buy this, I basically already have it”. The particular example I’m thinking of is a series called Overcooked which is a co op time management game and right now one of the hot sellers on steam is called Plate Up which seems to be basically the same thing at a similar level of polish.
The real learning I guess is to not be discouraged by existing products, because they don’t seem to prevent you from succeeding even when you can’t quite define why yours is better than the rest.
Yeah power law distributions are everywhere nowadays. Nassim Taleb talks about this a lot and uses book sales as a reference, which is cheeky since he’s doing it in his own best seller books.
Anyone writing for publishers outside the top 10 (tbh, big 5) and KDP, isn't writing to make money - at least initially. They are writing because they like writing books. The vast majority of authors really just want people to read the book they often spent an enormous amount of time creating.
My father wrote technical books through the 80's and 90's. I remember him being very happy about a book that sold in the low tens of thousands of copies (I'm guessing 20k or so). I know some of them flopped. Even as a child I knew that it did not make sense financially... stay up til 4am writing for a solid year or more (he had a regular job as well), so that you can get maybe 5-10% royalty on something that likely will sell only a few thousand copies on average. The ephemeral natural of computer books makes it even worse.
Having written a bunch of books on Flash and PHP back in the early 2000s, my experience is very much aligned with this.
For a relatively young guy the income was welcome, and after the first book did well I got a pretty decent advance for the subsequent books. They all sold enough to pay back that advance and so I got quarterly royalties on top too, but given the amount of toil it required - and quick turnaround times with new Flash releases, meaning many late nights - I would have been better compensated doing pretty much anything else.
That said, money isn't the only, or even necessarily the most important or fulfilling, reward on offer. I got the writing gig because I was spending a lot of my free time on various Flash-related forums answering questions and helping people with their projects. I did that for the sheer joy of helping others along in their learning journey, and saw writing books as a massive extension of that.
It very much mirrors my experience of the public school system in the UK: teachers are chronically overworked and underpaid, but do it anyway because it's something of a calling. In fact, that probably applies to a bunch of other public sector roles too, not least the NHS primary care roles.
It's like any hobby. Sometimes people are able to take their hobby full time after a massive success. Most of the time it's a hobby that provides satisfaction.
I forget if there’s a term for this, but I once read (and subsequently discovered) that a large portion of disagreements are simply because people are working with different definitions for things.
This seems like an example of that: depending on how you define “book” these claims are accurate or not.
But the point is that most people will take the literal definition of "book". They wouldn't differentiate between hardcover, paperback, ebooks, audiobooks. If an "expert" said "90% of books sell less than 99 copies", without any context, most people will assume that the book sold less than 99 copies across all its manifestations and in a lifetime. The article puts light on the fact that there's more to that statement than what it means in a first glance.
Yes, I've also struggled to pause many disagreements just to clarify terms, ironically because the other person didn't get what I was doing since there is no common word for this sort of disagreement.
This and context. It is weird finding oneself in an extended argument only to find out you were talking about two entirely different contexts. Saturday Night live's "Emily Litella" skits on the weekend news[1] used this situation to good effect.
I really don't like "fact-checking" articles like this which don't contain many useful facts, only pedantry. The first comment (by Kristen McLean from NPD BookScan) is much more interesting than the article itself:
>>>0.4% or 163 books sold 100,000 copies or more
>>>0.7% or 320 books sold between 50,000-99,999 copies
>>>2.2% or 1,015 books sold between 20,000-49,999 copies
>>>3.4% or 1,572 books sold between 10,000-19,999 copies
>>>5.5% or 2,518 books sold between 5,000-9,999 copies
>>>21.6% or 9,863 books sold between 1,000-4,999 copies
>>>51.4% or 23,419 sold between 12-999 copies
>>>14.7% or 6,701 books sold under 12 copies
So, ~66.1% or 2/3 of books in their dataset sell under a thousand copies.
The pedantry was intended to point out that there is plenty of room for publishers to mislead when they don't detail how the data is collected. When you don't have access to the data, it is usually the best one can do.
Even the comment by Kristen McLean has limits, though they are much more forthcoming about what the data includes. That said, I think they summed it up best when they said publishing is a gambler's game. That being said, whether the outcome is good or bad for a gambler depends upon how much they invested and the return across all of those bets. Their data does not venture into financial aspects. At best, it gives us an idea of the minimum number of units sold in a particular subset of the market.
There were many useful facts that I hadn't considered, to be honest, especially regarding what was counted (probbly print sales tracked by BookScan) and how (unique ISBNs).
Also, whenever some party brings up stats to make some point, I think it's fair to examine their figures, methodology and so on and just be a bit pedantic about it. And this was a claim made in a major antitrust case. The facts, and the general pedantry, show some serious likely issues and raise some important questions about the figures a party in a major trial made, and which were then - mostly uncritically - echoed all over the place.
Regarding Kristen McLeans comment:
The "in their dataset" is a very important detail that shouldn't be overlooked. The dataset for this is the book sales figures of the top 10 publisher published books. They get the numbers from partnered retailers (some 75% of retailers according to the article), and it only covers print copies, not ebooks, not audio books. It does not cover direct sales to larger organizational buyers, like library systems, either.
And it's grouped by unique ISBN, not by title. As the article points out, most books come in many editions (hardcover, paperback, etc), each of which has their own ISBN. The article author illustrates this by telling about his own book, which is one book with 4 different unique ISBNs (though one was for an ebook, and one for the audio book, both of which wouldn't be covered by this dataset anyway).
The author's first point was that anything with a unique ISBN is treated as a different book.
> From a sales tracking perspective, books are published in multiple formats, each with different ISBNs. I wrote one novel, but from a title count POV I actually published 4 books: hardcover, paperback, ebook, and audiobook. Other books have even more formats (mass market version, movie tie-in editions, etc.) and because they all have different ISBNs, they all have different sales figures.
I would like to see stats that collapse across different versions of (largely) the same text (including new versions or editions of text books, and re-releases that include some special commentary, etc.)
I think those data points work together with the "let's think this through for a bit" approach of the article. Particularly, note that the stats that you quoted in your comment are for a 52-week period and include everything that could be considered a "book." I think it's quite logical that 2007's Farmer's Almanac would sell less than 12 copies in the last year, or some vanity-published, typo-laden sci-fi novel of the type my family would sometimes get for me for Christmas (because if it's sold by the Amazon dot Com then it can't be that bad, right?).
I dunno, seems that the “spirit” of these comments is correct. I’ve self-published half a dozen small books that have only low double digit sales. The comment that goes through the data is also really interesting.
Good article and then a data-rich comment from Kristen McLean. This article makes me feel better as an author. 30 years ago, the first 2 books I wrote for the scientific publisher Springer-Verlag didn’t sell many copies, although I have received a lot of nice feedback about the first book over the years.
After writing 10 books for traditional publishers, all fairly niche technical books, I switched to self publishing and my readership has gone up dramatically (I think). When my publisher returned book rights to me, I released the second edition of my Java AI book under a Creative Commons license. For the 5 years that I tracked downloads from my web site (https://markwatson.com) I averaged 300 downloads a day over 5 years, and the book PDF was also downloadable from many other sites. I imagine that most people downloading a free PDF only read a small part of my open content books, but I have no data on that.
Currently I distribute on the leanpub.com platform (which I recommend!!) and I get about 50 free downloads for each time a person decides to voluntarily pay for one of my open content books. The exception is my Common Lisp book for which about 1 in 20 people choose to pay for it.
Writing is a lot of fun and it has opened the door to meeting and sometimes becoming friends with some amazing people.
I would expect that most people who try to write a book and sell it will sell exactly zero copies. They will successfully give a few copies to their friends, and Mom, and that's about it.
This started well but it seems like a crap take at tweetsplaining
> When people reference book sales, they’re typically talking using NPD Group’s BookScan numbers
The guy is an author and this is the meaning of sales he pulls first? really?
Number of sales is what your publisher will tell you. Because they owe you money for every sale done. Because a sale is how many times people clicked the 'buy' button (be it digital, Amazon, or went into a bookstore).
"BookScan numbers" LOL
> there are dramatic differences between 1) lifetime sales, 2) sales in the first 12 months after publication, and 3) sales in any random calendar year.
By the context it's obvious what they mean FFS. This guy can't interpret a tweet and came up with this clickbait crap?
The thing about that dozen copies quote that everyone seems to gloss over is “trade titles”; every time I've seen that phrase anywhere else in a publishing context it's been distinguishing, in paperbacks, from mass market titles.
Someone may write a technical book for a niche community. The book may help others, build a portfolio etc. It doesn't need to earn lots of sales to be satisfying or useful.
I've published three books with a publisher, and the one I'm most proud of and which got the best reviews (19 ratings on Amazon, all 5 stars) is the most niche, and got the lowest sales numbers.
Another misleading statement is that most books sell a thousand or fewer copies. This makes no delineation between fiction, which is generally intended for a large, broad audience, compared to non-fiction, which is more specific and more books are published as non fiction. Technical, reference books are expected to sell far fewer copies compared to teen fiction.
[+] [-] lovingCranberry|3 years ago|reply
However, there is gold in the comment section: Kristen McLean actually throws some numbers at us [1]. "66% of those books from the top 10 publishers sold less than 1,000 copies over 52 weeks". Well, uh, that's what I thought. Interesting nonetheless.
[1] https://countercraft.substack.com/p/no-most-books-dont-sell-...
[+] [-] thomascgalvin|3 years ago|reply
[+] [-] dalai|3 years ago|reply
[+] [-] lake_vincent|3 years ago|reply
[+] [-] lukeschwartz|3 years ago|reply
And this is the unsurprising TLDR:
"The long and short of it is publishing is very much a gambler's game, and I think that has been clear from the testimony in the DOJ case. It is true that most people in publishing up to and including the CEOs cannot tell you for sure what books are going to make their year."
[+] [-] okaleniuk|3 years ago|reply
The next one sold none. Which is expected since I made it openly available from my site. It hit about half a million downloads in the first year. Alas, most of them were from one IP somewhere in Czechia so I guess someone's spider got stuck in the web. But real people downloaded it too, about 12 000 copies. I even got a positive review from John D. Cook of which I'm extremely proud https://www.johndcook.com/blog/2020/04/24/programming-langua...
I'm now working on Geometry for Programming book for Manning https://www.manning.com/books/geometry-for-programmers. This would be my first collaboration with a publisher and so far it looks like it will become a moderate success. The early access sales are good, the reviews are encouraging. Even though, I would expect first year sales to be in thousands, maybe even low tens if we're lucky but no more. Definitely not enough to live of.
So while the post doesn't reveal any specific statistics, it does agree with my experience. Most books might indeed sell in very low numbers but they are not really published and maybe not even really books. If you're working with a well established publisher, you will get some moderate sales numbers guaranteed, and a even small chance to go big.
[+] [-] cultofmetatron|3 years ago|reply
[+] [-] gverrilla|3 years ago|reply
[+] [-] obruchez|3 years ago|reply
[+] [-] matwood|3 years ago|reply
[+] [-] spoonjim|3 years ago|reply
[+] [-] keyle|3 years ago|reply
[+] [-] Gigachad|3 years ago|reply
I’m not coming at this from some kind of ethics/copyright perspective, I think it’s totally fine to rehash the same idea. I’m just surprised they sell so well when I look and think “why would I buy this, I basically already have it”. The particular example I’m thinking of is a series called Overcooked which is a co op time management game and right now one of the hot sellers on steam is called Plate Up which seems to be basically the same thing at a similar level of polish.
The real learning I guess is to not be discouraged by existing products, because they don’t seem to prevent you from succeeding even when you can’t quite define why yours is better than the rest.
[+] [-] nemo44x|3 years ago|reply
[+] [-] matwood|3 years ago|reply
[+] [-] spixy|3 years ago|reply
[+] [-] wheresmycraisin|3 years ago|reply
[+] [-] spjwebster|3 years ago|reply
For a relatively young guy the income was welcome, and after the first book did well I got a pretty decent advance for the subsequent books. They all sold enough to pay back that advance and so I got quarterly royalties on top too, but given the amount of toil it required - and quick turnaround times with new Flash releases, meaning many late nights - I would have been better compensated doing pretty much anything else.
That said, money isn't the only, or even necessarily the most important or fulfilling, reward on offer. I got the writing gig because I was spending a lot of my free time on various Flash-related forums answering questions and helping people with their projects. I did that for the sheer joy of helping others along in their learning journey, and saw writing books as a massive extension of that.
It very much mirrors my experience of the public school system in the UK: teachers are chronically overworked and underpaid, but do it anyway because it's something of a calling. In fact, that probably applies to a bunch of other public sector roles too, not least the NHS primary care roles.
[+] [-] _tom_|3 years ago|reply
[+] [-] cfeduke|3 years ago|reply
[+] [-] Waterluvian|3 years ago|reply
This seems like an example of that: depending on how you define “book” these claims are accurate or not.
[+] [-] psnehanshu|3 years ago|reply
[+] [-] mihaic|3 years ago|reply
If anyone has a good one, please chime in.
[+] [-] ChuckMcM|3 years ago|reply
[1] https://www.youtube.com/watch?v=fZLeaSWY37I
[+] [-] idiolects|3 years ago|reply
[+] [-] matwood|3 years ago|reply
[+] [-] phantom_of_cato|3 years ago|reply
>>>0.4% or 163 books sold 100,000 copies or more
>>>0.7% or 320 books sold between 50,000-99,999 copies
>>>2.2% or 1,015 books sold between 20,000-49,999 copies
>>>3.4% or 1,572 books sold between 10,000-19,999 copies
>>>5.5% or 2,518 books sold between 5,000-9,999 copies
>>>21.6% or 9,863 books sold between 1,000-4,999 copies
>>>51.4% or 23,419 sold between 12-999 copies
>>>14.7% or 6,701 books sold under 12 copies
So, ~66.1% or 2/3 of books in their dataset sell under a thousand copies.
[+] [-] II2II|3 years ago|reply
Even the comment by Kristen McLean has limits, though they are much more forthcoming about what the data includes. That said, I think they summed it up best when they said publishing is a gambler's game. That being said, whether the outcome is good or bad for a gambler depends upon how much they invested and the return across all of those bets. Their data does not venture into financial aspects. At best, it gives us an idea of the minimum number of units sold in a particular subset of the market.
[+] [-] rndgermandude|3 years ago|reply
Also, whenever some party brings up stats to make some point, I think it's fair to examine their figures, methodology and so on and just be a bit pedantic about it. And this was a claim made in a major antitrust case. The facts, and the general pedantry, show some serious likely issues and raise some important questions about the figures a party in a major trial made, and which were then - mostly uncritically - echoed all over the place.
Regarding Kristen McLeans comment:
The "in their dataset" is a very important detail that shouldn't be overlooked. The dataset for this is the book sales figures of the top 10 publisher published books. They get the numbers from partnered retailers (some 75% of retailers according to the article), and it only covers print copies, not ebooks, not audio books. It does not cover direct sales to larger organizational buyers, like library systems, either.
And it's grouped by unique ISBN, not by title. As the article points out, most books come in many editions (hardcover, paperback, etc), each of which has their own ISBN. The article author illustrates this by telling about his own book, which is one book with 4 different unique ISBNs (though one was for an ebook, and one for the audio book, both of which wouldn't be covered by this dataset anyway).
[+] [-] subroutine|3 years ago|reply
> From a sales tracking perspective, books are published in multiple formats, each with different ISBNs. I wrote one novel, but from a title count POV I actually published 4 books: hardcover, paperback, ebook, and audiobook. Other books have even more formats (mass market version, movie tie-in editions, etc.) and because they all have different ISBNs, they all have different sales figures.
I would like to see stats that collapse across different versions of (largely) the same text (including new versions or editions of text books, and re-releases that include some special commentary, etc.)
[+] [-] Cyberdog|3 years ago|reply
[+] [-] yieldcrv|3 years ago|reply
I think that 66% selling under 1,000 is just as damning as the "less than 12 copies" thing making the rounds from the DOJ interpretation
Books ain't it!
[+] [-] codazoda|3 years ago|reply
[+] [-] lifeisstillgood|3 years ago|reply
[+] [-] Andrew_nenakhov|3 years ago|reply
[+] [-] mark_l_watson|3 years ago|reply
After writing 10 books for traditional publishers, all fairly niche technical books, I switched to self publishing and my readership has gone up dramatically (I think). When my publisher returned book rights to me, I released the second edition of my Java AI book under a Creative Commons license. For the 5 years that I tracked downloads from my web site (https://markwatson.com) I averaged 300 downloads a day over 5 years, and the book PDF was also downloadable from many other sites. I imagine that most people downloading a free PDF only read a small part of my open content books, but I have no data on that.
Currently I distribute on the leanpub.com platform (which I recommend!!) and I get about 50 free downloads for each time a person decides to voluntarily pay for one of my open content books. The exception is my Common Lisp book for which about 1 in 20 people choose to pay for it.
Writing is a lot of fun and it has opened the door to meeting and sometimes becoming friends with some amazing people.
[+] [-] helsinkiandrew|3 years ago|reply
Sadly, this is why books by celebrities or with a TV or movie tie in are the first to be published and have the most marketing spend.
[+] [-] kazinator|3 years ago|reply
[+] [-] raverbashing|3 years ago|reply
This started well but it seems like a crap take at tweetsplaining
> When people reference book sales, they’re typically talking using NPD Group’s BookScan numbers
The guy is an author and this is the meaning of sales he pulls first? really?
Number of sales is what your publisher will tell you. Because they owe you money for every sale done. Because a sale is how many times people clicked the 'buy' button (be it digital, Amazon, or went into a bookstore).
"BookScan numbers" LOL
> there are dramatic differences between 1) lifetime sales, 2) sales in the first 12 months after publication, and 3) sales in any random calendar year.
By the context it's obvious what they mean FFS. This guy can't interpret a tweet and came up with this clickbait crap?
[+] [-] dragonwriter|3 years ago|reply
[+] [-] ZiiS|3 years ago|reply
[+] [-] memling|3 years ago|reply
There's a good book on this called "Damned Lies and Statistics" that I recommend to people.[1,2]
[1]: https://archive.org/details/damnedliesstatis00best
[2]: https://www.ucpress.edu/book/9780520274709/damned-lies-and-s...
[+] [-] erwincoumans|3 years ago|reply
[+] [-] perlgeek|3 years ago|reply
I've published three books with a publisher, and the one I'm most proud of and which got the best reviews (19 ratings on Amazon, all 5 stars) is the most niche, and got the lowest sales numbers.
https://smile.amazon.com/Parsing-Perl-Regexes-Grammars-Recur... if anybody is interested.
[+] [-] jspash|3 years ago|reply
Why was the leading "No" removed?
[+] [-] paulpauper|3 years ago|reply
[+] [-] unknown|3 years ago|reply
[deleted]
[+] [-] marcodiego|3 years ago|reply