The example in the OP (fuck) was so common until the early 1800s because of the typographic convention to substitute an f for an s. In other words, the word "suck" was being written as "fuck", which is why the word appeared so often until the early 1800s.
If we assume all pre-1800ish mentions of 'fuck' are definitely meant to be 'suck', it still features much more prominently in the corpus beforehand than after.
Any ideas why that might be? E.g. certain types of text that were more common before that era, or other (less, er, 'suck'y) types of text that came after, 'diluting' the corpus?
I have no idea, but my guess is that they don't know the dates for some books and the system automatically classifies the publication date as "1900" or "1901." If you search the word "quark," you also get a bump at around 1900 even though the word wasn't coined until Joyce's Finnegans Wake in 1939.
I find it kind of interesting that a lot of words peak around the middle of the 19th century and have been in decline ever since. I'm guessing this has something to do with the increasing number of books published but it is still kind of hard for me to imagine that "the" is less commonly used now than one hundred years ago. The pattern holds true for a lot of common words...
[+] [-] EliAndrewC|15 years ago|reply
[+] [-] orls|15 years ago|reply
If we assume all pre-1800ish mentions of 'fuck' are definitely meant to be 'suck', it still features much more prominently in the corpus beforehand than after.
Any ideas why that might be? E.g. certain types of text that were more common before that era, or other (less, er, 'suck'y) types of text that came after, 'diluting' the corpus?
[+] [-] tseabrooks|15 years ago|reply
[+] [-] jcr|15 years ago|reply
[+] [-] Groxx|15 years ago|reply
Potentially even more awesome is that they have the entire dataset available for download o_O
edit: case sensitivity is more fun than insensitivity: http://ngrams.googlelabs.com/graph?content=Star+Trek%2Cstar+... vs http://ngrams.googlelabs.com/graph?content=star+trek%2CStar+...
edit2: there are a whole bunch of geek-term bumps around and just after 1900. Anyone know why? E.g.: http://ngrams.googlelabs.com/graph?content=Star+Wars&yea...
[+] [-] splat|15 years ago|reply
[+] [-] PetrolMan|15 years ago|reply
[+] [-] sylvinus|15 years ago|reply
http://ngrams.googlelabs.com/graph?content=google&year_s...
[+] [-] edge17|15 years ago|reply
[+] [-] thekevan|15 years ago|reply
http://ngrams.googlelabs.com/graph?content=smartphone&ye...
(Actually, "internet" also has a similar spike. I suspect some books are mislabeled in their dates.)
[+] [-] nrkn|15 years ago|reply
http://www.google.com/search?q=%22internet%22&tbs=bks:1,...
[+] [-] jalmos|15 years ago|reply
http://ngrams.googlelabs.com/graph?content=nigger&year_s...
[+] [-] iunk|15 years ago|reply
[+] [-] Groxx|15 years ago|reply
Perhaps weirder, "Woot": http://ngrams.googlelabs.com/graph?content=Woot&year_sta...
[+] [-] ryan42|15 years ago|reply
http://ngrams.googlelabs.com/graph?content=liberty&year_...
http://ngrams.googlelabs.com/graph?content=l33t&year_sta...
http://ngrams.googlelabs.com/graph?content=hacker&year_s...
[+] [-] unknown|15 years ago|reply
[deleted]
[+] [-] samuel1604|15 years ago|reply
[+] [-] dlsspy|15 years ago|reply
[+] [-] unknown|15 years ago|reply
[deleted]
[+] [-] unknown|15 years ago|reply
[deleted]
[+] [-] prat|15 years ago|reply
[deleted]