Classic Papers: Articles That Have Stood the Test of Time

[+] drfuchs|8 years ago|reply

They completely missed, with 1800+ citations, the winner of the “Theory of Cryptography Conference (TCC) 2016 Test of Time award”: “Calibrating Noise to Sensitivity in Private Data Analysis” by Cynthia Dwork, Frank McSherry, Kobbi Nissim, and Adam Smith. Oh, it also just won the 2017 Gödel Prize; it really ought to be at the top of both the “Theoretical Computer Science” and “Computer Security and Cryptography” lists.

Worse still, with ~3000 citations, Dwork’s “Differential Privacy” (ICALP (2) 2006: 1-12), should rank even higher in the Theoretical Computer Science list. But Google Scholar has completely lost track of that foundational paper; it’s got it all confused with a completely different paper, Dwork’s 2008 “Differential Privacy: A Survey of Results”. Note that this also means that anybody searching for the general topic “differential privacy” on Google Scholar will not get to see the most-cited paper about it! https://www.microsoft.com/en-us/research/wp-content/uploads/...

Disclaimer: Dwork and I have been seen together, for 24 years.

[+] jventura|8 years ago|reply

From the article: "This release of classic papers consists of articles that were published in 2006..". Your second one could be there (I haven't looked for it), but you're mentioning some problems with the article, maybe it's that..

[+] nyrulez|8 years ago|reply

This has left me scratching my head - why just 2006 ? Having just one year of publications and labeling them "Classic Papers" is pretty misleading as the term is used to indicate a wide gamut of publications over a much longer period of time. It should be just called "Top papers or research from 2006". Unless this expands to at least cover a decade, it shouldn't be labeled as such.

This almost sounds like collecting my most liked pics from 2006 on Facebook and creating an album "Best moments of my life".

Do they not have data before 2006 ?

[+] vitus|8 years ago|reply

I was expecting "classic" to mean papers like Part-time Parliament, Mathematical Theory of Communication, Unix Time-Sharing System, etc. Certainly was in for a surprise...

They certainly do have data prior to 2006, based on Google Scholar results. It seems like an odd choice, but it's explicitly stated that these articles were chosen because they're roughly 10 years old.

I do find some of their choices a bit odd, though. Surely they can come up with better examples? The BigTable paper (OSDI '06) out of Google itself has far more citations (~4x per google scholar citation counts) of the highest-ranked DB paper, and I'd say it's much higher impact than any of them, being one of the early papers of the NoSQL movement. I'd understand if the algorithm in play were more nuanced, but the introductory page explicitly states that these are the most-cited papers of 2006, which doesn't seem to be the case.

Obligatory disclaimer: despite my current employment status, these views don't represent Google's.

[+] a3n|8 years ago|reply

> This has left me scratching my head - why just 2006 ?

As they said in the post, they're measuring cites 10 years after. It's 2017. I imagine 2006 is their "inaugural year."

[+] tandav|8 years ago|reply

I remember how I was googling deeply for the most cited papers / science articles of all time and didn't find anything.

I naively thought that it is a simple thing and someone have that "collection of best articles".

Things are going to more like "this is hard problem"

[+] amelius|8 years ago|reply

This should be as simple as running a query in e.g. scholar: select area/field, sort by most cited, while ignoring citations that occur within x years of publication. Also, one could expand the citation relation transitively (like pagerank but without cycles).

[+] unknown|8 years ago|reply

[deleted]

[+] ThomPete|8 years ago|reply

They will release more.

[+] diggan|8 years ago|reply

Nice list, but as many other said, seems to only be for 2006.

For more papers, there is a nice list here: http://jeffhuang.com/best_paper_awards.html not limited to 2006

There is a bunch more places to get papers listed here too: https://github.com/papers-we-love/papers-we-love#other-good-...

[+] bokertov|8 years ago|reply

Is the author JH He of the #1 paper in computational mathematics a self citing spammer?

https://www.google.com/amp/s/selfcitation.wordpress.com/2011...

[+] whynotqat|8 years ago|reply

As one might guess, there is a lot wrong with this list even within there stated goals. My examples are drawn from mathematics, since that's what I know. They appear to use the journal to classify category, which doesn't work very well since many of the best results are published in general journals. Additionally, since citation counts vary so widely between sub-fields, there is a strong pull towards selecting misclassified work from higher-citation fields. For example the paper "High-dimensional centrally symmetric polytopes with neighborliness proportional to dimension" is listed in geometry but belongs elsewhere, and there are no probability papers in the category "Probability and Statistics with Applications". Also, the "Pure & Applied" category is meaningless. That list seems to be the most cited papers from five arbitrary journals. I guess it's a reminder that these problems are hard to automate, and that your work doesn't have to be perfect to share.

[+] glup|8 years ago|reply

Cognitive Science suffers from the same problem of misclassifications from higher-citation fields (neuroscience).

Agreed that projects don't have to be perfect but it does have to have some functionality to ship... I don't see how I could use this could help me construct a course reading list or to improve my understanding of my academic field, given the problems.

[+] stonesixone|8 years ago|reply

Also, were you able to find any papers in number theory? That's a huge gap as it is one of mathematics's primary subfields. Analysis seems to represented, as well as topology (via "geometry").

[+] dev_tty01|8 years ago|reply

Should be labeled "Top cited papers of 2006" or something similar. Calling this collection "Classic Papers" is misleading at best.

[+] a3n|8 years ago|reply

No, it's an exactly accurate name for their feature. For which they have only yet released the 2006 edition.

[+] logicallee|8 years ago|reply

Out of curiosity, does anyone have any examples of scientific books (or papers) that are the exact opposite: influential or famous at the time but completely and utterly destroyed by the test of time. Like, that seem silly to us in how completely and utterly wrong they turned out to be in their every single conclusion.

I'm thinking about research versions of Lord Kevin's favorite edict: "Heavier than air flying machines impossible" or the patent person (examiner? head of patent office?) who in the nineteenth century said everything that can be invented has been invented.

[+] ferdterguson|8 years ago|reply

Not a field, but a person who everyone thought was Nobel prize bound and it turned out to be all BS. You may think that it's just one person, but the amount of research dollars that got allocated to try and prove or disprove all of this work would be staggering. https://en.wikipedia.org/wiki/Schön_scandal

[+] Houshalter|8 years ago|reply

Sure fields of research go obsolete all the time. E.g. much of the computer vision stuff from 2006 is basically dead now. If you go further back, a lot of early AI research was exciting at the time, but is entirely forgotten about now.

[+] dekhn|8 years ago|reply

Any paper about the luminiferous aether from before the Michelson Morley experiment?

[+] glup|8 years ago|reply

Methodology is not described and the resulting collections are of notably poor quality. Given Google's privileged position in knowledge production I wish they would be far more careful in cases like this.

[+] ivansavz|8 years ago|reply

For everyone disappointed to see papers only from 2006, here is a consolation prize. Creating a Computer Science Canon: a Course of “Classic” Readings in Computer Science: http://l3d.cs.colorado.edu/~ctg/pubs/sigcsecanon.pdf (CS only, date range = [1806:2006])

[+] kensai|8 years ago|reply

This is also very interesting: the AAAI Classic Paper Award.

The AAAI Classic Paper award honors the author(s) of paper(s) deemed most influential, chosen from a specific conference year. Each year, the time period considered will advance by one year.

Papers will be judged on the basis of impact, for example:

    Started a new research (sub)area
    Led to important applications
    Answered a long-standing question/issue or clarified what had been murky
    Made a major advance that figures in the history of the subarea
    Has been picked up as important and used by other areas within (or outside of) AI
    Has been very heavily cited

https://aaai.org/Awards/classic.php

[+] joatmon-snoo|8 years ago|reply

Noticeably missing: Gray and Lamport's "Consensus on Transaction Commit"

[+] spatulon|8 years ago|reply

That paper appears to have been published in 2004, not 2006.

[+] idlewords|8 years ago|reply

In the Middle Eastern and Islamic Studies section, five of the ten cited papers are about Turkey. Another is about representation of Islam in the Australian media.

This... doesn't seem like a very representative selection of 'timeless' papers.

[+] nickpsecurity|8 years ago|reply

The security examples were weak. Far more influential were the Ware or Anderson reports, MULTICS security evaluation, anything describing Orange Book-style systematic assurance of whole systems, at least one on capability-security or by Butler Lampson (did access control too), something on monitoring/logging, something on static analysis, CompCert or Coq, and so on.

Things that had a major impact on the problems they focused on which many other papers doing something similar built on or constantly referenced. I'm skeptical of citations in general since those who chase them usually do a high number of quotable papers in whatever fad is popular instead of hard, deep, and critical work. Those I listed are the latter with who knows what citations. The collection is probably still nice for finding neat ideas or just learning in general.

[+] nadim|8 years ago|reply

Classic Albums: 5 Mics in the Source

https://en.wikipedia.org/wiki/The_Source#The_Source.27s_Five...

[+] Aardappel|8 years ago|reply

No "programming language design and implementation" category?

[+] seanmcdirmid|8 years ago|reply

Looks like those would be under "Software systems."

[+] hkon|8 years ago|reply

For computer science, I find most useful papers are from before 1990. Looking forward to that being included.

[+] threepipeproblm|8 years ago|reply

Ironically, you have to copy, paste and Google the titles of most of these to find downloadable versions.

[+] blt|8 years ago|reply

sci-hub.cc can help with those that don't show a PDF in the Google results.

[+] husamia|8 years ago|reply

all the articles were only published in 2006! I tried to change the data to 2017 but it didn't work

[+] teddyh|8 years ago|reply

Flagged for misleading headline.

[+] qrbLPHiKpiux|8 years ago|reply

A lot has happened in my profession since 2006...

[+] jldugger|8 years ago|reply

But it wouldn't necessarily be a 'classic'.

The point of the exercise is to find papers that are widely considered valuable, especially to other researchers. To do this, they're using citation counts.

There's obviously a number of problems with citations, including self-cites, negative citations ("Alice & Bob '06 shook the community when they found things, but our better, larger study finds no evidence of any effect"), and such. But it makes sense for a company built upon citation rank indexing to rely on such methods =)

52 comments