More than twenty years ago, I had fun tracing a similar phenomenon: English “proverbs” that appeared in English dictionaries and textbooks published in Japan but that did not seem to have any actual currency in English. It became clear that they had been copied from dictionary to dictionary for decades before large-scale corpora and search engines made it possible to check actual usage.
Which stated: "Geologically, the cape is a flat uplifted seafood plateau"
My comment for the change: I'm not an oceanographer, but I'm pretty sure it's not a "seafood plateau". Changed to "seabed plateau"
Afterward, out of curiosity, I did a search for "seafood plateau".
I was shocked at the number of sites that exactly copied that error along with the rest of the page. Most of these sites were clones of wikipedia with the inclusion of ads.
It didn't seem that these sites were LLM generated (they were exact copies), but this seems to be the case for many scientific paper submissions now.
Where it all goes from here is extremely unclear, but it does seem a disruption to many fields which are dependent on written material is in progress...
A friend did an edit (though you could call it vandalism) of a Wikipedia 20 years back. He linked from several pages to a non-existing apportionment method, and created an article with a fairer version of d'Hondt for elections, quite ingenious and probably more fair than the popular alternatives in most cases. He named it after himself (he has an unusual last name and capitalised on that).
It didn't take long for the page to be dropped for being original research, and he didn't put it anywhere else.
To this day, you can still find pages and people referencing the method.
Edit: a quick check and Grok and ChatGPT have scraped it, Gemini hallucinates something unrelated.
Of course much of the web is composed of copying, and of course copies of Wikipedia are copied--that's hardly relevant. But science journals are another matter. From the article: "shouldn't the peer reviewers and proofreaders at a top journal catch this error?"
I'm inclined to give them a pass. It's easy enough to figure out that it should be germanium and not gadolinium, and dyslexia already exists among scientists. Context provides enough information to correct the record.
I didn't catch the error the first time around because I autocorrected to Ge--there are only so many anions that can make that formula work and staring at these formulas all day long can make you go cross eyed anyway.
What I think is more dangerous to understanding is skipping formulas in favor of initials! BFO instead of BiFeO3, or BT instead of Bi2Te3, SRO for SrRuO3, LSFO for La0.3Sr0.7FeO3 abbreviations that I think obscure too much detail. You can more easily wander into talking about different things with the same terms. Such abbreviations are already endemic in condensed matter physics.
> I'm inclined to give them a pass. It's easy enough to figure out that it should be germanium and not gadolinium, and dyslexia already exists among scientists.
People make mistakes and you probably mean well but this is also the sort of pass given that makes scientific research and reporting terrible.
If it's "easy enough to figure out" then it's even more important to get it right -- why should we trust someone who can't even get the "easy" things right?
> ... and dyslexia already exists among scientists.
The article is pointing out a problem that appears to be fairly common, is that really a suitable explanation? Even if it is a suitable explanation, is that a reason for lowering standards, which you can then apply to explain away every mistake?
Keep in mind that proper publications should usually have been reviewed by at least 3 people including the authors (typically more) by the time everyone else gets to read it. So that kind of mistake isn't really acceptable.
> What I think is more dangerous to understanding is skipping formulas in favor of initials! BFO instead of BiFeO3, or BT instead of Bi2Te3, SRO for SrRuO3, LSFO for La0.3Sr0.7FeO3 abbreviations that I think obscure too much detail. You can more easily wander into talking about different things with the same terms. Such abbreviations are already endemic in condensed matter physics.
If you have been trained in scientific writing, you would always introduce an abbreviation. For example, "BiFeO3 (BFO)" and "SrRuO3 (SRO). It's also common to include a list of abbreviation in some forms of scientific writing.
> I'm inclined to give them a pass. It's easy enough to figure out that it should be germanium and not gadolinium, and dyslexia already exists among scientists.
I’m not. If somewhat said Pi was 9.14 I think no one would give it a pass. It’s not like a misspelling. It’s an invalid element which is the chemistry equivalent of an absurdly wrong number in maths.
The typo is not the problem; it's that the typo is evidence of academic dishonesty.
When you make a citation, it means you cracked open the original work, understood what it says and located a relevant passage to reference in your work.
The authors are propagating the same typo because they are not copying the original correct text; they are just copying ready-made citations of that text which they plant into their papers to manufacture the impression that they are surveying other work in their area and taking it into account when doing their work.
They survey one or two works, and then just steal their citations to make it look like they also surveyed 19 other works.
Problem is, the citations in those words are already copies of borrowed citations from some other paper, which copied some of them from another paper and that was the honest one that made a typo in a genuine, organically grown citation.
Researchers are blindly copy and pasting lists of citations into papers, because they did original work in a vacuum; i.e. without taking the time to study anyone else's work in the same area to understand where the field is at. Since papers without citations, or with too few citations, are giant red flags for publication, they need to generate something to mask the problem.
Laurent Bossavit wrote a whole book about similar cases occurred in the IT world, “The Leprechauns of Software Engineering
How folklore turns into fact and what to do about it”
Gr is the science journal version of Van Halen's brown M&M rider -- it's how you can tell the reviewers and the authors had no idea what they were doing and just copy pasted junk around.
I think established authors should try to sprinkle obvious mistakes like that on purpose once in a while in the literature and then see how much it spreads.
The Van Halen one is true. They had a crazy tour set up for the time and had very intense electricity requirements where if something wasn't properly set up it could literally kill someone. Any musician who has played a shitty venue has been zapped by a mic. The brown M&Ms were a canary in a coal mine to see if requirements were being followed. You can go on Snopes and literally see a concert rider from them.
If you ask ChatGPT about Cr2Gr2Te6 then it will correct you. The author's worry might be unfounded.
Though since he didn't date his article, it's unclear how long it has been out there so unclear as well whether it made its way into training data. Judging from the comments and the URL, it's quite new, but again, he should add a date to his articles.
As any practicing scientist knows even good research papers may be littered with blatant but unimportant errors. There is unfortunately no good reason or system to "correct the record", and it is not clear to me if such a thing is a good use of human resources. Nonetheless, I think correcting the record is always appreciated!
Getting a compound incorrect is not an "unimportant" error (for example the difference between sodium nitrate & sodium nitrite is small but critical) and seeing "small but blatant" errors actively propagated is the entire reason why the record should be corrected. The only upside of these little artifacts like "vegetative electron microscopy" [0] is that it's a leading indicator that the entire paper and team deserve more scrutiny--as well as any of those whom cite it.
That is a possible, but charitable explanation. I would like to hold your opinion, but don't know if I can. It must complete with less-charitable ones.
Folks need a linter, a compound checker, as part of their writing workflow. This seems like a simple idea that makes these errors show up with a squiggly red line under it.
I’m beginning to think my reluctance to shamelessly copy has held me back in life. It’s clearly more widespread than I naively assumed (and I say that without casting judgment).
In Quantum Mechanics the professors of my University consistently confused the terms Tensor-Product and Direct-Product. They all taught in lecture that the Tensor-Product was called "Direct-Product". In Mathematics this is just wrong. The definitions about what is what has been clear for about 100 years...
I called them out on that. The end result was, that the professor offered a bet in front of audience that he was right. The thing was simple - you just have to look up the definitions in any mathematical book. But nobody did this... Next lecture the professor declared himself the winner of the bet. The audience collected money. And on the next big student event they presented him a bottle of some nice alcohol as a price for his win. (They stopped Music for the party and made a big event about handing him the bottle)...
I learned that in University people aren't even able to look up a mathematical definition in a book... Nobody cares, especially those students that like to organise things don't - they surely meanwhile have made career as big heads in University councils...
Solution of the confusion I think was that in the beginning of QM the terms in mathematics were not so well defined yet. In 1910 Physics people most likely copied some wrong terminology - and some of it most likely can still be found in footnotes somewhere in physics - or in some oral tradition of local groups of professors.
Ok, but if they used the right reference it'd be the wrong reference. Just like when a code base contains typos. You know it's a typo but if you try to fix it, you know really know how it's reference external to your code base.
[+] [-] tkgally|7 months ago|reply
“Every man has his humo(u)r.”
https://www.gally.net/leavings/00/0001.html
“Losers are always in the wrong.”
https://www.gally.net/leavings/00/0098.html
In their heyday, dozens of English-Japanese dictionaries were published in Japan:
https://www.gally.net/leavings/00/0005.html
Producing an original dictionary from scratch would have been expensive and time consuming, so most publishers borrowed liberally from each other.
[+] [-] floren|7 months ago|reply
Craunch the marmoset!
[+] [-] Izkata|7 months ago|reply
I think this is the likely origin: https://en.wikipedia.org/wiki/Every_Man_in_His_Humour
But I think I've also heard the "has" phrasing as a pun on https://en.wikipedia.org/wiki/Humorism
[+] [-] javawizard|7 months ago|reply
[+] [-] thrashwerk|7 months ago|reply
[+] [-] johnea|7 months ago|reply
I recently corrected an error in this wikipedia article:
https://en.wikipedia.org/wiki/Cape_Shionomisaki
Which stated: "Geologically, the cape is a flat uplifted seafood plateau"
My comment for the change: I'm not an oceanographer, but I'm pretty sure it's not a "seafood plateau". Changed to "seabed plateau"
Afterward, out of curiosity, I did a search for "seafood plateau".
I was shocked at the number of sites that exactly copied that error along with the rest of the page. Most of these sites were clones of wikipedia with the inclusion of ads.
It didn't seem that these sites were LLM generated (they were exact copies), but this seems to be the case for many scientific paper submissions now.
Where it all goes from here is extremely unclear, but it does seem a disruption to many fields which are dependent on written material is in progress...
[+] [-] hidroto|7 months ago|reply
[+] [-] fer|7 months ago|reply
It didn't take long for the page to be dropped for being original research, and he didn't put it anywhere else.
To this day, you can still find pages and people referencing the method.
Edit: a quick check and Grok and ChatGPT have scraped it, Gemini hallucinates something unrelated.
[+] [-] Animats|7 months ago|reply
[+] [-] jibal|7 months ago|reply
[+] [-] muhdeeb|7 months ago|reply
I didn't catch the error the first time around because I autocorrected to Ge--there are only so many anions that can make that formula work and staring at these formulas all day long can make you go cross eyed anyway.
What I think is more dangerous to understanding is skipping formulas in favor of initials! BFO instead of BiFeO3, or BT instead of Bi2Te3, SRO for SrRuO3, LSFO for La0.3Sr0.7FeO3 abbreviations that I think obscure too much detail. You can more easily wander into talking about different things with the same terms. Such abbreviations are already endemic in condensed matter physics.
[+] [-] h4ny|7 months ago|reply
People make mistakes and you probably mean well but this is also the sort of pass given that makes scientific research and reporting terrible.
If it's "easy enough to figure out" then it's even more important to get it right -- why should we trust someone who can't even get the "easy" things right?
> ... and dyslexia already exists among scientists.
The article is pointing out a problem that appears to be fairly common, is that really a suitable explanation? Even if it is a suitable explanation, is that a reason for lowering standards, which you can then apply to explain away every mistake?
Keep in mind that proper publications should usually have been reviewed by at least 3 people including the authors (typically more) by the time everyone else gets to read it. So that kind of mistake isn't really acceptable.
> What I think is more dangerous to understanding is skipping formulas in favor of initials! BFO instead of BiFeO3, or BT instead of Bi2Te3, SRO for SrRuO3, LSFO for La0.3Sr0.7FeO3 abbreviations that I think obscure too much detail. You can more easily wander into talking about different things with the same terms. Such abbreviations are already endemic in condensed matter physics.
If you have been trained in scientific writing, you would always introduce an abbreviation. For example, "BiFeO3 (BFO)" and "SrRuO3 (SRO). It's also common to include a list of abbreviation in some forms of scientific writing.
[+] [-] pseudochemist|7 months ago|reply
I’m not. If somewhat said Pi was 9.14 I think no one would give it a pass. It’s not like a misspelling. It’s an invalid element which is the chemistry equivalent of an absurdly wrong number in maths.
[+] [-] Waterluvian|7 months ago|reply
[+] [-] kazinator|7 months ago|reply
When you make a citation, it means you cracked open the original work, understood what it says and located a relevant passage to reference in your work.
The authors are propagating the same typo because they are not copying the original correct text; they are just copying ready-made citations of that text which they plant into their papers to manufacture the impression that they are surveying other work in their area and taking it into account when doing their work.
They survey one or two works, and then just steal their citations to make it look like they also surveyed 19 other works.
Problem is, the citations in those words are already copies of borrowed citations from some other paper, which copied some of them from another paper and that was the honest one that made a typo in a genuine, organically grown citation.
[+] [-] kazinator|7 months ago|reply
[+] [-] arialdomartini|7 months ago|reply
[+] [-] luma|7 months ago|reply
kens is a national treasure.
[+] [-] pimlottc|7 months ago|reply
[+] [-] rdtsc|7 months ago|reply
I think established authors should try to sprinkle obvious mistakes like that on purpose once in a while in the literature and then see how much it spreads.
[+] [-] zh3|7 months ago|reply
[+] [-] readthenotes1|7 months ago|reply
[+] [-] rozab|7 months ago|reply
https://en.wikipedia.org/wiki/Sokal_affair
https://en.wikipedia.org/wiki/Grievance_studies_affair
[+] [-] thinkingtoilet|7 months ago|reply
[+] [-] ddingus|7 months ago|reply
They are copying data and placing it into documents.
Obviously, these are not the same thing.
[+] [-] nullc|7 months ago|reply
[+] [-] teiferer|7 months ago|reply
Though since he didn't date his article, it's unclear how long it has been out there so unclear as well whether it made its way into training data. Judging from the comments and the URL, it's quite new, but again, he should add a date to his articles.
[+] [-] dawnofdusk|7 months ago|reply
[+] [-] jessfyi|7 months ago|reply
[0] https://www.sciencealert.com/a-strange-phrase-keeps-turning-...
[+] [-] the__alchemist|7 months ago|reply
[+] [-] zh3|7 months ago|reply
[0] https://en.wikipedia.org/wiki/De_minimis
[+] [-] thewanderer1983|7 months ago|reply
[+] [-] jibal|7 months ago|reply
[+] [-] ElijahLynn|7 months ago|reply
[+] [-] st3fan|7 months ago|reply
[+] [-] halo|7 months ago|reply
[+] [-] olddustytrail|7 months ago|reply
How many papers have the correct formula?
[+] [-] Martin_Silenus|7 months ago|reply
[+] [-] adornKey|7 months ago|reply
In Quantum Mechanics the professors of my University consistently confused the terms Tensor-Product and Direct-Product. They all taught in lecture that the Tensor-Product was called "Direct-Product". In Mathematics this is just wrong. The definitions about what is what has been clear for about 100 years...
I called them out on that. The end result was, that the professor offered a bet in front of audience that he was right. The thing was simple - you just have to look up the definitions in any mathematical book. But nobody did this... Next lecture the professor declared himself the winner of the bet. The audience collected money. And on the next big student event they presented him a bottle of some nice alcohol as a price for his win. (They stopped Music for the party and made a big event about handing him the bottle)...
I learned that in University people aren't even able to look up a mathematical definition in a book... Nobody cares, especially those students that like to organise things don't - they surely meanwhile have made career as big heads in University councils...
Solution of the confusion I think was that in the beginning of QM the terms in mathematics were not so well defined yet. In 1910 Physics people most likely copied some wrong terminology - and some of it most likely can still be found in footnotes somewhere in physics - or in some oral tradition of local groups of professors.
[+] [-] furyg3|7 months ago|reply
[+] [-] ungreased0675|7 months ago|reply
[+] [-] michaelg7x|7 months ago|reply
[+] [-] cyanydeez|7 months ago|reply
[+] [-] jibal|7 months ago|reply