(no title)
j2kun | 1 month ago
The paper was https://openreview.net/forum?id=0ZnXGzLcOg and the problem flagged was "Two authors are omitted and one (Kyle Richardson) is added. This paper was published at ICLR 2024." I.e., for one cited paper, the author list was off and the venue was wrong. And this citation was mentioned in the background section of the paper, and not fundamental to the validity of the paper. So the citation was not fabricated, but it was incorrectly attributed (perhaps via use of an AI autocomplete).
I think there are some egregious papers in their dataset, and this error does make me pause to wonder how much of the rest of the paper used AI assistance. That said, the "single error" papers in the dataset seem similar to the one I checked: relatively harmless and minor errors (which would be immediately caught by a DOI checker), and so I have to assume some of these were included in the dataset mainly to amplify the author's product pitch. It succeeded.
i_am_proteus|1 month ago
And this is what's operative here. The error spotted, the entire class of error spotted, is easily checked/verified by a non-domain expert. These are the errors we can confirm readily, with obvious and unmistakable signature of hallucination.
If these are the only errors, we are not troubled. However: we do not know if these are the only errors, they are merely a signature that the paper was submitted without being thoroughly checked for hallucinations. They are a signature that some LLM was used to generate parts of the paper and the responsible authors used this LLM without care.
Checking the rest of the paper requires domain expertise, perhaps requires an attempt at reproducing the authors' results. That the rest of the paper is now in doubt, and that this problem is so widespread, threatens the validity of the fundamental activity these papers represent: research.
neilv|1 month ago
I am troubled by people using an LLM at all to write academic research papers.
It's a shoddy, irresponsible way to work. And also plagiarism, when you claim authorship of it.
I'd see a failure of the 'author' to catch hallucinations, to be more like a failure to hide evidence of misconduct.
If academic venues are saying that using an LLM to write your papers is OK ("so long as you look it over for hallucinations"?), then those academic venues deserve every bit of operational pain and damaged reputation that will result.
fn-mote|1 month ago
I am unconvinced that the particular error mentioned above is a hallucination, and even less convinced that it is a sign of some kind of rampant use of AI.
I hope to find better examples later in the comment section.
jvanderbot|1 month ago
ls612|1 month ago
jasonfarnon|1 month ago
Also everyone I know has been relying on google scholar for 10+ years. Is that AI-ish? There are definitely errors on there. If you would extrapolate from citation issues to the content in the age of LLMs, were you doing so then as well?
It's the age-old debate about spelling/grammar issues in technical work. In my experience it rarely gets to the point that these errors eg from non-native speakers affect my interpretation. Others claim to infer shoddy content.
andy12_|1 month ago
Given how stupidly tedious and error-prone citations are, I have no trouble believing that the citation error could be the only major problem with the paper, and that it's not a sign of low quality by itself. It would be another matter entirely if we were talking about something actually important to the ideas presented in the paper, but it isn't.
_alternator_|1 month ago
anishrverma|1 month ago
What I find more interesting is how easy these errors are to introduce and how unlikely they are to be caught. As you point out, a DOI checker would immediately flag this. But citation verification isn’t a first-class part of the submission or review workflow today.
We’re still treating citations as narrative text rather than verifiable objects. That implicit trust model worked when volumes were lower, but it doesn’t seem to scale anymore
There’s a project I’m working on at Duke University, where we are building a system that tries to address exactly this gap by making references and review labor explicit and machine verifiable at the infrastructure level. There’s a short explainer here that lays out what we mean, if useful context helps: https://liberata.info/
dexdal|1 month ago
nazgul17|1 month ago
jmmcd|1 month ago
arjvik|1 month ago
I wouldn't trust today's GPT-5-with-web-search to do turn a bullet point list of papers into proper citations without checking myself, but maybe I will trust GPT-X-plus-agent to do this.
nativeit|1 month ago
j2kun|1 month ago
gowld|1 month ago
m-schuetz|1 month ago
worik|1 month ago
...and including the erroneous entry is squarely the author's fault.
Papers should be carefully crafted, not churned out.
I guess that makes me sweetly naive
nearbuy|1 month ago
davidguetta|1 month ago
There was dumb stuff like this before the GPT era, it's far from convincing
ls612|1 month ago
Also, in my field (economics), by far the biggest source of finding old papers invalid (or less valid, most papers state multiple results) is good old fashioned coding bugs. I'd like to see the software engineers on this site say with a straight face that writing bugs should lead to jail time.
nativeit|1 month ago
I don’t think the point being made is “errors didn’t happen pre-GPT”, rather the tasks of detecting errors have become increasingly difficult because of the associated effects of GPT.
bjourne|1 month ago
fmbb|1 month ago
Well the title says ”hallucinations”, not ”fabrications”. What you describe sounds exactly like what AI builders call hallucinations.
j2kun|1 month ago
beowulfey|1 month ago
lou1306|1 month ago
They are not harmless. These hallucinated references are ingested by Google Scholar, Scopus, etc., and with enough time they will poison those wells. It is also plain academic malpractice, no matter how "minor" the reference is.
janalsncm|1 month ago
ainch|1 month ago
Not to say that you could ever feasibly detect all AI-generated text, but if it's possible for people to develop a sense for the tropes of LLM content then there's no reason you couldn't detect it algorithmically.
StopDisinfo910|1 month ago
If the mistake is one error of author and location in a citation, I find it fairly disingenuous to call that an hallucination. At least, it doesn't meet the threshold for me.
I have seen this kind of mistakes done long before LLM were even a thing. We used to call them that: mistakes.
currymj|1 month ago
bjourne|1 month ago
j2kun|1 month ago
> I don't share your view that hallucinated citations are less damaging in background section.
Who exactly is damaged in this particular instance?
David_Osipov|1 month ago
NedF|1 month ago
[deleted]