top | item 46724294

(no title)

>this error does make me pause to wonder how much of the rest of the paper used AI assistance

And this is what's operative here. The error spotted, the entire class of error spotted, is easily checked/verified by a non-domain expert. These are the errors we can confirm readily, with obvious and unmistakable signature of hallucination.

If these are the only errors, we are not troubled. However: we do not know if these are the only errors, they are merely a signature that the paper was submitted without being thoroughly checked for hallucinations. They are a signature that some LLM was used to generate parts of the paper and the responsible authors used this LLM without care.

Checking the rest of the paper requires domain expertise, perhaps requires an attempt at reproducing the authors' results. That the rest of the paper is now in doubt, and that this problem is so widespread, threatens the validity of the fundamental activity these papers represent: research.

discuss

neilv|1 month ago

> If these are the only errors, we are not troubled. However: we do not know if these are the only errors, they are merely a signature that the paper was submitted without being thoroughly checked for hallucinations. They are a signature that some LLM was used to generate parts of the paper and the responsible authors used this LLM without care.

I am troubled by people using an LLM at all to write academic research papers.

It's a shoddy, irresponsible way to work. And also plagiarism, when you claim authorship of it.

I'd see a failure of the 'author' to catch hallucinations, to be more like a failure to hide evidence of misconduct.

If academic venues are saying that using an LLM to write your papers is OK ("so long as you look it over for hallucinations"?), then those academic venues deserve every bit of operational pain and damaged reputation that will result.

derefr|1 month ago

I would argue that an LLM is a perfectly sensible tool for structure-preserving machine translation from another language to English. (Where by "another language", you could also also substitute "very poor/non-fluent English." Though IMHO that's a bit silly, even though it's possible; there's little sense in writing in a language you only half know, when you'd get a less-lossy result from just writing in your native tongue, and then having it translate from that.)

Google Translate et al were never good enough at this task to actually allow people to use the results for anything professional. Previous tools were limited to getting a rough gloss of what words in another language mean.

But LLMs can be used in this way, and are being used in this way; and this is increasingly allowing non-English-fluent academics to publish papers in English-language journals (thus engaging with the English-language academic community), where previously those academics they may have felt "stuck" publishing in what few journals exist for their discipline in their own language.

Would you call the use of LLMs for translation "shoddy" or "irresponsible"? To me, it'd be no more and no less "shoddy" or "irresponsible" than it would be to hire a freelance human translator to translate the paper for you. (In fact, the human translator might be a worse idea, as LLMs are more likely to understand how to translate the specific academic jargon of your discipline than a randomly-selected human translator would be.)

bjourne|1 month ago

There are legitimate, non-cheating ways to use LLMs for writing. I often use the wrong verb forms ("They synthesizes the ..."), write "though" when it should be "although", and forget to comma-separate clauses. LLMs are perfect for that. Generating text from scratch, however, is wrong.

piyh|1 month ago

>I am troubled by people using an LLM at all to write academic research papers.

I'm an outsider to the academic system. I have cool projects that I feel push some niche application to SOTA in my tiny little domain, which is publishable based on many of the papers I've read.

If I can build a system that does a thing, I can benchmark and prove it's better than previous papers, my main blocker is getting all my work and information into the "Arxiv PDF" format and tone. Seems like a good use of LLMs to me.

thomasahle|1 month ago

> And also plagiarism, when you claim authorship of it.

I don't actually mind putting Claude as a co-author on my github commits.

But for papers there are usually so many tools involved. It would be crowded to include each of Claude, Gemini, Codex, Mathematica, Grammarly, Translate etc. as co-authors, even though I used all of them for some parts.

Maybe just having a "tools used" section could work?

mapontosevenths|1 month ago

> It's a shoddy, irresponsible way to work. And also plagiarism, when you claim authorship of it.

It reminds me of kids these days and their fancy calculators! Those new fangled doohickeys just aren't reliable, and the kids never realize that they won't always have a calculator on them! Everyone should just do it the good old fashioned way with slide rules!

Or these darn kids and their unreliable sources like Wikipedia! Everyone knows that you need a nice solid reliable source that's made out of dead trees and fact checked but up to 3 paid professionals!

aydyn|1 month ago

>also plagiarism

To me, this is a reminder of how much of a specific minority this forum is.

Nobody I know in real life, personally or at work, has expressed this belief.

I have literally only ever encountered this anti-AI extremism (extremism in the non-pejorative sense) in places like reddit and here.

Clearly, the authors in NeurIPS don't agree that using an LLM to help write is "plagiarism", and I would trust their opinions far more than some random redditor.

fn-mote|1 month ago

This seems like finding spelling errors and using them to cast the entire paper into doubt.

I am unconvinced that the particular error mentioned above is a hallucination, and even less convinced that it is a sign of some kind of rampant use of AI.

I hope to find better examples later in the comment section.

j2kun|1 month ago

I actually believe it was an AI hallucination, but I agree with you that it seems the problem is far more concentrated to a few select papers (e.g., one paper made up more than 10% of the detected errors).

gold23|1 month ago

Why don't you look at the actual article? There are several more egregious examples, e.g., the authors being cited as "John Smith and Jane Doe"

recursive|1 month ago

What's the big deal with one dead canary? This coal mine's productivity is at record highs!

hojinkoh|1 month ago

> This seems like finding spelling errors and using them to cast the entire paper into doubt.

Well, to be fair, I did encounter this from actual human peer reviewers before the whole LLM thing. People do that.

jvanderbot|1 month ago

The problem is, 10 years ago when I was still publishing even I would let an incorrect citation go through b/c of an old bibtex file or some such.

0xWTF|1 month ago

Yeah, errors of omission are so common that "Errors and Omissions" is a category of professional liability insurance.

ls612|1 month ago

Google scholar and the vagaries of copy/paste errors has mangled bibitex ever since it became a thing, a single citation with these sorts of errors may not even be AI, just “normal” mistakes.

jasonfarnon|1 month ago

agree, I dont find this evidence of AI. It often happened that authors change, there are multiple venues, or I'm using an old version of the paper. We also need to see the denominator. If this google paper had this one bad citation out of 20 versus out of 60.

Also everyone I know has been relying on google scholar for 10+ years. Is that AI-ish? There are definitely errors on there. If you would extrapolate from citation issues to the content in the age of LLMs, were you doing so then as well?

It's the age-old debate about spelling/grammar issues in technical work. In my experience it rarely gets to the point that these errors eg from non-native speakers affect my interpretation. Others claim to infer shoddy content.

andy12_|1 month ago

> However: we do not know if these are the only errors, they are merely a signature that the paper was submitted without being thoroughly checked for hallucinations

Given how stupidly tedious and error-prone citations are, I have no trouble believing that the citation error could be the only major problem with the paper, and that it's not a sign of low quality by itself. It would be another matter entirely if we were talking about something actually important to the ideas presented in the paper, but it isn't.