There is a general problem with rewarding people for the volume of stuff they create, rather than the quality.
If you incentivize researchers to publish papers, individuals will find ways to game the system, meeting the minimum quality bar, while taking the least effort to create the most papers and thereby receive the greatest reward.
Similarly, if you reward content creators based on views, you will get view maximization behaviors. If you reward ad placement based on impressions, you will see gaming for impressions.
Bad metrics or bad rewards cause bad behavior.
We see this over and over because the reward issuers are designing systems to optimize for their upstream metrics.
Put differently, the online world is optimized for algorithms, not humans.
> There is a general problem with rewarding people for the volume of stuff they create, rather than the quality.
If you incentivize researchers to publish papers, individuals will find ways to game the system,
I heard someone say something similar about the “homeless industrial complex” on a podcast recently. I think it was San Francisco that pays NGOs funds for homeless aid based on how many homeless people they serve. So the incentive is to keep as many homeless around as possible, for as long as possible.
> rewarding people for the volume ... rather than the quality.
I suspect this is a major part of the appeal of LLMs themselves. They produce lines very fast so it appears as if work is being done fast. But that's very hard to know because number of lines is actually a zero signal in code quality or even a commit. Which it's a bit insane already that we use number of lines and commits as measures in the first place. They're trivial to hack. You even just reward that annoying dude who keeps changing the file so the diff is the entire file and not the 3 lines they edited...
I've been thinking we're living in "Goodhart's Hell". Where metric hacking has become the intent. That we've decided metrics are all that matter and are perfectly aligned with our goals.
But hey, who am I to critique. I'm just a math nerd. I don't run a multi trillion dollar business that lays off tons of workers because the current ones are so productive due to AI that they created one of the largest outages in history of their platform (and you don't even know which of the two I'm referencing!). Maybe when I run a multi trillion dollar business I'll have the right to an opinion about data.
I think many with this opinion actually misunderstand. Slop will not save your scientific career. Really it is not about papers but securing grant funding by writing compelling proposals, and delivering on the research outlined in these proposals.
So what they no longer accept is preprints (or rejects…) It’s of course a pretty big deal given that arXiv is all about preprints. And an accepted journal paper presumably cannot be submitted to arXiv anyway unless it’s an open journal.
For position (opinion) or review (summarizing state of art and often laden with opinions on categories and future directions). LLMs would be happy to generate both these because they require zero technical contributions, working code, validated results, etc.
> Technically, no! If you take a look at arXiv’s policies for specific content types you’ll notice that review articles and position papers are not (and have never been) listed as part of the accepted content types.
People have started to use arXiv as some resume-driven blog with white paper decorations. And people start citing these in research papers. Maybe this is a good change.
So we need to create a new website that actually accepts preprints like arXivs original goal from 30 years ago.
I think every project more or less deviates from its original goal given enough time. There are few exceptions in CS like GNU coreutils. cd, ls, pwd, ... they do one thing and do it well very likely for another 50 years.
Maybe it's time for a reputation system. E.g. every author publishes a public PGP key along with their work. Not sure about the details but this is about CS, so I'm sure they will figure something out.
I had been kinda hoping for a web-of-trust system to replace peer review. Anyone can endorse an article. You can decide which endorsers you trust, and do some network math to find what you think is reading. With hashes and signatures and all that rot.
Not as gate-keepy as journals and not as anarchic as purely open publishing. Should be cheap, too.
Maybe arXiv could keep the free preprints but offer a service on top. Humans, experts in the field, would review submissions, and arXiv would curate and publish the high quality ones, and offer access to these via a subscription or fee per paper....
it's clearly not sutainable to have the main website hosting CS articles not having any reviews or restrictions. (Except for the initial invite system)
There were 26k submission in october: https://arxiv.org/stats/monthly_submissions
Asking for a small amount of money would probably help.
Issue with requiring peer reviewed journals or conferences is the severe lag, takes a long time and part of the advantage of arxiv was that you could have the paper instantly as a preprint.
Also these conferences and journals are also receiving enormous quantities of submissions (29.000 for AAAI) so we are just pushing the problem.
A small payment is probably better than what they are doing. But we must eventually solve the LLM issue, probably by punishing the people that use them instead of the entire public.
I'll add the amount should be enough to cover at least a cursory review. A full review would be better. I just don't want to price out small players.
The papers could also be categorized as unreviewed, quick check, fully reviewed, or fully reproduced. They could pay for this to be done or verified. Then, we have a reputational problem to deal with on the reviewer side.
I like this idea. A small contribution would be a good filter. Looking at the stats it’s quite crazy. Didn’t know that we could access to this data. Thanks for sharing.
> Before being considered for submission to arXiv’s CS category, review articles and position papers must now be accepted at a journal or a conference and complete successful peer review.
Edit: original title was "arXiv No Longer Accepts Computer Science Position or Review Papers Due to LLMs"
Isn't arXiv where you upload things before they have gone through the entire process? Isn't that the entire value, aside from some publisher cartel busting?
Agree. Additionally, original title, "arXiv No Longer Accepts Computer Science Position or Review Papers Due to LLMs" is ambiguous. “Due to LLMs” is being interpreted as articles written by LLMs, which is not accurate.
I don’t know about this. From a pure entertainment standpoint, we may be denying ourselves a world of hilarity. LLMs + “You know Peter, I’m something of a research myself” delusions. I’d pay for this so long as people are very serious about the delusion.
i would like to understand what people get, or think they get, out of putting a completely AI-generated survey paper on arXiv.
Even if AI writes the paper for you, it's still kind of a pain in the ass to go through the submission process, get the LaTeX to compile on their servers, etc., there is a small cost to you. Why do this?
Gaming the h-index has been a thing for a long time in circles where people take note of such things. There are academics who attach their name to every paper that goes through their department (even if they contributed nothing), there are those who employ a mountain of grad students to speed run publishing junk papers... and now with LLMs, one can do it even faster!
Published papers are part of the EB-1 visa rubric so huge value in getting your content into these indexes:
"One specific criterion is the ‘authorship of scholarly articles in professional or major trade publications or other major media’. The quality and reputation of the publication outlet (e.g., impact factor of a journal, editorial review process) are important factors in the evaluation”
Great move by arXiv—clear standards for reviews and position papers are crucial in fast-moving areas like multi-agent systems and agentic LLMs. Requiring machine-readable metadata (type=review/position, inclusion criteria, benchmark coverage, code/data links) and consistent cross-listing (cs.AI/cs.MA) would help readers and tools filter claims, especially in distributed/parallel agentic AI where evaluation is fragile. A standardized “Survey”/“Position” tag plus a brief reproducibility checklist would set expectations without stifling early ideas.
I'm not sure this is the right way to handle it (I don't know what is) but arXiv.org has suffered from poor quality self-promotion papers in CS for a long time now. Years before llms.
How precisely does it "suffer" though? It's basically a way to disseminate results but carries no journalistic prestige in itself. It's a fun place to look now and then for new results, but just reading the "front page" of a category has always been a Caveat Emptor situation.
The review paper is dead... so this is a good development. Like you can generate these things in a couple of iterations with AI and minor edits. Preprint servers could be dealing with 1000s of review/position papers over short periods of time and then this wastes precious screening work hours.
It is a bit different in other fields where interpretations or know-how might be communicated in a review paper format that is otherwise not possible. For example, in biology relating to a new phenomena or function.
What are review papers for anyway? I think they are either for
1) new grad students to end up with something nice to publish after reviewing the literature or,
2) older professors to write a big overview of everything that happened in their field as sort of a “bible” that can get you up to speed
The former is useful as a social construct; I mean, hey, new grad students, don’t skimp on your literature review. Finding out a couple years in that folks had already done something sorta similar to my work was absolutely gut-wrenching.
For the latter, I don’t think LLMs are quite ready to replace the personal experiences of a late-career professor, right?
A good review paper is infinitely better than an llm managing to find a few papers and making a summary.
A knowledgeable researcher knows which papers are outdated and can make a trustworthy review paper, an LLM can't easily do that yet
Review papers are summarizations to recent updates in the field that deserve fellow researchers' attention. Such works should be done annually or at most quarterly in my opinion, to include only time-tested results.
If hundreds of review papers are published every month, I am afraid that their quality in terms of paper selection and innovative interpretation/direction will not be much higher than the content generated by LLM, even if written word-to-word by a real scientist.
LLMs are good at plainly summarizing from the public knowledge base. Scientists should invest their time in contributing new knowledge to public base instead of doing the summarization.
I have a hunch that most of the slop is not just on CS but specifically about AI. For some reason, a lot of people's first idea when they encounter an LLM is "let's have this LLM write an opinion piece about LLMs", as if they want to test its self-awareness or hack it by self-recursion. And then they get a medley of the learning data, which if they are lucky contains some technical explanations sprinkled in.
That said, AI-generated papers have already been spotted in other disciplines besides cs, and some of them are really obvious (arXiv:2508.11634v1 starts with a review of a non-existing paper). I really hope arXiv won't react by narrowing its scope to "novel research only"; in fact there is already AI slop in that category and it is harder to spot for a moderator.
("Peer-reviewed papers only" is mostly equivalent to "go away". Authors post on the arXiv in order to get early feedback, not just to have their paper openly accessible. And most journals at least formally discourage authors from posting their papers on the arXiv.)
If you read through the papers, you'll realize the actual problem is blatant abuse and reputation hacking.
So many "research papers" by "AI companies" that are blog posts or marketing dressed up as research. They contribute nothing and exist so the dudes running the company can point to all their "published research".
In my experience, arXiv is not a preprint platform. It's a strange gatekeeper of science and should be avoided altogether. They have their favorites which they deem as "high quality" and everything else gets rejected. I am eagerly awaiting for people to dismiss arXiv altogether.
Why not just reject papers authored by LLMs and ban accounts that are caught? arXiv’s management has become really questionable lately, it’s like they’re trying to become a prestigious journal and are becoming the problem they were trying to solve in the first place
What matters is the quality. Requiring reviews and opinions to be peer-reviewed seems a lot less superficial than rejecting LLM-assisted papers (which can be valid). This seems like a reasonable filter for papers with no first-party contributions. I'm sure they ran actual numbers as well.
It’s articles (not papers) _about_ LLMs that are the problem, not papers written _by_ LLMs (although I imagine they are not mutually exclusive). Title is ambiguous.
Verification via LLM tends to break under quite small optimization pressure. For example I did RL to improve <insert aspect> against one of the sota models from one generation ago, and the (quite weak) learner model found out that it could emit a few nonsense words to get the max score.
That's without even being able to backprop through the annotator, and also with me actively trying to avoid reward hacking. If arxiv used an open model for review, it would be trivial for people to insert a few grammatical mistakes which cause them to receive max points.
> The advent of large language models have made this type of content relatively easy to churn out on demand, and the majority of the review articles we receive are little more than annotated bibliographies, with no substantial discussion of open research issues.
I have to agree with their justification. Since "Attention Is All You Need" (2017) I have seen maybe four papers with similar impact in the AI/ML space. The signal to noise ratio is really awful. If I had to pick a semi-related paper published since 2020 that I actually found interesting, it would have to be this one: https://arxiv.org/abs/2406.19108 I cannot think of a close second right now.
All of the machine learning papers are pure slop to me now. The last one I looked at had an abstract that was so long it put me to sleep. Many of these papers aren't attempting basic decorum anymore. Mandatory peer review would fix a lot of this. I don't think it is acceptable for the staff at arXiv to have to endure a Sisyphean mountain of LLM shit. They definitely need to push back.
Isn’t the signal to noise problem what journals are supposed to be for? I thought arxiv was supposed to just be a record keeper, to make it easy to share papers and preprints.
You picked the arguably most impactful AI/ML paper of the century so far, no wonder you don't find others with similar impact.
Not every paper can be a world-changing breakthrough. Which doesn't mean that more modest papers are noise (although some definitely are). What Kuhn calls "normal science" is also needed for science to work.
This is only for review/position papers, though I agree that pretty much all ML papers for the past 20 years have been slop. I also consider the big names like, "Adam", "Attention", or "Diffusion" slop, because even thought they are powerful and useful, the presentation is so horrible (for the first two) or they contain major mistakes in the justication of why they work (the last two) that they should never have gotten past review without major rewrites.
A better policy might be for arXiv to do the following:
1. Require LLM produced papers to be attributed to the relevant LLM and not the person who wrote the prompt.
2. Treat submissions that misrepresent authorship as plagiarism. Remove the article, but leave an entry for it so that there is a clear indication that the author engaged in an act of plagiarism.
Review papers are valuable. Writing one is a great way to gain, or deepen, mastery over a field. It forces you to branch out and fully assimilate papers that you may have only skimmed, and then place them in their proper context. Reading quality review papers is also valuable. They're a great way for people new to a field to get up to speed and they can bring things that were missed to the fore, even for veterans of the field.
While the current generation of AI does a poor job of judging significance and highlighting what is actually important, they could improve in the future. However, there's no need for arXiv to accept hundreds of review papers written by the same model on the same field, and readers certainly don't want to sift through them all.
Clearly marking AI submissions and removing credit from the prompters would adequately future-proof things for when, and if, AI can produce high quality review papers. Clearly marking authors who engage in plagiarism as plagiarists will, hopefully, remove most of the motivation to spam arXiv with AI slop that is misrepresented as the work of humans.
My only concern would be for the cost to arXiv of dealing with the inevitable lawsuits. The policy arXiv has chosen is worse for science, but is less likely to get them sued by butt-hurt plagiarists or the very occasional false positive.
That doesn't solve the problem they're trying to solve, which is their all-volunteer staff is being flooded with LLM slop and doesn't have the time to artistically moderate.
If you want to blame someone, blame all the people LARPing as AI researchers.
I've seen quite a few preprints posted on HN with clearly fantastical claims that only seem to reinforce or ride the coattails of the current hype cycle. It's no longer research, it's becoming "top of funnel thought leadership".
arXiv was built over a good faith assumption, where a long paper meant at least the author had put some effort behind, and a every idea deserved attention. AI generated text breaks that assumption, and anybody uploading it is not acting in good faith.
And it's a unequal arms race, in which generating endless slop is way cheaper than storing it, because slop generators are subsidised (by operating at a loss) but arXiv has to pay the full price for their hosting.
A very weird move. They are now taking a stance on what science is supposed to be.
As someone commented, due to the increasing volume, we would actually need and benefit from more reviews -- with a fixed cycle preferably, and I do not mean LLM slop but SLRs. And in contrary to someone's post, it is actually nice to read things from the industry, and I would actually want that more.
And not only are they taking a stance on science but they have also this allegation:
"Please note: the review conducted at conference workshops generally does not meet the same standard of rigor of traditional peer review and is not enough to have your review article or position paper accepted to arXiv."
In fact -- and supposedly related to the peer review crisis, the situation is exactly the opposite. That is, reviews are usually today of much higher quality at specialized workshops organized by experts in a particular, often niche area.
Maybe arXiv people should visit PubPeer once in a while to see what kind of fraud is going on with conferences (i.e., not workshops and usually not review papers) and their proceedings published by all notable CS publishers? The same goes for journals.
This should honestly have been implemented a long time ago. Much of academia is pressured to churn out papers month after month as academia is prioritizing volume over quality or impact.
It doesn't apply CS papers in general - only opinion pieces and surveys of existing papers. i.e. it only bans preprints for papers that contribute nothing new.
I had a convo with a senior CS prof at Stanford two years ago. He was excited about LLM use in paper writing to, e.g., "lower barriers" to idk, "historically marginalized groups" and to "help non-native English speakers produce coherent text". Etc, etc - all the normal tech folk gobbledygook, which tends to forecast great advantage with minimal cost...and then turn out to be wildly wrong.
There are far more ways to produce expensive noise with LLMs than signal. Most non-psychopathic humans tend to want to produce veridical statements. (Except salespeople, who have basically undergone forced sociopathy training.) At the point where a human has learned to produce coherent language, he's also learned lots of important things about the world. At the point where a human has learned academic jargon and mathematical nomenclature, she has likely also learned a substantial amount of math. Few people want to learn the syntax of a language with little underlying understanding. Alas, this is not the case with statistical models of papers!
I wonder why they can't facilitate LLMs in the review process (like fighting fire with fire). Are even the best models not capable enough, or are the costs too high?
the problem is generally the same as with generative adversarial networks; the capability to meaningfully detect some set of hallmarks of LLMs automatically is equivalent to the capability to avoid producing those, and LLMs are trained to predict (ie. be indistinguishable from) their source corpus of human-written text.
so the LLM detection problem is (theoretically) impossible for SOTA LLMs; in practice, it could be easier due to the RLHF stage inserting idiosyncrasies.
Curious for the state on things here. Can we reliably tell if a text was LLM generated? I just heard of a prof screening assignments for this, but not sure how that would work.
I always figured if I wrote a paper, the peer review would be public scrutiny. As in, it would have revolutionary (as opposed to evolutionary) innovations that disrupt the status quo. I don't see how blocking that kind of paper from arXiv helps hacker culture in any way, so I oppose their decision.
They should solve the real problem of obtaining more funding and volunteers so that they can take on the increased volume of submissions. Especially now that AI's here and we can all be 3 times as productive for the same effort.
It’s weird to say that you can be three times more efficient at taking down AI slop now that AI is here, given that the problem is exacerbated by AI in the first place. At least without AI authors were forced to actually write the slop themselves…
This does not seem like a win even if your “fight AI with AI plan works.”
efitz|4 months ago
If you incentivize researchers to publish papers, individuals will find ways to game the system, meeting the minimum quality bar, while taking the least effort to create the most papers and thereby receive the greatest reward.
Similarly, if you reward content creators based on views, you will get view maximization behaviors. If you reward ad placement based on impressions, you will see gaming for impressions.
Bad metrics or bad rewards cause bad behavior.
We see this over and over because the reward issuers are designing systems to optimize for their upstream metrics.
Put differently, the online world is optimized for algorithms, not humans.
noobermin|4 months ago
Blame people, bad actors, systems of incentives, the gods, the devils, but never broach the fault of LLMs and their wide spread abuse.
hammock|4 months ago
I heard someone say something similar about the “homeless industrial complex” on a podcast recently. I think it was San Francisco that pays NGOs funds for homeless aid based on how many homeless people they serve. So the incentive is to keep as many homeless around as possible, for as long as possible.
godelski|4 months ago
I've been thinking we're living in "Goodhart's Hell". Where metric hacking has become the intent. That we've decided metrics are all that matter and are perfectly aligned with our goals.
But hey, who am I to critique. I'm just a math nerd. I don't run a multi trillion dollar business that lays off tons of workers because the current ones are so productive due to AI that they created one of the largest outages in history of their platform (and you don't even know which of the two I'm referencing!). Maybe when I run a multi trillion dollar business I'll have the right to an opinion about data.
RobotToaster|4 months ago
_jsmh|4 months ago
How would an online world that is optimized for humans, not algorithms, look like?
Should content creators get paid?
epolanski|4 months ago
Sure, publishing on important papers has its weight, but not as much as getting cited.
canjobear|4 months ago
kjkjadksj|4 months ago
Sharlin|4 months ago
jvanderbot|4 months ago
jasonjmcghee|4 months ago
> Technically, no! If you take a look at arXiv’s policies for specific content types you’ll notice that review articles and position papers are not (and have never been) listed as part of the accepted content types.
kergonath|4 months ago
You cannot upload the journal’s version, but you can upload the text as accepted (so, the same content minus the formatting).
JadeNB|4 months ago
Why not? I don't know about in CS, but, in math, it's increasingly common for authors to have the option to retain the copyright to their work.
jeremyjh|4 months ago
nicce|4 months ago
tuhgdetzhh|4 months ago
I think every project more or less deviates from its original goal given enough time. There are few exceptions in CS like GNU coreutils. cd, ls, pwd, ... they do one thing and do it well very likely for another 50 years.
pj_mukh|4 months ago
cyanydeez|4 months ago
amelius|4 months ago
jfengel|4 months ago
Not as gate-keepy as journals and not as anarchic as purely open publishing. Should be cheap, too.
hermannj314|4 months ago
jvanderbot|4 months ago
uniqueuid|4 months ago
Her suggestion was simple: Kick out all non-ivy league and most international researchers. Then you have a working reputation system.
Make of that what you will ...
losvedir|4 months ago
SoftTalker|4 months ago
DalasNoin|4 months ago
Asking for a small amount of money would probably help. Issue with requiring peer reviewed journals or conferences is the severe lag, takes a long time and part of the advantage of arxiv was that you could have the paper instantly as a preprint. Also these conferences and journals are also receiving enormous quantities of submissions (29.000 for AAAI) so we are just pushing the problem.
marcosdumay|4 months ago
ec109685|4 months ago
nickpsecurity|4 months ago
The papers could also be categorized as unreviewed, quick check, fully reviewed, or fully reproduced. They could pay for this to be done or verified. Then, we have a reputational problem to deal with on the reviewer side.
mottiden|4 months ago
skopje|4 months ago
thomascountz|4 months ago
> Before being considered for submission to arXiv’s CS category, review articles and position papers must now be accepted at a journal or a conference and complete successful peer review.
Edit: original title was "arXiv No Longer Accepts Computer Science Position or Review Papers Due to LLMs"
dimava|4 months ago
ArXiv CS requires peer review for surveys amid flood of AI-written ones
- nothing happened to preprints
- "summarization" articles always required it, they are just pointing at it out loud
stefan_|4 months ago
catlifeonmars|4 months ago
dang|4 months ago
ivape|4 months ago
exasperaited|4 months ago
These things will ruin everything good, and that is before we even start talking about audio or video.
kibwen|4 months ago
hoistbypetard|4 months ago
currymj|4 months ago
Even if AI writes the paper for you, it's still kind of a pain in the ass to go through the submission process, get the LaTeX to compile on their servers, etc., there is a small cost to you. Why do this?
swiftcoder|4 months ago
unethical_ban|4 months ago
ec109685|4 months ago
"One specific criterion is the ‘authorship of scholarly articles in professional or major trade publications or other major media’. The quality and reputation of the publication outlet (e.g., impact factor of a journal, editorial review process) are important factors in the evaluation”
whatpeoplewant|4 months ago
ants_everywhere|4 months ago
jvanderbot|4 months ago
physarum_salad|4 months ago
It is a bit different in other fields where interpretations or know-how might be communicated in a review paper format that is otherwise not possible. For example, in biology relating to a new phenomena or function.
bee_rider|4 months ago
1) new grad students to end up with something nice to publish after reviewing the literature or,
2) older professors to write a big overview of everything that happened in their field as sort of a “bible” that can get you up to speed
The former is useful as a social construct; I mean, hey, new grad students, don’t skimp on your literature review. Finding out a couple years in that folks had already done something sorta similar to my work was absolutely gut-wrenching.
For the latter, I don’t think LLMs are quite ready to replace the personal experiences of a late-career professor, right?
JumpCrisscross|4 months ago
The problem is you can’t. Not without careful review of the output. (Certainly not if you’re writing about anything remotely novel and thus useful.)
But not everyone knows that, which turns private ignorance into a public review problem.
awestroke|4 months ago
bulubulu|4 months ago
LLMs are good at plainly summarizing from the public knowledge base. Scientists should invest their time in contributing new knowledge to public base instead of doing the summarization.
generationP|4 months ago
That said, AI-generated papers have already been spotted in other disciplines besides cs, and some of them are really obvious (arXiv:2508.11634v1 starts with a review of a non-existing paper). I really hope arXiv won't react by narrowing its scope to "novel research only"; in fact there is already AI slop in that category and it is harder to spot for a moderator.
("Peer-reviewed papers only" is mostly equivalent to "go away". Authors post on the arXiv in order to get early feedback, not just to have their paper openly accessible. And most journals at least formally discourage authors from posting their papers on the arXiv.)
naveen99|4 months ago
cubefox|4 months ago
macleginn|4 months ago
ninetyninenine|4 months ago
Don’t understand why it restricted one category when the problem spans multiple categories.
habinero|4 months ago
So many "research papers" by "AI companies" that are blog posts or marketing dressed up as research. They contribute nothing and exist so the dudes running the company can point to all their "published research".
unknown|4 months ago
[deleted]
kittikitti|4 months ago
an0malous|4 months ago
tarruda|4 months ago
Are you saying that there's an automated method for reliably verifying that something was created by an LLM?
orbital-decay|4 months ago
catlifeonmars|4 months ago
Quizzical4230|4 months ago
PaperMatch [1] helps solve this problem (large influx of papers) by running a semantic search on top of abstracts, for all of arXiv.
[1]: https://papermatch.me/
GMoromisato|4 months ago
If so, I think the solution is obvious.
(But I remind myself that all complex problems have a simple solution that is wrong.)
thatguysaguy|4 months ago
That's without even being able to backprop through the annotator, and also with me actively trying to avoid reward hacking. If arxiv used an open model for review, it would be trivial for people to insert a few grammatical mistakes which cause them to receive max points.
HL33tibCe7|4 months ago
Doubt
LLMs are experts in generating junk. And generally terrible at anything novel. Classifying novel vs junk is a much harder problem.
unknown|4 months ago
[deleted]
bob1029|4 months ago
I have to agree with their justification. Since "Attention Is All You Need" (2017) I have seen maybe four papers with similar impact in the AI/ML space. The signal to noise ratio is really awful. If I had to pick a semi-related paper published since 2020 that I actually found interesting, it would have to be this one: https://arxiv.org/abs/2406.19108 I cannot think of a close second right now.
All of the machine learning papers are pure slop to me now. The last one I looked at had an abstract that was so long it put me to sleep. Many of these papers aren't attempting basic decorum anymore. Mandatory peer review would fix a lot of this. I don't think it is acceptable for the staff at arXiv to have to endure a Sisyphean mountain of LLM shit. They definitely need to push back.
an0malous|4 months ago
Al-Khwarizmi|4 months ago
Not every paper can be a world-changing breakthrough. Which doesn't mean that more modest papers are noise (although some definitely are). What Kuhn calls "normal science" is also needed for science to work.
programjames|4 months ago
beloch|4 months ago
1. Require LLM produced papers to be attributed to the relevant LLM and not the person who wrote the prompt.
2. Treat submissions that misrepresent authorship as plagiarism. Remove the article, but leave an entry for it so that there is a clear indication that the author engaged in an act of plagiarism.
Review papers are valuable. Writing one is a great way to gain, or deepen, mastery over a field. It forces you to branch out and fully assimilate papers that you may have only skimmed, and then place them in their proper context. Reading quality review papers is also valuable. They're a great way for people new to a field to get up to speed and they can bring things that were missed to the fore, even for veterans of the field.
While the current generation of AI does a poor job of judging significance and highlighting what is actually important, they could improve in the future. However, there's no need for arXiv to accept hundreds of review papers written by the same model on the same field, and readers certainly don't want to sift through them all.
Clearly marking AI submissions and removing credit from the prompters would adequately future-proof things for when, and if, AI can produce high quality review papers. Clearly marking authors who engage in plagiarism as plagiarists will, hopefully, remove most of the motivation to spam arXiv with AI slop that is misrepresented as the work of humans.
My only concern would be for the cost to arXiv of dealing with the inevitable lawsuits. The policy arXiv has chosen is worse for science, but is less likely to get them sued by butt-hurt plagiarists or the very occasional false positive.
habinero|4 months ago
If you want to blame someone, blame all the people LARPing as AI researchers.
iberator|4 months ago
Lets say 50000€ fine, or 1 year in prison. :)
tasuki|4 months ago
deltaburnt|4 months ago
zekrioca|4 months ago
j45|4 months ago
candiddevmike|4 months ago
Sharlin|4 months ago
Maken|4 months ago
jruohonen|4 months ago
goldenjm|4 months ago
ThrowawayTestr|4 months ago
anthk|4 months ago
https://pubmed.ncbi.nlm.nih.gov/18955255/
https://pubmed.ncbi.nlm.nih.gov/16136218/
Maken|4 months ago
And it's a unequal arms race, in which generating endless slop is way cheaper than storing it, because slop generators are subsidised (by operating at a loss) but arXiv has to pay the full price for their hosting.
jruohonen|4 months ago
As someone commented, due to the increasing volume, we would actually need and benefit from more reviews -- with a fixed cycle preferably, and I do not mean LLM slop but SLRs. And in contrary to someone's post, it is actually nice to read things from the industry, and I would actually want that more.
And not only are they taking a stance on science but they have also this allegation:
"Please note: the review conducted at conference workshops generally does not meet the same standard of rigor of traditional peer review and is not enough to have your review article or position paper accepted to arXiv."
In fact -- and supposedly related to the peer review crisis, the situation is exactly the opposite. That is, reviews are usually today of much higher quality at specialized workshops organized by experts in a particular, often niche area.
Maybe arXiv people should visit PubPeer once in a while to see what kind of fraud is going on with conferences (i.e., not workshops and usually not review papers) and their proceedings published by all notable CS publishers? The same goes for journals.
internetguy|4 months ago
mottiden|4 months ago
sfpotter|4 months ago
swiftcoder|4 months ago
unknown|4 months ago
[deleted]
whatever1|4 months ago
Sorry folks but we lost.
jsrozner|4 months ago
There are far more ways to produce expensive noise with LLMs than signal. Most non-psychopathic humans tend to want to produce veridical statements. (Except salespeople, who have basically undergone forced sociopathy training.) At the point where a human has learned to produce coherent language, he's also learned lots of important things about the world. At the point where a human has learned academic jargon and mathematical nomenclature, she has likely also learned a substantial amount of math. Few people want to learn the syntax of a language with little underlying understanding. Alas, this is not the case with statistical models of papers!
_jsmh|4 months ago
How will journals or conferences handle AI slop?
unknown|4 months ago
[deleted]
hamonrye|4 months ago
[deleted]
anupj|4 months ago
[deleted]
whatpeoplewant|3 months ago
[deleted]
arendtio|4 months ago
DroneBetter|4 months ago
so the LLM detection problem is (theoretically) impossible for SOTA LLMs; in practice, it could be easier due to the RLHF stage inserting idiosyncrasies.
efavdb|4 months ago
zackmorris|4 months ago
They should solve the real problem of obtaining more funding and volunteers so that they can take on the increased volume of submissions. Especially now that AI's here and we can all be 3 times as productive for the same effort.
tasuki|4 months ago
raddan|4 months ago
This does not seem like a win even if your “fight AI with AI plan works.”