top | item 34599106

New AI classifier for indicating AI-written text

403 points| davidbarker | 3 years ago |openai.com

337 comments

order
[+] ilaksh|3 years ago|reply
It can't possibly work reliably. It's going to be very challenging for honest kids because almost everyone is going to be cheating.

The reality is that learning to think and write will be harder because of the ubiquity of text generation AI. This may be the last generation of kids where most are good at doing it on their own.

On the other hand, at least a few will be able to use this as an instant feedback mechanism or personal tutor, so the potential for some carefully supervised students to learn faster is there.

And it should increase the quality of writing overall if people start taking advantage of these tools. It's going to fairly quickly become somewhat like using a calculator.

Actually it probably means that informal text will really stand out more.

I am giving it the ability to do simple tasks given commands like !!create filename file content etc.

It's actually now very important for kids to adapt quickly and learn how to take advantage of these tools if they are going to be able to find jobs or just adapt in general even if they don't have jobs. It actually is starting to look like everyone is either an entrepreneur or just unemployed.

Learning about all the ways to use these tools and the ones coming up in the next few years could be quite critical for children's education.

There are always going to be luddites of course. But it's looking like ChatGPT etc. are going to be the least of our problems. It is not hard to imagine that within twenty years or so, anyone without a high bandwidth connection to an advanced AI will be essentially irrelevant because their effective IQ will be less than half of those who are plugged in.

[+] o_____________o|3 years ago|reply
Schools are going to have to be reverse format: watch lectures at home, do homework in class.
[+] ElFitz|3 years ago|reply
> The reality is that learning to think and write will be harder because of the ubiquity of text generation AI. This may be the last generation of kids where most are good at doing it on their own.

I think the form will change, but the substance is going to be like it always has.

Those who give a damn, want to and are able to improve will be able to do it 100x by getting access to a new source of diversified ideas and view points, variations on their own, and information. And efficiently and more or less reliably delegating low-value stuff at a low marginal cost, freeing up bandwidth & increasing impact.

And the grifters who’re always looking for shortcuts and constantly try to game whatever system they are in without any desire whatsoever to learn anything or grow in anyway… will keep doing just that. And the only value they’ll be able to bring will be… access to a tool anyone (but, yes, not everyone) can access.

And that matters, because in the end many of the problems worth solving aren’t technical problems, but people problems.

Writing this, I realise I may have completely missed your point and gone way off topic here.

[+] logifail|3 years ago|reply
> it should increase the quality of writing overall if people start taking advantage of these tools

Perversely, it might also dramatically decrease reading, if there's no incentive for anyone to need to properly understand anything.

A pretty dire scenario :(

[+] Gabriel_Martin|3 years ago|reply
Sorry, it is unclear if this comment is AI generated or not, I can't give you full credit for it.

(No seriously. "The classifier considers the text to be unclear if it is AI-generated.")

[+] kobalsky|3 years ago|reply
> The reality is that learning to think and write will be harder because of the ubiquity of text generation AI

is this just a baseless assertion or do you have something to back it up?

what I'm seeing is that a lot of kids will have a 1:1 coach for a lot of topics.

I know now it's not perfect, but this thing may be a few years away from having a likeable personality and fact checking what it says.

[+] sdenton4|3 years ago|reply
/This may be the last generation of kids where most are good at doing it on their own./

Ironically, students almost all suck at writing...

[+] anothernewdude|3 years ago|reply
It's going to falsely label any recent innovation in language as generated, because the training set will skew human-written as old and GPT generated as new language.
[+] jb_s|3 years ago|reply
I've defeated it already using basic prompt engineering.
[+] GOONIMMUNE|3 years ago|reply
This seems like a sort of unwinnable arms race. Can't the people who work on generative text models use this classifier as a feedback mechanism so that their output doesn't flag it? I'm not an AI expert, but I believe this is even the core mechanism behind Generative Adversarial Networks.
[+] londons_explore|3 years ago|reply
Detectors can be a black box "pay $5 per detection" type service.

That way, you can't fire thousands of texts at it to retrain your generative net.

Plagiarism detectors in schools and universities work the same. In fact, some plagiarism detection companies now offer the same software to students to allow them to pay some money to pre-scan their classwork to see if it will be detected...

[+] mritchie712|3 years ago|reply
There's also always going to be more capital going towards building better generators than better detectors.
[+] PartiallyTyped|3 years ago|reply
Language Models produce a high probability sequence of words given history (or an approximation of it). This is the only paradigm that we know works for language synthesis.

What the creators of this page did is turn that into its head, and use exactly that reasoning to identify candidate passages as computer generated, exactly because they have access to those probabilities, so it's not a viable approach to improving the language model directly.

With ChatGPT however, we have 2 models working , a language model, and a ranking model. The ranking model is trained to order the results of the language model to look better to humans. The suggested approach could be used to help fit the model by ranking lower probability sequences higher, but this comes at the cost of increased computation time by generating many more sequences, and constructing incoherent output.

[+] black_puppydog|3 years ago|reply
Jup, an arms race indeed. With the companies involved selling to both sides, as in any good conflict... :|
[+] hfbff|3 years ago|reply
You're right, that's the core mechanims of GANs. The current state of the art models aren't using a GAN structure, but it's plausible that they achieve state of the art numbers in the future
[+] michaericalribo|3 years ago|reply
I foresee a dystopian education outcome:

1. Classifiers like this are used to flag possible AI-generated text

2. Non-technical users (teachers) treat this like a 100% certainty

3. Students pay the price.

Especially with a true positive rate of only 26% and a false positive rate of 9%, this seems next to useless.

[+] gillesjacobs|3 years ago|reply
I found a great way to fool these detectors: piping output through multiple generative models.

1. Generate text by promoting ChatGPT.

2. Rewrite / copyedit with Wordtune [1], InstaText [2] or Jasper.

This fools GPTZero [4] consistently.

Of course soon these emotive, genre or communication style specialisations will be promptable too by a single model too. Detectors will be integrated as adversarial agents in training. There is no stopping generative text tooling, better adopt and integrated it fully into education and work.

1. https://www.wordtune.com/

2. https://instatext.io/

3. https://www.jasper.ai/

4. https://gptzero.me/

[+] bioemerl|3 years ago|reply
Now they get to monetize Chat GPT and this new classifier. Starting fires and providing the extinguishers, charging for both of them.

All while pretending to be morally responsible in order to do it.

[+] dakiol|3 years ago|reply
No way. If I were a student trying to use ChatGPT in order to improve my writing, I would definitely not pay for it if I know my teachers are using their AI Classifier. I mean, what's the point? I don't think OpenAI will be able to reach that (big) chunk of potential customers that want to use ChatGPT to write essays, social media comments, etc. if OpenAI at the same time sells their classifier. It's just nuts.
[+] dxbydt|3 years ago|reply
There was a merchant who said - Buy my sword! It will pierce through any shield !!

So the gullible people bought the swords and soon the merchant ran out of swords to sell.

So the merchant said - Buy my shield! They can defend against any sword !!

Once again the gullible people rushed to buy the shields.

But one curious onlooker asked - what happens when your sword meets your shield?

[+] earthboundkid|3 years ago|reply
A compound noun meaning sword-shield, 矛盾 mujun, is the word for contradiction in Japanese, based on this Chinese folktale.
[+] astrange|3 years ago|reply
ChatGPT doesn't make any promises to beat AI text classifiers. If you asked it to it'd probably tell you that's unethical.
[+] e1ghtSpace|3 years ago|reply
It depends whos holding them I guess.
[+] kypro|3 years ago|reply
The existence of this tool might actually do more damage if people are using with any level of confidence to check text content as important as exams. I understand why they felt the need to release something, but I think it would be better if this didn't exist.

My guess is that it's very easily gamed. Something ChatGPT is very good at is producing text content in different styles so if you're a student and you run your text through a AI detector you can always ask ChatGPT to write it in a style which is more likely to pass detection.

Finally, I wouldn't be surprised if this detector is mostly just detecting grammatical and spelling mistakes. It's obvious I'm a human given how awful I am at writing, but I wouldn't be surprised if a good write who uses very good grammar, has good sentence structure and who's writing looks a little bit too "perfect" might end up triggering the detector more often.

[+] minimaxir|3 years ago|reply
> Our classifier is not fully reliable. In our evaluations on a “challenge set” of English texts, our classifier correctly identifies 26% of AI-written text (true positives) as “likely AI-written,” while incorrectly labeling human-written text as AI-written 9% of the time (false positives).

That is an interesting mathematical description of "not fully reliable".

[+] rafaelero|3 years ago|reply
26% of true positive and 9% of false positive is just terrible. I don't see how this can be usable.
[+] robomc|3 years ago|reply
it can't be used usefully for anything. the only time it's better than flipping a coin is when there's known to be a majority of human texts in your corpus but even under those conditions it will fail to flag the majority of AI texts.

In a set of 100 texts, with 20 being AI, the most likely outcome would be 5 AI texts correctly flagged, along with 7 falsely accused human texts. For like, 22 incorrect answers.

For 100 texts where 90 are AI, it would be better to just flip a coin. A coin flip would give you around half correct, and this system would apparently give you around 68 wrong answers (three quarters of the 90 AI ones wrong, then one of the human ones wrong).

[+] yboris|3 years ago|reply
Quote:

> In our evaluations on a “challenge set” of English texts

I wonder if they mean "challenge" in the sense that these are some of the hardest-to-discern passages. Meaning that with average human writing / average type of text, the % is better. I'm unsure.

[+] kriro|3 years ago|reply
I'd rather try to empower students to use ChatGPT as a tool or incorporate it into class work than worry about cheating. This is a pretty unique time for teachers to step up and give their students a nice edge in life by teaching them how to become early adopters for these kinds of things.
[+] discreteevent|3 years ago|reply
The purpose of writing an essay is to teach students how to think. Being able to prompt is a subset of being able to think. If you only teach them to prompt you have taken away any edge they might have had. Its like those schools that think that getting more ipads will make the kids smarter.
[+] a257|3 years ago|reply
Perhaps when we have sufficiently capable OSS models, but as it stands GPT is a paid service and not a public good.
[+] bhouston|3 years ago|reply
I used ChatGPT to rewrite a number of paragraphs of my own writing earlier today. It rewrote them completely. I just pasted those into this detection tool and it responded for both "The classifier considers the text to be unlikely AI-generated."

So it can not detect AI re-written/augmented text it seems, even things that ChatGPT itself generates.

[+] mitchdoogle|3 years ago|reply
Well OpenAI admits it is wrong most of the time, so your results are consistent with what is expected
[+] lumost|3 years ago|reply
I don't see why teachers don't use this as an opportunity to accelerate curriculum. Every student now has a cheap personal instructor. Why not raise the bar on difficulty and quality expectations for assignments?
[+] dakiol|3 years ago|reply
Isn't this a poor business move from OpenAI? I mean, if they make possible to distinguish (100% in the future) between AI-written text and human-written text... then a big chunk of potential OpenAI's customers will not use ChatGPT and similars because "they are gonna be caught" (e.g., students, writers, social media writers, etc.)
[+] ycombiredd|3 years ago|reply
My first reaction-thought to seeing this is: implement a GAN with GPT inference on the generator side and this classifier (or DetectGPT or GPTZero, or whatever) they’ve developed as the detector. I would think this would very quickly a) achieve state of the art whatever-the-fool-a-human-reader-test-measurement-is results and b) render the classifier, and any subsequent AI-text-detecting classifier, useless.

I could be way off base with that idea, but it seemed a good enough one to ponder, but not so great I was motivated to do anything more than post the thought.

[+] cuteboy19|3 years ago|reply
I think the detection performance is bad enough that it might just degrade chat
[+] jerpint|3 years ago|reply
This wouldn’t work in practice because you don’t have access to GPTs activations
[+] brink|3 years ago|reply
I miss the 90's and the early 00's. Take me away from this AI hell.
[+] shagie|3 years ago|reply
Musicians Wage War Against Evil Robots - https://www.smithsonianmag.com/history/musicians-wage-war-ag...

From the March, 1931 issue of Modern Mechanix magazine:

> The time is coming fast when the only living thing around a motion picture house will be the person who sells you your ticket. Everything else will be mechanical. Canned drama, canned music, canned vaudeville. We think the public will tire of mechanical music and will want the real thing. We are not against scientific development of any kind, but it must not come at the expense of art. We are not opposing industrial progress. We are not even opposing mechanical music except where it is used as a profiteering instrument for artistic debasement.

[+] sekai|3 years ago|reply
> Take me away from this AI hell

People used to say that about electricity too, and cars, and planes, and computers. This is just the next step in the chain.

[+] blueberrychpstx|3 years ago|reply
Doesn't this get us into a sort of perpetual motion machine with the back and forth being

1) generate paragraph of my essay 2) feed it into this classifier 3a) if AI -> make it sound more human 3b) if human -> $$$ Profit?

Obviously it could be more fine tuned than this and is in general good to know, but I just love watching this game play out of ... errr how do we manage the fact that humans are relatively less and less creative compared to their counterparts.

[+] dakiol|3 years ago|reply
The thing is point 1 costs money (I imagine at some point, ChatGPT will cost money), but point 2 also will cost money. So OpenAI will charge you double for generate AI-written text that is undetectable. Poor move. I could happily pay a lot for ChatGPT, but if they also commercialize a (more accurate) classifier then I won't use ChatGPT at all.
[+] anshumankmr|3 years ago|reply
What I would love to see in GPT 3 is some sort of a confidence score that they could return, as in how sure their model is that what it returned is accurate and not gibberish. Could this classifier help with that? I am working on a requirement where we are using ElasticSearch to map a query to an article in a knowledge base and then the plan is to send it to GPT 3 to help summarize the article.

Since the ElasticSearch integration is still WIP, I had made a POC to scrape the knowledge base (with mixed results, lots of the content is poorly organized, so the scraped content that would act as prompt to the GPT 3 model wasn't all that good either) and then feed it to GPT 3, but the it couldn't always give the most accurate answers on that. The answers sometimes were spot on, or quite good but other times, not so much. I would say about 30% of the time, it made sense. So if there was a way for me to get if answer was sensible or not, so we could give an error response if the GPT 3's response did not make sense.

The reason why we are doing it cause the client has a huge knowledge base and mapping each question to an answer would be difficult for them.

[+] siliconc0w|3 years ago|reply
Horrible idea, you can't eliminate the false positives and these are going to impact innocent students or used to re-enforce teacher biases.
[+] BulgarianIdiot|3 years ago|reply
I wrote some text about the subjectivity of communication and the nature of natural language, and I kept it very neutral, formal and verbose. And it said "this text is likely AI".

So, as honestly was predictable, people who rely on this tool being accurate, will inflict a lot of pain on unsuspecting individuals who simply write like GPT writes.