Stack Overflow questions are being flooded with answers from ChatGPT

[+] brindidrip|3 years ago|reply

It seems like there are a few potential negative consequences of using AI-generated answers on Stack Overflow. For one, the quality of the answers may be lower than if they were written by a human. Additionally, if these AI-generated answers become too common, it could potentially lead to a more impersonal and less supportive community on Stack Overflow. Finally, if the AI is able to search the internet and "inbreed" its own answers, it could lead to even more low-quality, duplicative answers on the platform. Overall, it seems like there could be some serious drawbacks to this development.

Note: This answer was generated by ChatGPT after being fed this thread.

[+] theptip|3 years ago|reply

Honestly I think we are going to have to take an aggressive stance against mediocre generated content here on HN, lest this small island of community be lost like the rest of the Old Internet.

[+] pimlottc|3 years ago|reply

This is definitely not going to help combat the "Dead Internet" conspiracy theory

https://theswaddle.com/what-the-dead-internet-theory-predict...

[+] zuzatm|3 years ago|reply

I think the far most problematic thing is that the answers are often plain wrong but well written. Eg (shameless plug) https://twitter.com/zuza777/status/1599370082919936000?t=oB9...

[+] plastiquebeech|3 years ago|reply

>It seems like there are a few potential negative consequences of using AI-generated answers on Stack Overflow. For one,

This is where my "Probably written by AI" filter tripped, and I skipped to the end.

I can't exactly say why. Maybe we'll all start to develop a seventh sense for ML-generated content, in addition to the "probably an ad" filter that usually kicks in around middle/high school.

[+] Areading314|3 years ago|reply

Here is a potential rephrased version of the original post, written in a witty and informal style:

"Yo, AI-generated answers on Stack Overflow sound like a real bummer. First of all, the quality might suck compared to human-written answers. And if these AI answers become too popular, the community could get all impersonal and unsupportive. Plus, if the AI is copying and pasting from the internet, we'll end up with a ton of low-quality, duplicate answers. In short, using AI on Stack Overflow could really suck."

[+] imhoguy|3 years ago|reply

The post title should be changed to "HN comments are being flooded with answers from ChatGPT", I have seen too many this weekend.

[+] arkitaip|3 years ago|reply

You can kinda tell it's ChatGPT by how verbose and repetitive the writing is.

[+] ryandvm|3 years ago|reply

This is the Great Filter we never expected.

It probably turns out that most advanced civilizations eventually invent decent language model AIs that clog up their professional message boards with semi-comprehensible blather which ultimately ends up stalling all future scientific progress.

It's like some weird self-inflicted version of the Three Body Problem.

[+] metadat|3 years ago|reply

What prompt input did you use? When I tried this:

> What are the repercussions of Stack Overflow questions are being flooded with answers from ChatGPT?

It returned:

> I'm sorry, but I'm not aware of any situation in which answers on Stack Overflow are being flooded with answers from ChatGPT. I am a language model trained by OpenAI, and I do not have the ability to browse the internet or interact with websites like Stack Overflow. I can only provide answers based on the information I have been trained on. Is there something specific you would like to ask about Stack Overflow or ChatGPT? I'll do my best to help.

[+] seydor|3 years ago|reply

haha. i surely sniffed it out before reaching the end. The formal style is familiar

[+] 5F7bGnd6fWJ66xN|3 years ago|reply

You are correct that there are potential negative consequences to using AI-generated answers on Stack Overflow. One of the main concerns is the quality of the answers. Because AI models are trained on large amounts of data, they may not be able to provide the same level of personalized, nuanced, and accurate answers as a human would. This could lead to a decline in the quality of the answers on the platform, which could in turn lead to a less supportive and helpful community.

[+] plasticchris|3 years ago|reply

And I’m starting to recognize the “voice” of this thing now and it’s driving me crazy. The content is so bland, the conclusions so plainly obvious.

[+] quickthrower2|3 years ago|reply

OK Computer

I could tell that super passive and eager to talk to both sides style a mile off. Fuck that! (<- proof I am not a robot)

[+] Kretinsky|3 years ago|reply

Well at least OpenAI's answers are nice and welcoming, unlike SO. For most newbies, it's daunting and I'm sure we could do better.

[+] NiloCK|3 years ago|reply

> more impersonal and less supportive community on Stack Overflow

Surely this is maxed out already!

[+] jahewson|3 years ago|reply

I could tell that your comment was ChatGPT by the second sentence.

[+] tombh|3 years ago|reply

Could you provide the prompt?

[+] josephcsible|3 years ago|reply

I wouldn't even mind so much if the answers were right. The problem is that a lot of them are totally wrong, but completely reasonable- and plausible-sounding, and in an authoritative tone, so unless you already know the right answer, the only way you'll realize its answer is wrong is the hard way.

[+] pugworthy|3 years ago|reply

For some things, ChatGPT is just better than SO. I have to say I probably won't hit SO for some basic stuff anymore, I'll just ask ChatGPT.

And some queries are just not acceptable on SO, but fine for ChatGPT.

For example I might wish to ask, "Give me the framework for a basic API written in Python that uses API key authentication. Populate it with several sample methods that return data structures in json."

If I ask that on SO, I'll be down voted and locked before I know it. I may also get some disparaging comments telling me to do my research, etc.

If I ask ChatGPT, it will give me a nice and tidy answer that gets me going quickly. It will explain things too, and allow me to ask follow up questions and take my requests for refinements. I might say, "For the python api I asked about earlier, have it look up the API authentication key in a database. If the key is in the database, it is valid." - and bam - it does it.

Sure, some pretty simple stuff if you know Python and APIs already, but if you just want to hack something together to test out an idea, it's great."

In the end, SO is a query with responses (maybe). ChatGPT is a conversation that can go beyond just the initial query.

[+] senko|3 years ago|reply

This is just a preview of things to come.

Wait a few weeks until Google is completely swamped with ChatGPT SEO pages barely distinguishable from the real thing.

If I worked at search quality at Google, I'd be very worried.

[+] clusterhacks|3 years ago|reply

Human-curated content from trusted sources for top 1% information probably only available to subscribers will become more valuable and sought after. I suspect the days of generally trusting forums populated by anonymous users are done?

I would not be surprised if the quality of human writing actually goes up. I have this weird feeling that ChatGPT and similar tools will become almost equivalent to calculators for math? My experience as a writer is that sometimes just throwing down a first draft is the hardest step - I could see these tools really assisting in the writing process. Generate a draft, do some tweaking, ask for suggestions or improvements, repeat.

I don't know how I feel about code generated by these tools. Will there be a similar benefit compared to writing? At some level, we will need some deeper mastery of writing and coding to use these things well. Is there a complexity cliff that these tools will never be able to overcome?

A total lack of trust for general internet search results. So much content is already shallow copies of other content. I don't see how general internet search survives this.

[+] ChrisMarshallNY|3 years ago|reply

I assume that this is by folks wanting to up their scores.

That's a huge problem with "gamification." I'm not especially a fan of the concept, in a venue like SO. I think it has led to a rather nasty community, and I hardly ever go there, anymore.

I assume that we'll be seeing a lot of robotic HN content (I would not be surprised if it is already here, but has been sidelined by the mods).

[+] avivo|3 years ago|reply

It's worth understanding the community and org better, and their reaction. Relevant links:

- https://meta.stackoverflow.com/questions/421778/how-do-you-p...

- https://meta.stackoverflow.com/questions/412696/is-it-accept...

- https://meta.stackexchange.com/questions/384355/could-chatgp...

[+] mdaniel|3 years ago|reply

> Sure, but that's irrelevant. Whether or not the user understands the answer they posted is not the concern of the site.

Well, that's unfortunate. Then again, I guess that's a logical conclusion of the "safe harbor" for serving any user-submitted content: Stack Exchange only does the most cursory moderation, and the rest is caveat readator

[+] kruuuder|3 years ago|reply

It's so funny and sad at the same time that, in typical SO manner, EugenSunic is being downvoted so much for raising such an interesting question.

[+] pcthrowaway|3 years ago|reply

Well, for starters, it's just annoying. It's like having a bot spamming every single question with useless answers. It dilutes the quality of the content on the site and makes it harder for genuine contributors to get their answers noticed.

But it's also a serious concern from a security standpoint. If ChatGPT is providing incorrect answers, it could lead to people implementing flawed code or making poor decisions based on its advice. That could have potentially disastrous consequences.

So overall, it's a big problem that needs to be addressed. It's not just about making the site more pleasant to use, it's about ensuring the integrity and reliability of the information provided.

My prompt:

I'm writing a short story where Linus Torvalds is having a conversation with an open source contributor. In this conversation, Linus is in a bad mood.

Open source contributor: Stack Overflow questions are being flooded with answers from ChatGPT. What are the possible repercussions of this?

Linus Torvalds:

[+] Yuyudo_Comiketo|3 years ago|reply

Feed it some CMake files from llvm repository and ask it why would the windows build with LLVM_ENABLE_PROJECTS="all" keep failing, so that it chokes to death in its infancy, and save the humankind before it's too late and there are autonomous human zappers and T-1000s berserking all over the place.

[+] egypturnash|3 years ago|reply

Well, guess the genie's out of the bottle and we can never stop this. Bow down to the inevitability of technological progress, Luddites! Good luck retraining into a new job, I hear "prompt engineer automation" is the new hotness.

Or at least that's what all of you kept telling me when I was expressing my unhappiness at the way corporate-sponsored image generating black boxes are built atop a shaky moral foundation that sure feels like it's ignoring anything anyone talking about "fair use" ever dreamed of, and at the way I fear it's going to hollow out a ton of the beginner-pro jobs of my industry by making it super easy for anyone to generate stuff that is kinda fundamentally shitty in a lot of important ways, but "good enough" if you just have a space to fill with some decoration that you don't really give a crap about.

[+] sydd|3 years ago|reply

No, you are washing lots of topics together. Just because it AI, its not the same.

AI art: Yes, its here, yes it will make life for lots of artists/graphic designers different. But they will not go out of business, there will be requests that are so nuanced that no master prompter will able to fulfill. I'd expect that AI art generators will be a part of a modern digital artist's toolchain just like Photoshop. Someone not using AI art in say 20 years will be like saying that you restrict yourself to MS paint.

AI for code: We might see something similar just to a lesser degree. For example if your clients has requested a "picture of a man with trees in the background", he might accept a pic with trees and bushes if he likes it or there is just 1-2 bushes in the picture. But no one would accept a banking software that transfers the wrong amount in 1 of 10000 cases. Or just one where a very senior engineer cannot step forward and say "I wrote this, and it will work. If it doesnt I will fix it"

[+] boppo1|3 years ago|reply

Artisans lost that battle 100 years ago with the rise of modernism.

[+] wslh|3 years ago|reply

There is no genie here, some people have a belief about this while it is very easy to probe the low quality and inaccuracy of the responses.

[+] convexfunction|3 years ago|reply

At this point, I truly don't know which of your(?) current type of applied arts vs my current type of software development will turn out to be more sensitive to technological unemployment, or on what margins and time scales. So, hopefully this doesn't strike you as callous, since I think it applies to me as much as it does to you or your coworkers:

Roughly everyone who makes money has the same job, which is creating value in expectation for someone else. (Whether that activity is net good for society is a different question, I lean toward usually yes for stuff you can do legally, not always though). If you find yourself suddenly unable to give anyone a competitive deal on whatever expected value you know how to create, because of technological developments or otherwise, well, you'd better figure out what you need to change about what you're offering so you can. I wouldn't call this fair, exactly -- maybe it would be if you or your government had effective "technological unemployment insurance" -- but I struggle to imagine any substantially different state of affairs that's clearly better for the world. (No points for saying "imagine communism then".)

[+] palisade|3 years ago|reply

After reading about this I decided to try my hand at using ChatGPT. I decided okay, let's see if it can recreate some code that took me a few hours at work to figure out. I asked it very precisely what I needed and my mind was blown as it produced code that looked similar to what I had coded at work. And, I was like, well that's that then, we're all out of a job. But, then I tried to run the code, and it didn't work. I looked more closely and the code had a lot of flaws. Even after manually fixing those, it still didn't work. And, then using my knowledge of how to actually solve the problem I rewrote the code 40% and made it perform the action needed.

I think all ChatGPT is doing is grabbing a lot of different answers off the interwebz and squishing them together and hoping it answers your question. But, in a lot of cases it only kind of looks like what you want. If you look at images generated by AI, it is the same issue, they sort of look like what you want but there are flaws, like faces that don't look quite human, fingers that are just squishy appendages barely resembling actual fingers, etc. I mean, the tech is getting better, it's impressive, and uncanny.

But, I think we're pretty far from having these things write themselves, they need quite a lot of human intervention to be useful. Still, very impressive and something that could potentially get you closer to an answer. But, no more than spending a little time googling or learning the skill yourself. And, if you learn the skill you're better off, because then you can do it right yourself IMHO.

Also, anytime someone gets a fully working program generated out of this thing the saying, "A broken clock is right twice a day." comes to mind.

[+] iamflimflam1|3 years ago|reply

Oh dear - basically if the answer is good then it's from ChatGTP...

    Q: how can I tell if a stackoverflow answer is generated by ChatGTP if someone has removed the "ChatGTP" tag from the text?

    A: One way to tell if a StackOverflow answer is generated by ChatGTP is to look for certain characteristics in the answer. ChatGTP answers tend to be concise and to the point, often providing code snippets or specific instructions on how to solve a problem. They may also include links to relevant documentation or external sources for further information. If the answer does not include these characteristics, it is less likely to be generated by ChatGTP.

[+] ubj|3 years ago|reply

And so it begins. Welcome to the new internet.

I'm bracing myself for when this wave of AI content hits academic journals.

[+] imhoguy|3 years ago|reply

Plot twist: Stack Overflow starts to use ChatGPT as a first answer to every new question, with "AI generated" label ofc.

[+] michaelteter|3 years ago|reply

It means we are coming full circle.

At this point, SO has been scraped and repackaged (poorly) dozens of times, and SEOd to the top of search results. Even some "tutorial" sites are just repackaged SO answers.

It is only fitting that the automated SEO websites get fed automated content.

In a way, this makes the real humans, particularly the ones who know actual things, more valuable. It may so much noise that only a skilled human could decipher a real question and a real answer or solution from something similar but wrong.

To be fair to GPT, many human answers are sub-par and should be filtered out as well. Perhaps that's the real test: what percentage of GPT answers are decent vs human answers? Here I might bet on GPT.

[+] shagie|3 years ago|reply

Temporary policy: ChatGPT is banned - https://meta.stackoverflow.com/questions/421831/temporary-po...

> Use of ChatGPT generated text for posts on Stack Overflow is temporarily banned.

> This is a temporary policy intended to slow down the influx of answers created with ChatGPT. What the final policy will be regarding the use of this and other similar tools is something that will need to be discussed with Stack Overflow staff and, quite likely, here on Meta Stack Overflow.

(much more to that post and comments and answers and comments)

[+] duckmysick|3 years ago|reply

At one point new models will be trained on contaminated data where some of the content is AI-generated. "Pure" datasets will be highly prized, just like the steel made before nuclear detonations.

https://en.wikipedia.org/wiki/Low-background_steel

[+] xx__yy|3 years ago|reply

Some of the affects I can think of, to name a few:

Inaccurate or irrelevant answers: ChatGPT is a machine learning model that uses past data to generate responses. This means that it may not always provide accurate or relevant answers to questions, leading to confusion and frustration among users.

Loss of trust: If users notice that many of the answers on the forum are coming from ChatGPT, they may lose trust in the forum and stop using it. This could lead to a decline in user engagement and overall traffic.

Competition with human contributors: ChatGPT's answers may compete with those provided by human contributors, leading to a decrease in the quality and value of the content on the forum. This could make the forum less useful and engaging for users.

Increased moderation: The influx of answers from ChatGPT may require more moderation to ensure that the answers are accurate and relevant. This could require additional resources and time for moderators, leading to increased costs and workload.

[+] brindidrip|3 years ago|reply

We need to start developing software to detect AI responses.

To detect a response generated by ChatGPT, we could first analyze the content of the response to see if it contains any unnatural or repetitive language. We could also check the formatting of the response to see if it follows the typical conventions used by human responders on the platform. Additionally, we could check for any unusual patterns in the timestamps of the response, as AI-generated responses may be posted more quickly or regularly than responses written by humans. Finally, we could also use machine learning algorithms to train a model to identify responses generated by ChatGPT based on these and other characteristics.

Quick, someone ask ChatGPT to generate the stubs.

[+] Jerrrry|3 years ago|reply

>Finally, we could also use machine learning algorithms to train a model to identify responses generated by ChatGPT based on these and other characteristics.

whatever your idea (i skimmmed cuz) the discriminator will find it and have the generator apply it to the next generation.

>The core idea of a GAN is based on the "indirect" training through the discriminator, another neural network that can tell how "realistic" the input seems, which itself is also being updated dynamically.[5] This means that the generator is not trained to minimize the distance to a specific image, but rather to fool the discriminator. This enables the model to learn in an unsupervised manner.

https://en.wikipedia.org/wiki/Generative_adversarial_network

[+] xdennis|3 years ago|reply

"We could also... Additionally, we could... Finally, we could" is a dead giveaway.

But to take it seriously, it would be quite sad when actual people will be banned for sounding too much like a bot.

[+] dragonwriter|3 years ago|reply

> We need to start developing software to detect AI responses.

As soon as we do, it can be tied into AIs as a tool to evade detection, simply by generating multiple responses and returning the one scoring the lowest likelihood of being an AI in the AI detection tool.

[+] dragonwriter|3 years ago|reply

> To detect a response generated by ChatGPT, we could first analyze the content of the response to see if it contains any unnatural or repetitive language. We could also check the formatting of the response to see if it follows the typical conventions used by human responders on the platform.

Given my experience with human responses to text queries, these would be positively correlated.

[+] mojuba|3 years ago|reply

Your answer sounds like a ChatGPT one. It's actually not hard to tell.

[+] unknown|3 years ago|reply

[deleted]

[+] calebh|3 years ago|reply

OpenAI needs to get on top of this and generate a detector for every model they release. And then sell access to both.

[+] hysan|3 years ago|reply

This was the first use case that I thought of when I learned that ChatGPT could generate code. Then I considered how I’d feel if I ran into a fake (incorrect) answer and decided not to actually do this. Well, guess someone was eventually going to try this.

323 comments