Stack Overflow, the go-to question-and-answer site for programmers and programmers, has temporarily banned users from sharing responses generated by AI chatbot ChatGPT.
The site’s mods said the ban was temporary and a final ruling would be made some time in the future after consultation with the community. But, as the mods have explained, ChatGPT simply makes it too easy for users to generate comments and flood the site with answers that appear correct on the surface, but are often wrong on closer inspection.
“The primary problem is […] the answers that ChatGPT produces are often incorrect.”
“The primary problem is that while the answers ChatGPT produces are often incorrect, they usually are looks like she power be good and the answers are terribly easy to produce,” the mods wrote (emphasis theirs). “As such, we need to reduce the volume of these messages […] So for now, using ChatGPT to create posts here on Stack Overflow is not allowed. If a user is believed to have used ChatGPT after posting this temporary policy, sanctions will be imposed to prevent users from continuing to post such content, even if the posts would otherwise be acceptable.”
ChatGPT is an experimental chatbot created by OpenAI and based on the autocomplete text generator GPT-3.5. A web demo for the bot was released last week and has since been enthusiastically embraced by users across the web. The bot’s interface encourages people to ask questions and in return provides impressive and fluid results for a range of queries; from generating poems, songs and TV scripts to answering trivia questions and writing and debugging lines of code.
But while many users are impressed with ChatGPT’s capabilities, others have noticed its persistent tendency to generate plausible but false answers. For example, ask the bot to write a biography of a public figure, and it may well step in incorrect biographical information with complete confidence. Ask it to explain how to program software for a specific function and it can produce similarly credible but ultimately false code.
AI text models like ChatGPT learn by searching for statistical regularities in text
This is one of the many known shortcomings of text generation AI models, also known as large language models or LLMs. These systems are trained by analyzing patterns in massive amounts of text scraped from the Internet. They look for statistical regularities in this data and use it to predict which words should go in a given sentence. However, this means they don’t have hard-coded rules for how certain systems in the world work, leading to their tendency to generate “flowing nonsense”.
Given the sheer size of these systems, it is impossible to say for sure what percentage of their output is false. But in the case of Stack Overflow, the company has preliminarily ruled that the risk of misleading users is simply too great.
Stack Overflow’s decision is particularly noteworthy as experts in the AI community are currently debating the potential threat posed by these large language models. Yann LeCun, chief AI scientist at Facebook parent company Meta, does arguedfor example that while LLMs can certainly generate bad output, such as misinformation, they do not make the actual ones parts of this text does not make it easier, and that is what causes damage. But others say the ability of these systems to cheaply generate text on a large scale necessarily increases the risk of it being shared later on.
To date, there is little evidence of the harmful effects of LLMs in the real world. But these recent events at Stack Overflow support the argument that the scale of these systems does indeed create new challenges. The site’s mods say the same thing when announcing the ban on ChatGPT, noting that the “volume of this [AI-generated] answers (thousands) and the fact that the answers often need to be read in detail by someone with at least some subject matter expertise to determine if the answer is really bad has effectively swamped our volunteer-based quality management infrastructure.
The concern is that this pattern could be repeated on other platforms, with a deluge of AI content drowning out real users’ voices with plausible but incorrect data. However, exactly how this might play out in different domains on the internet would depend on the exact nature of the platform and its moderation capabilities. Whether these issues can be addressed in the future with tools such as improved spam filters remains to be seen.
“The scary part was how confidently incorrect it was.”
Meanwhile, responses to Stack Overflow’s policy announcement on the site’s own discussion forums and on related forums like Hacker News have been broadly supportive, with users adding the caveat that it may be difficult for Stack Overflow’s mods to identify AI-generated responses. .
Many users have shared their own experiences using the bot, with one person on Hacker News saying they found that the answers to questions about coding issues were more often wrong than right. “The scary thing was how confidently incorrect it was,” said the user. “The text looked very good, but it had major mistakes.”
Others took the AI moderation issue to ChatGPT itself and asked the bot to generate arguments for and against the ban. In one comment, the bot came to the exact same conclusion as Stack Overflow’s own mods: “All in all, whether or not to allow AI-generated answers on Stack Overflow is a complex decision that needs careful consideration by the community. “