OpenAI Plans Tool to Detect ChatGPT-Generated Texts
OpenAI is developing a tool designed to detect texts written with ChatGPT by using a watermarking method. This approach involves making subtle changes to the model’s word selection process, creating an invisible watermark. The watermark can be identified by a specialized tool, making it possible to determine if ChatGPT was used to generate the content.
ChatGPT operates on an AI system that predicts which word or word fragment (known as a token) should come next in a sentence. The tool under development at OpenAI will slightly alter the way tokens are chosen, leaving a pattern referred to as a watermark. These marks are intended to be invisible to the human eye but can be detected using the company’s technology. The detector provides a probability estimate of whether all or part of a document was written by ChatGPT. Internal company documents show that watermarks are 99.9% effective if ChatGPT generates a sufficient amount of new text.
Concerns and Delays in Release
Despite the tool’s technical readiness, OpenAI is delaying its release. The company believes that using this technology could cause dissatisfaction among users. According to a survey, about one-third of regular ChatGPT users would react negatively to the introduction of anti-cheating technology. These concerns are heightened by the risk that the technology could be bypassed, for example, by translating the text into another language or by adding and removing special characters.
Additionally, OpenAI is considering the potential negative impact on various user groups, such as non-native English speakers. There are concerns that watermarking could create barriers for these groups in using AI for educational purposes.
Internal Debate and Access Control
Active debates continue within the company about whether to release the tool. Some employees believe it would be useful for preventing academic dishonesty and maintaining the integrity of educational assessments. At the same time, there are concerns that access to the tool should be strictly controlled to prevent malicious actors from studying and circumventing the watermarks.
The question of who should have access to the detector remains unresolved. Limited access could reduce its effectiveness, while overly broad access could reveal detection methods. OpenAI is currently considering various distribution strategies, including working directly with educational institutions or partnering with companies specializing in plagiarism detection.
Part of a Broader Strategy
This tool is part of OpenAI’s broader strategy to develop technologies for detecting AI-generated content, including watermarks for audio and video. This is especially important for preventing misinformation, particularly during election campaigns.
The decision to release the text detection tool will be made after further evaluation of its impact on users and the overall ecosystem. OpenAI aims to act in accordance with principles of transparency and responsibility, requiring a careful approach to implementing new technologies.