YouTuber Fights AI Content Theft with Subtitle Poisoning

YouTuber Uses Subtitles to “Poison” AI

YouTuber F4mi has discovered an interesting way to combat AI tools that are used to steal other creators’ content. She uses .ass subtitle files, filling the subtitles for her videos with junk data that is invisible to viewers but confuses AI systems.

Recently, there has been a surge in YouTube channels where content production is fully automated using AI tools. On these channels, AI creates everything: from scripts and voiceovers to images and music.

As a result, more and more YouTubers (1, 2, 3) have been complaining that these faceless channels steal embedded video transcripts, run them through AI summarizers, and quickly generate their own knockoff content. This is exactly what F4mi is fighting against.

F4mi explains that she aims to poison all AI summarizers that try to steal her content. The key to her method is using the .ass subtitle format, which was created several decades ago. Unlike simpler formats, .ass supports features like different fonts, colors, positioning, bold, italics, underline, and much more.

This allows F4mi to hide junk data in her video transcripts that confuses AI but is invisible to humans. For each real text fragment, F4mi adds “two fragments of text outside the boundaries using the positioning feature in the .ass format.” The size and transparency of this text are set to zero, making it invisible to viewers.

How the Subtitle Poisoning Works

In these “invisible” subtitles, F4mi inserts text from works in the public domain (sometimes swapping words for synonyms to avoid detection) or her own texts generated by LLMs, filled with made-up facts.

When an AI summarizer processes these transcripts, the hidden text overwhelms the real content, resulting in complete nonsense that can’t be used for quick content theft.

F4mi notes that advanced models like ChatGPT-o1 can sometimes filter out the junk and still generate fairly accurate summaries of her videos. However, she can split the .ass file into individual letters with timestamps, scrambling their order in the file (while they still display correctly in the final video). This creates real puzzles that even advanced AIs can’t solve.

Technical Challenges and Limitations

Although YouTube doesn’t natively support the .ass format, there are tools that let creators convert .ass subtitles to YouTube’s preferred .ytt format. Unfortunately, these subtitles display incorrectly on the mobile version, where they appear as black boxes covering the video.

According to F4mi, she managed to work around this by writing a Python script that hides the junk subtitles as black text on a black background, filling the screen whenever the image turns black. However, she notes in her video description that “some people’s phones freeze because the subtitles have become too heavy.”

The YouTuber also admits that this method is far from perfect. For example, tools like OpenAI’s Whisper, which actually listen to the audio track, can still generate usable transcripts since they don’t rely on subtitles. Additionally, almost any AI-based screen reader will likely be able to extract the subtitles visible to humans from the video.

Other Efforts to Protect Content from AI

F4mi is not the only one trying to fight back against AI bots and crawlers to protect her content. Recently, we reported on other enthusiasts who are creating traps for AI crawlers and developing tools to “poison” AI systems.

Source

Leave a Reply