Researchers Demonstrate How to Fool Advanced Deepfake Detectors

Researchers Show How to Outsmart the Best Deepfake Detectors

A team of scientists from the University of California, San Diego has demonstrated that even the most advanced deepfake detection systems can be deceived. The key is to insert adversarial examples or manipulated input data into every video frame of a deepfake.

Adversarial examples are slightly altered inputs that cause artificial intelligence systems to make mistakes. Notably, this method works even after the video has been compressed.

“Our work shows that attacks on deepfake detectors can be a real threat,” said study co-author Shehzeen Hussain. According to her, it’s possible to create deepfakes without any understanding of the machine learning model used by the detector.

How Deepfake Detectors Work

Typical deepfake detectors focus on faces in videos: they first track the faces, then feed the facial data into a neural network that determines whether the face is real or fake. For example, deepfakes often fail to reproduce blinking, so detectors focus on eye movements.

If attackers have some knowledge of how detectors work, they can design input data to target the detectors’ blind spots.

The Attack Method

The researchers created an adversarial example for each face in a video frame. The algorithm evaluates a set of input transformations, just as the model evaluates real or fake images. It then uses this evaluation to transform the images in a way that remains effective even after compression and decompression. The altered version of the face is inserted into the video frames. This process is repeated for every frame to create the final deepfake video.

Testing the Deepfakes

The researchers tested their deepfakes in two scenarios:

Full Access: Hackers have complete access to the detector model, including the face extraction pipeline, model architecture, and classification parameters.
Limited Access: Attackers can only queryQuery is an online Q&A platform where users can ask questions on any topic and get answers from the community. It features voting, reputation points, and topic tags to organize and highlight quality content. While answer quality can vary, Query aims to provide quick, crowdsourced knowledge and create a collaborative space for sharing expertise. With active moderation and community engagement, it has the potential to become a valuable resource for learning and discussion. More the machine learning model to determine the probability that a frame will be classified as real or fake.

In the first scenario, the attack success rate for uncompressed videos exceeded 99%. For compressed videos, it was 84.96%. In the second scenario, the success rate was 86.43% for uncompressed videos and 78.33% for compressed ones.

The team chose not to publish their code to prevent misuse by malicious actors.

Improving Deepfake Detectors

To improve detectors, the researchers recommend an approach similar to adversarial training: during training, an adversary continues to generate new deepfakes while the detector keeps improving.

Previously, researchers from Binghamton University and Intel proposed detecting deepfakes based on invisible skin color changes caused by blood flow. The photoplethysmography method allows for tracking changes in blood flow using an infrared or light source and a photoresistor or phototransistor.

DarkNet KING Shedding Light on the Hidden Web — Your Gateway to Trusted, Secure Information

DarkNet KING Shedding Light on the Hidden Web — Your Gateway to Trusted, Secure Information

Researchers Demonstrate How to Fool Advanced Deepfake Detectors

Researchers Show How to Outsmart the Best Deepfake Detectors

How Deepfake Detectors Work

The Attack Method

Testing the Deepfakes

Improving Deepfake Detectors

Leave a Reply Cancel reply

Workspace

About us

Services

Partners

Workspace

About us

Services

Partners

USERNAME:
PASSWORD: