The Nuclear GIF: How the Zero-Click iMessage Exploit Works
In Israel’s NSO Group, a zero-click exploit for iMessage was developed that caused a major stir. Using this exploit, the Pegasus spyware was deployed onto the phones of public figures and politicians. Apple has already filed a lawsuit against NSO. But let’s set politics aside—this article focuses on the exploit itself, which is truly explosive! It infects devices without any user interaction, hides inside a GIF, and even includes a tiny virtual computer.
NSO and the Birth of Pegasus
Our story begins in August 2016, when the Israeli company NSO Group, specializing in cyberweapons, developed and released the Pegasus spyware, designed to infect mobile devices running Android and iOS. Pegasus could read text messages, track calls and location, collect passwords, access microphone and camera data, and even retrieve personal user information.
The original 2016 Pegasus used a “one-click” exploit. When a victim received a “loaded” message on their smartphone, they had to do something—like click a link—to activate the malicious payload. Infection was easy to avoid: just don’t click on suspicious things.
- Examples of phishing SMS messages
In July 2021, researchers from Citizen Lab at the University of Toronto managed to study a “zero-click” exploit for iMessage, discovered on the smartphone of a Saudi Arabian activist. The exploit worked without any user interaction—the hacker only needed to send a payload via the messenger.
The Dangerous Entry Point
The entry point for Pegasus on the iPhone is the iMessage app. This means the attacker only needs to know the victim’s phone number or Apple ID.
iMessage natively supports GIF animation. A GIF sent in a chat plays in a loop endlessly. As soon as iMessage receives a message, even before it’s displayed on the screen, a method in the IMTranscoderAgent
process is called. This process runs outside the BlastDoor
sandbox. The method receives any image with a .gif
extension as a parameter, like this:
[IMGIFUtils copyGifFromPath:toDestinationPath:error]
This is Objective-C code. The selector likely intended to simply copy the GIF file before editing the loop counter field, but the method’s semantics are different. Inside, it uses the CoreGraphics API to render the source image into a new GIF file at the specified path. However, just because a file has a .gif
extension doesn’t mean it’s actually a GIF.
The ImageIO library is used to identify and analyze the file format, completely ignoring the extension. By using this “fake GIF” trick, over 20 graphic codecs become potential victims for zero-click attacks in iMessage. Some of these codecs are extremely complex, with hundreds of thousands of lines of code—a huge playground for hackers!
According to Google Project Zero, starting with iOS 14.8.1 (October 26, 2021), Apple limited the formats available in ImageIO from the IMTranscoderAgent
process. In iOS 15.0 (September 20, 2021), Apple completely removed GIF access code from IMTranscoderAgent
, moving GIF decoding entirely inside BlastDoor.
PDF Inside a GIF
NSO used the “fake GIF” loophole to exploit a vulnerability in the CoreGraphics PDF parser.
For years, the PDF format has been a favorite target for attacks—it’s widely supported and complex enough. A bonus for hackers: PDF supports JavaScript. While CoreGraphics PDF doesn’t interpret JavaScript, NSO found something just as powerful deep in the parser.
Ian Beer and Samuel Groß from Project Zero dissected the NSO exploit and explained in detail how it works. Let’s follow their research to see how an image turned into a real computer and helped the exploit escape the sandbox.
Extreme Compression: JBIG2
In the late 1990s, few people had fast, stable internet. Most users connected via dial-up at laughably slow speeds, and disks had limited capacity, so data compression was crucial. PNG, JPEG, and GIF are still familiar today, but there were others.
The JBIG2 format was designed for compressing monochrome images (where pixels are only black or white). It was used in high-end office scanners like the Xerox WorkCenter.
- If you used the “scan to PDF” function on such a device 10–20 years ago, your resulting PDF likely contained a JBIG2 stream.
Remarkably, these files, even at decent scan resolutions, were only a few kilobytes. JBIG2 uses two methods to achieve such powerful compression. Let’s discuss them—they’re directly related to the iMessage exploit!
Technique 1: Segmentation and Substitution
A text document, especially in languages with small alphabets (like English or Russian), consists of many frequently occurring symbols. Letters, diacritics, punctuation, and other marks are called glyphs.
JBIG2 tries to segment each page into glyphs, then uses simple pattern matching to identify glyphs that look the same.
- Pattern matching allows finding all forms, e.g., all the letter “e”s.
JBIG2 doesn’t know anything about the glyphs themselves and doesn’t try to recognize or match them to an alphabet (no OCR). The JBIG2 encoder just finds connected pixel regions and groups similar ones. The compression algorithm then replaces all sufficiently similar regions with a copy of just one of them.
- Replacing all instances with a single glyph copy achieves high compression ratios.
The text remains readable, but the amount of stored data is reduced. Instead of storing pixel data for the whole page, only the compressed “reference glyph” for each symbol and the relative coordinates for its copies are needed. During decompression, the algorithm places the glyphs in the right spots, as if drawing them on a canvas.
This approach has a major drawback: a poor encoder might confuse similar-looking symbols, leading to unfortunate consequences. In his blog, David Kriesel gives inspiring examples where scanned PDF invoices and drawings had numbers swapped. For our purposes, these issues aren’t important—except to explain why JBIG2 is nearly extinct.
Technique 2: Refinement Coding
The result of substitution-based compression is lossy. After compression and decompression, the output won’t exactly match the input. JBIG2 also supports lossless compression, which includes an intermediate “less lossy” step.
Here, additional information about the difference between the substituted glyph and the original is stored—also compressed. For example, the encoder saves a difference mask, which is then XORed with the substituted symbol during decompression to restore the exact pixels of the original symbol.
Instead of encoding the entire difference at once, this can be done in steps, using logical operators (AND, OR, XOR, XNOR) at each iteration to set, reset, or toggle bits. Each refinement step brings the result closer to the original, allowing control over quality loss during compression. The implementation is very flexible. There’s also the ability to read values already present in the output workspace. As you might guess, this leads us to Turing completeness… But first, let’s discuss a critical vulnerability.
The JBIG2 Stream and the Vulnerability
Most of the CoreGraphics PDF decoder is Apple’s proprietary code, but the JBIG2 implementation comes from the open-source Xpdf project.
The JBIG2 format is a set of segments, which can be seen as a series of drawing commands executed sequentially in one pass. The CoreGraphics JBIG2 parser supports 19 different segment types, including operations like defining a new page, decoding a Huffman table, and rendering a bitmap image at specified coordinates.
Segments are represented by the JBIG2Segment
class and its subclasses JBIG2Bitmap
and JBIG2SymbolDict
. JBIG2Bitmap
is a rectangular pixel array, with a data field pointing to the backing buffer for rendering. JBIG2SymbolDict
groups bitmaps. The target page is represented as a JBIG2Bitmap
and consists of individual glyphs. Segments can be referenced by number, and a vector type GList
stores pointers to all segments. To find a segment by its number, GList
is scanned sequentially.
The vulnerability is a classic integer overflow when matching reference segments. Here’s a simplified code snippet:
Guint numSyms; // (1) numSyms = 0; for (i = 0; i < nRefSegs; ++i) { if ((seg = findSegment(refSegs[i]))) { if (seg->getType() == jbig2SegSymbolDict) { numSyms += ((JBIG2SymbolDict *)seg)->getSize(); // (2) } else if (seg->getType() == jbig2SegCodeTable) { codeTables->append(seg); } } else { error(errSyntaxError, getPos(), "Invalid segment reference in JBIG2 text region"); delete codeTables; return; } } // Get the symbol bitmaps syms = (JBIG2Bitmap **)gmallocn(numSyms, sizeof(JBIG2Bitmap *)); // (3) kk = 0; for (i = 0; i < nRefSegs; ++i) { if ((seg = findSegment(refSegs[i]))) { if (seg->getType() == jbig2SegSymbolDict) { symbolDict = (JBIG2SymbolDict *)seg; for (k = 0; k < symbolDict->getSize(); ++k) { syms[kk++] = symbolDict->getBitmap(k); // (4) } } } }
The variable numSyms
is a 32-bit integer (see (1)). By repeatedly adding specially crafted reference segments (2), numSyms
can overflow to a small, controlled value. This value is used to allocate the heap (3), so syms
points to an undersized buffer. In the inner loop (4), JBIG2Bitmap
pointers are written into the too-small syms
buffer.
Without extra tricks, this loop would write over 32 GB of data into an undersized buffer, causing a crash. To avoid this, the heap is manipulated so that the last few writes from syms
overwrite the end of the GList
segment buffer. GList
stores all known segments and is used by findSegments
to match segment numbers in refSegs
to JBIG2Segment
pointers. The overflow overwrites JBIG2Segment
pointers in GList
with JBIG2Bitmap
pointers (4).
Since JBIG2Bitmap
inherits from JBIG2Segment
, the virtual call seg->getType()
works even on devices with pointer authentication (used for weak type checking of virtual calls). But the returned type is no longer jbig2SegSymbolDict
, so further writes stop (4), limiting memory corruption.
Unlimited Access
After the segments in GList
are corrupted, the attacker moves on to corrupting the JBIG2Bitmap
object representing the current page (where drawing commands render the image). In short, JBIG2Bitmap
is a wrapper for the backing buffer, storing the buffer’s width and height (in bits), and a value indicating how many bytes are stored per line.
If refSegs
is carefully structured, the overflow can be stopped after writing three more JBIG2Bitmap
pointers past the end of the GList
segment buffer. This allows overwriting the vtable pointer and the first four fields of the JBIG2Bitmap
representing the current page.
Due to iOS’s address space layout, these pointers are likely in the second 4 GB of virtual memory (addresses from 0x100000000 to 0x1ffffffff). On iOS devices, little-endian byte order is used, so the w
and line
fields are overwritten with 0x1 (the most significant half of the JBIG2Bitmap
pointer), and the segNum
and h
fields are replaced with the least significant half—a random value depending on heap layout and ASLR, somewhere between 0x100000 and 0xffffffff.
As a result, the target JBIG2Bitmap
page gets an excessively large h
value. Since this value is used for bounds checking and should reflect the allocated buffer size, the effect is to expand the working area for image output. This means subsequent JBIG2 segment commands can read and write memory beyond the original buffer boundaries.
The heap allocator also places the current page’s backing buffer just below the undersized syms
buffer, so when the JBIG2Bitmap
page is unbounded, it can read and write its own fields.
By drawing four-byte bitmaps at the right coordinates, it’s possible to write to all fields of the JBIG2Bitmap
page. By carefully choosing new values for w
, h
, and line
, arbitrary offsets can be written into the page’s backing buffer.
At this point, it would be possible to write absolute memory addresses if their offsets in the backing buffer were known. But how to calculate these offsets? So far, the exploit has acted much like a traditional scripting language exploit. In JavaScript, this could end with an unlimited ArrayBuffer
with memory access, allowing arbitrary code execution. How can this be done in a single-pass image parser?
Another Compression Format—Turing Complete!
Recall that the sequence of refinement steps in JBIG2 is very flexible. Refinement steps can reference both the output bitmap and any previously created segments, and can render output either on the current page or a segment. By cleverly manipulating context-dependent refinement decompression, it’s possible to create segment sequences where only the refinement combination operators have an effect.
In practice, this means logical operators AND, OR, XOR, and XNOR can be applied between memory regions at arbitrary offsets in the current page’s JBIG2Bitmap
buffer. With no restrictions, these logical operations can be performed on memory at arbitrary offsets beyond the buffer’s boundaries.
This complicates things, but with effort, it’s possible to work with individual bits instead of glyphs. As input, a set of JBIG2 segment commands can be provided that implement a sequence of logical bitwise operations on the page. Since the page buffer is unbounded, these bitwise operations can manipulate arbitrary memory.
With the available logical operators AND, OR, XOR, and XNOR, any mathematical function can be performed.
Conclusion
JBIG2 doesn’t support scripting, but combined with the vulnerability, it can emulate various logic gate schemes operating on arbitrary memory. So why not use this to create a custom computer architecture for running your own scripts?
This is exactly what the NSO exploit does. It implements a small computer architecture using about 70,000 segment commands. There are registers, a full 64-bit adder and comparator, used for memory searching and arithmetic operations. It’s not as fast as JavaScript, but similar results can be achieved.
The initial boot operations that allow the exploit to escape the sandbox are written entirely in this bizarre emulated elementary logic, created from a single pass of JBIG2 stream decompression.
While NSO’s activities raise ethical questions, you have to admire the ingenuity of this scheme!