Dissecting a Highly Obfuscated RTF Exploit
A technical dive into a sophisticated CVE-2017-11882 RTF exploit featuring dynamic XOR encryption, position-independent shellcode, and multi-layer obfuscation techniques.
Today we’re going deep on a CVE-2017-11882 exploit that’s a showcase in evasion. This sample uses dynamic XOR encryption, position-independent shellcode, and enough obfuscation to make analysts pause.
The Target: Microsoft’s Ancient Equation Editor
CVE-2017-11882 targets a stack-based buffer overflow in Microsoft Equation Editor (EQNEDT32.EXE). We’re dealing with legacy code that predates modern memory protections like ASLR and DEP.
The vulnerability is simple: when processing font names in embedded equation objects, the code copies user data into a 40-byte stack buffer without length validation. Game over.
Buffer Layout:
┌─────────────────┐
│ Return Address │ ← Overwritten to jump to shellcode
├─────────────────┤
│ Saved EBP │
├─────────────────┤
│ Font Name │ ← Buffer overflow starts here
└─────────────────┘
RTF Container: The Art of Fragmentation
The malicious document uses RTF as its container, but not in any straightforward way. The attackers fragmented the embedded OLE object:
# RTF structure discovered via rtfdump
rtfdump.py sample.rtf
# Stream 3: Main objdata
The hex data is nibble-misaligned, meaning standard extraction tools fail. You need the -S (hex-shift) parameter to reconstruct the object properly:
# This fails
rtfdump.py -s 3 -H sample.rtf -d > failed.bin
# This works - corrects the nibble alignment
rtfdump.py -s 3 -S -H -c 0x64: sample.rtf -d > success.bin
Fragmentation breaks signature-based detection, while nibble corruption may defeat automated analysis tools.
OLE Object Structure: Hiding in Plain Sight
Once reconstructed, the object reveals a corrupted OLE1 Native Stream:
Offset 0x60: 02 00 00 00 0B 00 00 00 65 71 75 41 70 36 C9 E4
[OLE1 Type] [Class Len] [Corrupted "Equation.3"]
The class name corruption—”Equation.3” becomes “equAp6…IoN.3”—is intentional. It’s anti-analysis obfuscation to break tools that look for specific class names.
The MTEF (Math Type Equation Format) header follows standard structure:
Offset 0x84: 00 00 03 7E 81 EB 47 62 01 05 3B 1E BF EC 90 00
[MTEF Ver] [Header] [Font Record Marker]
That 01 05 pattern is the font record marker triggering the vulnerable parsing code.
The Overflow: Precision Engineering
The malicious font data extends exactly to offset 0xDA, overwriting the return address with:
E9 65 01 00 00 ; JMP +0x165 (jumps to shellcode at offset 0x244)
Jump calculation: 0xDA + 5 + 0x165 = 0x244 (shellcode entry point)
This lands at the position-independent code prologue:
seg000:00000244 E8 00 00 00 00 call $+5 ; Push return address
seg000:00000249 5A pop edx ; Get current EIP
seg000:0000024A 81 C2 36 01 00 00 add edx, 136h ; Calculate data offset
Classic PIC technique—the shellcode calculates its runtime location without hardcoded addresses.
Dynamic XOR Encryption: Adaptive Keying
Instead of static XOR, the shellcode uses dynamic key generation:
; Key evolution: ECX_new = (ECX_old × 0x787F3959) + 0x5EB59C59
seg000:000002BB imul ecx, 787F3959h ; Multiply by LCG constant
seg000:000002C1 add ecx, 5EB59C59h ; Add LCG increment
seg000:00000290 xor [edx], ecx ; XOR 4 bytes with dynamic key
seg000:00000292 add edx, 4 ; Advance pointer
Each 4-byte block uses a different XOR key. Python equivalent:
def dynamic_xor_decrypt(encrypted_data):
decrypted = bytearray()
ecx = 0 # Initial key state
for i in range(0, len(encrypted_data), 4):
ecx = ((ecx * 0x787F3959) + 0x5EB59C59) & 0xFFFFFFFF
chunk = encrypted_data[i:i+4]
key_bytes = ecx.to_bytes(4, 'little')
for j in range(len(chunk)):
decrypted.append(chunk[j] ^ key_bytes[j])
return bytes(decrypted)
Static analysis can’t decrypt the strings, but the algorithm is deterministic enough for reconstruction.
Shellcode Obfuscation: Junk Code
; Save registers and flags
seg000:00000308 pushf ; Save flags
seg000:00000309 push edi ; Save EDI register
seg000:0000030A push edi ; Save EDI again
; Extensive junk math operations on the saved register
seg000:0000030B lea edi, [edi+5CD7h] ; Add 0x5CD7 to EDI
seg000:00000311 lea edi, [edi+6C02h] ; Add 0x6C02 to EDI
seg000:00000317 sub edi, 4E03h ; Subtract 0x4E03
seg000:0000031D sub edi, 1D71h ; Subtract 0x1D71
seg000:00000323 sub edi, 4EA1h ; Subtract 0x4EA1
seg000:00000329 lea edi, [edi+5A49h] ; Add 0x5A49 back
; Restore registers and flags to original state
seg000:0000032F pop edi ; Restore original EDI
seg000:00000330 pop edi ; Restore again
seg000:00000331 popf ; Restore flags
The actual functionality—address calculation and control transfer—is buried in over 300 bytes of noise.
The Payload: No PowerShell Required
Decrypted strings reveal the complete attack chain.
| Library | Function | Purpose |
|---|---|---|
kernel32 |
LoadLibraryW |
Load additional DLLs |
kernel32 |
ExpandEnvironmentStringsW |
Expand %APPDATA% path |
UrlMon |
URLDownloadToFileW |
Download from C2 server |
kernel32 |
CreateProcessW |
Execute payload directly |
The payload downloads petitzx.exe from the C2 server, saves it as petitgxk8523.exe in the user’s AppData folder, and executes it with CreateProcessW.
Wrapping Up
This CVE-2017-11882 exploit layers multiple evasion techniques:
- Document fragmentation breaks signature detection
- Nibble corruption defeats automated extraction
- Dynamic encryption prevents static string analysis
- Code obfuscation wastes analyst time
Stay tuned for more malware-related tips and tricks!