Self-Modification - Welcome to UnHaxed

Hello folks, a bit late, but happy new year for everyone! Life got busy lately.

This writeup looks at one of the old but still effective tricks: using FPU instructions to recover the instruction pointer, followed by a polymorphic XOR decoder that unpacks the real payload at runtime. Despite its age, this technique is still showing up in malware samples, including ones derived from Metasploit’s Shikata Ga Nai encoder. Static analysis tools tend to hate it for a good reason. Let’s walk through the decoder stub and see what’s going on.

Using the FPU to Find EIP

Instead of relying on the usual call/pop trick, the malware gets its own address using the floating-point unit:

Snippet

At first glance, fcmovbe looks pointless, and that’s because it mostly is. Its only real job is to make sure the FPU has executed something. The interesting part is fnstenv.

fnstenv saves the FPU environment to memory. Buried inside that structure is the instruction pointer of the last FPU instruction, stored at offset 0x0C. The environment is written to the stack, so when pop ebx executes, EBX now holds the address of the decoder stub itself. There are no calls and jumps like in other usual GetPC patterns.

The XOR Decryption Loop

Once the malware knows where it’s executing, it starts decrypting itself:

mov     edi, 673CF4h
mov     cx, 0B7h
sub     ebx, 0FFFFFFFCh
xor     [ebx+16h], edi
add     edi, [ebx-1Eh]

The sub ebx, 0FFFFFFFCh is just a slightly sneaky way of adding 4. EBX is being nudged forward to line up with the encrypted data.

The interesting part is the key update:

add edi, [ebx-1Eh]

Instead of using a static XOR key, the decoder mutates the key on every iteration based on the encrypted bytes themselves. This feedback loop ensures that even if the payload is identical, the encrypted output looks completely different each time.

This is classic Shikata Ga Nai behavior—polymorphism designed to break signatures and frustrate bulk detection.

Practical Analysis Strategy

If you encounter this in the wild:

Let the sample run under a debugger
Break after the XOR loop finishes (after 0xB7 iterations here)
Dump the decrypted memory region
Load the dump into a fresh IDA session

Trying to statically decrypt polymorphic payloads is usually a pain. Let the malware do the work for you.

Stay tuned for more malware-related tips and tricks!