Hack.lu 2012 CTF Challenge #12 (500)

The famous zombie researcher “Donn Beach” almost created an immunization against the dipsomanie virus. This severe disease leads to the inability to defend against Zombies, later causes a complete loss of memory and finally turns you into one of them.

Inexplicably Donn forgot where he put the license key for his centrifuge. Provide him a new one and humanity will owe you a debt of gratitude for fighting one of the most wicked illnesses today.

https://ctf.fluxfingers.net/challenges/donn_beach.exe

ctf.fluxfingers.net tcp/2055

Summary: reverse engineering, VM, MiTM attack, AES

This task is solved by snk and me =D Thank you, snk!

The given file is non-packed PE x86. At first stage we needed to bruteforce easy hash function:

The final hash value is supposed to be equal  0x4B17E245

So, we got few possible strings quite fast:
KCoEun
PP_Pzn
TrKyVo
mFsdVo

But actually this string does not participate in next computation, so you can just patched this check (it is even faster than bruteforce =P )
After that the program asks to input three more numbers and initializes a virtual machine by given numbers:

vm_set_params(&vm, 5u, 
		byte_404020, 
		hash[3], 
		hash[2], 
		hash[1], 
		hash[0]);

Also for initialization it uses a pointer to bytes array (byte_404020). If we look precisely we can see that this array is S-boxes encryption table for Rijndael (AES). So it gives us some clue for further analyze.

But first lets indentificate some VM functions. Authors forgot cleared debug output from some functions and we got their names

Not much usefull information but we know that it is VM and VM have 5 parameters.

To collect information about called functions we traced a program and make call-graph.

As we can see two functions vm_set_params and vm_get_output copy memory from/to main program and VM context. Third function called once from main (vm_sub_2 0x00401C67) obfuscate vm context to MMX registers {MM0..MM1}. Virtual Machine operates own registers (or have internal memory buffer) that stored in MMX in a special way. VM registers are spread over MMX like this.

r0 = 0x14131211 (will be stored in MM1 MM2 MM0 and MM4)
r1 = 0x24232221
r2 = 0x34333231
..
r7 = 0x84838281

MM0 73 23 32 81 13 51 53 61
MM1 31 72 53 22 41 11 63 71
MM2 62 52 82 74 24 12 43 34
MM3 83 14 44 21 84 33 64 42

r0 will be stored in MM1 MM2 MM0 and MM4
1st byte r0 in 6th byte MM1
2nd byte r0 in 5th byte MM0

Function at 0x004012E3 just deubfuscate data from MM{0..3} and stored result in MMX4.

.text:004012E3 pxor    mm4, mm4           ; MM4: 00 00 00 00 00 00 00 00
.text:004012E6 pxor    mm7, mm7           ; MM7: 00 00 00 00 00 00 00 00
.text:004012E9 pcmpeqd mm7, mm7           ; MM7: FF FF FF FF FF FF FF FF
.text:004012EC palignr mm7, mm7, 0Fh      ; MM7: FF 00 00 00 00 00 00 00
.text:004012F1 palignr mm7, mm7, 1        ; MM7: 00 00 00 00 00 00 00 FF
.text:004012F6 pxor    mm6, mm6

.text:004012F9 por     mm6, mm3           ; MM6 = MM3
.text:004012FC palignr mm7, mm7, 6        ; MM7: 00 FF 00 00 00 00 00 00
.text:00401301 pand    mm6, mm7           ; get 2nd byte from MM3
.text:00401304 palignr mm6, mm6, 6        ; shift value to 4th byte
.text:00401309 por     mm4, mm6           ; MM4 |= MM6  -> MM4[4] = MM3[3]
...
.text:0040131F por     mm4, mm6           ; MM4[3] = MM0[5]
...
.text:00401335 por     mm4, mm6           ; MM4[2] = MM2[6]
...
.text:0040134B por     mm4, mm6           ; MM4[1] = MM1[6]

So by same analysis we was identificate functions group that load and store VM registers (functions on lowest layer on the picture). Rest function was identified by same way, for example function at 0x004060B2 just do:
hacklu12:004060B2 ; reg(op[1]) += reg(op[2])
hacklu12:004060B2 ; reg7 += 2

or function at 0x00406B8F:
hacklu12:00406B8F ; reg6 += 4
hacklu12:00406B8F ; reg(op[1]) = [reg6-4]
hacklu12:00406B8F ; reg7 += 2

reg6 – VM stack pointer
reg7 – VM intruction pointer

In result we got disassembler for VM with the following instructions:
0x004060B2: [‘add‘,dis_add],
0x004062AA: [‘xor‘,dis_xor],
0x0040649A: [‘push r’,dis_push_r],
0x0040681A: [‘push #i’,dis_push_i],
0x00406B8F: [‘pop r’,dis_pop_r],
0x00406F18: [‘never called’,None],
0x004070F7: [‘r = [r]‘,dis_load],
0x004072DA: [‘shr‘,dis_shr],
0x004074C2: [‘mov’,dis_mov],
0x0040769C: [‘and‘,dis_and],
0x0040788E: [‘shl‘,dis_shl],

After disassembling we got pseudo code:

 r1 = 0x4B17E245
 r2 = arg[1]
 r3 = arg[2]
 r4 = arg[3]
 
 ; SubBytes
 ; TL = Table lookup
 r1 = TL(r1)
 r2 = TL(r2)
 r3 = TL(r3)
 r4 = TL(r4)
 
 ; ShiftRows
 r2 = (r2<<8) ^ (r2>>24)
 r3 = (r3 >> 16) ^ (r3 << 16)
 r4 = (r4>>8) ^ (r4<<24)
 
 ; XORs round 1
 r0 = r1
 r1 ^= r2
 r2 ^= r3
 r3 ^= r4
 r4 ^= r0
 
 ; SubBytes
 r1 = TL(r1)
 r2 = TL(r2) 
 r3 = TL(r3) 
 r4 = TL(r4 ^ CONST)
 
 ; ShiftRows
 r2 = (r2 << 8) ^ (r2 >> 24)  ; ROL(r2,8)
 ; reverce: (r2<<24 | r4>>8)
 r3 = (r3 >> 16) ^ (r3 << 16) ; ROL(r2,16)
 ; reverce: (r3>>16 | r3<<16)
 r4 = (r4 >> 8) ^ (r4 << 24)  ; ROL(r2,24)
 ; reverce: (r4>>24 | r4<<8)
 
 ; XORs round 2
 r0 = r1
 r1 ^= r2
 r2 ^= r3
 r3 ^= r4
 r4 ^= r0
 
 r1 == 0x1020304
 r2 == 0x5060708
 r3 == 0x9101112
 r4 == 0xD14151E

After we analyzed this pseudo-code we understood that two round of AES encryption (without MixColumn stage) were implemented by this VM.
So, because we had the one correct input parameter (0x4B17E245) and four output parameters (0x1020304, 0x5060708, 0x9101112, 0xD14151E), which the program requires for succeed execution, we decided to do mitm attack to bruteforce possible input values.
So, the first step of the attack is to pre-calculate r0:

r1 = 0x4B17E245
 r2 = arg[1]
 r3 = arg[2]
 r4 = arg[3]

 ; SubBytes
 ; TL = Table lookup
 r1 = TL(r1)
 r2 = TL(r2)
 r3 = TL(r3)
 r4 = TL(r4)

 ; ShiftRows
 r2 = (r2<>24)
 r3 = (r3 >> 16) ^ (r3 << 16)  r4 = (r4>>8) ^ (r4<

The second step is to bruteforce r0 which uses in last xor operation:

; XORs round 2
 r0 = r1
 r1 ^= r2
 r2 ^= r3
 r3 ^= r4
 r4 ^= r0

The conditions for successful stop is when in XORs round1 the pre-calculated r0 is equal to bruted r1

; XORs round 1
r0 == r1

The code of the bruteforce program is below. By using this program we got four keys in ~30 seconds.

#include "header.h"
#include <stdio.h>
 
u32 prepare_r0()
{
	u32 r1 = 0x4B17E245;
 
	u32 t0 =	Te4[(r1      ) & 0xff]       ^
			Te4[(r1 >>  8) & 0xff] <<  8 ^
			Te4[(r1 >> 16) & 0xff] << 16 ^
			Te4[(r1 >> 24)       ] << 24;
 
	return t0;
}
u32 brute(u32 r0, u32 pre_brute_value)
{
	u32 r1 = 0x1020304;
	u32 r2 = 0x5060708;
	u32 r3 = 0x9101112;
	u32 r4 = 0xD14151E;
 
	r4 ^= r0;
	r3 ^= r4;
	r2 ^= r3;
	r1 ^= r2;
 
	r4 = (r4 << 8) ^ (r4 >> 24);
	r3 = (r3 >> 16) ^ (r3 << 16);
	r2 = (r2 >> 8) ^ (r2 << 24);
 
	u32 t[4] = {0,0,0,0};
 
    t[0] =	Td4[(r1      ) & 0xff]       ^
	Td4[(r1 >>  8) & 0xff] <<  8 ^
	Td4[(r1 >> 16) & 0xff] << 16 ^
	Td4[(r1 >> 24)       ] << 24;
    t[1] =	Td4[(r2      ) & 0xff]       ^
	Td4[(r2 >>  8) & 0xff] <<  8 ^
	Td4[(r2 >> 16) & 0xff] << 16 ^
	Td4[(r2 >> 24)       ] << 24;
    t[2] =	Td4[(r3      ) & 0xff]       ^
	Td4[(r3 >>  8) & 0xff] <<  8 ^
	Td4[(r3 >> 16) & 0xff] << 16 ^
	Td4[(r3 >> 24)       ] << 24;
    t[3] =	Td4[(r4      ) & 0xff]       ^
	Td4[(r4 >>  8) & 0xff] <<  8 ^
	Td4[(r4 >> 16) & 0xff] << 16 ^
	Td4[(r4 >> 24)       ] << 24;
 
	r1 = t[0];
	r2 = t[1];
	r3 = t[2];
	r4 = t[3];
 
	r4 ^= pre_brute_value;
	r3 ^= r4;
	r2 ^= r3;
	r1 ^= r2;
 
	return r1 == pre_brute_value;
}
 
u32 brute_check(u32 r0, u32 pre_brute_value)
{
	u32 r1 = 0x1020304;
	u32 r2 = 0x5060708;
	u32 r3 = 0x9101112;
	u32 r4 = 0xD14151E;
 
	r4 ^= r0;
	r3 ^= r4;
	r2 ^= r3;
	r1 ^= r2;
 
	r4 = (r4 << 8) ^ (r4 >> 24);
	r3 = (r3 >> 16) ^ (r3 << 16);
	r2 = (r2 >> 8) ^ (r2 << 24);
 
	u32 t[4] = {0,0,0,0};
 
    t[0] =	Td4[(r1      ) & 0xff]       ^
	Td4[(r1 >>  8) & 0xff] <<  8 ^
	Td4[(r1 >> 16) & 0xff] << 16 ^
	Td4[(r1 >> 24)       ] << 24;
    t[1] =	Td4[(r2      ) & 0xff]       ^
	Td4[(r2 >>  8) & 0xff] <<  8 ^
	Td4[(r2 >> 16) & 0xff] << 16 ^
	Td4[(r2 >> 24)       ] << 24;
    t[2] =	Td4[(r3      ) & 0xff]       ^
	Td4[(r3 >>  8) & 0xff] <<  8 ^
	Td4[(r3 >> 16) & 0xff] << 16 ^
	Td4[(r3 >> 24)       ] << 24;
    t[3] =	Td4[(r4      ) & 0xff]       ^
	Td4[(r4 >>  8) & 0xff] <<  8 ^
	Td4[(r4 >> 16) & 0xff] << 16 ^
	Td4[(r4 >> 24)       ] << 24;
 
	r1 = t[0];
	r2 = t[1];
	r3 = t[2];
	r4 = t[3];
 
	r4 ^= pre_brute_value;
	r3 ^= r4;
	r2 ^= r3;
	r1 ^= r2;
 
	r2 = (r2>>8) ^ (r2<<24);
	r3 = (r3 >> 16) ^ (r3 << 16);
	r4 = (r4<<8) ^ (r4>>24);
 
	t[0] =	Td4[(r1      ) & 0xff]       ^
	Td4[(r1 >>  8) & 0xff] <<  8 ^
	Td4[(r1 >> 16) & 0xff] << 16 ^
	Td4[(r1 >> 24)       ] << 24;
    t[1] =	Td4[(r2      ) & 0xff]       ^
	Td4[(r2 >>  8) & 0xff] <<  8 ^
	Td4[(r2 >> 16) & 0xff] << 16 ^
	Td4[(r2 >> 24)       ] << 24;
    t[2] =	Td4[(r3      ) & 0xff]       ^
	Td4[(r3 >>  8) & 0xff] <<  8 ^
	Td4[(r3 >> 16) & 0xff] << 16 ^
	Td4[(r3 >> 24)       ] << 24;
    t[3] =	Td4[(r4      ) & 0xff]       ^
	Td4[(r4 >>  8) & 0xff] <<  8 ^
	Td4[(r4 >> 16) & 0xff] << 16 ^
	Td4[(r4 >> 24)       ] << 24;
 
	r1 = t[0];
	r2 = t[1];
	r3 = t[2];
	r4 = t[3];
 
	printf("%x-%x-%x\n", r2, r3, r4);
 
	return r1 == pre_brute_value;
}
int main(int argc, char* argvp[])
{
	u32 pre_bruted_r0 = prepare_r0();
	for (u32 i = 0; i < 0xffffffff; i++)
		if (brute(i, pre_bruted_r0))
			brute_check(i, pre_bruted_r0);
}

Keys:

b6b09bf0-f23daa06-ac4ee747
ee25052c-e6d61a0d-45e9cfe0
2b94f8c8-e1db8003-f6edf54
e5304760-47b7c45f-f59a8f29

2 comments

  1. I guess “XORs round” is MixColumns stage indeed, but with another, simple polynomial: c(x) = x + 1 instead of c(x) = 3x**3 + x**2 + x + 2
    And this poly is not inversible so you can’t simply reverse it, so bruteforce is ok :)

    • rm -rf on April 11, 2013 at 02:37
    • Reply

    How did you calculate keys
    ee25052c-e6d61a0d-45e9cfe0
    2b94f8c8-e1db8003-f6edf54
    ???

    I tried 3 ways (and your code too) to decode this AES, but only found
    b6b09bf0-f23daa06-ac4ee747
    e5304760-47b7c45f-f59a8f29

    I believe there are only two key sequences possibly

    P.S.
    Thank you so much for this write-up!

Leave a Reply

Your email address will not be published.