The famous zombie researcher “Donn Beach” almost created an immunization against the dipsomanie virus. This severe disease leads to the inability to defend against Zombies, later causes a complete loss of memory and finally turns you into one of them.
Inexplicably Donn forgot where he put the license key for his centrifuge. Provide him a new one and humanity will owe you a debt of gratitude for fighting one of the most wicked illnesses today.
https://ctf.fluxfingers.net/challenges/donn_beach.exe
ctf.fluxfingers.net tcp/2055
Summary: reverse engineering, VM, MiTM attack, AES
This task is solved by snk and me =D Thank you, snk!
The given file is non-packed PE x86. At first stage we needed to bruteforce easy hash function:
The final hash value is supposed to be equal 0x4B17E245
So, we got few possible strings quite fast:
KCoEun
PP_Pzn
TrKyVo
mFsdVo
But actually this string does not participate in next computation, so you can just patched this check (it is even faster than bruteforce =P )
After that the program asks to input three more numbers and initializes a virtual machine by given numbers:
vm_set_params(&vm, 5u, byte_404020, hash[3], hash[2], hash[1], hash[0]); |
Also for initialization it uses a pointer to bytes array (byte_404020). If we look precisely we can see that this array is S-boxes encryption table for Rijndael (AES). So it gives us some clue for further analyze.
But first lets indentificate some VM functions. Authors forgot cleared debug output from some functions and we got their names
Not much usefull information but we know that it is VM and VM have 5 parameters.
To collect information about called functions we traced a program and make call-graph.
As we can see two functions vm_set_params and vm_get_output copy memory from/to main program and VM context. Third function called once from main (vm_sub_2 0x00401C67) obfuscate vm context to MMX registers {MM0..MM1}. Virtual Machine operates own registers (or have internal memory buffer) that stored in MMX in a special way. VM registers are spread over MMX like this.
r0 = 0x14131211 (will be stored in MM1 MM2 MM0 and MM4)
r1 = 0x24232221
r2 = 0x34333231
..
r7 = 0x84838281
MM0 73 23 32 81 13 51 53 61
MM1 31 72 53 22 41 11 63 71
MM2 62 52 82 74 24 12 43 34
MM3 83 14 44 21 84 33 64 42
r0 will be stored in MM1 MM2 MM0 and MM4
1st byte r0 in 6th byte MM1
2nd byte r0 in 5th byte MM0
Function at 0x004012E3 just deubfuscate data from MM{0..3} and stored result in MMX4.
.text:004012E3 pxor mm4, mm4 ; MM4: 00 00 00 00 00 00 00 00 .text:004012E6 pxor mm7, mm7 ; MM7: 00 00 00 00 00 00 00 00 .text:004012E9 pcmpeqd mm7, mm7 ; MM7: FF FF FF FF FF FF FF FF .text:004012EC palignr mm7, mm7, 0Fh ; MM7: FF 00 00 00 00 00 00 00 .text:004012F1 palignr mm7, mm7, 1 ; MM7: 00 00 00 00 00 00 00 FF .text:004012F6 pxor mm6, mm6 .text:004012F9 por mm6, mm3 ; MM6 = MM3 .text:004012FC palignr mm7, mm7, 6 ; MM7: 00 FF 00 00 00 00 00 00 .text:00401301 pand mm6, mm7 ; get 2nd byte from MM3 .text:00401304 palignr mm6, mm6, 6 ; shift value to 4th byte .text:00401309 por mm4, mm6 ; MM4 |= MM6 -> MM4[4] = MM3[3] ... .text:0040131F por mm4, mm6 ; MM4[3] = MM0[5] ... .text:00401335 por mm4, mm6 ; MM4[2] = MM2[6] ... .text:0040134B por mm4, mm6 ; MM4[1] = MM1[6]
So by same analysis we was identificate functions group that load and store VM registers (functions on lowest layer on the picture). Rest function was identified by same way, for example function at 0x004060B2 just do:
hacklu12:004060B2 ; reg(op[1]) += reg(op[2])
hacklu12:004060B2 ; reg7 += 2
or function at 0x00406B8F:
hacklu12:00406B8F ; reg6 += 4
hacklu12:00406B8F ; reg(op[1]) = [reg6-4]
hacklu12:00406B8F ; reg7 += 2
reg6 – VM stack pointer
reg7 – VM intruction pointer
In result we got disassembler for VM with the following instructions:
0x004060B2: [‘add‘,dis_add],
0x004062AA: [‘xor‘,dis_xor],
0x0040649A: [‘push r’,dis_push_r],
0x0040681A: [‘push #i’,dis_push_i],
0x00406B8F: [‘pop r’,dis_pop_r],
0x00406F18: [‘never called’,None],
0x004070F7: [‘r = [r]‘,dis_load],
0x004072DA: [‘shr‘,dis_shr],
0x004074C2: [‘mov’,dis_mov],
0x0040769C: [‘and‘,dis_and],
0x0040788E: [‘shl‘,dis_shl],
After disassembling we got pseudo code:
r1 = 0x4B17E245 r2 = arg[1] r3 = arg[2] r4 = arg[3] ; SubBytes ; TL = Table lookup r1 = TL(r1) r2 = TL(r2) r3 = TL(r3) r4 = TL(r4) ; ShiftRows r2 = (r2<<8) ^ (r2>>24) r3 = (r3 >> 16) ^ (r3 << 16) r4 = (r4>>8) ^ (r4<<24) ; XORs round 1 r0 = r1 r1 ^= r2 r2 ^= r3 r3 ^= r4 r4 ^= r0 ; SubBytes r1 = TL(r1) r2 = TL(r2) r3 = TL(r3) r4 = TL(r4 ^ CONST) ; ShiftRows r2 = (r2 << 8) ^ (r2 >> 24) ; ROL(r2,8) ; reverce: (r2<<24 | r4>>8) r3 = (r3 >> 16) ^ (r3 << 16) ; ROL(r2,16) ; reverce: (r3>>16 | r3<<16) r4 = (r4 >> 8) ^ (r4 << 24) ; ROL(r2,24) ; reverce: (r4>>24 | r4<<8) ; XORs round 2 r0 = r1 r1 ^= r2 r2 ^= r3 r3 ^= r4 r4 ^= r0 r1 == 0x1020304 r2 == 0x5060708 r3 == 0x9101112 r4 == 0xD14151E |
After we analyzed this pseudo-code we understood that two round of AES encryption (without MixColumn stage) were implemented by this VM.
So, because we had the one correct input parameter (0x4B17E245) and four output parameters (0x1020304, 0x5060708, 0x9101112, 0xD14151E), which the program requires for succeed execution, we decided to do mitm attack to bruteforce possible input values.
So, the first step of the attack is to pre-calculate r0:
r1 = 0x4B17E245 r2 = arg[1] r3 = arg[2] r4 = arg[3] ; SubBytes ; TL = Table lookup r1 = TL(r1) r2 = TL(r2) r3 = TL(r3) r4 = TL(r4) ; ShiftRows r2 = (r2<>24) r3 = (r3 >> 16) ^ (r3 << 16) r4 = (r4>>8) ^ (r4<
The second step is to bruteforce r0 which uses in last xor operation:
; XORs round 2 r0 = r1 r1 ^= r2 r2 ^= r3 r3 ^= r4 r4 ^= r0
The conditions for successful stop is when in XORs round1 the pre-calculated r0 is equal to bruted r1
; XORs round 1 r0 == r1
The code of the bruteforce program is below. By using this program we got four keys in ~30 seconds.
#include "header.h" #include <stdio.h> u32 prepare_r0() { u32 r1 = 0x4B17E245; u32 t0 = Te4[(r1 ) & 0xff] ^ Te4[(r1 >> 8) & 0xff] << 8 ^ Te4[(r1 >> 16) & 0xff] << 16 ^ Te4[(r1 >> 24) ] << 24; return t0; } u32 brute(u32 r0, u32 pre_brute_value) { u32 r1 = 0x1020304; u32 r2 = 0x5060708; u32 r3 = 0x9101112; u32 r4 = 0xD14151E; r4 ^= r0; r3 ^= r4; r2 ^= r3; r1 ^= r2; r4 = (r4 << 8) ^ (r4 >> 24); r3 = (r3 >> 16) ^ (r3 << 16); r2 = (r2 >> 8) ^ (r2 << 24); u32 t[4] = {0,0,0,0}; t[0] = Td4[(r1 ) & 0xff] ^ Td4[(r1 >> 8) & 0xff] << 8 ^ Td4[(r1 >> 16) & 0xff] << 16 ^ Td4[(r1 >> 24) ] << 24; t[1] = Td4[(r2 ) & 0xff] ^ Td4[(r2 >> 8) & 0xff] << 8 ^ Td4[(r2 >> 16) & 0xff] << 16 ^ Td4[(r2 >> 24) ] << 24; t[2] = Td4[(r3 ) & 0xff] ^ Td4[(r3 >> 8) & 0xff] << 8 ^ Td4[(r3 >> 16) & 0xff] << 16 ^ Td4[(r3 >> 24) ] << 24; t[3] = Td4[(r4 ) & 0xff] ^ Td4[(r4 >> 8) & 0xff] << 8 ^ Td4[(r4 >> 16) & 0xff] << 16 ^ Td4[(r4 >> 24) ] << 24; r1 = t[0]; r2 = t[1]; r3 = t[2]; r4 = t[3]; r4 ^= pre_brute_value; r3 ^= r4; r2 ^= r3; r1 ^= r2; return r1 == pre_brute_value; } u32 brute_check(u32 r0, u32 pre_brute_value) { u32 r1 = 0x1020304; u32 r2 = 0x5060708; u32 r3 = 0x9101112; u32 r4 = 0xD14151E; r4 ^= r0; r3 ^= r4; r2 ^= r3; r1 ^= r2; r4 = (r4 << 8) ^ (r4 >> 24); r3 = (r3 >> 16) ^ (r3 << 16); r2 = (r2 >> 8) ^ (r2 << 24); u32 t[4] = {0,0,0,0}; t[0] = Td4[(r1 ) & 0xff] ^ Td4[(r1 >> 8) & 0xff] << 8 ^ Td4[(r1 >> 16) & 0xff] << 16 ^ Td4[(r1 >> 24) ] << 24; t[1] = Td4[(r2 ) & 0xff] ^ Td4[(r2 >> 8) & 0xff] << 8 ^ Td4[(r2 >> 16) & 0xff] << 16 ^ Td4[(r2 >> 24) ] << 24; t[2] = Td4[(r3 ) & 0xff] ^ Td4[(r3 >> 8) & 0xff] << 8 ^ Td4[(r3 >> 16) & 0xff] << 16 ^ Td4[(r3 >> 24) ] << 24; t[3] = Td4[(r4 ) & 0xff] ^ Td4[(r4 >> 8) & 0xff] << 8 ^ Td4[(r4 >> 16) & 0xff] << 16 ^ Td4[(r4 >> 24) ] << 24; r1 = t[0]; r2 = t[1]; r3 = t[2]; r4 = t[3]; r4 ^= pre_brute_value; r3 ^= r4; r2 ^= r3; r1 ^= r2; r2 = (r2>>8) ^ (r2<<24); r3 = (r3 >> 16) ^ (r3 << 16); r4 = (r4<<8) ^ (r4>>24); t[0] = Td4[(r1 ) & 0xff] ^ Td4[(r1 >> 8) & 0xff] << 8 ^ Td4[(r1 >> 16) & 0xff] << 16 ^ Td4[(r1 >> 24) ] << 24; t[1] = Td4[(r2 ) & 0xff] ^ Td4[(r2 >> 8) & 0xff] << 8 ^ Td4[(r2 >> 16) & 0xff] << 16 ^ Td4[(r2 >> 24) ] << 24; t[2] = Td4[(r3 ) & 0xff] ^ Td4[(r3 >> 8) & 0xff] << 8 ^ Td4[(r3 >> 16) & 0xff] << 16 ^ Td4[(r3 >> 24) ] << 24; t[3] = Td4[(r4 ) & 0xff] ^ Td4[(r4 >> 8) & 0xff] << 8 ^ Td4[(r4 >> 16) & 0xff] << 16 ^ Td4[(r4 >> 24) ] << 24; r1 = t[0]; r2 = t[1]; r3 = t[2]; r4 = t[3]; printf("%x-%x-%x\n", r2, r3, r4); return r1 == pre_brute_value; } int main(int argc, char* argvp[]) { u32 pre_bruted_r0 = prepare_r0(); for (u32 i = 0; i < 0xffffffff; i++) if (brute(i, pre_bruted_r0)) brute_check(i, pre_bruted_r0); } |
Keys:
b6b09bf0-f23daa06-ac4ee747
ee25052c-e6d61a0d-45e9cfe0
2b94f8c8-e1db8003-f6edf54
e5304760-47b7c45f-f59a8f29
2 comments
I guess “XORs round” is MixColumns stage indeed, but with another, simple polynomial: c(x) = x + 1 instead of c(x) = 3x**3 + x**2 + x + 2
And this poly is not inversible so you can’t simply reverse it, so bruteforce is ok :)
How did you calculate keys
ee25052c-e6d61a0d-45e9cfe0
2b94f8c8-e1db8003-f6edf54
???
I tried 3 ways (and your code too) to decode this AES, but only found
b6b09bf0-f23daa06-ac4ee747
e5304760-47b7c45f-f59a8f29
I believe there are only two key sequences possibly
P.S.
Thank you so much for this write-up!