Seeing that it is not all.
Summary: VM analysis, python decompiling
First of all thanks to snk for help in preparing this writeup.
Also, thanks to snk, blackzert, pzbitskiy and everybody who helped to solve this task!
There are two files. First file is vm2x.exe which is PE x86. Second file is vm2x.dat which is script for immunity debugger. At first glimpse, vm2x.exe looks like a normal VC executable file with library function and so on. But after more scrupulous analysis it becomes obvious that all functional is hided in VM. So, it seems to be easy firstly analyze python script for immunity debugger. Let’s look inside the file:
import marshal, imp if imp.get_magic() != '\x03\xf3\r\n': raise ImportError('Wrong!!') __code = marshal.loads(“...”) del marshal, imp exec __code del __code |
Okey, script required Python2.7. It deserializes and executes python object. Try to disasm __code
$python2.7 >>> import marshal,dis >>> __code = marshal.loads(“...”) >>> __code.co_consts (-1, None, code, code) >>> __code.co_names ('immlib', 'toString', 'main') >>> dis.dis(__code) 1 0 LOAD_CONST 0 (-1) 3 LOAD_CONST 1 (None) 6 IMPORT_FROM 0 (immlib) 9 STORE_NAME 0 (immlib) 3 12 LOAD_CONST 2 (code) 15 MAKE_FUNCTION 0 18 STORE_NAME 1 (toString) 12 21 LOAD_CONST 3 (code) 24 MAKE_FUNCTION 0 27 STORE_NAME 2 (main) 30 LOAD_CONST 1 (None) 33 RETURN_VALUE
It imports immlib and defines two function’s “toString” and “main”. Let’s analyze the main function deeper:
>>> main = __code.co_consts[3] >>> main.co_names ('immlib', 'Debugger', 'readMemory', 'toString', 'getRegs', 'log') >>> main.co_consts (None, 4237456, 80, 'EIP', 4273157, 29, 52, 69, 65, 46, 68, 63, 'Nice work, Key1 : "', '"', 'But, Find Next Key!', 4278021, 2, 0, 61, 'Nice work, Key2 : "', 'Input Key : Key1 + Key2', 'Nothing found ..')
It executes function readMemory from 0x40a890 to variable b. which constructs two strings from b like this:
20 70 LOAD_FAST 3 (b) 73 LOAD_CONST 5 (29) 76 BINARY_SUBSCR 77 LOAD_FAST 3 (b) 80 LOAD_CONST 6 (52) 83 BINARY_SUBSCR 84 BINARY_ADD 85 LOAD_FAST 3 (b) 88 LOAD_CONST 7 (69) 91 BINARY_SUBSCR 92 BINARY_ADD 93 LOAD_FAST 3 (b) 96 LOAD_CONST 6 (52) 99 BINARY_SUBSCR 100 BINARY_ADD 101 LOAD_FAST 3 (b) 104 LOAD_CONST 8 (65) 107 BINARY_SUBSCR 108 BINARY_ADD 109 LOAD_FAST 3 (b) 112 LOAD_CONST 9 (46) 115 BINARY_SUBSCR 116 BINARY_ADD 117 LOAD_FAST 3 (b) 120 LOAD_CONST 10 (68) 123 BINARY_SUBSCR 124 BINARY_ADD 125 LOAD_FAST 3 (b) 128 LOAD_CONST 11 (63) 131 BINARY_SUBSCR 132 BINARY_ADD 133 STORE_FAST 5 (str1)
After decompiling we had:
import immlib def toString(s): t = '' for i in range(len(s)): if(s[i]==0): break t += s[i] return t def main(args): imm = immlib.Debugger() a = imm.readMemory(0x40a890,80) b = toString(a) regs = imm.getRegs() if (regs['EIP'] == 0x413405): str1 = 'b[29]+b[52]+b[69]+b[52]+b[65]+b[46]+b[68]+b[63]' imm.log('Nice work, Key1 : "' + str1 + '"') return 'But, Find Next Key!' elif( regs['EIP'] == 0x414705 ): str2 = b[46]+b[29]+b[2]+b[69]+b[2]+b[65]+b[46]+b[0]+b[61] imm.log('Nice work, Key2 : "' + str2 + '"') return 'Input Key : Key1 + Key2' return 'Nothing found ..' |
Data located at 0x40a890:
.rdata:0040A890 a123456789?@abc db '123456789:;?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopq' .rdata:0040A890 db 'rstuvwxyz{|}~'
And finished script for getting answere:
b = "123456789:;?@ABCDEFGHIJKLMNOPQ RSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~" print b[29]+b[52]+b[69]+b[52]+b[65]+b[46]+b[68]+b[63] + b[46]+b[29]+b[2]+b[69]+b[2]+b[65]+b[46]+b[0]+b[61] |
So, we tried an answer “Never_up_N3vr_1n”, but it didn’t work!
So, we thought it is totally ok because bin500 couldn’t be so easy, and start to analyze vm2x.exe.
After careful analysis we completely understood VM structure. It is a three layers virtual machine.
VM layer 1 -> VM layer 2 -> Payload code
The first layer of Virtual machine is described on scheme.
When program starts it allocates memory blocks for each VM, which contain execute buffer, VM stack, and saved native registers. And start execute VM1. Format for VM instruction the following: First byte xor’ed with second byte and interpreted as length. After ‘len’ byte is decrypted on function 0x0041338B, prototype for it:
void __stdcall decode_buffer(char *buf, int size, int key) |
After decryption, check first two byte, if this is 0xFFFF, then it VM instruction, otherwise it native code. We write IDA script for decode VM, if you interested look it there. The logic of the VM to generate executable code in a special buffer and execute one instruction per round. The first layer generate code for second layer, and the second layer generates third code layer which is payload. The generated code looks like:
And the same third layer code executes by one instruction per iteration. For transitions between layers two gateway are used (at 0x04135f8 and 0x04113a7). They look like
and
So general working scheme looks like
In this way, if we want to get payload code we need to trace code from layer 3 when program is executing. For this reason we wrote an IDA script:
import idaapi import binascii class Xglob: def __init__(self): self.i = 0 self.dmp = None glob = Xglob() class ida_SDbg(idaapi.DBG_Hooks): def dbg_bpt(self, tid, ea): if( dbg_commands.has_key(ea) ): callback = dbg_commands[ea] if( callback() == 0 ): idaapi.continue_process() return 0 def GetMem(ea,size): r = '' for ea in range(ea,ea+size): r += chr(Byte(ea)) return r def DumpInit(): glob.dmp = file('bin500_1.dmp','wb') def Dump(s): if(glob.dmp!=None): glob.dmp.write(s); def OnVM2(): idaapi.refresh_debugger_memory() ret = Dword(GetRegValue("ESP")) l = idaapi.decode_insn(ret) Dump(GetMem(ret,l)) return 0; dbg_commands = {} script_dbg = ida_SDbg() script_dbg.hook() dbg_commands[0x004113AC] = OnVM2 DumpInit() |
So, after execution we had all executed code as a binary file. This file we can load to IDA as an additional binary file and create a new segment. After these manipulations we can easily analyze payload code. For example, code which create window looks like:
If we push the “MD5” button the code will look like:
From this code we find out that MD5 is calculated by 9000h bytes from beginning of first section (0x401000). There is a part of md5 calculation code:
When we finish analyzing code we found nothing interesting. There were just usual MD5 calculating, picture drawing and message box functional. We spent a lot of time before somebody found out that the answer is “Never_up_N3v3r_1n”, and that when we had checked this answer first time we just had made misspelling mistake. So, all what we had to do is decompile python script. Guys, is it binary 500, are you kidding? Anyway, the VM was very interesting to analyze and we had a fun time =)
Key: Never_up_N3v3r_1n