CodeGate 2012 Quals – Binary 500

Seeing that it is not all.


Summary: VM analysis, python decompiling

First of all thanks to snk for help in preparing this writeup.

Also, thanks to snk, blackzert, pzbitskiy and everybody who helped to solve this task!

There are two files. First file is vm2x.exe which is PE x86. Second file is vm2x.dat which is script for immunity debugger. At first glimpse, vm2x.exe looks like a normal VC executable file with library function and so on. But after more scrupulous analysis it becomes obvious that all functional is hided in VM. So, it seems to be easy firstly analyze python script for immunity debugger. Let’s look inside the file:

import marshal, imp
if imp.get_magic() != '\x03\xf3\r\n':
    raise ImportError('Wrong!!')
__code = marshal.loads(“...”)
del marshal, imp
exec __code
del __code

Okey, script required Python2.7. It deserializes and executes python object. Try to disasm __code

>>> import marshal,dis
>>> __code = marshal.loads(“...”)
>>> __code.co_consts
(-1, None, code, code)
>>> __code.co_names ('immlib', 'toString', 'main')
>>> dis.dis(__code)
1 0 LOAD_CONST 0 (-1)
3 LOAD_CONST 1 (None)
6 IMPORT_FROM 0 (immlib)
9 STORE_NAME 0 (immlib)
3 12 LOAD_CONST 2 (code)
18 STORE_NAME 1 (toString)
12 21 LOAD_CONST 3 (code)
27 STORE_NAME 2 (main)
30 LOAD_CONST 1 (None)

It imports immlib and defines two function’s “toString” and “main”. Let’s analyze the main function deeper:

>>> main = __code.co_consts[3]
>>> main.co_names
('immlib', 'Debugger', 'readMemory', 'toString', 'getRegs', 'log')
>>> main.co_consts
(None, 4237456, 80, 'EIP', 4273157, 29, 52, 69, 65, 46, 68, 63,
'Nice work, Key1 : "', '"', 'But, Find Next Key!', 4278021, 2, 0, 61,
'Nice work, Key2 : "', 'Input Key : Key1 + Key2', 'Nothing found ..')

It executes function readMemory from 0x40a890 to variable b. which constructs two strings from b like this:

 20          70 LOAD_FAST                3 (b)
             73 LOAD_CONST               5 (29)
             76 BINARY_SUBSCR
             77 LOAD_FAST                3 (b)
             80 LOAD_CONST               6 (52)
             83 BINARY_SUBSCR
             84 BINARY_ADD
             85 LOAD_FAST                3 (b)
             88 LOAD_CONST               7 (69)
             91 BINARY_SUBSCR
             92 BINARY_ADD
             93 LOAD_FAST                3 (b)
             96 LOAD_CONST               6 (52)
             99 BINARY_SUBSCR
            100 BINARY_ADD
            101 LOAD_FAST                3 (b)
            104 LOAD_CONST               8 (65)
            107 BINARY_SUBSCR
            108 BINARY_ADD
            109 LOAD_FAST                3 (b)
            112 LOAD_CONST               9 (46)
            115 BINARY_SUBSCR
            116 BINARY_ADD
            117 LOAD_FAST                3 (b)
            120 LOAD_CONST              10 (68)
            123 BINARY_SUBSCR
            124 BINARY_ADD
            125 LOAD_FAST                3 (b)
            128 LOAD_CONST              11 (63)
            131 BINARY_SUBSCR
            132 BINARY_ADD
            133 STORE_FAST               5 (str1)

After decompiling we had:

import immlib
def toString(s):
 t = ''
 for i in range(len(s)):
   t += s[i]
 return t
def main(args):
 imm = immlib.Debugger()
 a = imm.readMemory(0x40a890,80)
 b = toString(a)
 regs = imm.getRegs()
 if (regs['EIP'] == 0x413405):
    str1 = 'b[29]+b[52]+b[69]+b[52]+b[65]+b[46]+b[68]+b[63]'
    imm.log('Nice work, Key1 : "' + str1 + '"')
    return 'But, Find Next Key!'
 elif( regs['EIP'] == 0x414705 ):
    str2 = b[46]+b[29]+b[2]+b[69]+b[2]+b[65]+b[46]+b[0]+b[61]
    imm.log('Nice work, Key2 : "' + str2 + '"')
    return 'Input Key : Key1 + Key2'
 return 'Nothing found ..'

Data located at 0x40a890:

.rdata:0040A890 a123456789?@abc db '123456789:;?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopq'
.rdata:0040A890                 db 'rstuvwxyz{|}~'

And finished script for getting answere:

b = "123456789:;?@ABCDEFGHIJKLMNOPQ
print b[29]+b[52]+b[69]+b[52]+b[65]+b[46]+b[68]+b[63] +

So, we tried an answer “Never_up_N3vr_1n”, but it didn’t work!

So, we thought it is totally ok because bin500 couldn’t be so easy, and start to analyze vm2x.exe.

After careful analysis we completely understood VM structure. It is a three layers virtual machine.

VM layer 1 -> VM layer 2 -> Payload code

The first layer of Virtual machine is described on scheme. 

When program starts it allocates memory blocks for each VM, which contain execute buffer, VM stack, and saved native registers. And start execute VM1. Format for VM instruction the following: First byte xor’ed with second byte and interpreted as length. After ‘len’ byte is decrypted on function 0x0041338B, prototype for it:

void __stdcall decode_buffer(char *buf, int size, int key)

After decryption, check first two byte, if this is 0xFFFF, then it VM instruction, otherwise it native code. We write IDA script for decode VM, if you interested look it there. The logic of the VM to generate executable code in a special buffer and execute one instruction per round. The first layer generate code for second layer, and the second layer generates third code layer which is payload. The generated code looks like:

And the same third layer code executes by one instruction per iteration. For transitions between layers two gateway are used (at 0x04135f8 and 0x04113a7). They look like


So general working scheme looks like

In this way, if we want to get payload code we need to trace code from layer 3 when program is executing. For this reason we wrote an IDA script:

import idaapi
import binascii
class Xglob:
  def __init__(self):
    self.i = 0
    self.dmp = None
glob = Xglob()
class ida_SDbg(idaapi.DBG_Hooks):
   def dbg_bpt(self, tid, ea):
     if( dbg_commands.has_key(ea) ):
       callback = dbg_commands[ea]
       if( callback() == 0 ): idaapi.continue_process()
     return 0
def GetMem(ea,size):
  r = ''
  for ea in range(ea,ea+size):
    r += chr(Byte(ea))
  return r
def DumpInit():
  glob.dmp = file('bin500_1.dmp','wb')
def Dump(s):
  if(glob.dmp!=None): glob.dmp.write(s);
def OnVM2():
   ret = Dword(GetRegValue("ESP"))
   l = idaapi.decode_insn(ret)
   return 0;
dbg_commands = {}
script_dbg = ida_SDbg()
dbg_commands[0x004113AC] = OnVM2

So, after execution we had all executed code as a binary file. This file we can load to IDA as an additional binary file and create a new segment. After these manipulations we can easily analyze payload code. For example, code which create window looks like:

If we push the “MD5” button the code will look like:

From this code we find out that MD5 is calculated by 9000h bytes from beginning of first section (0x401000). There is a part of md5 calculation code:

When we finish analyzing code we found nothing interesting. There were just usual MD5 calculating, picture drawing and message box functional. We spent a lot of time before somebody found out that the answer is “Never_up_N3v3r_1n”, and that when we had checked this answer first time we just had made misspelling mistake. So, all what we had to do is decompile python script. Guys, is it binary 500, are you kidding? Anyway, the VM was very interesting to analyze and we had a fun time =)

Key: Never_up_N3v3r_1n

Leave a Reply

Your email address will not be published.