VMProtect, Part 0: Basics
Author: RolfRolles
Translator:千里之外
VMProtect is a virtualization protector. Like other protections in the genre, among others ReWolf's x86 Virtualizer and CodeVirtualizer, it works by disasmbling the x86 bytecode of the target executable and compiling it into a proprietary, polymorphic bytecode which is executed in a custom interpreter at run-time.This is unlike the traditional notions of packing, in which the x86 bytecode is simply encrypted and/or compresd: with virtualization, the original x86 bytecode in the protected areas is gone, never to be en again. Or so the idea goes.
译文:
VMProtect是一个虚拟化的保护器.像其他这种类型的保护器ReWolf's x86 Virtualizer和CodeVirtualizer一样,它反汇编目标可执行程序的X86字节码并把它编译为一个可以在运行时通过一个通用的解释器执行的私有的多态的字节码.这种做法不同于传统的只是将X86字节码简单的加密and/or 压缩的打包观念:运用虚拟化技术,在保护区域的原始的X86字节码消失了,再也看不到了.或者这样的想法也没了.
If you've never looked at VMProtect before, I encourage you to take a five-minute look in IDA (here's a sample packed binary). As far as VMs go, it is particularly skeletal and easily comprehended. The difficulty lies in recreating working x86 bytecode from the VM bytecode. Here's a two-minute analysis of its dispatcher.
译文:
如果你以前从没看过VMProtect,我鼓励你花5分钟看看IDA(这里是一个二进制打包的例子).直到VMs运行前,它是特别简单和容易理解的.难点在于从VM的字节码中重建X86字节码.这里是它的派遣函数的2分钟的分析.
push edi ; push all registers 压入所有的寄存器
push ecx
push edx
push esi
push ebp
push ebx
一方什么push eax
懒惰的英文
push edx
pushf
push 0 ; imageba fixup 映像基址修正
mov esi, [esp+8+arg_0] ; esi = pointer to VM bytecode esi = VM字节码指针
mov ebp, esp ; ebp = VM's "stack" pointer ebp = VM 的“栈”指针
sub esp, 0C0h
mov edi, esp ; edi = "scratch" data area edi = “刻痕”数据区
VM__FOLLOW__Update:
add esi, [ebp+0]
VM__FOLLOW__Regular:
mov al, [esi] ; read a byte from ESI 从ESI中读取一个字节
movzx eax, al
withstandsub esi, -1 ; increment ESI 增加ESI
jmp ds:VM__HandlerTable[eax*4] ; execute instruction handler 执行指令处理程序
A feature worth discussing is the "scratch space", referenced by the register edi throughout the dispatch loop. This is a 16-dword-sized area on the stack where VMProtect saves the registers upon entering the VM, modifies them throughout the cour of a basic block, and from whence it restores the registers upon exit. For each basic block protected by the VM, the layout of the registers in the scratch space can potentially be different.
译文:
一个值得讨论的特征是被EDI寄存器引用并且贯穿派遣循环的”刻痕区”.这是一个栈上16个DOWRD大小的区域,在进入Vm后VMProtect在这里保存了寄存器,并且在贯穿基本块的过程中都会修改它们,
在退出的时候从这里恢复这些寄存器.每一个被VM 保护的基本块,在刻痕区域中的寄存器的布局可能不同.
我的生日派对
Here's a disasmbly of some instruction handlers. Notice that A) VMProtect is a stack machine and that B) each handler -- though consisting of scant few instructions -- performs veral tasks, e.g. popping veral values, performing multiple operations, pushing one or more values.
译文:
这里是一些指令处理程序的反汇编.注意A) VMProtect是一个栈机器B)每一个指令处理程序—尽管只包含了不足几个指令—却执行各自的任务,例如弹出几个数,执行乘法操作,压入一个或更多的数.
#00: x = [ESI-1] & 0x3C; y = popd; [edi+x] = y
.text:00427251 and al, 3Ch ; al = instruction number al = 指令数wupdmgr
.text:00427254 mov edx, [ebp+0] ; grab a dword off the stack 从栈中抢一个dword .text:00427257 add ebp, 4 ; pop the stack 弹出栈
.text:0042725A mov [edi+eax], edx ; store the dword in the scratch space
在刻痕区域中存储一个dword
#01: x = [ESI-1] & 0x3C; y = [edi+x]; pushd y
.vmp0:0046B0EB and al, 3Ch ; al = instruction number al = 指令数
.vmp0:0046B0EE mov edx, [edi+eax] ; grab a dword out of the scratch space
从刻痕区域中抢一个dword
.vmp0:0046B0F1 sub ebp, 4 ; subtract 4 from the stack pointer
从栈指针中减去4
.vmp0:0046B0F4 mov [ebp+0], edx ; push the dword onto the stack
把一个dword压入栈中
#02: x = popw, y = popw, z = x + y, pushw z, pushf江西培训机构
.text:004271FB mov ax, [ebp+0] ; pop a word off the stack 从栈中弹出一个字
.
text:004271FF sub ebp, 2
:00427202 add [ebp+4], ax ; add it to another word on the stack
把它加到栈上的另一个字
.text:00427206 pushf
.text:00427207 pop dword ptr [ebp+0] ; push the flags 压入标志位
#03: x = [ESI++]; w = popw; [edi+x] = Byte(w)
.vmp0:0046B02A movzx eax, byte ptr [esi] ; read a byte from ESI 从ESI中读一个字节.vmp0:0046B02D mov dx, [ebp+0] ; pop a word off the stack从栈上弹出一个字.vmp0:0046B031 inc esi ; ESI++
.vmp0:0046B032 add ebp, 2 ; adjust stack pointer 调整栈指针
.vmp0:0046B035 mov [edi+eax], dl ; write a byte into the scratch area
在刻痕区域内写入一个字节
#04: x = popd, y = popb, z = x << y, pushd z, pushf
.vmp0:0046B095 mov eax, [ebp+0] ; pop a dword off the stack
从栈上弹出一个DWORD
ram是什么.vmp0:0046B098 mov cl, [ebp+4] ; pop a byte off the stack
从栈上弹出一个字节
.vmp0:0046B09B sub ebp, 2
.vmp0:0046B09E shr eax, cl ; shr the dword by the byte
dword算数右移byte 位
.vmp0:0046B0A0 mov [ebp+4], eax ; push the result 压入结果
.vmp0:0046B0A3 pushf
.vmp0:0046B0A4 pop dword ptr [ebp+0] ; push the flags 压入标志位
#05: x = popd, pushd ss:[x]
.vmp0:0046B5F7 mov eax, [ebp+0] ; pop a dword off the stack
从栈中弹出一个dword
.vmp0:0046B5FA mov eax, ss:[eax] ; read a dword from ss
从SS中读一个dword
.vmp0:0046B5FD mov [ebp+0], eax ; push that dword 压入那个dword
Part 1: Bytecode and IR
The approach I took with ReWolf's x86 Virtualizer is also applicable here, although a more sophisticated compiler is required. What follows is some preliminary notes on the design and implementation of such a component. The are not complete details on breaking the protection; I confess to having only looked at a few samples, and I am not sure which protection options were enabled.
译文:
尽管需要一个更复杂的编译器,但是我处理ReWolf's x86虚拟器的方法在这里也适用. 接下来是设计和实现这样一个组件的初步说明.这里没有打破这个保护的完整细节;我承认我只看过几个例子,我也不确定这些保护选项已启用.
As before, we begin by constructing a disasmbler for the interpreter. This is immediately problematic, since the bytecode language is polymorphic. I have created an IDA plugin that automatically constructs OCaml source code for a bytecode disasmbler. In a production-quality implementation, this should be implemented as a standalone component that returns a closure.
译文:
像从前一样,我们首先为这个解释器构建一个反汇编程序.由于字节码语言是多态的,所以立刻会有一个问题.我已经创建了一个IDA插件用来自动把字节码汇编程序构建为Ocaml源码.为了保证产品的实现质量,它应该作为一个单独的返回一个关闭的组件去实现.
The generated disasmbler, then, looks like this:
产生的反汇编程序看起来像这样:
let disasmble bytearray index =
april是几月份
match (bytearray.(index) land 0xff) with
0x0 -> (VM__Handler0__PopIntoRegister(0),[index+1])
| 0x1 -> (VM__Handler1__PushDwordFromRegister(0),[index+1])
| 0x2 -> (VM__Handler2__AddWords,[index+1])
| 0x3 -> (VM__Handler3__StoreByteIntoRegister(bytearray.(index+1)),[index+2])
| 0x4 -> (VM__Handler0__PopIntoRegister(4),[index+1])
| 0x5 -> (VM__Handler1__PushDwordFromRegister(4),[index+1])
| 0x6 -> (VM__Handler4__ShrDword,[index+1])
| 0x7 -> (VM__Handler5__ReadDword__FromStackSegment,[index+1])juicy中国官网
| ... -> ...
Were we to work with the instructions individually in their natural granularity, depicted above, the boo
kkeeping on the mantics of each would likely prove tedious. For illustration, compare and contrast handlers #02 and #04. Both have the same basic pattern: pop two values (words vs. dwords), perform a binary operation (add vs. shr), push the result, then push the flags. The current reprentation of instructions does not express the, or any, similarities.
译文:
我们处理的指令分别在它们的自然粒度上,从上面描述来看,在每个语义上的薄记很可能证明是繁琐的.为了方便说明.我们比较和对比下处理例程#02和#04.它们都有相同的基本模型:弹出两个值(words vs. dwords),执行一个二进制操作(add vs. shr),压入结果后加入标志位.当前指令的表示法没有表达这些或者任何相似的地方.
Handler #02: Handler #04:
mov ax, [ebp+0] mov eax, [ebp+0]
sub ebp, 2 m ov cl, [ebp+4]
add [ebp+4], ax sub ebp, 2
pushf shr eax, cl
pop dword ptr [ebp+0] mov [ebp+4], eax
pushf
pop dword ptr [ebp+0]