The moment I heard of machine code and its opcodes… I fell in love. Being able to understand machine code from just looking at the binary (okay, mostly its hexadecimal representation) felt like magic. And since many simple x86 assembly instructions are quite easy to decipher, I really liked the fact I could not only ‘read some of the code’ by just looking at binary, but also use that knowledge to patch code here and there, too.
Of course, today everyone knows about nopping code with 0x90, or changing the conditional jumps from 0x74, 0x75 to 0xEB, but back then it was something special. Unfortunately, once you learn the basics, this feeling doesn’t last for too long, because the opcodes got … complicated, and they did so, pretty quickly, too. The FPU, MMX, SSEn, AVXn instructions are not for the faint-hearted, and it takes a lot effort to understand them on a mathematical level, let alone memorizing their opcodes. And on top of that, the new CPUs arrived, bytecode in many different forms is a thing, and on top of that we have code virtualizers, so now it’s really prohibitive to even think of learning any of it… unless you are a dedicated low-level code fan.
Still, even in 2023 it really helps to know some of the most important opcodes, at least in the x86/x64 world. Malware uses many tricks to obfuscate code, use opcodes to enforce incorrect disassembly, or trigger exceptions on undocumented instructions. Patching is also still a thing, and knowing at least a subset of most popular opcodes helps to quickly understand what is going on. For example, if some random routine is looking for some specific byte values that correspond to known opcodes it’s really handy to know some of them to quickly make an educated guess that we are looking at some sort of length disassembler, or a hooking/unhooking routine…
Let’s admit it though – we can’t learn it all, so, it’s time to cheat a bit and then hopefully win some…
Knowing how complicated all of this became, for a long time I dreamed of a tool that takes a series of bytes, interprets it as code, and breaks it down into smaller chunks where the respective parts of the alleged machine instruction are clearly deconstructed, described, and represented; that is, the prefixes, the opcode itself, the operation direction, the size of the argument, the R/M, MOD, REG, SIB, and IMM and DISP parts, etc. and all are extracted and presented in a nice way to the user…
And after thinking of it for a long time I only last week asked about a tool like this…
Thanks to Steve Eckels, we now know that such tool does exist! It’s called Zydisinfo, and It was created by Joel Höner.
Over last few days I spent some time playing around with Zydisinfo and I am really impressed. This is a fantastic educational tool that many students and assembler lovers will find absolutely delightful to work with.
Let’s see a few examples:
ZydisInfo -64 “90” (NOP)
no surprise here…
ZydisInfo -64 “74 01” (short jump)
no surprise here either…
ZydisInfo -64 “67 8B 04 C1” (mov eax, dword ptr ds:[ecx+eax*8])
a more complicated example and it still works like a charm…
Isn’t that cool?
Joel, you really killed it! Touche!