x86 Programming

Assemblers

Three major 'families' of assembly languages with different syntax

AT&T - used in Unix systems

destination argument is the last one
register names start with %
explicit argument size specification in instruction name - MOVB, MOVL
memory addressing modes written as (%ebx, 4)

Intel/Microsoft (MASM) - as in Intel documents

memory displacement may appear before brackets - 8[ebp]
argument interpretation depends on data declarations

NASM - similar to Intel, simpler, no ambiguities

memory references always use []

MASM vs NASM syntax differences

MASM


Explicit specification of argument sizes

Most of instructions argument size is defined by register name used
mov eax, [ebx] ; load 32-bit word
add dl, [esi] ; add byte
If an instruction has no register arguments or its arguments have different lengths, size of memory argument may be impossible to determine
inc [ebx] ; unspecified size - any size possible
movzx eax, [esi] ; may be byte or 16-bit word
In there cases explicit argument size specification is required
In NASM syntax - byte, word, dword, qword should be used before the argument
int word [ebx] ; 16-bit word
movzx eax, byte [esi] ; byte


Zeroing and testing register value

Zeroing - xor eax, eax

(xor a register with itself)
Binary encoding shorter than mov eax, 0
In 64-bit mode xor eax, eax clears the whole rax
32-bit instruction encoding is usually shorter than 64-bit

Testing for zero/non-zero/negative value

test eax, eax (bitwise AND with itself, no result stored)
js negative - jump if sign negative
jz zero - jump if zero zero
Modern x86 processors handle test and the following jump in a special way (optimized - fused into a single internal processor's instruction)

No need to test for 0 or sign after any arithmetic/logic instruction - the flags are already set


LEA usage

3-argument multiplication by 2, 3, 4, 5, 8, 9
3-argument left shift by 1, 2, 3
3- and 4- argument addition
3-argument logical OR/XOR with optional left shift by 1, 2, 3 for one source argument

if 1's in source arguments don't overlap, OR is equivalent to XOR or addition