Binary arithmetic
Binary addition
1111
1001011
+ 1100101
────────────
1 0110000
In this case, we had an extra carry; the true result of the addition is too big to fit into a single byte.
Binary subtraction
Subtract-with-borrow, but see also the negate-then-add method below.
Subtraction follows a similar pattern, but with “borrowing” instead of carrying. E.g.,
110110
- 100001
────────────
0 - 1 = -1, so we borrow a 1 from the next column (i.e., we are doing 10 - 1 = 1)
1
110110
- 100001
────────────
1
1 - 1 = 0:
1
110110
- 100001
────────────
01
1 - 0 = 1:
1
110110
- 100001
────────────
101
0 - 0 = 0:
1
110110
- 100001
────────────
0101
1 - 0 = 1:
1
110110
- 100001
────────────
10101
And 1 - 1 = 0 (we could drop the leading 0 in the answer):
1
110110
- 100001
────────────
010101
It’s possible to end up with an extra “borrow”, indicating underflow.
Binary to decimal, and the reverse
| Input | Remainder | Binary |
|---|---|---|
| 1234 | 0 | __________0 |
| 617 | 1 | _________10 |
| 308 | 0 | ________010 |
| 154 | 0 | _______0010 |
| 77 | 1 | ______10010 |
| 38 | 0 | _____010010 |
| 19 | 1 | ____1010010 |
| 9 | 1 | ___11010010 |
| 4 | 0 | __011010010 |
| 2 | 0 | _0011010010 |
| 1 | 1 | 10011010010 |
To decimal: multiple bits by powers of two.
Two’s complement representation
Represent the negation of a value by
Flipping all the bits
Adding 1
E.g., 00110110 negated gives
00110110
11001001 Flip all bits
11001010 Add 1
Negative values will always have the high bit set.
Addition/subtraction can be done normally. (To do subtraction, just negate the second operand and then add.)
Registers, and their uses
Syscall register use
| 64-bits | Low 32-bits | Low 16-bits | Low 8-bits | Comment |
|---|---|---|---|---|
rax |
eax |
ax |
al |
Accumulator; syscall code and return |
rbx |
ebx |
bx |
bl |
Base |
rcx |
ecx |
cx |
cl |
Count (syscall clobbered) |
rdx |
edx |
dx |
dl |
Dword accum.; 3rd syscall arg. |
rsi |
esi |
si |
sil |
Source index; 2nd syscall arg. |
rdi |
edi |
di |
dil |
Dest. index; 1st syscall arg. |
rbp |
ebp |
bp |
bpl |
Stack base pointer |
rsp |
esp |
sp |
spl |
Stack pointer |
r8 |
r8d |
r8w |
r8b |
5th syscall arg. |
r9 |
r9d |
r9w |
r9b |
6th syscall arg. |
r10 |
r10d |
r10w |
r10b |
4th syscall arg. |
r11 |
r11d |
r11w |
r11b |
(syscall clobbered) |
| … | … | … | … | |
r15 |
r15d |
r15w |
r15b |
The first four registers allow access to their second byte (the high byte of
the word-sized): ah, bh, ch, dh. These cannot be mixed with any of the
newer registers (e.g., mov r15b, ah is invalid).
C-style functions
Registers:
| Register | Use |
|---|---|
rax |
Return value |
rbx |
Callee-preserved |
rcx |
4th argument |
rdx |
3rd argument |
rsi |
2nd argument |
rdi |
1st argument |
rbp |
Callee-preserved |
rsp |
Stack pointer |
r8 |
5th argument |
r9 |
6th argument |
r10 |
Temporary (caller-preserved) |
r11 |
Temporary (caller-preserved) |
r12-r15 |
Callee-preserved |
Stack (rsp) must be aligned to a multiple of 16 + 8 before any call. The
stack is aligned to a multiple of 16 immediately after function entry, so
usually we can just do either
sub rsp, 8
or
push rbp
mov rbp, rsp
Either way, we have to undo the process before ret-urning.
Arithmetic operations
add
sub
inc
dev
mul and div (and imul and idiv) and their register usage.
mul rm ; Multiply rdx:rax by rm, store the result back into rdx:rax
div rm ; Divide rdx:rax by rm, store the result back into rdx:rax
Note that these use rdx:rax as a 128-bit input; if you are not using the
full 128 bits, you should zero rdx before the operation.
Division stores the quotient into rax and the remainder (modulo) into rdx.
Comparisons
cmp – basically just a sub which discards the results (but keeps the flags).
test – just an and which discards the results. Mostly useful for testing
individual bits.
Flags and their meanings: CF, OF, SF, ZF
CF – Carry flag, set if addition/subtraction generated an extra carry/borrow. Indicates overflow for unsigned arithmetic operations. Meaningless if the operands are signed.
OF – Overflow flag, set if addition/subtraction generated an overflow when interpreted as signed. (I.e., set if the sign of the result is not what it should be.) Meaningless for unsigned operations.
SF – Sign flag, copy of the high bit of the result.
ZF – Zero flag, set if the result = 0.
Condition codes
a,b– “above” and “below” for unsigned comparisons.ameans CF unset, ZF unset (ifa > bthena - b > 0).aemeans CF unset, ZF ignored.bemeansCF == 1orZF == 1.bmeansCF == 1andZF == 0.lg– “less than”, “greater than” for signed comparisons.lmeansSF != OF and ZF == 0,lemeansSF != OF.gmeansSF == OF and ZF == 0,gemeansSF == OF.ene– “equal”, not-equal,emeansZF == 1(a - a = 0) andnemeansZF == 0.There are condition codes for each of the flags. E.g.,
nsmeansSF == 0.There are negated forms of all the conditions. E.g.,
naemeans “not above-or-equal” and is equivalent tob.
Jumps and branches
jmp target: unconditional jump
jCC target: conditional jump, replace CC with condition code
loop target: decrement rcx, jump to target if not 0.
Memory operands and arrays/strings
[displacement + scale*offset + base]
Displacement is an immediate address (typically a label)
scaleis 1,2,4 or 8. If omitted, 1 is assumedoffsetis the offset registerbaseis the base register.
Memory-memory operations are generally forbidden.
lea reg, mem computes the effective address of mem (i.e., does the math)
and then stores the address, not the value in memory, into reg.
String operations
String operations implicitly use [rdi] and [rsi] as their operands.
| Instruction | Description |
|---|---|
lodsb |
Load byte [rdi] into al |
stosb |
Write byte from al into byte [rdi] |
movsb |
Copy byte from [rdi] to [rsi] |
cmpsb |
Compare [rdi] with [rsi] and update flags |
scasb |
Compare [rdi] with rax and update flags |
Replace b with w for word-sized, d for dword, etc.
All of these implicitly increment rdi and rsi (if used).
Repetition prefixes:
rep– Repeatrcxmany times. Can be used withlodsb,stosb,movsb.repe/repne– Repeatrcxmany times, or until equal/not equal. Can be used withcmpsbandscasb.
Structures and alignments
struc/endstruc – Shortcut for defining a bunch of equ definitions. E.g.,
struc thing
a: resb
b: resb
c: resw
d: resd
e: resq
endstruc
defines the following constants:
thing: equ 0
a: equ 0
b: equ 1
c: equ 2
d: equ 4
e: equ 8
thing_size: equ 16
To be C-compatible, the elements of a structure must be aligned (placed
in memory at a multiple of their size). So a qword member must start at a multiple
of 8. Extra resbs can be used to add padding bytes, or the align directive.
Instances of structures must be placed in memory at structure alignment, which is a multiple of the largest element of the structure. E.g., the above structure would have 8-byte alignment.
Floating-point operations
Floating point registers are xmm0 through xmm15. Operations are
suffixed with their operand size: ss for single-precision (float), sd for
double-precision (double).
Use movss, movsd to move float values into/out of operands. There are no
float immediates; store floating point constants in the .data section and
then movs* them into a register.
addss dest, src ; dest += src (float)
addsd dest, src ; dest += src (double)
subss dest, src ; dest -= src (float)
subsd dest, src ; dest -= src (double)
mulss dest, src ; dest *= src (float)
mulsd dest, src ; dest *= src (double)
divss dest, src ; dest /= src (float)
divsd dest, src ; dest /= src (double)
All of these are also available in three-operand forms:
vaddss dest, src1, src2 ; dest = src1 + src2
vaddsd dest, src1, src2 ; dest = src1 + src2
vsubss dest, src1, src2 ; dest = src1 + src2
vsubsd dest, src1, src2 ; dest = src1 + src2
vmulss dest, src1, src2 ; dest = src1 + src2
vmulsd dest, src1, src2 ; dest = src1 + src2
vdivss dest, src1, src2 ; dest = src1 + src2
vdivsd dest, src1, src2 ; dest = src1 + src2
Comparisons use ucomiss, ucomisd which update the flags as if for an
unsigned comparison.
Bitwise operations
and, or, not, xor, andn (AND followed by negation of the result).
These set flags, so they can be used for various purposes.
Shifts and rotates
shl– Shift left, fill in low bits with 0shr– Shift right, fill in high bits with 0sar– Shift arithmetic right, for signed values, fill high bits with
copies of the existing sign bit.ror,rol– Rotate left/right.
The shift/rotate amount can either be an immediate or a byte-sized register.