Endian Escapades


Computing 101.

x86 stores every multi-byte value little-endian (low byte first), and that fact turns up everywhere you look at raw memory: hex dumps, debuggers, exploits. To work properly in these contexts, you need to understand byte order. This module makes it second nature.

You'll crack a series of programs that hide a password in their own compiled code, reading it back out of the disassembly at every size, whether it's a "qword", a lone byte, or a structure. By the end, byte order won't trip you up in a hex dump, a debugger, or an exploit ever again (or we'll add more challenges to make it so!).


Every value wider than a single byte (a 16, 32, or 64-bit number, a memory address, etc) has to be split into individual bytes to live in memory, because memory is addressed one byte at a time. Being used to working with large decimal numbers (e.g., 1337) in real life, with the "least significant" digit on the right and the "most significant" digit on the left, you might expect something similar in the CPU. For example, if you were storing the 16-bit (2-byte) value 0x1234, you might expect it to be stored as two consecutive bytes, first 12 and then 34.

Some CPUs do work like this, but most do not. Most architectures store the least significant digit on the left and the most significant on the right. In these architectures, the value 0x1234 would be stored in two consecutive bytes as 34 12. Because the "little" (least significant) end goes first, these are called "Little Endian" (LE) architectures, and represent essentially all modern CPU architectures.

Of course, though this seems extremely silly to anyone that encounters it for the first time, there are number of solid reasons behind it:

  1. A lot of arithmetic is done from the little end. Consider long addition: you start from the small values and carry the 10 to the left. An Arithmetic Logical Unit does the same thing, and Little Endian is natural here.
  2. A value's address is the address of its low byte, so reading a 4-byte int as a 1-byte char (or a 2-byte short) is the same address --- you just read fewer bytes. In Big Endian systems, this requires fixing up the address, which is complicated.

Hopefully, you're convinced. Now, get familiar with some more examples:

size of value decimal value hex value big endian bytes (NOT x86) little endian bytes (x86)
8 (1 byte) 65 0x41 41 41
16 (2 bytes) 4660 0x1234 12 34 34 12
32 (4 bytes) 1145258561 0x44434241 44 43 42 41 41 42 43 44
64 (8 bytes) 1145258561 0x0000000044434241 00 00 00 00 44 43 42 41 41 42 43 44 00 00 00 00

A single byte is identical in both columns: with only one byte there's no order to pick, so endianness only matters once a value is two bytes or wider. The 32- and 64-bit rows hold the same number --- the 64-bit version just pads with zero bytes, which, being the most-significant bytes, sit at the higher addresses.

Endianness is very much a CPU-level concept. When you move beyond it (e.g., when sending data over the network), big-endian encoding of numbers rears its head. And even most of the time when working in assembly, you don't really have to think about endianness. For example, you've already stored and loaded multi-byte values without reversing byte order, because a value (say mov rax, 0x1234; push rax) written to memory and read straight back (e.g., pop rax) comes out unchanged: it's written in little-endian order on the stack by push and endian-corrected when it's read back into the register by pop. Endianness only matters in memory, but the moment you look at memory as bytes (in a hex dump, in a debugger, in an exploit) the byte order is right there, and you have to read it the way the CPU wrote it.

The easiest place to get turned around is the boundary between memory order and the value printed from a register. Suppose rdi points at these eight bytes:

Address    Byte
[rdi+0]    41
[rdi+1]    42
[rdi+2]    43
[rdi+3]    44
[rdi+4]    45
[rdi+5]    46
[rdi+6]    47
[rdi+7]    48

A 64-bit load reads those bytes starting at the lowest address:

mov rax, [rdi]

Because x86 is little-endian, [rdi+0] becomes the low byte of rax, [rdi+1] becomes the next byte, and so on. The register value is therefore 0x4847464544434241. Written as hex, the most-significant byte prints on the left, so the bytes look reversed compared to address order:

memory address order:  41 42 43 44 45 46 47 48
register hex order:   48 47 46 45 44 43 42 41
rax value:            0x4847464544434241

The bytes did not move in memory. The CPU interpreted the byte at the lowest address as the least-significant part of the number.

When you work in assembly, you're constantly choosing how many bytes an operation touches: one, two, four, or eight. x86 has a name for each of these sizes, and they're worth knowing, because you'll meet them everywhere in disassembly, in assembler size directives, and baked into the register names you already use.

Name Bits Bytes Partial rax Access Memory Access
byte 8 1 mov al, [rdi] mov BYTE PTR [rdi], 0x11
word 16 2 mov ax, [rdi] mov WORD PTR [rdi], 0x1122
doubleword (dword) 32 4 mov eax, [rdi] mov DWORD PTR [rdi], 0x11223344
quadword (qword) 64 8 mov rax, [rdi] mov QWORD PTR [rdi], 0x1122334455667788

You've already met some of these registers: al is the low byte of rax, ax the low 2 bytes, eax the low 4, and rax all 8. The size names are just another way of saying the same thing --- al holds a byte, eax holds a dword, rax holds a qword.

Why is a "word" 16 bits? The names trace back to Intel's early chips, each working in the chunk of data that was natural to it. The ancestors of the x86-64 processor (the 8008 (1972), and the 8080 after it) were 8-bit processors, moving data one 8-bit byte at a time, so the byte (8 bits) is where the sizes start. Intel's next upgrade was the 8086 (1978), a 16-bit chip. Its registers were 16 bits wide, and 16 bits was the natural chunk of data it handled. But because the 8086 needed to be backwards compatible with the 8080 for commercial reasons (e.g., execute programs written and assembled for the 8080), the term "byte" had to remain 8-bits, and they needed a new term for the 16-bit width. They used "word". When 32-bit (the 386) and then 64-bit (x86-64) chips arrived, they again couldn't redefine "word" without breaking every program and assembler that already relied on it. Instead, they named the new sizes relative to that original word: a doubleword (dword) is two words (32 bits), and a quadword (qword) is four words (64 bits).

The "word" collision. Outside of x86's size directives, computer architects use "word" more loosely: the machine word (or "word size", or "word width") means the natural width a processor works in --- essentially its register width. By that definition, x86-64 is a "64-bit word" machine. So the same term points at two different sizes: in x86 assembly a WORD is 16 bits (no matter how wide the machine is), but "the machine's word" is the full register width --- 64 bits on x86-64. When you read "word", work out which is meant: the fixed 16-bit x86 size, or the loose "how wide is a register" sense. To avoid this confusion, there is another term often used for a 16-bit value: a short. So, byte, short, dword, and qword don't have this problem --- they always mean 8, 16, 32, and 64 bits.

You have seen that byte, word, dword, and qword describe how many bytes an instruction reads or writes. Now add one more wrinkle: a smaller value can be copied into a larger register as either unsigned or signed.

If the byte is unsigned, filling the high bytes with zero is fine: 0x7f becomes 0x000000000000007f. But a signed byte uses two's complement. The byte 0xff is -1, so extending it to 64 bits must fill the new high bits with 1s: 0xffffffffffffffff.

That is sign extension. It copies the sign bit, not zeroes, into the new high bits. On x86-64, the form you need here is:

movsx rax, BYTE PTR [rdi]

This reads one byte from the address in rdi, treats that byte as signed, and returns the 64-bit signed value in rax.

Write a function called solve that takes a pointer to one byte in rdi. Load that byte as a signed 8-bit value, sign-extend it to 64 bits, return it in rax, and export it with .global solve.

Build it into a shared library and hand it to the grader:

hacker@dojo:~$ as -o solve.o solve.s
hacker@dojo:~$ ld -shared -o solve.so solve.o
hacker@dojo:~$ /challenge/check solve.so

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

You just read how x86 stores multi-byte values little-endian --- low byte first. Time to use it: /challenge/reverse-me hides an 8-character password in a single qword, deep in its own code.

It loads your input and compares all 8 bytes at once against a hard-coded value:

movabs rbx, 0x4847464544434241
mov    rax, [rdi]
cmp    rax, rbx
jne    fail

(movabs is new, but it's just a mov: a normal mov's immediate maxes out at 32 bits, so the assembler uses this wider form --- "move absolute" --- when the constant fills all 64. Read it as a mov.)

That immediate is the password as the CPU read it from memory --- little-endian, so its bytes are the characters in reverse:

0x4847464544434241  ->  bytes 48 47 46 45 44 43 42 41  (high to low, as printed)
                    ->  low byte first: 41 42 43 44 45 46 47 48  ->  "ABCDEFGH"

Disassemble it, read that one movabs immediate, reverse its eight bytes into the password, and run it:

hacker@dojo:~$ objdump -d -M intel /challenge/reverse-me
hacker@dojo:~$ /challenge/reverse-me YOUR_PASSWORD_HERE

WARNING: /challenge/reverse-me is a SUID binary, so debugging it drops its privileges and the open("/flag") inside will silently fail under gdb. Use objdump to read it, but run it directly to get the flag.

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

You've seen that a multi-byte value sits in memory low byte first, so reading one back into a register reverses its bytes. A single register tops out at eight bytes, but plenty of values are wider than that (and wider than the CPU can easily access at one time).

Depending on the program, such values might be accessed sequentially, register-width by register-width. For example, consider a 16-byte password, as you will experience in this challenge. The sixteen ASCII bytes, read as two 8-byte qwords, might be accessed like this:

Address Value
0x1337000 0x41
0x1337001 0x42
0x1337002 0x43
0x1337003 0x44
0x1337004 0x45
0x1337005 0x46
0x1337006 0x47
0x1337007 0x48
0x1337008 0x49
0x1337009 0x4a
0x133700a 0x4b
0x133700b 0x4c
0x133700c 0x4d
0x133700d 0x4e
0x133700e 0x4f
0x133700f 0x50

As a string value stored in memory byte by byte, that ordering is not affected by endianness. What endianness flips is the bytes as they end up in a register after the mov: each one, being a multi-byte value, still reads back low byte first. So if rdi is pointing to this buffer, a mov rsi, [rdi] would end up with the value 0x4847464544434241, and the next mov rsi, [rdi+8] would have the value 0x504f4e4d4c4b4a49.

So to rebuild a longer value: keep the qwords in the order you get them, but reverse the bytes inside each one.

Disassemble /challenge/reverse-me, read the two qword values it checks your input against, endian-correct each one, concatenate them in order, and run it with the result:

hacker@dojo:~$ objdump -d -M intel /challenge/reverse-me
hacker@dojo:~$ /challenge/reverse-me YOUR_PASSWORD_HERE

WARNING: /challenge/reverse-me is a SUID binary, so debugging it drops its privileges and the open("/flag") inside will silently fail under gdb. Use objdump to read it, but run it directly to get the flag.

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

Endianness isn't special to 64-bit values. Every integer read from memory comes back little-endian, and the bytes that get reversed are exactly the ones covered by that read. For example, consider the following bytes, contiguously chilling in memory pointed to by rdi:

rdi -> 11 22 33 44 55 66 77 88

If you read all 8 bytes into, say, rsi with mov rsi, [rdi], rsi will have the value of 0x8877665544332211. You could also read it as two 32-bit (4 byte) values (into, say, the 4-byte partial registers esi and edx, which are 32 bits of rsi and rdx, respectively):

mov esi, [rdi]      // results in 0x44332211 in esi
mov edx, [rdi+4]    // results in 0x88776655 in edx

This makes sense written out, but it can confuse some people. Specifically, what does not happen is the reversal of the whole 8-byte value (in which case, esi above would have the 0x88).

You'll practice this in this challenge. /challenge/reverse-me checks the same kind of password four bytes at a time, as 32-bit values (termed "dwords"), so you get four integers instead of two. Each dword still reverses its own four bytes; the dwords themselves stay in address order, just like the qwords did. Because a dword fits the normal immediate size, each check is a direct cmp eax, 0x........ --- the value to recover is right there in the instruction:

mov eax, [rdi+0]
cmp eax, 0x44434241
jne fail

Disassemble /challenge/reverse-me, read the four cmp eax immediates, endian-correct each dword, concatenate them in address order, and run it with the result:

hacker@dojo:~$ objdump -d -M intel /challenge/reverse-me
hacker@dojo:~$ /challenge/reverse-me YOUR_PASSWORD_HERE

WARNING: /challenge/reverse-me is a SUID binary, so debugging it drops its privileges and the open("/flag") inside will silently fail under gdb. Use objdump to read it, but run it directly to get the flag.

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

Smaller still: /challenge/reverse-me now checks two bytes at a time, as words.

The rule never changes --- the bytes that reverse are exactly the ones in the read. A word read swaps its two bytes; the words stay in address order. (And a one-byte read has nothing to swap, which is why single bytes never need endian-correcting.)

Disassemble /challenge/reverse-me, read the eight cmp ax, 0x.... immediates, swap each pair, keep the words in address order, and run it:

hacker@dojo:~$ objdump -d -M intel /challenge/reverse-me
hacker@dojo:~$ /challenge/reverse-me YOUR_PASSWORD_HERE

WARNING: /challenge/reverse-me is a SUID binary, so debugging it drops its privileges and the open("/flag") inside will silently fail under gdb. Use objdump to read it, but run it directly to get the flag.

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

One byte at a time, in the 8-bit register al:

mov al, [rdi+0]
cmp al, 0x41
jne fail

And here's the payoff. A single byte has no order to reverse, so the immediates are the characters, already in order --- 0x41 is just 'A'. Sixteen byte-compares, and no endian-correcting at all.

This is the far end of the rule you've been applying: the reversal unit is the read size, so a one-byte read reverses nothing. Read the immediates straight down the disassembly and run it.

hacker@dojo:~$ objdump -d -M intel /challenge/reverse-me
hacker@dojo:~$ /challenge/reverse-me YOUR_PASSWORD_HERE

WARNING: /challenge/reverse-me is a SUID binary, so debugging it drops its privileges and the open("/flag") inside will silently fail under gdb. Use objdump to read it, but run it directly to get the flag.

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

Real programs rarely read a buffer at one uniform size. They read structs: a handful of fields of different sizes, laid out one after another in memory (in fact, struct is structure for short).

This /challenge/reverse-me treats your password as a struct. You might not know the C programming language, but if you did, this is what the structure would be defined as:

struct { uint64_t a; uint32_t b; uint16_t c; uint8_t d; uint8_t e; };

The disassembly loads each field at its own width and offset:

movabs rbx, 0x................   ; a: 8-byte field at +0
mov    rax, [rdi+0]
cmp    rax, rbx
mov    eax, [rdi+8]               ; b: 4-byte field at +8
cmp    eax, 0x........
mov    ax,  [rdi+12]              ; c: 2-byte field at +12
cmp    ax,  0x....
mov    al,  [rdi+14]              ; d: 1-byte field at +14
cmp    al,  0x..
mov    al,  [rdi+15]              ; e: 1-byte field at +15
cmp    al,  0x..

This is the whole module in one challenge. For each field, read three things off the access: its width (from rax/eax/ax/al), its offset ([rdi+X]), and its value (endian-correct the immediate according to the field's width). Reassemble the fields in offset order and you have the password.

hacker@dojo:~$ objdump -d -M intel /challenge/reverse-me
hacker@dojo:~$ /challenge/reverse-me YOUR_PASSWORD_HERE

WARNING: /challenge/reverse-me is a SUID binary, so debugging it drops its privileges and the open("/flag") inside will silently fail under gdb. Use objdump to read it, but run it directly to get the flag.

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

The last struct read its fields top to bottom, in memory order, so reading the disassembly straight down handed you the password already in order. Nothing guarantees that. A program can check a struct's fields in any order it likes --- the order the compares appear in the code has nothing to do with where those bytes live in memory.

This /challenge/reverse-me checks the same five fields, but scrambled:

mov    ax,  [rdi+12]              ; the +12 word might be checked first...
cmp    ax,  0x....
mov    al,  [rdi+15]              ; ...then a byte from the very end...
cmp    al,  0x..
movabs rbx, 0x................    ; ...then the +0 qword, and so on.
mov    rax, [rdi+0]
cmp    rax, rbx

So you can no longer read the password straight down the disassembly. Recover each field's value exactly as before, but now also read its offset from the [rdi+X] load --- that offset is where the bytes belong. Place each field at its offset, concatenate in offset order, and you have the password.

hacker@dojo:~$ objdump -d -M intel /challenge/reverse-me
hacker@dojo:~$ /challenge/reverse-me YOUR_PASSWORD_HERE

WARNING: /challenge/reverse-me is a SUID binary, so debugging it drops its privileges and the open("/flag") inside will silently fail under gdb. Use objdump to read it, but run it directly to get the flag.

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

30-Day Scoreboard:

This scoreboard reflects solves for challenges in this module after the module launched in this dojo.

Rank Hacker Badges Score