pwn.college

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

The CPU thinks in very simple terms. It moves data around, changes data, makes decisions based on data, and takes action based on data. Most of the time, this data is stored in registers.

Simply put, registers are containers for data. The CPU can put data into registers, move data between registers, and so on. These registers, at a hardware level, are implemented using very expensive chips, crammed into shockingly microscopic spaces, and accessed at a frequency where even physical concepts such as the speed of light impact their performance. Hence, the number of registers that a CPU can have is extremely constrained. Different CPU architectures have different amounts of registers, different names for these registers, and so on, but typically, there are between 10 and 20 "general purpose" registers that program code can use for any reason, and up to a few dozen other ones that are used for special purposes.

In x86's modern incarnation, x86_64, programs have access to 16 general purpose registers. In this challenge, we will learn about our first one: rax. Hi, Rax!

rax, a single x86 register, is a tiny piece of the massively complex design of the x86 CPU, but this is where we'll start. Like the other registers, rax is a container for a small amount of data. You move data into rax with the mov instruction. Instructions are specified as an operator (in this case, mov), and operands, which represent additional data (in this case, it will be the specification of rax as a destination, and the value we will want to store there).

For example, if you wanted to store the value 1337 into rax, the x86 Assembly would look like:

mov rax, 1337

You can see a few things:

The destination (rax) is specified before the source (the value 1337).
The operands are separated by a comma.
It is really simple!

In this challenge, you will write your first assembly. You must move the value 60 into rax. Write your program in a file with a .s extension, such as rax-challenge.s (while not mandatory, .s is the typical extension for assembly files), and pass it as an argument to the /challenge/check file (e.g., /challenge/check rax-challenge.s). You can use either your favorite text editor or the text editor in pwn.college's VSCode Workspace to implement your .s file!

ERRATA: If you've seen x86 assembly before, there is a chance that you've seen a slightly different dialect of it. The dialect used in pwn.college is "Intel Syntax", which is the correct way to write x86 assembly (as a reminder, Intel created x86). Some courses incorrectly teach the use of "AT&T Syntax", causing enormous amounts of confusion. We'll touch on this slightly in the next module and then, hopefully, never have to think about AT&T Syntax again.

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

So, your first program crashed... Don't worry, it happens! In this challenge, you'll learn how to make your program cleanly exit instead of crashing.

Starting your program and cleanly stopping it are actions handled by your computer's Operating System. The operating system manages the existence of programs and interactions between the programs, your hardware, the network environment, and so on.

Your programs "interact" with the CPU using assembly instructions such as the mov instruction you wrote earlier. Similarly, your programs interact with the operating system (via the CPU, of course) using the syscall, or System Call instruction.

Like how you might use a phone call to interact with a local restaurant to order food, programs use system calls to request the operating system to carry out actions on the program's behalf. As a bit of an overgeneralization, anything your program does that doesn't involve performing computation on data is done with a system call.

There are a lot of different system calls your program can invoke. For example, Linux has around 330 different ones, though this number changes over time as syscalls are added and deprecated. Each system call is indicated by a syscall number, counting upwards from 0, and your program invokes a specific syscall by moving its syscall number into the rax register and invoking the syscall instruction. For example, if we wanted to invoke syscall 42 (a syscall that you'll learn about sometime later!), we would write two instructions:

mov rax, 42
syscall

Very cool, and super easy!

In this challenge, we'll learn our first syscall: exit. The exit syscall causes a program to exit. By explicitly exiting, we can avoid the crash we ran into with our previous program!

Now, the syscall number of exit is 60. Go and write your first program: it should move 60 into rax, then invoke syscall to cleanly exit!

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

As you might know, every program exits with an exit code as it terminates. This is done by passing a parameter to the exit system call.

Similarly to how a system call number (e.g., 60 for exit) is specified in the rax variable, parameters are also passed to the syscall through registers. System calls can take multiple parameters, though exit takes only one: the exit code. The first parameter to a system call is passed via another register: rdi. rdi is what we will focus on in this challenge.

In this challenge, you must make your program exit with the exit code of 42. Thus, your program will need three instructions:

Set your program's exit code (move it into rdi).
Set the system call number of the exit syscall (mov rax, 60).
syscall!

Now, go and do it!

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

So you've written your first program? But until now, we've handled the actual building of it into an executable that your CPU can actually run. In this challenge, you will build it!

To build an executable binary, you need to:

Write your assembly in a file (often with a .S or .s syntax. We'll use program.s in this example).
Assemble your assembly file into an object file (using the as command).
Link one or more executable object files into a final executable binary (using the ld command)!

Let's take this step by step:

Writing assembly.
The assembly file contains, well, your assembly code. For the previous level, this might be:

hacker@dojo:~$ cat program.s
mov rdi, 42
mov rax, 60
syscall
hacker@dojo:~$

But it needs to contain just a tad more info. We mentioned that we're using the Intel assembly syntax in this course, and we'll need to let the assembler know that. You do this by prepending a directive to the beginning of your assembly code, as such:

hacker@dojo:~$ cat program.s
.intel_syntax noprefix
mov rdi, 42
mov rax, 60
syscall
hacker@dojo:~$

.intel_syntax noprefix tells the assembler that you will be using Intel assembly syntax, and specifically the variant of it where you don't have to add extra prefixes to every instruction. It isn't actually an x86 instruction (like mov and syscall), and so it doesn't end up in our final executable binary or runs on the CPU. We'll talk about other directives later, but for now, we'll let the assembler figure it out!

Assembling Assembly Code into Object Files.
Next, we'll assemble the code. This is done using the assembler, as, as so:

hacker@dojo:~$ ls
program.s
hacker@dojo:~$ cat program.s
.intel_syntax noprefix
mov rdi, 42
mov rax, 60
syscall
hacker@dojo:~$ as -o program.o program.s
hacker@dojo:~$ ls
program.o   program.s
hacker@dojo:~$

Here, the as tool reads in program.s, assembles it into binary code, and outputs an object file called program.o. This object file has actual assembled binary code, but it is not yet ready to be run. First, we need to link it.

Linking Object Files into an Executable.
In a typical development workflow, source code is compiled and assembly is assembled to object files, and there are typically many of these (generally, each source code file in a program compiles into its own object file). These are then linked together into a single executable. Even if there is only one file, we still need to link it, to prepare the final executable. This is done with the ld (stemming from the term "link editor") command, as so:

hacker@dojo:~$ ls
program.o   program.s
hacker@dojo:~$ ld -o program program.o
ld: warning: cannot find entry symbol _start; defaulting to 0000000000401000
hacker@dojo:~$ ls
program.o   program.s   program
hacker@dojo:~$

This creates an program file that we can then run! Here it is:

hacker@dojo:~$ ./program
hacker@dojo:~$ echo $?
42
hacker@dojo:~$

In the shell, $? holds the exit code of the last executed command.

Neat! Now you can build programs. In this challenge, go ahead and run through these steps yourself. Build your executable, and pass it to /challenge/check for the flag!

_start?
The attentive learner might have noticed that ld prints a warning about entry symbol _start. The _start symbol is, essentially, a note to ld about where in your program execution should begin when the ELF is executed. The warning states that, absent a specified _start, execution will start right at the beginning of the code. This is just fine for us!

If you want to silence the error, you can specify the _start symbol, in your code, as so:

hacker@dojo:~$ cat program.s
.intel_syntax noprefix
.global _start
_start:
mov rdi, 42
mov rax, 60
syscall
hacker@dojo:~$ as -o program.o program.s
hacker@dojo:~$ ld -o program program.o
hacker@dojo:~$ ./program
hacker@dojo:~$ echo $?
42
hacker@dojo:~$

There are two extra lines here. The second, _start:, adds a label called start, pointing to the beginning of your code. The first, .global _start, directs as to make the _start label globally visible at the linker level, instead of just locally visible at the object file level. As ld is the linker, this directive is necessary for the _start label to be seen.

For all the challenges in this dojo, starting execution at the beginning of the file is just fine, but if you don't want to see those warnings pop up, now you know how to prevent them!

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

Okay, let's learn about one more register: rsi! Like rdi, rsi is a place you can park some data. For example:

mov rsi, 42

Of course, you can also move data around between registers! Watch:

mov rsi, 42
mov rdi, rsi

Just like the first line there moves 42 into rsi, the second line moves the value in rsi to rdi. Here, we have to mention one complication: by move, we really mean set. After the snippet above, rsi and rdi will be 42. It's a mystery as to why the mov was chosen rather than something reasonable like set (even very knowledgeable people resort to wild speculation when asked), but it was, and here we are.

Anyways, on to the challenge! In this challenge, we will store a secret value in the rsi register, and your program must exit with that value as the return code. Since exit uses the value stored in rdi as the return code, you'll need to move the secret value in rsi into rdi. Run /challenge/check and pass it your code for the flag! /challenge/check will set the secret value in rsi before running your code. Good luck!

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

As seen by your program, computer memory is a huge place where data is housed. Like houses on a street, every part of memory has a numeric address, and like houses on a street, these numbers are (mostly) sequential. Modern computers have enormous amounts of memory, and the view of memory of a typical modern program actually has large gaps (think: a portion of the street that hasn't had houses built on it, and so those addresses are skipped). But these are all details: the point is, computers store data, mostly sequentially, in memory.

In this level, we will practice accessing data stored in memory. How might we do this? Recall that to move a value into a register, we did something like:

mov rdi, 31337

After this, the value of rdi is 31337. Cool. Well, we can use the same instruction to access memory! There is another format of the command that, instead, uses the second parameter as an address to access memory! Consider that our memory looks like this:

  Address │ Contents
+────────────────────+
│ 31337   │ 42       │
+────────────────────+

To access the memory contents at memory address 31337, you can do:

mov rdi, [31337]

When the CPU executes this instruction, it of course understands that 31337 is an address, not a raw value. If you think of the instruction as a person telling the CPU what to do, and we stick with our "houses on a street" analogy, then instead of just handing the CPU data, the instruction/person points at a house on the street. The CPU will then go to that address, ring its doorbell, open its front door, drag the data that's in there out, and put it into rdi. Thus, the 31337 in this context is the memory address and serves to point to the data stored at that memory address. After this instruction executes, the value stored in rdi will be 42!

Let's put this into practice! I've stored a secret number at memory address 133700, as so:

  Address │ Contents
+────────────────────+
│ 133700  │ ???      │
+────────────────────+

You must retrieve this secret number and use it as the exit code for your program. To do this, you must read it into rdi, whose value, if you recall, is the first parameter to exit and is used as the exit code. Good luck!

NOTE: To solve this challenge, you must pass either the output executable binary or the assembly code .s file to /challenge/check.

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

You look like you need just a tiny bit more practice. In this level, we put the secret value at 123400 instead of 133700, as so:

  Address │ Contents
+────────────────────+
│ 123400  │ ???      │
+────────────────────+

Go load it into rdi and exit with that as the exit code!

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

Did you prefer to access memory at 133700 or at 123400? Your answer might say something about your personality, but it's not super relevant from a technical perspective. In fact, in most cases, you don't deal with actual memory addresses when writing programs at all!

How is this possible? Well, typically, memory addresses are stored in registers, and we use the values in the registers to point to data in memory! Let's start with this memory configuration:

  Address │ Contents
+────────────────────+
│ 133700  │ 42       │
+────────────────────+

And consider this assembly snippet:

mov rax, 133700

Now, what you have is the following situation:

    Address │ Contents
  +────────────────────+
┌▸│ 133700  │ 42       │
│ +────────────────────+
│
└────────────────────────┐
                         │
   Register │ Contents   │
  +────────────────────+ │
  │ rax     │ 133700   │─┘
  +────────────────────+

rax now holds a value that corresponds with the address of the data that we want to load! Let's load it:

mov rdi, [rax]

Here, we are accessing memory, but instead of specifying a fixed address like 133700 for the memory read, we're using the value stored in rax as the memory address. By containing the memory address, rax is a pointer that points to the data we want to access! When we use rax in lieu of directly specifying the address that it stores to access the memory address that it references, we call this dereferencing the pointer. In the above example, we dereference rax to load the data it points to (the value 42 at address 133700) into rdi. Neat!

This also drives home another point: these registers are general purpose! Just because we've been using rax as the syscall index in our challenges so far doesn't mean that it can't have other uses as well. Here, it's used as a pointer to our secret data in memory.

Similarly, the data in the registers doesn't have an implicit purpose. If rax contains the value 133700 and we write mov rdi, [rax], the CPU uses the value as a memory address to dereference. But if we write mov rdi, rax in the same conditions, the CPU just happily puts 133700 into rdi. To the CPU, data is data; it only becomes differentiated when it's used in different ways.

In this challenge, we've initialized rax to contain the address of the secret data we've stored in memory. Dereference rax to load the secret data into rdi and use it as the exit code of the program to get the flag!

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

In the previous level, you dereferenced rax to read data into rdi. The interesting thing here is that our choice of rax was pretty arbitrary. We could have used any other pointer, even rdi itself! Nothing stops you from dereferencing a register to overwrite its own content with the dereferenced value!

For example, here is us doing this exact thing with rax. I've annotated each line with comments:

mov [133700], 42
mov rax, 133700  # after this, rax will be 133700
mov rax, [rax]   # after this, rax will be 42

Throughout this snippet, rax goes from being used as a pointer to being used to hold the data that's been read from memory. The CPU makes this all work!

In this challenge, you'll explore this concept. Rather than initializing rax, as before, we've made rdi the pointer to the secret value! You'll need to dereference it to load that value into rdi, then exit with that value as the exit code. Good luck!

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

So now you can dereference pointers in memory like a pro! But pointers don't always point directly at the data you need. Sometimes, for example, a pointer might point to a collection of data (say, an entire book), and you'll need to reference partway into this collection for the specific data you need.

For example, if your pointer (say, rdi) points to a sequence of numbers in memory, as so:

    Address │ Contents
  +────────────────────+
┌▸│ 133700  │ 50       │
│ │ 133701  │ 42       │
│ │ 133702  │ 99       │
│ │ 133703  │ 14       │
│ +────────────────────+
│
└────────────────────────┐
                         │
   Register │ Contents   │
  +────────────────────+ │
  │ rdi     │ 133700   │─┘
  +────────────────────+

If you want the second number of that sequence, you could do:

mov rax, [rdi+1]

Wow, super simple! In memory terms, we call these number slots bytes: each memory address represents a specific byte of memory. The above example is accessing memory 1 byte after the memory address pointed to by rdi. In memory terms, we call this 1 byte difference an offset, so in this example, there is an offset of 1 from the address pointed to by rdi.

Let's practice this concept. As before, we will initialize rdi to point at the secret value, but not directly at it. This time, the secret value will have an offset of 8 bytes from where rdi points, something analogous to this:

    Address │ Contents
  +────────────────────+
┌▸│ 31337   │ 0        │
│ │ 31337+1 │ 0        │
│ │ 31337+2 │ 0        │
│ │ 31337+3 │ 0        │
│ │ 31337+4 │ 0        │
│ │ 31337+5 │ 0        │
│ │ 31337+6 │ 0        │
│ │ 31337+7 │ 0        │
│ │ 31337+8 │ ???      │
│ +────────────────────+
│
└────────────────────────┐
                         │
   Register │ Contents   │
  +────────────────────+ │
  │ rdi     │ 31337    │─┘
  +────────────────────+

Of course, the actual memory address is not 31337. We'll choose it randomly, and store it in rdi. Go dereference rdi with offset 8 and get the flag!

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

Pointers can get even more interesting! Imagine that your friend lives in a different house on your street. Rather than remembering their address, you might write it down, and store the paper with their house address in your house. Then, to get data from your friend, you'd need to point the CPU at your house, have it go in there and find the friend's address, and use that address as a pointer to their house.

Similarly, since memory addresses are really just values, they can be stored in memory, and retrieved later! Let's explore a scenario where we store the value 133700 at the address 123400, and store the value 42 at the address 133700. Consider the following instructions:

mov rdi, 123400    # after this, rdi becomes 123400
mov rdi, [rdi]     # after this, rdi becomes the value stored at 123400 (which is 133700)
mov rax, [rdi]     # here we dereference rdi, reading 42 into rax!

Wow! This storing of addresses is extremely common in programs. Addresses and data are stored, loaded, moved around, and, sometimes, mixed up with each other! When that happens, security issues can arise, and you'll romp through many such issues during your pwn.college journey.

For now, let's practice dereferencing an address stored in memory. I'll store a secret value at a secret address, then store that secret address at the address 567800. You must read the address, dereference it, get the secret value, and then exit with it as the exit code. You got this!

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

In the last few levels, you have:

Used an address that we told you (in one level, 133700, and in another, 123400) to load a secret value from memory.
Used an address that we put into rax for you to load a secret value from memory.
Used an address that we told you (in the last level, 567800) to load the address of a secret value from memory into a register, then used that register as a pointer to retrieve the secret value from memory!

Let's put those last two together. In this challenge, we stored our SECRET_VALUE in memory at the address SECRET_LOCATION_1, then stored SECRET_LOCATION_1 in memory at the address SECRET_LOCATION_2. Then, we put SECRET_LOCATION_2 into rax! The result looks something like this, using 123400 for SECRET_LOCATION_1 and 133700 for SECRET_LOCATION_2 (not, in the real challenge, these values will be different and hidden from you!):

      Address │ Contents
    +────────────────────+
┌──▸│ 133700  │ 123400   │─┐
│   +────────────────────+ │
│ ┌▸│ 123400  │ 42       │ │
│ │ +────────────────────+ │
│ └────────────────────────┘
└──────────────────────────┐
                           │
     Register │ Contents   │
    +────────────────────+ │
    │ rax     │ 133700   │─┘
    +────────────────────+

Here, you will need to perform two memory reads: one dereferencing rax to read SECRET_LOCATION_1 from the location that rax is pointing to (which is SECRET_LOCATION_2), and the second one dereferencing whatever register now holds SECRET_LOCATION_1 to read SECRET_VALUE into rdi, so you can use it as the exit code!

That sounds like a lot, but you've done basically all of this already. Go put it together!

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

So far, we've been loading data from memory at addresses that we gave you: either hardcoded (like 133700) or stored in a register (like rax). But there's one region of memory that your program already has access to without any setup from us: the stack.

The stack is a region of allocated memory used as scratch space for your program, and the register rsp (the Stack Pointer) points to the top of it. We'll explore the stack further later, but for now, the relevant detail is this: when a program starts, rsp points to data that represents the number of command-line arguments passed to the program (including the program name itself).

So if you run:

hacker@dojo:~$ /tmp/your-program hello world

Then the situation looks like this (the actual addresses are an example):

    Address    │ Contents
  +───────────────────────+
  │ ...        │ ...      │
  +───────────────────────+
┌▸│ 1337000    │ 3        │  ◀── the argument count
| +───────────────────────+
| | 1337008    | ???      |
| +───────────────────────+
| | 1337016    | ???      |
│ +───────────────────────+
│
└────────────────────────────┐
                             │
   Register │ Contents       │
  +────────────────────────+ │
  │ rsp     │ 1337000      │─┘
  +────────────────────────+

rsp points to the stack, and the value there is 3: one for the program name, one for hello, and one for world. The stack also has other data, as shown, but we won't worry about that for now!

In this challenge, read the argument count from [rsp] and use it as the exit code of your program. We'll run your program a few times with different arguments to make sure you're reading it correctly!

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

In the previous challenge, you read the value at [rsp]: the very top of the stack. But the stack has lots of data on it, and you can access any of it by adding an offset to rsp.

For example, [rsp+8] reads the 8-byte value right after [rsp], [rsp+16] reads the next one after that, and so on. In general, [rsp+N] reads memory at the address rsp+N:

    Address    | Contents
  +---------------------------+
  | rsp        | value 0      |  <-- [rsp]
  +---------------------------+
  | rsp+8      | value 1      |  <-- [rsp+8]
  +---------------------------+
  | rsp+16     | value 2      |  <-- [rsp+16]
  +---------------------------+
  | ...        | ...          |
  +---------------------------+

You'll notice these offsets go in multiples of 8. That's because many values on the stack, such as numbers or memory addresses, tend to be 8 bytes (64 bits) wide, so consecutive values are 8 bytes apart. But this is mostly convention: in reality, the stack, like any other region of memory, is a contiguous region of individual bytes, though for now we'll treat the stack as a bunch of 8-byte/64-bit values.

In this challenge, we've stashed a secret value on the stack at an offset of 128 bytes from rsp. Read the value at [rsp+128] and use it as the exit code!

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

You've now read [rsp] to get the argument count, and [rsp+128] to get data at an offset. Let's look at what else is on the stack!

Right after the argument count, the stack stores pointers to each program argument. These are addresses stored in memory: [rsp+16] doesn't contain the argument text directly --- it contains the address where that text lives.

For example, if your program is run as /tmp/your-program Hi:

     Register │ Contents
   +───────────────────────────+
   │ rsp      │ 1337000        │─┐
   +───────────────────────────+ │
                                 │
  ┌──────────────────────────────┘
  │
  │    Address    │ Contents
  │  +────────────────────────+
  │  │ ...        │ ...       │
  │  +────────────────────────+
  └▸ │ 1337000    │ 2         │  ◀── the ARGument Count (termed "argc")
     +────────────────────────+
     │ 1337008    │ 1234000   │──────┐
     +────────────────────────+      │
     │ 1337016    │ 1234560   │────┐ │
     +────────────────────────+    │ │
     │ 1337024    │ 0         │    │ │
     +────────────────────────+    │ │
                                   │ │
   ┌───────────────────────────────┘ │
   │                                 │
   │   Address   │ Contents          │
   │ +──────────────────────────+    │
   │ │ 1234000   │ "/tmp/..."   │◀───┘ the program name
   │ +──────────────────────────+
   │ │ ...       │ ...          │
   │ +──────────────────────────+
   └▸│ 1234560   │ "Hi"         │ the first argument!
     +──────────────────────────+

To get the actual argument data, you need to dereference twice: once to get the pointer from the stack, and once to follow it to the data.

mov rdi, [rsp+16]   # load the first argument pointer (e.g., 1234000) from the stack
mov rdi, [rdi]      # follow the pointer to read the actual data (e.g., "Hi")

In this challenge, your program will be invoked with an argument. Read the value of the first argument and exit with it!

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

Why is the stack called a stack? So far, we've just used it as a region of memory that we read from with mov, like any other memory dereference. But the stack is meant to be used as, well, a stack of data: you pop values off the top!

The pop instruction is purpose-built for this. pop rdi does two things:

Reads the value at [rsp] into rdi (just like mov rdi, [rsp]).
Adds 8 to rsp, advancing the stack pointer to the next value.

Using the same example as before:

hacker@dojo:~$ /tmp/your-program hello world

Before the pop rdi:

    Address    │ Contents
  +───────────────────────+
  │ ...        │ ...      │
  +───────────────────────+
┌▸│ 1337000    │ 3        │  ◀── the argument count
│ +───────────────────────+
│ | 1337008    | ???      |
│ +───────────────────────+
│
└────────────────────────────┐
                             │
   Register │ Contents       │
  +────────────────────────+ │
  │ rsp     │ 1337000      │─┘
  +────────────────────────+
  │ rdi     │ 0            │
  +────────────────────────+

After the pop rdi:

    Address    │ Contents
  +───────────────────────+
  │ ...        │ ...      │
  +───────────────────────+
  │ 1337000    │ 3        │
  +───────────────────────+
┌▸| 1337008    | ???      |
│ +───────────────────────+
│
└────────────────────────────┐
                             │
   Register │ Contents       │
  +────────────────────────+ │
  │ rsp     │ 1337008      │─┘
  +────────────────────────+
  │ rdi     │ 3            │
  +────────────────────────+

The value 3 was popped off the top of the stack into rdi, and rsp advanced by 8 bytes to point to the next value. The data at 1337000 is still there in memory, but as far as the stack is concerned, it's been removed: rsp has moved past it.

In this challenge, use pop to read the argument count and exit with it!

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

As you know, bytes are what is actually stored in your computer's memory. As you might also know, computers think in binary: just a bunch of ones and zeroes. For historical reasons, we express these ones and zeroes ("bits") in groups of 8, and each group of 8 (a "byte"). This number is purely arbitrary: early computers (pre-1960s or so) didn't have this grouping at all, or had other arbitrary groupings. It is very feasible for there to be an alternate universe in which a byte is 16, 32, or really any numbers of bits (though for math reasons, it'll likely remain a power-of-2).

A single binary digit (bit) can represent two values (0 and 1), two bits can represent four values (00, 01, 10, and 11), three bits can represent eight values (000, 001, 010, 011, 100, 101, 110, 111), and four bits can represent sixteen values. Comparatively, a single decimal digit can represent 10 values (from 0 to 9). Ten values are represented by roughly log2(10) == 3.3219... bits, and you get weird situations like binary 1001 being decimal 9, but binary 1100 (still 4 binary digits) being 12 (two decimal digits!). Another way of expressing this digit desynchronization between decimal and binary is that decimal does not have clean bit boundaries.

The lack of bit boundaries makes reasoning about the relationship between decimal and binary complex. For example, it is hard to spot-translate numbers between decimal and binary in general: we can work out that 97 is 110001, but it's hard to see that at a glance.

It's much easier to spot-translate between bases that have more alignment between digits. For example, a single hexadecimal (base 16) digit can represent 16 values (0, 1, 2, 3, 4, 5, 6, 7, 8, 9, a, b, c, d, e, f): the same number of values that binary can represent in 4 digits! This allows us to have a super simple mapping:

Hex	Binary	Decimal
`0`	`0000`	`0`
`1`	`0001`	`1`
`2`	`0010`	`2`
`3`	`0011`	`3`
`4`	`0100`	`4`
`5`	`0101`	`5`
`6`	`0110`	`6`
`7`	`0111`	`7`
`8`	`1000`	`8`
`9`	`1001`	`9`
`a`	`1010`	`10`
`b`	`1011`	`11`
`c`	`1100`	`12`
`d`	`1101`	`13`
`e`	`1110`	`14`
`f`	`1111`	`15`

This mapping from a hex digit to 4 bits is something that's easily memorizable (most important: memorize 1, 2, 4, and 8, and you can quickly derive the rest). Better yet, two hex digits is 8 bits, which is one byte! Unlike decimal, where you'd have to memorize 16 mappings for 4 bits and 256 mappings for 8 bits, with hexadecimal, you only have to memorize 16 mappings for 4 bits and the same amount of mappings for 8 bits, since it's just two hexadecimal digits concatenated! Some examples:

Hex	Binary	Decimal
`00`	`0000 0000`	`0`
`0e`	`0000 1110`	`14`
`3e`	`0011 1110`	`62`
`e3`	`1110 0011`	`227`
`ee`	`1110 1110`	`238`

Now you're starting to see the beauty. This gets even more obvious when you expand beyond one byte of input, but we'll let you find that out through future challenges!

Now, let's talk about notation. How do you differentiate 11 in decimal, 11 in binary (which equals 3 in decimal), and 11 in hex (which equals 17 in decimal)? For numerical constants, we sometimes prepend binary data with 0b, hexadecimal with 0x, and keep decimal as is, resulting in 11 == 0b1011 == 0xb, 3 == 0b11 == 0x3, and 17 == 0b10001 == 0x11.

In the previous module, you wrote assembly programs and built them into executables. But what if someone gives you a program and you want to understand what it does? This is where disassembly comes in: the process of converting the binary machine code in an executable back into human-readable assembly instructions.

Though you will learn to use vastly more powerful tooling later in your journey, we will start with one of the most common tools for disassembly: objdump. Given a binary, objdump -d will disassemble the executable sections and show you the assembly instructions:

hacker@dojo:~$ objdump -d -M intel /tmp/your-program

/tmp/your-program:     file format elf64-x86-64


Disassembly of section .text:

0000000000401000 <_start>:
  401000:	48 c7 c7 39 05 00 00 	mov    rdi,0x539
  401007:	48 c7 c7 00 00 00 00 	mov    rdi,0
  40100e:	48 c7 c0 3c 00 00 00 	mov    rax,0x3c
  401015:	0f 05                	syscall

There are a few things to note here. First, by default, objdump uses the wrong assembly syntax, which is why we pass the -M intel option. Don't forget this option! Viewing assembly in non-Intel syntax can be confusing and harmful for your health.

Second, objdump displays the raw bytes of each instruction (e.g., the hexadecimal values 0f 05 is the syscall instruction) alongside the human-readable assembly. These are the actual values that are stored in computer memory to represent the instructions. For mathematical reasons, these are represented in "base 16" (hexadecimal) rather than the "base 10" (decimal) that we are used to counting with. If that does not make sense, please run through the first half or so of the Dealing with Data module and then come back here!

Third, the values that are being moved into registers are also represented as hexadecimal. This can make it slightly tricky to understand what the program is doing. Above, we can see that it is setting rax to the hexadecimal value 0x3c, which is 60 in decimal and, thus, is our familiar syscall number of exit! Right before that, it sets rdi to 0, which will be the exit code of the program.

But interestingly, right before that, it sets rdi to 0x539, which we can't really observe from the outside because it's overwritten to 0 immediately. While this "secret" is benign, by reading the code of software, we can extract many different such secrets, some of which are security relevant!

We'll practice this secret extraction in this challenge, using a binary at /challenge/disassemble-me. Use objdump to disassemble it and find the number being loaded into rdi before it's wiped out. Then, submit that number using /challenge/submit-number. The number will be displayed in hexadecimal in the disassembly, but /challenge/submit-number accepts both hexadecimal (e.g., 0x539) and decimal (e.g., 1337) values. Good luck!

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

The first one is pretty simple: the syscall tracer, strace.

Given a program to run, strace will use functionality of the Linux operating system to introspect and record every system call that the program invokes, and its result. For example, let's look at our program from the previous challenge:

hacker@dojo:~$ strace /tmp/your-program
execve("/tmp/your-program", ["/tmp/your-program"], 0x7ffd48ae28b0 /* 53 vars */) = 0
exit(42)                                 = ?
+++ exited with 42 +++
hacker@dojo:~$

As you can see, strace reports what system calls are triggered, what parameters were passed to them, and what data they returned. The syntax used here for output is system_call(parameter, parameter, parameter, ...). This syntax is borrowed from a programming language called C, but we don't have to worry about that yet. Just keep in mind how to read this specific syntax.

In this example, strace reports two system calls: the second is the exit system call that your program uses to request its own termination, and you can see the parameter you passed to it (42). The first is an execve system call. We'll learn about this system call later, but it's somewhat of a yin to exit's yang: it starts a new program (in this case, your-program). It's not actually invoked by your-program in this case: its detection by strace is a weird artifact of how strace works, that we'll investigate later.

In the final line, you can see the result of exit(42), which is that the program exits with an exit code of 42!

Now, the exit syscall is easy to introspect without using strace --- after all, part of the point of exit is to give you an exit code that you can access. But other system calls are less visible. For example, the alarm system call (syscall number 37!) will set a timer in the operating system, and when that many seconds pass, Linux will terminate the program. The point of alarm is to, e.g., kill the program when it's frozen, but in this case, we'll use alarm to practice our strace snooping!

In this challenge, you must strace the /challenge/trace-me program to figure out what value it passes as a parameter to the alarm system call, then call /challenge/submit-number with the number you've retrieved as the argument. Good luck!

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

Next, let's move on to GDB. GDB stands for the GNU Debugger, and it is typically used to hunt down and understand bugs. More specifically, a debugger is a tool that enables the close monitoring and introspection of another process. There are many famous debuggers, and in the Linux space, gdb is by far the most common.

We'll learn gdb step by step through a series of challenges. In this one, we'll focus on simply launching it. That's done as so:

hacker@dojo:~$ gdb /path/to/binary/file

In this challenge, the binary that holds the secret is /challenge/debug-me. Once you load it in gdb, the rest will happen magically: we'll handle the analysis and give you the secret number. In later levels, you'll learn how to get that number on your own!

Again, once you have the number, exchange it for the flag with /challenge/submit-number.

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

In the previous level, GDB automatically quit for you. Now it's your turn!

When you're done working in GDB, you exit it with the quit command (or just q):

(gdb) quit

In this level, we'll still handle the analysis for you. All you need to do is launch GDB, let the magic happen, and then type quit to exit.

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

Debuggers, including gdb, observe the debugged program as it runs to expose information about its runtime behavior. In the previous level, we automatically launched the program for you. Here, we will tone down the magic somewhat: you must start the execution of the program, and we'll do the rest (e.g., recover the secret value from it).

When you launch gdb now, it will eventually bring up a command prompt, that looks like this:

(gdb)

You start a program with the starti command:

(gdb) starti

starti starts the program at the very first instruction. Once the program is running, you can use other gdb commands to inspect its actual runtime state. We'll start with the code that's running, which you can disassemble using the disassemble command! For example:

(gdb) disassemble
Dump of assembler code for function main:
=> 0x0000000000401000 <+0>:     mov    rdi,0x539
   0x0000000000401007 <+7>:     mov    rdi,0x0
   0x000000000040100e <+14>:    mov    rax,0x3c
   0x0000000000401015 <+21>:    syscall
End of assembler dump.

This is the same program from the objdump challenge, now running in gdb. Like before, you can gleam its secrets by reading the disassembly, though later we'll dig even deeper! For now, run starti after loading the binary in gdb, and we'll take care of the rest.

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

In the previous level, we ran the disassemble command for you after you started the program. Now it's your turn!

After starting the program with starti, you will need to run the disassemble command yourself:

(gdb) starti
...
(gdb) disassemble
Dump of assembler code for function main:
=> 0x0000000000401000 <+0>:     mov    rdi,0x539
   0x0000000000401007 <+7>:     mov    rdi,0x0
   0x000000000040100e <+14>:    mov    rax,0x3c
   0x0000000000401015 <+21>:    syscall
End of assembler dump.

Read the output to find the secret number, then submit it with /challenge/submit-number.

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

So far, you've been reading the secret from the program's disassembly. But what if the secret is hidden?

In this level, the disassembly is censored: the secret value is replaced with CENSORED. However, even though you can't read the value from the code, you can still execute the code! When the CPU executes mov rdi, CENSORED, it loads the actual secret value into the rdi register.

To execute a single instruction in GDB, use the stepi command (step one instruction, also abbreviated si):

(gdb) stepi

Once you step past the mov instruction, we'll read the rdi register for you and show the secret value. Submit it with /challenge/submit-number!

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

In the previous level, we automatically read the register value for you after you stepped. Now it's your turn!

The disassembly is still censored, so you'll need to:

Start the program with starti
Step one instruction with stepi (or si)
Read the register yourself with print $rdi

The print command displays the value of an expression. Register names in GDB are prefixed with $, so you can read rdi like this:

(gdb) print $rdi
$1 = 1337

Then submit the value with /challenge/submit-number.

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

In previous levels, the secret was hidden in the program's code (a hardcoded mov instruction). This time, the secret comes from the program's runtime state: it's the argument count (argc), which lives on the stack.

The program pops this value off the stack with pop rdi, but then immediately overwrites rdi with 0 before exiting:

pop    rdi          <- reads argc from the stack into rdi
mov    rdi,0x0      <- overwrites rdi with 0!
mov    rax,0x3c
syscall             <- exit(0) --- the secret is gone!

The code is fully visible, and nothing is censored, but you can't determine the secret just by reading the disassembly because argc depends on how many arguments the program was launched with. In this level, GDB handles that for you, but in the future, we'll show you how to set the program's arguments in gdb as well!

For now, you'll need to:

Start the program.
Step one instruction to execute just pop rdi
print the resulting value in rdi before it gets overwritten
Quit gdb and then submit the value with /challenge/submit-number.

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

In the last level, you could stepi to execute pop rdi and then print $rdi to read the secret. This time, there's no pop at all --- the program just exits immediately:

mov    rdi,0x0
mov    rax,0x3c
syscall             <- exit(0) --- the secret was never read!

The secret is still argc, and it's sitting right on top of the stack, but the program never loads it into a register. You'll need to examine memory directly!

GDB's x (examine) command lets you look at the contents of memory. As you learned earlier, the stack pointer ($rsp) starts out pointing right at argc, so you can read it with:

x $rsp

Go and do that!

Start the program
Examine the top of the stack
Quit gdb and submit the value with /challenge/submit-number

NOTE: x displays values in hexadecimal by default. You can change the display format by appending / to the command. For example, if you'd rather see decimal, use x/d $rsp. Either way, /challenge/submit-number accepts both hex (e.g., 0x2a) and decimal (e.g., 42).

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

In the last level, you used x to read argc from the top of the stack. But the stack holds more than just argc!

Right after the argument count, the stack stores pointers to each program argument. These are addresses stored in memory: $rsp+16 doesn't contain the argument text directly --- it contains the address where that text lives.

For example, if your program is run as /challenge/debug-me Hi:

     Address    │ Contents
   +────────────────────────────+
   │  rsp + 0   │ 2             │◀── argc
   +────────────────────────────+
   │  rsp + 8   │ 0x1234000     │──────┐
   +────────────────────────────+      │
   │  rsp + 16  │ 0x1234560     │────┐ │
   +────────────────────────────+    │ │
                                     │ │
                                     │ │
     Address    │ Contents           │ │
   +──────────────────────────────+  │ │
   │ 0x1234000  │ "/challenge/..."│◀─│─┘ the program name
   +──────────────────────────────+  │
   │ ...        │ ...                │
   +──────────────────────────────+  │
   │ 0x1234560  │ "Hi"            │◀─┘   the first argument
   +──────────────────────────────+

To get the actual argument data, you need two dereferences: one to get the pointer from the stack, and one to follow it to the string.

In this level, THE FLAG ITSELF is passed as the first argument! The program doesn't use it --- it just exits --- but the flag is right there in memory.

To find it, you'll need two x commands, with two different display modes:

First: You'll need the pointer the first argument. You've done this before, but now you're doing it in gdb.

x/a $rsp+16

/a tells x to display the value as a memory address. You'll see a very large hexadecimal number, something like 0x7ffc001c4750.

Second: Read the text of the first argument at that address:

x/s 0x7ffc001c4750

/s tells x to display the value as a string. Replace the address with whatever you got from step 1. This will show you the flag!

Go and do that!

Start the program
x/a $rsp+16 to get the address of the first argument
x/s <address> to read the flag string

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

So far, the debugging you've done has been preemptive: you (the debugger) started the program with stepi, which immediately forces it to stop and let you debug it, without the program necessarily being aware of it. In this challenge, we'll learn another model for this, where the program decides when the debugger stop happens. We'll call this cooperative debugging.

On our now-familiar x86 architecture, the program can signal a desire to be debugged by using the int3 instruction. If a debugger is attached when int3 is executed, it stops the program. This is called a program breakpoint.

Later, we'll learn how to set breakpoints from the debugger itself, going back to the preemptive model. But in this challenge, the checker will run your program under gdb and expect your program to trigger its own breakpoint. To do this, rather than using starti to start your program and immediately stop it, we'll use gdb's run command, which will simply run it until a breakpoint is hit!

When your program executes int3, gdb will break and the checker script will inspect $rdi. If $rdi is 1337 at that point, you get the flag!

Go and write a program that:

Moves 1337 into rdi
Executes int3 to cooperatively hand control to the debugger

NOTE: When an int3 is executed by a program not running under a debugger, you will see:

hacker@dojo:~$ /tmp/your-program
Trace/breakpoint trap
hacker@dojo:~$

And the program will terminate... If you want the program to run outside a debugger, take out that int3!

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

Let's learn to write text!

Unsurprisingly, your program writes text to the screen by invoking a system call. Specifically, this is the write system call, and its syscall number is 1. However, the write system call also needs to specify, via its parameters, what data to write and where to write it to.

You may remember, from the Practicing Piping module of the Linux Luminarium dojo, the concept of File Descriptors (FDs). As a reminder, each process starts out with three FDs:

FD 0: Standard Input is the channel through which the process takes input. For example, your shell uses Standard Input to read the commands that you input.
FD 1: Standard Output is the channel through which processes output normal data, such as the flag when it is printed to you in previous challenges or the output of utilities such as ls.
FD 2: Standard Error is the channel through which processes output error details. For example, if you mistype a command, the shell will output, over standard error, that this command does not exist.

It turns out that, in your write system call, this is how you specify where to write the data to! The first (and only) parameter to your exit system call was your exit code (mov rdi, 42), and the first (but, in this case, not only!) parameter to write is the file descriptor. If you want to write to standard output, you would set rdi to 1. If you want to write to standard error, you would set rdi to 2. Super simple!

This leaves us with what to write. Now, you could imagine a world where we specify what to write through yet another register parameter to the write system call. But these registers don't fit a ton of data, and to write out a long story like this challenge description, you'd need to invoke the write system call multiple times. Relatively speaking, this has a lot of performance cost --- the CPU needs to switch from executing the instructions of your program to executing the instructions of Linux itself, do a bunch of housekeeping computation, interact with your hardware to get the actual pixels to show up on your screen, and then switch back. This is slow, and so we try to minimize the number of times we invoke system calls.

Of course, the solution to this is to write multiple characters at the same time. The write system call does this by taking two parameters for the "what": a where (in memory) to start writing from and a how many characters to write. These parameters are passed as the second and third parameters to write. In the kinda-C syntax that we learned from strace, this would be:

write(file_descriptor, memory_address, number_of_characters_to_write)

For a more concrete example, if you wanted to write 10 characters starting from some memory address to standard output (file descriptor 1), this would be:

write(1, memory_address, 10);

Wow, that's simple! Now, how do we actually specify these parameters?

We'll pass the first parameter of a system call, as we reviewed above, in the rdi register.
We'll pass the second parameter via the rsi register. The agreed-upon convention in Linux is that rsi is used as the second parameter to system calls.
We'll pass the third parameter via the rdx register. This is the most confusing part of this entire module: rdi (the register holding the first parameter) has such a similar name to rdx that it's really easy to mix up and, unfortunately, the naming is this way for historic reasons and is here to stay. Oh well... It's just something we have to be careful about. Maybe a mnemonic like "rdi is the initial parameter while rdx is the xtra parameter"? Or just think of it as having to keep track of different friends with similar names, and you'll be fine.

And, of course, the write syscall index into rax itself: 1. Other than the rdi vs rdx confusion, this is really easy!

Now, you know how to set the system call number and how to set the rest of the registers. But where in memory is the data you need to write?

In this challenge, your program is invoked with a command-line argument, something like:

/tmp/your-program H

Recall that when a program is run with arguments, the stack stores pointers to each argument. These are addresses stored in memory: [rsp+16] doesn't contain the argument text directly --- it contains the address where that text lives.

So, to get the memory address of the first argument, you simply load the pointer from the stack, as you've done before!

mov rsi, [rsp+16]

This puts the memory address of the first argument's text into rsi --- exactly what write needs as its second parameter!

Your program will be invoked with a single character as its first argument. Call write to write that single character (for now! We'll do multiple-character writes later) to standard output, and we'll give you the flag!

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

Okay, our previous solution wrote output but then crashed. In this level, you will write output, and then not crash!

We'll do this by invoking the write system call, and then invoking the exit system call to cleanly exit the program. How do we invoke two system calls? Just like you invoke two instructions! First, you set up the necessary registers and invoke write, then you set up the necessary registers and invoke exit!

Your previous solution had 5 instructions (loading the first argument's address from the stack, setting rdi, setting rdx, setting rax, and syscall). This one should have those 5, plus three more for exit (setting rdi to the exit code, setting rax to syscall index 60, and syscall). For this level, let's exit with exit code 42!

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

Okay, we have one thing left for this run of challenges. You've written out a single byte, and now we'll practice writing out multiple bytes. In this level, the flag itself is passed as the first argument to your program! Can you write all 64 characters of it to stdout?

Hint: The only thing you should have to change compared to your previous solution is the value in rdx!

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

You now know how to output data to stdout using write. But how does your program receive input data? It reads it from stdin!

Like write, read is a system call that shunts data around between file descriptors and memory, and its syscall number is 0. In read's case, it reads some amount of bytes from the provided file descriptor and stores them in memory. The C-style syntax is the same as write:

read(0, some_address, 5);

This will read 5 bytes from file descriptor 0 (stdin) into memory starting from some_address. So, if you type in (or pipe in) HELLO HACKERS into stdin, the above read call would result in the following memory configuration:

     Address     │ Contents
+───────────────────────────+
│ some_address   │ 48       │
│ some_address+1 │ 45       │
│ some_address+2 │ 4c       │
│ some_address+3 │ 4c       │
│ some_address+4 │ 4f       │
+───────────────────────────+

What are those numbers?? They are hexadecimal representations of ASCII-encoded letters. If those words don't make sense, please run through the first half or so of the Dealing with Data module and then come back here!

In this level, we will combine read with our previous write abilities. The flag will be piped into your program's stdin --- 64 bytes of it. Your program should:

first read 64 bytes from stdin to your program's memory
write those 64 bytes from that memory location to stdout
finally, exit with the exit code 42.

But what address should you use? You need somewhere that's valid and writable, and you already know about one such place: the stack! The rsp register points to the top of the stack, and there's plenty of writable space there. So you can just use rsp as your memory address: mov rsi, rsp.

DEBUGGING: Having trouble? Recall the Introspection module! Build your program and run it with strace to see what's happening at the system call level, or run it in gdb to inspect the values of registers and memory to see what's unexpected.

REMEMBER: You've basically already written steps 2 and 3 (though in the previous challenges, you loaded rsi from [rsp+16] --- here, you'll set it to rsp directly with mov rsi, rsp!). All you have to do is add step 1!

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

So far, your program has only interacted with stdin and stdout, but what about files on disk? To access a file, you first need to open it using the open system call.

The open system call (syscall number 2) takes a pointer to a filename string and returns a brand-new file descriptor referring to that file:

open("/flag", 0);

The second argument specifies additional modes and permissions for the file, but 0 requests the default: read-only.

The registers for open follow the same convention:

Register	Purpose
`rax`	`2` (syscall number for `open`)
`rdi`	pointer to the filename string in memory
`rsi`	`0` (read-only)

When open returns, rax contains the new file descriptor (fd) number. Recall that file descriptor 0 is stdin, file descriptor 1 is stdout, and file descriptor 2 is stderr. Other files that are open are just represented by other file descriptors, incrementing from 3 onwards! You'll use this fd as the first argument to read, just like you did for stdin earlier, but this time read will read from your file.

How to load the filename into memory? In this level, the path to the flag (/flag) will be passed as the first argument to your program. You already know how to load that: mov rdi, [rsp+16].

Your program should:

Load a pointer to the filename (stored at [rsp+16], the first argument) into rdi
Specify the default of read access for the second argument (set rsi to 0).
open it (syscall 2)
read 64 bytes from the returned fd into memory. The returned fd will be stored in rax; you'll need to move that to rdi for read's first argument. Make sure to do this before you set the syscall number for write!
write those 64 bytes to stdout
exit with code 42 (syscall 60)

DEBUGGING: Having trouble? Use strace to see your system calls in action --- it will show you exactly what arguments each syscall receives and what it returns. If open is returning -1, double-check your filename pointer. If read returns 0, the file descriptor from open might be wrong.

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

In the previous level, the filename was passed as an argument to your program. But what if you need to open a file whose path you already know? You can hardcode the filename string directly into your program by writing it onto the stack, byte by byte!

The open syscall needs a pointer to the filename, so you need the bytes / f l a g stored somewhere in memory. You already know a writable memory address: rsp (the stack). You can write each character one byte at a time:

mov BYTE PTR [rsp], '/'
mov BYTE PTR [rsp+1], 'f'
mov BYTE PTR [rsp+2], 'l'
mov BYTE PTR [rsp+3], 'a'
mov BYTE PTR [rsp+4], 'g'
mov BYTE PTR [rsp+5], 0

A few things to note here:

BYTE PTR: When you write to a memory address like [rsp] using an immediate value (a number or character), the CPU doesn't know how many bytes you intend to write --- one? two? eight? BYTE PTR is a size directive that tells the assembler "I mean exactly one byte." Without it, the assembler won't know what you want and will refuse to assemble the instruction.
Single quotes: In assembly, a single-quoted character like 'f' represents that character's one-byte ASCII value. So 'f' is just a convenient way of writing 0x66, and '/' is 0x2f.
The null byte: The last byte we write is 0 --- a special null byte. This is how Linux knows where a string ends: it reads bytes starting from the pointer you give it and stops when it hits a 0 byte. Without it, open would keep reading past "flag" into whatever else is on the stack, and you'd be trying to open a file with a nonsense name!

After writing these bytes, rsp points to the null-terminated string "/flag", ready to pass to open.

Your turn! This time, no arguments are passed to your program. You must construct the filename yourself.

Your program should:

Write "/flag\0" onto the stack byte by byte using mov BYTE PTR [rsp+N], ...
open it (syscall 2): rdi = rsp (the string you just wrote), rsi = 0
read 64 bytes from the returned fd into memory (syscall 0)
write those 64 bytes to stdout (syscall 1)
exit with code 42 (syscall 60)

DEBUGGING: Having trouble? Use strace to trace your syscalls. If open returns -1, your string pointer or encoding might be off. Try x/s $rsp in gdb to see what string is actually on the stack.

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

So far, your programs have been fairly straightforward: move some values around, read from memory, and invoke a system call. But real programs need to make decisions: "if this condition is true, do one thing; otherwise, do something else." This is the foundation of control flow, and it starts with being able to compare values.

In x86 assembly, comparisons are done with the cmp instruction. cmp compares two values by subtracting the second operand from the first. Crucially, cmp doesn't store the result of the subtraction anywhere you can see directly. Instead, it updates the CPU's internal flags based on what the result looked like.

For example:

cmp rdi, 42

This internally computes rdi - 42, but rdi is not modified. Instead, the CPU sets a special bit called the Zero Flag (ZF): if the result of the subtraction was zero (meaning the two values were equal), ZF is set to 1. If rdi contains 42, then 42 - 42 = 0, and ZF becomes 1. If rdi contains anything else, the result is non-zero, and ZF becomes 0.

Great, so after cmp, the CPU knows whether the values were equal. But how do we actually use that information?

We can't directly mov the flags into a register. Instead, x86 provides a family of "set on condition" instructions that write a 0 or 1 to a byte-sized destination based on the current flags.

The one we'll use here is setz ("Set if Zero"):

setz dil

This checks the Zero Flag and:

If ZF = 1 (the values were equal, i.e., the subtraction result was zero), it writes 1 to dil.
If ZF = 0 (the values were not equal), it writes 0 to dil.

Simple: 1 means "yes, they matched!" and 0 means "no, they didn't." There's also a complementary instruction, setnz ("Set if Not Zero"), which does the opposite, but we won't need it here.

But what is dil? So far, you've worked with 64-bit registers like rdi, rax, and rsp. The setnz instruction, however, only writes a single byte (8 bits). Luckily, you can access smaller portions of the full 64-bit registers. For rdi:

rdi is the full 64 bits
dil is just the lowest 8 bits --- the low byte of rdi

When you write setz dil, you're putting a 0 or 1 into just the lowest byte of rdi, leaving the upper bytes unchanged. Since rdi is the register used for the exit code in the exit system call, this effectively makes your exit code 1 (equal!) or 0 (not equal!).

One more thing about cmp: it can compare a register with an immediate (cmp rdi, 42) or even a memory location with an immediate (cmp BYTE PTR [rsp], 42). But it cannot compare two memory locations at once --- at most one operand can be a memory dereference. This is a general rule in x86 and, actually, in almost all CPU architectures.

Now, your challenge: recall from the Stack module that [rsp] contains argc --- the number of command-line arguments passed to your program, including the program name. Write a program that:

Compares argc with 42 (whether by first moving argc into a register or comparing against the memory directly).
Uses setz dil to set the exit code: 1 if argc equals 42, 0 otherwise.
Exits.

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

Now let's apply what you've learned to check a specific character in a command-line argument.

Recall from the Stack module that [rsp+16] holds a pointer to argv[1] --- the first command-line argument. To actually look at the argument text, you first need to load that pointer into a register:

mov rax, [rsp+16]

Now rax holds the address of the argument string. The first character of that string lives at [rax], the second at [rax+1], and so on.

To check whether the first character is, say, 'p':

cmp BYTE PTR [rax], 'p'

This reads one byte from the address in rax and compares it against the ASCII value of 'p'. Remember: BYTE PTR tells the CPU you're working with a single byte, not a full 64-bit value. You learned this back in the Output and Input module when you built strings on the stack byte by byte.

After the cmp, the Zero Flag reflects whether they matched, and you can capture that result with setz dil, just like before.

Your challenge: write a program that checks whether the first character of argv[1] is 'p'. Exit with 1 if it is, 0 if it isn't.

Your program should use 5 instructions:

Load the argv[1] pointer from [rsp+16] into a register.
Compare BYTE PTR at that address against 'p'.
Use setz dil to capture the result.
Set up the exit syscall number.
syscall.

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

In the previous challenges, you used setz to capture a comparison result as a 0 or 1 in dil, then passed that directly as your exit code. That was a neat trick --- but it has limitations. What if you want your program to take entirely different actions depending on whether the values were equal?

This is where conditional jumps come in. Instead of recording the comparison result into a register, you can tell the CPU to jump to a different part of your code based on the outcome of the cmp. The most useful conditional jump for our purposes is jne (Jump if Not Equal):

cmp BYTE PTR [rax], 'p'
jne fail

After the cmp, if the values were not equal, the CPU jumps to the location labeled fail. The terminology used for this is that it "takes the branch" (in the road/code). If the values were equal, the CPU simply continues to the next instruction. The terminology used for this is behavior of not taking the branch that it "falls through" to the next instruction.

Under the hood, jne checks the Zero Flag (ZF) that cmp set: jne jumps when ZF = 0 (meaning the subtraction result was non-zero, i.e., the values differed). There's also je (Jump if Equal), which does the opposite: it jumps when the values are equal.

But what is fail? It's a label --- a name you give to a location in your code. Labels don't generate any machine instructions; they just mark a spot that jump instructions can refer to. You define a label by writing its name followed by a colon:

fail:
  mov rdi, 1
  mov rax, 60
  syscall

The assembler resolves the label to an address, so jne fail becomes something like jne <address> in the actual machine code. You can name labels almost anything (fail, error, done, loop, etc.), but the name should describe what happens at that location.

With conditional jumps, your programs can now have two different paths of execution:

main:
  [load and compare]
  jne fail          ← jump to fail if NOT equal

success:
  mov rdi, 0
  mov rax, 60
  syscall

fail:
  mov rdi, 1
  mov rax, 60
  syscall

If the comparison succeeds (the values are equal), execution falls through to the success path and exits with 0 --- the standard "success" exit code for Linux programs. If the comparison fails (the values are not equal), execution jumps to the fail label and exits with 1 --- indicating program failure.

Of course, this is a simple example, but we'll start simple! The challenge: write a program that checks whether the first character of argv[1] is 'p', using conditional jumps instead of setz:

Load the argv[1] pointer from [rsp+16] into a register.
Compare BYTE PTR at that address against 'p'.
jne fail --- jump to the failure case if the characters aren't equal.
Write the "fall-through" success case (exit(0)).
Define the fail: label and write the fail case (exit(1)).

The tricky thing is that your success case (jump not taken) is between your jne instruction and the fail case that the jne instruction refers to. This can take a bit to wrap your head around, but you'll get used to it!

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

In the previous challenge, you used cmp and jne to check a single character and branch to a failure path. But checking one character is rarely sufficient: passwords, commands, and filenames are all strings of multiple characters.

The good news: you already know everything you need to check a whole string! You simply chain multiple cmp / jne pairs, one for each character, all jumping to the same fail label:

mov rax, [rsp+16]       ; load argv[1] pointer

cmp BYTE PTR [rax], 'Y'
jne fail

cmp BYTE PTR [rax+1], 'E'
jne fail

cmp BYTE PTR [rax+2], 'S'
jne fail

Each comparison checks one character of the string. Remember from the Computer Memory module that [rax+1] accesses the byte one past the address in rax, [rax+2] is two past, and so on. Since strings are stored as contiguous bytes in memory, [rax] is the first character, [rax+1] is the second, [rax+2] is the third, etc.

If any character doesn't match, jne immediately jumps to fail --- the program doesn't bother checking the rest. Only if all comparisons pass (all characters match) does execution fall through to the success path.

This is how many string comparisons work at the lowest level: compare byte by byte, bail out on the first mismatch.

Now, you will practice this. Write a program that checks whether the first argument starts with the string "pwn":

Load the pointer for the first argument from [rsp+16].
Compare byte at offset 0 against 'p' --- jne fail if it doesn't match.
Compare byte at offset 1 against 'w' --- jne fail if it doesn't match.
Compare byte at offset 2 against 'n' --- jne fail if it doesn't match.
Implement the success path: exit(0).
Implement the fail: label with exit(1).

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

In the previous challenge, you wrote assembly that compared strings character by character. Well, the tables have turned! We wrote a program, and you need to figure out what it does!

At /challenge/reverse-me, there's a SUID binary. It takes a command-line argument, compares it against a hidden password one byte at a time (sound familiar?) and, if the password is correct, reads and prints the flag. If any character is wrong, it silently exits.

How do you solve this? You must read the disassembly of the program, analyze the cmp instructions, understand the password that the program needs, then run it with the correct argument.

You already have the tools for this! From the Software Introspection module, remember: objdump -d -M intel /challenge/reverse-me disassembles the binary and shows its assembly instructions. You'll see familiar cmp instructions similar to those you wrote in the last challenge, but instead of the familiar ''-quoted characters, the compared-against values will be written as hex. The immediate values in those comparisons are the password characters, encoded as hexadecimal ASCII values.

For example, imagine that the disassembly shows:

cmp    BYTE PTR [rax],0x70

Here, 0x70 is the ASCII code for 'p'. You can get the full list of ASCII values by referencing the man ascii command.

Once you've recovered all the password characters, run the program directly:

hacker@dojo:~$ /challenge/reverse-me YOUR_PASSWORD_HERE

WARNING: /challenge/reverse-me is a SUID binary --- it runs with elevated privileges so it can read /flag. However, debugging a program will drop its SUID privileges, which means the open("/flag") syscall inside will silently fail if you run it under gdb. You can use gdb or objdump to understand the binary and figure out the password, but make sure to run it directly (outside of gdb) to get the flag.

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

In the previous challenge, you reverse-engineered cmp/jne pairs to recover a password. That technique checks each possibility one by one: compare, branch, compare, branch... But what if a program needs to branch to one of many different destinations based on a single value?

There's a more efficient approach: a jump table. A jump table is an array of addresses stored in memory, one for each possible destination (called a case). Instead of comparing the input against every possibility, the program uses the input value as an index into the table, loads the address stored at that position, and jumps to it. This pattern is called a switch, and it's a fundamental building block in programs.

In the disassembly, you'll see something like:

xor    eax, eax                  ; zero out rax
mov    al, BYTE PTR [rcx]        ; load the character into the low byte of rax
mov    rax, [rax*8+0x1234000]    ; load a stored address from the jump table at 0x1234000
jmp    rax                       ; jump to it

You've seen dil (the low byte of rdi) before, and al is the same idea for rax. Writing to al only changes the lowest 8 bits, leaving the rest of rax intact. That's why the code first zeros rax with xor eax, eax: it ensures the upper bytes are 0, so after mov al, [rcx], rax holds just the character's value (0--255).

The character's value directly indexes a table of 256 entries (one per possible byte value). Each entry is an 8 byte address pointing to code for that case. In this way, the program implements conditional logic without any conditional control flow!

This challenge (at /challenge/reverse-me) has 256 possible cases, with only one of them (corresponding to an alphanumeric character) being different than the others. Look at the jump table (you'll have to look at a lot of entries...), look at the program to understand how to influence the index, and get the flag!

NOTE: Though you should look at the disassembly using objdump -d -M intel /challenge/reverse-me, objdump will try to interpret the jump table data as assembly instructions, which will result in garbage. Ignore that section of the disassembly; you'll need to look at that data in gdb, instead.

HINT: You'll likely want to use gdb extensively in this challenge, and x/a will be your friend. For example, if you are in gdb at the instruction mov rax, [rax*8+0x1337000] (note, your address will differ), you can examine the jump table entries:

(gdb) print $rax
1
(gdb) x/a $rax*8+0x1337000
0x1337008:  0x400100
(gdb) x/a 2*8+0x1337000
0x1337010:  0x400100
(gdb) x/a 98*8+0x1337000
0x1337310:  0x400200
(gdb)

HINT: You can also print out several jump table entries at the same time:

(gdb) x/3a 0x1337000
0x1337000:  0x400100
0x1337008:  0x400100
0x1337010:  0x400100
(gdb)

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

So far, every control flow pattern you've seen executes in a straight line: compare, branch, done. But what if you need to repeat the same operation many times? That's what a loop is: a sequence of instructions that jumps backward to repeat itself.

In this challenge, /challenge/reverse-me compares argv[1] against the password using a loop:

loop:
  mov    al, BYTE PTR [rsi]       ; load next password character
  cmp    al, BYTE PTR [rdi]       ; compare against next argv[1] character
  jne    fail                     ; mismatch → jump to fail
  cmp    al, 0x0                  ; reached the null terminator?
  je     success                  ; yes → all characters matched!
  inc    rdi                      ; **inc**rement rdi to advance to next argv[1] character
  inc    rsi                      ; **inc**rement rsi to advance to next password character
  jmp    loop                     ; jump back to the top — repeat!

The key instruction is jmp loop at the bottom. Unlike jne (which only jumps when a condition is met), jmp unconditionally always jumps. By jumping backward to the loop label, the program re-executes the same comparison logic on the next pair of characters. The loop terminates when either a mismatch is found (jne fail) or the null terminator is reached at the end of the string after the other characters are successfully matched to the password (je success).

This is the fundamental pattern behind every for loop, while loop, and string operation you'll ever encounter in compiled code.

Analyze the binary, figure out the password, and get the flag!

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

In the previous challenge, you reversed a program by finding password characters directly in the cmp instructions. This time, the program transforms your input before comparing it. You'll need to understand and mentally invert this operation to successfully pass the check!

At /challenge/reverse-me, there's a new SUID binary. It will do some math on the first byte of the first program argument, and compare it against a hardcoded value. If the comparison passes, it reads and prints the flag. Otherwise, it silently exits.

The new instruction here is add, as so:

add rax, 42

This adds 42 to the rax register and updates rax with the result. The following would result in rax having the value 99:

mov rax, 57
add rax, 42

Like many other instructions, add can handle memory, registers, or immediates, when you disassemble this binary with objdump -d -M intel /challenge/reverse-me, you might see something like:

add    BYTE PTR [rax],0x2a
cmp    BYTE PTR [rax],0x96

This adds 0x2a (42) to the first byte of your input (in memory), then checks if the result equals 0x96 (150). To figure out what character you need, just reverse the math: 150 - 42 = 108 (6c). Looking at man ascii, 0x6c is the character 'l'. So the required input character in this case is l (remember, man ascii is your friend for converting between hex values and characters)!

Once you've figured out the character, run the program:

hacker@dojo:~$ /challenge/reverse-me YOUR_CHARACTER_HERE

Now it's your turn! Go and get the flag.

WARNING: /challenge/reverse-me is a SUID binary --- it runs with elevated privileges so it can read /flag. However, debugging a program will drop its SUID privileges, which means the open("/flag") syscall inside will silently fail if you run it under gdb. You can use gdb or objdump to understand the binary, but make sure to run it directly (outside of gdb) to get the flag.

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

In the previous challenge, the program used add to transform your input before checking it. This time, it uses sub (as in, subtract) instead. Analogous to add, sub rax, 42 will subtract 42 from rax and store the result in rax.

Otherwise, this challenge is the same as the previous one. Go get it!

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

This challenge introduces a new type of operation: bitwise XOR. Unlike add and sub, which do arithmetic, xor operates on individual bits.

The xor instruction computes the exclusive or of two values: for each bit position, the result is 1 if exactly one of the two input bits is 1, and 0 otherwise. For example:

  01100001  (0x61, 'a')
^ 00101010  (0x2a, 42)
---------
  01001011  (0x4b, 75)

The syntax is the same as add and sub: xor rax, 42.

A key property of XOR is that it's its own inverse: xoring a value with the same value twice gives back the original value. So if you see:

xor    BYTE PTR [rax],0x2a
cmp    BYTE PTR [rax],0x4b

The program XORs your input byte with 0x2a and checks if the result is 0x4b. To reverse this, XOR the target with the key: 0x4b ^ 0x2a = 0x61, which is 'a'.

Now: disassemble the binary, reverse the XOR, and get the flag!

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

So far, the values you've been reversing have been embedded directly in instructions as immediate operands. However, this challenge compares the first program argument against a hardcoded string inside the challenge. The string lives in a different section of the program file: the binary's .rodata (read-only data) section, rather than in the instructions themselves.

There are several options to find it:

The most familiar: stepi to where the comparison is happening and x the registers pointing to the data.
Use strings /challenge/reverse-me to list all printable strings in the binary. There are a lot, but one of them will be the password.
Use objdump -s -j .rodata /challenge/reverse-me to dump the raw contents of the .rodata section.

Which you use is up to you!

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

These challenges are written in Python and run your assembly code in an emulator. This means you cannot use the normal debugging tools such as gdb to debug the challenge. However, we have added a special debug functionality to these challenges. If an int3 instruction is executed by the challenge in your assembly code, the emulator will print out the current state of the registers and memory. This can be extremely useful to reason about your code!

An awesome intro series that covers some of the fundamentals from LiveOverflow.
`Ike: The Systems Hacking Handbook, an excellent guide to Computer Organization.
A comprehensive assembly tutorial for several architectures (amd64 is the relevant one here).
The course "Architecture 1001: x86-64 Assembly" from OpenSecurityTraining2.
A whole x86_64 assembly book to help you out!
A game to teach you x86 assembly and one to stress test your knowledge!
A flowchart of x86 prefix and escape opcodes.
An unofficial, but extremely detailed and useful x86 reference.

In this level, you will be working with registers. You will be asked to modify or read from registers.

In this level, you will work with registers! Please set the following:

rdi = 0x1337

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

In this level, you will be working with registers. You will be asked to modify or read from registers.

In this level, you will work with multiple registers. Please set the following:

rax = 0x1337
r12 = 0xCAFED00D1337BEEF
rsp = 0x31337

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

In this level, you will be working with registers. You will be asked to modify or read from registers.

We will set some values in memory dynamically before each run. On each run, the values will change. This means you will need to perform some formulaic operation with registers. We will tell you which registers are set beforehand and where you should put the result. In most cases, it's rax.

Many instructions exist in x86 that allow you to perform all the normal math operations on registers and memory.

For shorthand, when we say A += B, it really means A = A + B.

Here are some useful instructions:

add reg1, reg2 <=> reg1 += reg2
sub reg1, reg2 <=> reg1 -= reg2
imul reg1, reg2 <=> reg1 *= reg2

div is more complicated, and we will discuss it later. Note: all regX can be replaced by a constant or memory location.

Do the following:

Add 0x331337 to rdi

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

In this level, you will be working with registers. You will be asked to modify or read from registers.

We will now set some values in memory dynamically before each run. On each run, the values will change. This means you will need to do some type of formulaic operation with registers. We will tell you which registers are set beforehand and where you should put the result. In most cases, it's rax.

Using your new knowledge, please compute the following:

f(x) = mx + b, where:
- m = rdi
- x = rsi
- b = rdx

Place the result into rax.

Note: There is an important difference between mul (unsigned multiply) and imul (signed multiply) in terms of which registers are used. Look at the documentation on these instructions to see the difference.

In this case, you will want to use imul.

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

In this level, you will be working with registers. You will be asked to modify or read from registers.

We will set some values in memory dynamically before each run. On each run, the values will change. This means you will need to perform some type of formulaic operation with registers. We will tell you which registers are set beforehand and where you should put the result, which is usually rax.

Division in x86 is more special than in normal math. Math here is called integer math, meaning every value is a whole number.

As an example: 10 / 3 = 3 in integer math.

Why?

Because 3.33 is rounded down to an integer.

The relevant instructions for this level are:

mov rax, reg1
div reg2

Note: div is a special instruction that can divide a 128-bit dividend by a 64-bit divisor while storing both the quotient and the remainder, using only one register as an operand.

How does this complex div instruction work and operate on a 128-bit dividend (which is twice as large as a register)?

For the instruction div reg, the following happens:

rax = rdx:rax / reg
rdx = remainder

rdx:rax means that rdx will be the upper 64-bits of the 128-bit dividend and rax will be the lower 64-bits of the 128-bit dividend.

You must be careful about what is in rdx and rax before you call div.

Please compute the following:

speed = distance / time, where:
- distance = rdi
- time = rsi
- speed = rax

Note that distance will be at most a 64-bit value, so rdx should be 0 when dividing.

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

In this level, you will be working with registers. You will be asked to modify or read from registers.

We will set some values in memory dynamically before each run. On each run, the values will change. This means you will need to perform a formulaic operation with registers. We will tell you which registers are set beforehand and where you should put the result. In most cases, it's rax.

Modulo in assembly is another interesting concept!

x86 allows you to get the remainder after a div operation.

For instance: 10 / 3 results in a remainder of 1.

The remainder is the same as modulo, which is also called the "mod" operator.

In most programming languages, we refer to mod with the symbol %.

Please compute the following: rdi % rsi

Place the value in rax.

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

In this level, you will be working with registers. You will be asked to modify or read from registers.

We will set some values in memory dynamically before each run. On each run, the values will change. This means you will need to do some type of formulaic operation with registers. We will tell you which registers are set beforehand and where you should put the result, which is typically in rax.

Another cool concept in x86 is the ability to independently access the lower register bytes.

Each register in x86_64 is 64 bits in size, and in the previous levels, we have accessed the full register using rax, rdi, or rsi.

We can also access the lower bytes of each register using different register names.

For example, the lower 32 bits of rax can be accessed using eax, the lower 16 bits using ax, and the lower 8 bits using al.

MSB                                    LSB
+----------------------------------------+
|                   rax                  |
+--------------------+-------------------+
                     |        eax        |
                     +---------+---------+
                               |   ax    |
                               +----+----+
                               | ah | al |
                               +----+----+

Lower register bytes access is applicable to almost all registers.

Using only one move instruction, please set the upper 8 bits of the ax register to 0x42.

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

In this level, you will be working with registers. You will be asked to modify or read from registers.

We will set some values in memory dynamically before each run. On each run, the values will change. This means you will need to perform some type of formulaic operation with registers. We will tell you which registers are set beforehand and where you should put the result. In most cases, it's rax.

It turns out that using the div operator to compute the modulo operation is slow!

We can use a math trick to optimize the modulo operator (%). Compilers use this trick a lot.

If we have x % y, and y is a power of 2, such as 2^n, the result will be the lower n bits of x.

Therefore, we can use the lower register byte access to efficiently implement modulo!

Using only the following instruction(s):

mov

Please compute the following:

rax = rdi % 256
rbx = rsi % 65536

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

In this level, you will be working with registers. You will be asked to modify or read from registers.

We will set some values in memory dynamically before each run. On each run, the values will change. This means you will need to perform some type of formulaic operation with registers. We will tell you which registers are set beforehand and where you should put the result. In most cases, it's rax.

In this level, you will be working with bit logic and operations. This will involve heavy use of directly interacting with bits stored in a register or memory location. You will also likely need to make use of the logic instructions in x86: and, or, not, xor.

Shifting bits around in assembly is another interesting concept!

x86 allows you to 'shift' bits around in a register.

Take, for instance, al, the lowest 8 (or least significant 8) bits of rax.

The value in al (in bits) is:

al = 10001010

If we shift once to the left using the shl instruction:

shl al, 1

The new value is:

al = 00010100

Everything shifted to the left, and the highest (or most significant) bit fell off while a new 0 was added to the right side.

You can use this to do special things to the bits you care about.

Shifting has the nice side effect of doing quick multiplication (by 2) or division (by 2), and can also be used to compute modulo.

Here are the important instructions:

shl reg1, reg2 <=> Shift reg1 left by the amount in reg2
shr reg1, reg2 <=> Shift reg1 right by the amount in reg2

Note: 'reg2' can be replaced by a constant or memory location.

When we say significant bit or least significant byte, significant means "most important for the value."

The least significant bit/byte carries the smallest weight (the "lowest" place value). For example, when you modify the "lowest" or "rightmost" bit, the value changes just by 1.
The most significant bit/byte carries the highest weight (the "highest" place value).

For this challenge, using only the following instructions:

mov, shr, shl

Please perform the following: Set rax to the 5th least significant byte of rdi.

For example:

rdi = | B7 | B6 | B5 | B4 | B3 | B2 | B1 | B0 |
Set rax to the value of B4

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

In this level, you will be working with registers. You will be asked to modify or read from registers.

We will set some values in memory dynamically before each run. On each run, the values will change. This means you will need to perform some type of formulaic operation with registers. We will tell you which registers are set beforehand and where you should put the result. In most cases, it's rax.

In this level, you will be working with bit logic and operations. This will involve heavy use of directly interacting with bits stored in a register or memory location. You will also likely need to make use of the logic instructions in x86: and, or, not, xor.

Bitwise logic in assembly is yet another interesting concept! x86 allows you to perform logic operations bit by bit on registers.

For the sake of this example, say registers only store 8 bits.

The values in rax and rbx are:

rax = 10101010
rbx = 00110011

If we were to perform a bitwise AND of rax and rbx using the and rax, rbx instruction, the result would be calculated by ANDing each bit pair one by one, hence why it's called bitwise logic.

So from left to right:

1 AND 0 = 0
0 AND 0 = 0
1 AND 1 = 1
0 AND 1 = 0
...

Finally, we combine the results together to get:

rax = 00100010

Here are some truth tables for reference:

AND

A | B | X
---+---+---
0 | 0 | 0
0 | 1 | 0
1 | 0 | 0
1 | 1 | 1

OR

A | B | X
---+---+---
0 | 0 | 0
0 | 1 | 1
1 | 0 | 1
1 | 1 | 1

XOR

A | B | X
---+---+---
0 | 0 | 0
0 | 1 | 1
1 | 0 | 1
1 | 1 | 0

Without using the following instructions: mov, xchg

Please perform the following:

Set rax to the value of (rdi AND rsi)

NOTE: rax will have all bits set to 1 If it didn't, this level would be trickier!

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

In this level, you will be working with registers. You will be asked to modify or read from registers.

We will set some values in memory dynamically before each run. On each run, the values will change. This means you will need to perform some type of formulaic operation with registers. We will tell you which registers are set beforehand and where you should put the result. In most cases, it is rax.

In this level, you will be working with bit logic and operations. This will involve heavy use of directly interacting with bits stored in a register or memory location. You will also likely need to make use of the logic instructions in x86: and, or, xor.

Using only the following instructions:

and
or
xor

Implement the following logic:

if x is even then
  y = 1
else
  y = 0

Where:

x = rdi
y = rax

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

We will now set some values in memory dynamically before each run. On each run, the values will change. This means you will need to do some type of formulaic operation with registers. We will tell you which registers are set beforehand and where you should put the result. In most cases, it's rax.

In this level, you will be working with memory. This will require you to read or write to things stored linearly in memory. If you are confused, go look at the linear addressing module in 'ike. You may also be asked to dereference things, possibly multiple times, to things we dynamically put in memory for your use.

Up until now, you have worked with registers as the only way for storing things, essentially variables such as 'x' in math.

However, we can also store bytes into memory!

Recall that memory can be addressed, and each address contains something at that location. Note that this is similar to addresses in real life!

As an example: the real address '699 S Mill Ave, Tempe, AZ 85281' maps to the 'ASU Brickyard'. We would also say it points to 'ASU Brickyard'. We can represent this like:

['699 S Mill Ave, Tempe, AZ 85281'] = 'ASU Brickyard'

The address is special because it is unique. But that also does not mean other addresses can't point to the same thing (as someone can have multiple houses).

Memory is exactly the same!

For instance, the address in memory where your code is stored (when we take it from you) is 0x400000.

In x86, we can access the thing at a memory location, called dereferencing, like so:

mov rax, [some_address]        <=>     Moves the thing at 'some_address' into rax

This also works with things in registers:

mov rax, [rdi]         <=>     Moves the thing stored at the address of what rdi holds to rax

This works the same for writing to memory:

mov [rax], rdi         <=>     Moves rdi to the address of what rax holds.

So if rax was 0xdeadbeef, then rdi would get stored at the address 0xdeadbeef:

[0xdeadbeef] = rdi

Note: Memory is linear, and in x86_64, it goes from 0 to 0xffffffffffffffff (yes, huge).

Please perform the following: Place the value stored at 0x404000 into rax. Make sure the value in rax is the original value stored at 0x404000.

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

We will now set some values in memory dynamically before each run. On each run, the values will change. This means you will need to do some type of formulaic operation with registers. We will tell you which registers are set beforehand and where you should put the result. In most cases, it's rax.

In this level, you will be working with memory. This will require you to read or write to things stored linearly in memory. If you are confused, go look at the linear addressing module in 'ike. You may also be asked to dereference things, possibly multiple times, to things we dynamically put in memory for your use.

Please perform the following: Place the value stored in rax to 0x404000.

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

We will now set some values in memory dynamically before each run. On each run, the values will change. This means you will need to do some type of formulaic operation with registers. We will tell you which registers are set beforehand and where you should put the result. In most cases, it's rax.

In this level, you will be working with memory. This will require you to read or write to things stored linearly in memory. If you are confused, go look at the linear addressing module in 'ike. You may also be asked to dereference things, possibly multiple times, to things we dynamically put in memory for your use.

Please perform the following:

Place the value stored at 0x404000 into rax.
Increment the value stored at the address 0x404000 by 0x1337.

Make sure the value in rax is the original value stored at 0x404000 and make sure that [0x404000] now has the incremented value.

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

We will now set some values in memory dynamically before each run. On each run, the values will change. This means you will need to do some type of formulaic operation with registers. We will tell you which registers are set beforehand and where you should put the result. In most cases, it's rax.

In this level, you will be working with memory. This will require you to read or write to things stored linearly in memory. If you are confused, go look at the linear addressing module in 'ike. You may also be asked to dereference things, possibly multiple times, to things we dynamically put in memory for your use.

Recall that registers in x86_64 are 64 bits wide, meaning they can store 64 bits. Similarly, each memory location can be treated as a 64-bit value. We refer to something that is 64 bits (8 bytes) as a quad word.

Here is the breakdown of the names of memory sizes:

Quad Word = 8 Bytes = 64 bits
Double Word = 4 bytes = 32 bits
Word = 2 bytes = 16 bits
Byte = 1 byte = 8 bits

In x86_64, you can access each of these sizes when dereferencing an address, just like using bigger or smaller register accesses:

mov al, [address] <=> moves the least significant byte from address to rax
mov ax, [address] <=> moves the least significant word from address to rax
mov eax, [address] <=> moves the least significant double word from address to rax
mov rax, [address] <=> moves the full quad word from address to rax

Remember that moving into al does not fully clear the upper bytes.

Please perform the following: Set rax to the byte at 0x404000.

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

We will now set some values in memory dynamically before each run. On each run, the values will change. This means you will need to perform some type of formulaic operation with registers. We will tell you which registers are set beforehand and where you should put the result. In most cases, it's rax.

In this level, you will be working with memory. This will require you to read or write to things stored linearly in memory. If you are confused, refer to the linear addressing module in 'ike. You may also be asked to dereference things, possibly multiple times, to things we dynamically put in memory for your use.

Recall the following:

The breakdown of the names of memory sizes:
- Quad Word = 8 Bytes = 64 bits
- Double Word = 4 bytes = 32 bits
- Word = 2 bytes = 16 bits
- Byte = 1 byte = 8 bits

In x86_64, you can access each of these sizes when dereferencing an address, just like using bigger or smaller register accesses:

mov al, [address] <=> moves the least significant byte from address to rax
mov ax, [address] <=> moves the least significant word from address to rax
mov eax, [address] <=> moves the least significant double word from address to rax
mov rax, [address] <=> moves the full quad word from address to rax

Please perform the following:

Set rax to the byte at 0x404000
Set rbx to the word at 0x404000
Set rcx to the double word at 0x404000
Set rdx to the quad word at 0x404000

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

We will now set some values in memory dynamically before each run. On each run, the values will change. This means you will need to do some type of formulaic operation with registers. We will tell you which registers are set beforehand and where you should put the result. In most cases, it's rax.

In this level, you will be working with memory. This will require you to read or write to things stored linearly in memory. If you are confused, go look at the linear addressing module in 'ike. You may also be asked to dereference things, possibly multiple times, to things we dynamically put in memory for your use.

It is worth noting, as you may have noticed, that values are stored in reverse order of how we represent them.

As an example, say:

[0x1330] = 0x00000000deadc0de

If you examined how it actually looked in memory, you would see:

[0x1330] = 0xde
[0x1331] = 0xc0
[0x1332] = 0xad
[0x1333] = 0xde
[0x1334] = 0x00
[0x1335] = 0x00
[0x1336] = 0x00
[0x1337] = 0x00

This format of storing things in 'reverse' is intentional in x86, and it's called "Little Endian".

For this challenge, we will give you two addresses created dynamically each run.

The first address will be placed in rdi. The second will be placed in rsi.

Using the earlier mentioned info, perform the following:

Set [rdi] = 0xdeadbeef00001337
Set [rsi] = 0xc0ffee0000

Hint: it may require some tricks to assign a big constant to a dereferenced register. Try setting a register to the constant value, then assigning that register to the dereferenced register.

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

We will now set some values in memory dynamically before each run. On each run, the values will change. This means you will need to perform some type of formulaic operation with registers. We will tell you which registers are set beforehand and where you should put the result. In most cases, it’s rax.

In this level, you will be working with memory. This will require you to read or write to things stored linearly in memory. If you are confused, go look at the linear addressing module in 'ike. You may also be asked to dereference things, possibly multiple times, to things we dynamically put in memory for your use.

Recall that memory is stored linearly.

What does that mean?

Say we access the quad word at 0x1337:

[0x1337] = 0x00000000deadbeef

The real way memory is laid out is byte by byte, little endian:

[0x1337] = 0xef
[0x1337 + 1] = 0xbe
[0x1337 + 2] = 0xad
...
[0x1337 + 7] = 0x00

What does this do for us?

Well, it means that we can access things next to each other using offsets, similar to what was shown above.

Say you want the 5th byte from an address, you can access it like:

mov al, [address+4]

Remember, offsets start at 0.

Perform the following:

Load two consecutive quad words from the address stored in rdi.
Calculate the sum of the previous steps' quad words.
Store the sum at the address in rsi.

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

We will now set some values in memory dynamically before each run. On each run, the values will change. This means you will need to do some type of formulaic operation with registers. We will tell you which registers are set beforehand and where you should put the result. In most cases, it's rax.

In this level, you will be working with the stack, the memory region that dynamically expands and shrinks. You will be required to read and write to the stack, which may require you to use the pop and push instructions. You may also need to use the stack pointer register (rsp) to know where the stack is pointing.

In these levels, we are going to introduce the stack.

The stack is a region of memory that can store values for later.

To store a value on the stack, we use the push instruction, and to retrieve a value, we use pop.

The stack is a last in, first out (LIFO) memory structure, and this means the last value pushed is the first value popped.

Imagine unloading plates from the dishwasher. Let's say there are 1 red, 1 green, and 1 blue. First, we place the red one in the cabinet, then the green on top of the red, then the blue.

Our stack of plates would look like:

Top ----> Blue
          Green
Bottom -> Red

Now, if we wanted a plate to make a sandwich, we would retrieve the top plate from the stack, which would be the blue one that was last into the cabinet, ergo the first one out.

On x86, the pop instruction will take the value from the top of the stack and put it into a register.

Similarly, the push instruction will take the value in a register and push it onto the top of the stack.

Using these instructions, take the top value of the stack, subtract rdi from it, then put it back.

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

We will now set some values in memory dynamically before each run. On each run the values will change. This means you will need to do some type of formulaic operation with registers. We will tell you which registers are set beforehand and where you should put the result. In most cases, it's rax.

In this level, you will be working with the stack, the memory region that dynamically expands and shrinks. You will be required to read and write to the stack, which may require you to use the pop and push instructions. You may also need to use the stack pointer register (rsp) to know where the stack is pointing.

In this level, we are going to explore the last in first out (LIFO) property of the stack.

Using only the following instructions:

push
pop

Swap values in rdi and rsi.

Example:

If to start rdi = 2 and rsi = 5
Then to end rdi = 5 and rsi = 2

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

We will now set some values in memory dynamically before each run. On each run, the values will change. This means you will need to do some type of formulaic operation with registers. We will tell you which registers are set beforehand and where you should put the result. In most cases, it's rax.

In this level, you will be working with the stack, the memory region that dynamically expands and shrinks. You will be required to read and write to the stack, which may require you to use the pop and push instructions. You may also need to use the stack pointer register (rsp) to know where the stack is pointing.

In the previous levels, you used push and pop to store and load data from the stack. However, you can also access the stack directly using the stack pointer.

On x86, the stack pointer is stored in the special register, rsp. rsp always stores the memory address of the top of the stack, i.e., the memory address of the last value pushed.

Similar to the memory levels, we can use [rsp] to access the value at the memory address in rsp.

Without using pop, please calculate the average of 4 consecutive quad words stored on the stack. Push the average on the stack.

Hint:

RSP+0x?? Quad Word A
RSP+0x?? Quad Word B
RSP+0x?? Quad Word C
RSP Quad Word D

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

We will now set some values in memory dynamically before each run. On each run, the values will change. This means you will need to do some type of formulaic operation with registers. We will tell you which registers are set beforehand and where you should put the result. In most cases, it's rax.

In this level, you will be working with control flow manipulation. This involves using instructions to both indirectly and directly control the special register rip, the instruction pointer. You will use instructions such as jmp, call, cmp, and their alternatives to implement the requested behavior.

Earlier, you learned how to manipulate data in a pseudo-control way, but x86 gives us actual instructions to manipulate control flow directly.

There are two major ways to manipulate control flow:

Through a jump
Through a call

In this level, you will work with jumps.

There are two types of jumps:

Unconditional jumps
Conditional jumps

Unconditional jumps always trigger and are not based on the results of earlier instructions.

As you know, memory locations can store data and instructions. Your code will be stored at 0x400042 (this will change each run).

For all jumps, there are three types:

Relative jumps: jump + or - the next instruction.
Absolute jumps: jump to a specific address.
Indirect jumps: jump to the memory address specified in a register.

In x86, absolute jumps (jump to a specific address) are accomplished by first loading the target address into a general-purpose register (we'll call this placeholder reg), then doing jmp reg.

In this level, we will ask you to do an absolute jump. Perform the following: Jump to the absolute address 0x403000.

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

We will now set some values in memory dynamically before each run. On each run, the values will change. This means you will need to perform some type of formulaic operation with registers. We will tell you which registers are set beforehand and where you should put the result. In most cases, it's rax.

In this level, you will be working with control flow manipulation. This involves using instructions to both indirectly and directly control the special register rip, the instruction pointer. You will use instructions such as jmp, call, cmp, and their alternatives to implement the requested behavior.

Recall that for all jumps, there are three types:

Relative jumps: jump a certain number of bytes forward or backward from the current instruction.
Absolute jumps: jump to a fixed memory address.
Indirect jumps: jump to the address stored in a register or memory location.

Here, we are focusing on relative jumps. This means you will tell the CPU to “jump forward a certain number of bytes from where you are currently executing.” This is useful because your code can move in memory and the jump will still reach the correct target.

To implement a relative jump, you will need a few tools:

labels: Instead of calculating addresses manually, you can use labels as placeholders. The assembler will automatically calculate the offset from your jump instruction to the label.
nop (No Operation): A single-byte instruction that does nothing. It is predictable in size and can be used as filler to create an exact distance for your jump.
.rept (Repeat Directive): A directive that tells the assembler to repeat a given instruction multiple times: GNU Assembler Manual This is perfect for generating a block of nop instructions without typing each one individually.

Please perform the following:

Make the first instruction in your code a jmp.
Make that jmp a relative jump of exactly 0x51 bytes from the current instruction.
Fill the space between the jump and the destination with nop instructions using .rept.
At the label where the relative jump lands, set rax to 0x1.

When your code runs, the CPU will execute the jump, skip over all the nop instructions, and continue at the instruction that sets rax. This will demonstrate how to control the flow of execution using relative jumps.

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

We will now set some values in memory dynamically before each run. On each run, the values will change. This means you will need to do some type of formulaic operation with registers. We will tell you which registers are set beforehand and where you should put the result. In most cases, it's rax.

In this level, you will be working with control flow manipulation. This involves using instructions to both indirectly and directly control the special register rip, the instruction pointer. You will use instructions such as jmp, call, cmp, and their alternatives to implement the requested behavior.

Now, we will combine the two prior levels and perform the following:

Create a two jump trampoline:
- Make the first instruction in your code a jmp.
- Make that jmp a relative jump to 0x51 bytes from its current position.
- At 0x51, write the following code:
  - Place the top value on the stack into register rdi.
  - jmp to the absolute address 0x403000.

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

In this level, you will be working with control flow manipulation. This involves using instructions to both indirectly and directly control the special register rip, the instruction pointer. You will use instructions such as jmp, call, cmp, and their alternatives to implement the requested behavior.

We will be testing your code multiple times in this level with dynamic values! This means we will be running your code in a variety of random ways to verify that the logic is robust enough to survive normal use.

We will now introduce you to conditional jumps--one of the most valuable instructions in x86. In higher-level programming languages, an if-else structure exists to do things like:

if x is even:
    is_even = 1
else:
    is_even = 0

This should look familiar since it is implementable in only bit-logic, which you've done in a prior level. In these structures, we can control the program's control flow based on dynamic values provided to the program.

Implementing the above logic with jmps can be done like so:

; assume rdi = x, rax is output
; rdx = rdi mod 2
mov rax, rdi
mov rsi, 2
div rsi
; remainder is 0 if even
cmp rdx, 0
; jump to not_even code if it's not 0
jne not_even
; fall through to even code
mov rbx, 1
jmp done
; jump to this only when not_even
not_even:
mov rbx, 0
done:
mov rax, rbx
; more instructions here

Often though, you want more than just a single 'if-else'. Sometimes you want two if checks, followed by an else. To do this, you need to make sure that you have control flow that 'falls-through' to the next if after it fails. All must jump to the same done after execution to avoid the else.

There are many jump types in x86, it will help to learn how they can be used. Nearly all of them rely on something called the ZF, the Zero Flag. The ZF is set to 1 when a cmp is equal, 0 otherwise.

Using the above knowledge, implement the following:

if [x] is 0x7f454c46:
    y = [x+4] + [x+8] + [x+12]
else if [x] is 0x00005A4D:
    y = [x+4] - [x+8] - [x+12]
else:
    y = [x+4] * [x+8] * [x+12]

Where:

x = rdi, y = rax.

Assume each dereferenced value is a signed dword. This means the values can start as a negative value at each memory position.

A valid solution will use the following at least once:

jmp (any variant), cmp

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

In this level, you will work with control flow manipulation. This involves using instructions to indirectly and directly control the special register rip, the instruction pointer. You will use instructions such as jmp, call, cmp, and their alternatives to implement the requested behavior.

We will be testing your code multiple times in this level with dynamic values! This means we will run your code in various random ways to verify that the logic is robust enough to survive normal use.

The last jump type is the indirect jump, often used for switch statements in the real world. Switch statements are a special case of if-statements that use only numbers to determine where the control flow will go.

Here is an example:

switch(number):
  0: jmp do_thing_0
  1: jmp do_thing_1
  2: jmp do_thing_2
  default: jmp do_default_thing

The switch in this example works on number, which can either be 0, 1, or 2. If number is not one of those numbers, the default triggers. You can consider this a reduced else-if type structure. In x86, you are already used to using numbers, so it should be no surprise that you can make if statements based on something being an exact number. Additionally, if you know the range of the numbers, a switch statement works very well.

Take, for instance, the existence of a jump table. A jump table is a contiguous section of memory that holds addresses of places to jump.

In the above example, the jump table could look like:

[0x1337] = address of do_thing_0
[0x1337+0x8] = address of do_thing_1
[0x1337+0x10] = address of do_thing_2
[0x1337+0x18] = address of do_default_thing

Using the jump table, we can greatly reduce the amount of cmps we use. Now all we need to check is if number is greater than 2. If it is, always do:

jmp [0x1337+0x18]

Otherwise:

jmp [jump_table_address + number * 8]

Using the above knowledge, implement the following logic:

if rdi is 0:
  jmp 0x40301e
else if rdi is 1:
  jmp 0x4030da
else if rdi is 2:
  jmp 0x4031d5
else if rdi is 3:
  jmp 0x403268
else:
  jmp 0x40332c

Please do the above with the following constraints:

Assume rdi will NOT be negative.
Use no more than 1 cmp instruction.
Use no more than 3 jumps (of any variant).
We will provide you with the number to 'switch' on in rdi.
We will provide you with a jump table base address in rsi.

Here is an example table:

[0x40427c] = 0x40301e (addrs will change)
[0x404284] = 0x4030da
[0x40428c] = 0x4031d5
[0x404294] = 0x403268
[0x40429c] = 0x40332c

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

We will now set some values in memory dynamically before each run. On each run, the values will change. This means you will need to perform some type of formulaic operation with registers. We will tell you which registers are set beforehand and where you should put the result. In most cases, it's rax.

In this level, you will be working with control flow manipulation. This involves using instructions to both indirectly and directly control the special register rip, the instruction pointer. You will use instructions such as jmp, call, cmp, and their alternatives to implement the requested behavior.

In a previous level, you computed the average of 4 integer quad words, which was a fixed amount of things to compute. But how do you work with sizes you get when the program is running?

In most programming languages, a structure exists called the for-loop, which allows you to execute a set of instructions for a bounded amount of times. The bounded amount can be either known before or during the program's run, with "during" meaning the value is given to you dynamically.

As an example, a for-loop can be used to compute the sum of the numbers 1 to n:

sum = 0
i = 1
while i <= n:
    sum += i
    i += 1

Please compute the average of n consecutive quad words, where:

rdi = memory address of the 1st quad word
rsi = n (amount to loop for)
rax = average computed

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

In this level, you will be working with control flow manipulation. This involves using instructions to both indirectly and directly control the special register rip, the instruction pointer. You will use instructions such as jmp, call, cmp, and their alternatives to implement the requested behavior.

We will be testing your code multiple times in this level with dynamic values! This means we will be running your code in a variety of random ways to verify that the logic is robust enough to survive normal use.

In previous levels, you discovered the for-loop to iterate for a number of times, both dynamically and statically known, but what happens when you want to iterate until you meet a condition?

A second loop structure exists called the while-loop to fill this demand. In the while-loop, you iterate until a condition is met.

As an example, say we had a location in memory with adjacent numbers and we wanted to get the average of all the numbers until we find one bigger or equal to 0xff:

average = 0
i = 0
while x[i] < 0xff:
  average += x[i]
  i += 1
average /= i

Using the above knowledge, please perform the following:

Count the consecutive non-zero bytes in a contiguous region of memory, where:

rdi = memory address of the 1st byte
rax = number of consecutive non-zero bytes

Additionally, if rdi = 0, then set rax = 0 (we will check)!

An example test-case, let:

rdi = 0x1000
[0x1000] = 0x41
[0x1001] = 0x42
[0x1002] = 0x43
[0x1003] = 0x00

Then: rax = 3 should be set.

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

We will be testing your code multiple times in this level with dynamic values! This means we will be running your code in a variety of random ways to verify that the logic is robust enough to survive normal use.

In this level, you will be working with functions! This will involve manipulating the instruction pointer (rip), as well as doing harder tasks than normal. You may be asked to use the stack to store values or call functions that we provide you.

In previous levels, you implemented a while loop to count the number of consecutive non-zero bytes in a contiguous region of memory.

In this level, you will be provided with a contiguous region of memory again and will loop over each performing a conditional operation till a zero byte is reached. All of which will be contained in a function!

A function is a callable segment of code that does not destroy control flow.

Functions use the instructions "call" and "ret".

The "call" instruction pushes the memory address of the next instruction onto the stack and then jumps to the value stored in the first argument.

Let's use the following instructions as an example:

0x1021 mov rax, 0x400000
0x1028 call rax
0x102a mov [rsi], rax

call pushes 0x102a, the address of the next instruction, onto the stack.
call jumps to 0x400000, the value stored in rax.

The "ret" instruction is the opposite of "call".

ret pops the top value off of the stack and jumps to it.

Let's use the following instructions and stack as an example:

                            Stack ADDR  VALUE
0x103f mov rax, rdx         RSP + 0x8   0xdeadbeef
0x1042 ret                  RSP + 0x0   0x0000102a

Here, ret will jump to 0x102a.

Please implement the following logic:

str_lower(src_addr):
  i = 0
  if src_addr != 0:
    while [src_addr] != 0x00:
      if [src_addr] <= 0x5a:
        [src_addr] = foo([src_addr])
        i += 1
      src_addr += 1
  return i

foo is provided at 0x403000. foo takes a single argument as a value and returns a value.

All functions (foo and str_lower) must follow the Linux amd64 calling convention (also known as System V AMD64 ABI): System V AMD64 ABI

Therefore, your function str_lower should look for src_addr in rdi and place the function return in rax.

An important note is that src_addr is an address in memory (where the string is located) and [src_addr] refers to the byte that exists at src_addr.

Therefore, the function foo accepts a byte as its first argument and returns a byte.

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

We will be testing your code multiple times in this level with dynamic values! This means we will be running your code in a variety of random ways to verify that the logic is robust enough to survive normal use.

In this level, you will be working with functions! This will involve manipulating the instruction pointer (rip), as well as doing harder tasks than normal. You may be asked to use the stack to store values or call functions that we provide you.

In the previous level, you learned how to make your first function and how to call other functions. Now we will work with functions that have a function stack frame.

A function stack frame is a set of pointers and values pushed onto the stack to save things for later use and allocate space on the stack for function variables.

First, let's talk about the special register rbp, the Stack Base Pointer.

The rbp register is used to tell where our stack frame first started. As an example, say we want to construct some list (a contiguous space of memory) that is only used in our function. The list is 5 elements long, and each element is a dword. A list of 5 elements would already take 5 registers, so instead, we can make space on the stack!

The assembly would look like:

; setup the base of the stack as the current top
mov rbp, rsp
; move the stack 0x14 bytes (5 * 4) down
; acts as an allocation
sub rsp, 0x14
; assign list[2] = 1337
mov eax, 1337
mov [rbp-0xc], eax
; do more operations on the list ...
; restore the allocated space
mov rsp, rbp
ret

Notice how rbp is always used to restore the stack to where it originally was. If we don't restore the stack after use, we will eventually run out. In addition, notice how we subtracted from rsp, because the stack grows down. To make the stack have more space, we subtract the space we need. The ret and call still work the same.

Consider the fact that to assign a value to list[2] we subtract 12 bytes (3 dwords). That is because the stack grows down and when we moved rsp our stack contains addresses <rsp, rbp.

Once again, please make function(s) that implement the following:

most_common_byte(src_addr, size):
  i = 0
  while i <= size-1:
    curr_byte = [src_addr + i]
    [stack_base - curr_byte * 2] += 1
    i += 1

  b = 0
  max_freq = 0
  max_freq_byte = 0
  while b <= 0xff:
    if [stack_base - b * 2] > max_freq:
      max_freq = [stack_base - b * 2]
      max_freq_byte = b
    b += 1

  return max_freq_byte

Assumptions:

There will never be more than 0xffff of any byte
The size will never be longer than 0xffff
The list will have at least one element

Constraints:

You must put the "counting list" on the stack
You must restore the stack like in a normal function
You cannot modify the data at src_addr

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

Robert, known to the discord as robwaz, is pwn.college's resident GDB guru. In this video, Robert takes you through a celebration of debugging, showing off various features of gdb.

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

There are a number of good gdb crash courses / reference manuals:

GDB's documentation
Tudor's gdb crash course
gdb debugging full example
pwndbg: a gdb extension (feature list)
gef: another gdb extension (feature list)
The course Debuggers 1012: Introductory GDB from OpenSecurityTraining2.

This level gets you re-familiarized with gdb. To get started with this level, and all the other levels of this module, run /challenge/embryogdb_levelXYZ, where XYZ is the level number. That program will launch gdb. Run the actual level logic with r, and follow the prompts to get that flag!

RELEVANT DOCUMENTATION:

gdb's run command
gdb's continue command

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

Next, we'll learn about how to print out the values of registers.

You can see the values for all your registers with info registers. Alternatively, you can also just print a particular register's value with the print command, or p for short. For example, p $rdi will print the value of $rdi in decimal. You can also print its value in hex with p/x $rdi.

In order to solve this level, you must figure out the current random value of register r12 in hex.

As before, start the challenge, invoke the run gdb command, then follow the instructions. When you've printed out what you need, remember to continue to move on to the next step of the challenge!

RELEVANT DOCUMENTATION:

gdb's run command
gdb's continue command
gdb's info command
gdb's print command

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

Next, we'll learn to use gdb to peek into process memory!

You can examine the contents of memory using the x/<n><u><f> <address> parameterized command. In this format <u> is the unit size to display, <f> is the format to display it in, and <n> is the number of elements to display. Valid unit sizes are b (1 byte), h (2 bytes), w (4 bytes), and g (8 bytes). Valid formats are d (decimal), x (hexadecimal), s (string) and i (instruction). The address can be specified using a register name, symbol name, or absolute address. Additionally, you can supply mathematical expressions when specifying the address.

For example, x/8i $rip will print the next 8 instructions from the current instruction pointer. x/16i main will print the first 16 instructions of main. You can also use disassemble main, or disas main for short, to print all of the instructions of main. Alternatively, x/16gx $rsp will print the first 16 values on the stack. x/gx $rbp-0x32 will print the local variable stored there on the stack.

You will probably want to view your instructions using the CORRECT assembly syntax. You can do that with the command set disassembly-flavor intel.

In order to solve this level, you must figure out the random value on the stack (the value read in from /dev/urandom). Think about what the arguments to the read system call are.

RELEVANT DOCUMENTATION:

gdb's run command
gdb's continue command
gdb's info command
gdb's print command
gdb's examine command

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

A critical part of dynamic analysis is getting your program to the state you are interested in analyzing. So far, these challenges have automatically set breakpoints for you to pause execution at states you may be interested in analyzing. It is important to be able to do this yourself.

There are a number of ways to move forward in the program's execution. You can use the stepi <n> command, or si <n> for short, in order to step forward one instruction. You can use the nexti <n> command, or ni <n> for short, in order to step forward one instruction, while stepping over any function calls. The <n> parameter is optional, but allows you to perform multiple steps at once. You can use the finish command in order to finish the currently executing function. You can use the break *<address> parameterized command in order to set a breakpoint at the specified-address. You have already used the continue command, which will continue execution until the program hits a breakpoint.

While stepping through a program, you may find it useful to have some values displayed to you at all times. There are multiple ways to do this. The simplest way is to use the display/<n><u><f> parameterized command, which follows exactly the same format as the x/<n><u><f> parameterized command. For example, display/8i $rip will always show you the next 8 instructions. On the other hand, display/4gx $rsp will always show you the first 4 values on the stack. Another option is to use the layout regs command. This will put gdb into its TUI mode and show you the contents of all of the registers, as well as nearby instructions.

In order to solve this level, you must figure out a series of random values which will be placed on the stack. As before, run will start you out, but it will interrupt the program and you must, carefully, continue its execution.

You are highly encouraged to try using combinations of stepi, nexti, break, continue, and finish to make sure you have a good internal understanding of these commands. The commands are all absolutely critical to navigating a program's execution.

RELEVANT DOCUMENTATION:

gdb's run command
gdb's continue command
gdb's info command
gdb's print command
gdb's examine command
gdb's break command
gdb's display command
gdb's various stepping commands command (that whole section)

NOTE: This challenge will require you to read and understand assembly! Don't worry, this skill will come in quite handy later in pwn.college.

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

We write code in order to express an idea which can be reproduced and refined. We can think of our analysis as a program which injests the target to be analyzed as data. As the saying goes, code is data and data is code.

While using gdb interactively as we've done with the past levels is incredibly powerful, another powerful tool is gdb scripting. By scripting gdb, you can very quickly create a custom-tailored program analysis tool. If you know how to interact with gdb, you already know how to write a gdb script--the syntax is exactly the same. You can write your commands to some file, for example x.gdb, and then launch gdb using the flag -x <PATH_TO_SCRIPT>. This file will execute all of the gdb commands after gdb launches. Alternatively, you can execute individual commands with -ex '<COMMAND>'. You can pass multiple commands with multiple -ex arguments. Finally, you can have some commands be always executed for any gdb session by putting them in ~/.gdbinit. You probably want to put set disassembly-flavor intel in there.

Within gdb scripting, a very powerful construct is breakpoint commands. Consider the following gdb script:

start
break *main+42
commands
  x/gx $rbp-0x32
  continue
end
continue

In this case, whenever we hit the instruction at main+42, we will output a particular local variable and then continue execution.

Now consider a similar, but slightly more advanced script using some commands you haven't yet seen:

start
break *main+42
commands
  silent
  set $local_variable = *(unsigned long long*)($rbp-0x32)
  printf "Current value: %llx\n", $local_variable
  continue
end
continue

In this case, the silent indicates that we want gdb to not report that we have hit a breakpoint, to make the output a bit cleaner. Then we use the set command to define a variable within our gdb session, whose value is our local variable. Finally, we output the current value using a formatted string.

Use gdb scripting to help you collect the random values in this level. This may feel difficult, but will serve you well in your journey ahead.

RELEVANT DOCUMENTATION:

gdb's run command
gdb's continue command
gdb's info command
gdb's print command
gdb's examine command
gdb's break command
gdb's display command
gdb's various stepping commands command (that whole section)
gdb's breakpoint scripting

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

As it turns out, gdb has FULL control over the target process. Not only can you analyze the program's state, but you can also modify it. While gdb probably isn't the best tool for doing long term maintenance on a program, sometimes it can be useful to quickly modify the behavior of your target process in order to more easily analyze it.

You can modify the state of your target program with the set command. For example, you can use set $rdi = 0 to zero out $rdi. You can use set *((uint64_t *) $rsp) = 0x1234 to set the first value on the stack to 0x1234. You can use set *((uint16_t *) 0x31337000) = 0x1337 to set 2 bytes at 0x31337000 to 0x1337.

Suppose your target is some networked application which reads from some socket on fd 42. Maybe it would be easier for the purposes of your analysis if the target instead read from stdin. You could achieve something like that with the following gdb script:

start
catch syscall read
commands
  silent
  if ($rdi == 42)
    set $rdi = 0
  end
  continue
end
continue

This example gdb script demonstrates how you can automatically break on system calls, and how you can use conditions within your commands to conditionally perform gdb commands.

In the previous level, your gdb scripting solution likely still required you to copy and paste your solutions. This time, try to write a script that doesn't require you to ever talk to the program, and instead automatically solves each challenge by correctly modifying registers / memory.

RELEVANT DOCUMENTATION:

gdb's run command
gdb's continue command
gdb's info command
gdb's print command
gdb's examine command
gdb's break command
gdb's display command
gdb's various stepping commands command (that whole section)
gdb's breakpoint scripting
gdb's set command. Keep in mind that, in this challenge, you'll specifically use this command to set registers (e.g., $rdi as above) and memory (as described at the bottom of the linked section).

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

This level will expose you to some of the true power of gdb.

RELEVANT DOCUMENTATION:

gdb's run command
gdb's continue command
gdb's info command
gdb's print command
gdb's examine command
gdb's break command
gdb's display command
gdb's various stepping commands command (that whole section)
gdb's breakpoint scripting
gdb's set command
gdb's call command. Keep in mind that the syntax for the expression is similar to C's syntax.

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

The previous level showed you raw, but unrefined power. This level will force you to refine it, as the win function will no longer work. break at it, look around, and understand what is wrong.

RELEVANT DOCUMENTATION:

gdb's run command
gdb's continue command
gdb's info command
gdb's print command
gdb's examine command
gdb's break command
gdb's display command
gdb's various stepping commands command (that whole section)
gdb's breakpoint scripting
gdb's set command
gdb's call command
gdb's jump command

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

Your first task is to create the simplest possible program—one that immediately terminates when run. In this challenge, you will use the exit syscall, which is responsible for ending a process and returning an exit status to the operating system. This syscall takes a single argument: the exit status (with 0 typically indicating success). Understanding how to cleanly exit a program is crucial because it ensures your process communicates its completion state properly.

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

In this challenge, you’ll begin your journey into networking by creating a socket using the socket syscall. A socket is the basic building block for network communication; it serves as an endpoint for sending and receiving data. When you invoke socket, you provide three key arguments: the domain (for example, AF_INET for IPv4), the type (such as SOCK_STREAM for TCP), and the protocol (usually set to 0 to choose the default). Mastering this syscall is important because it lays the foundation for all subsequent network interactions.

NOTE: Looking through documentation, the arguments of the system calls are listed as names in all capitals. For instance, we may wish to call socket(AF_INET, SOCK_STREAM, 0) but we cannot simply perform mov rdi, AF_INET: AF_INET is simply not a concept at the assembly level. We need to find the integer which corresponds to AF_INET. These numbers are not even found in the man pages, but these numbers do exist on your machine. Check out the /usr/include directory. All the system's general-use include files for C programming are placed here. (For those who have written C, think of any header files you've included in your code "#include <stdio.h>". All those Functions and constants are defined somewhere here). Since C is compiled to assembly, these numbers are present somewhere in this directory. Rather than manually searching, you can grep for them.

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

After creating a socket, the next step is to assign it a network identity. In this challenge, you will use the bind syscall to connect your socket to a specific IP address and port number. The call requires you to provide the socket file descriptor, a pointer to a struct sockaddr (specifically a struct sockaddr_in for IPv4 that holds fields like the address family, port, and IP address), and the size of that structure. Binding is essential because it ensures your server listens on a known address, making it reachable by clients.

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

With your socket bound to an address, you now need to prepare it to accept incoming connections. The listen syscall transforms your socket into a passive one that awaits client connection requests. It requires the socket’s file descriptor and a backlog parameter, which sets the maximum number of queued connections. This step is vital because without marking the socket as listening, your server wouldn’t be able to receive any connection attempts.

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

Once your socket is listening, it’s time to actively accept incoming connections. In this challenge, you will use the accept syscall, which waits for a client to connect. When a connection is established, it returns a new socket file descriptor dedicated to communication with that client and fills in a provided address structure (such as a struct sockaddr_in) with the client’s details. This process is a critical step in transforming your server from a passive listener into an active communicator.

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

Now that your server can establish connections, it’s time to learn how to send data. In this challenge, your goal is to send a fixed HTTP response (HTTP/1.0 200 OK\r\n\r\n) to any client that connects. You will use the write syscall, which requires a file descriptor, a pointer to a data buffer, and the number of bytes to write. This exercise is important because it teaches you how to format and deliver data over the network.

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

In this challenge, your server evolves to handle dynamic content based on HTTP GET requests. You will first use the read syscall to receive the incoming HTTP request from the client socket. By examining the request line--particularly, in this case, the URL path--you can determine what the client is asking for. Next, use the open syscall to open the requested file and read to read its contents. Send the file contents back to the client using the write syscall. This marks a significant step toward interactivity, as your server begins tailoring its output rather than simply echoing a static message.

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

Previously, your server served just one GET request before terminating. Now, you will modify it so that it can handle multiple GET requests sequentially. This involves wrapping the accept-read-write-close sequence in a loop. Each time a client connects, your server will accept the connection, process the GET request, and then cleanly close the client session while remaining active for the next request. This iterative approach is essential for building a persistent server.

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

To enable your server to handle several clients at once, you will introduce concurrency using the fork syscall. When a client connects, fork creates a child process dedicated to handling that connection. Meanwhile, the parent process immediately returns to accept additional connections. With this design, the child uses read and write to interact with the client, while the parent continues to listen. This concurrent model is a key concept in building scalable, real-world servers.

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

Expanding your server’s capabilities further, this challenge focuses on handling HTTP POST requests concurrently. POST requests are more complex because they include both headers and a message body. You will once again use fork to manage multiple connections, while using read to capture the entire request. Again, you will parse the URL path to determine the specified file, but this time instead of reading from that file, you will instead write to it with the incoming POST data. In order to do so, you must determine the length of the incoming POST data. The obvious way to do this is to parse the Content-Length header, which specifies exactly that. Alternatively, consider using the return value of read to determine the total length of the request, parsing the request to find the total length of the headers (which end with \r\n\r\n), and using that difference to determine the length of the body--this seemingly more complicated algorithm may actually be easier to implement. Finally, return just a 200 OK response to the client to indicate that the POST request was successful.

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

In the final challenge, your server must seamlessly support both GET and POST requests within a single program. After reading the incoming request using read, your server will inspect the first few characters to determine whether it is dealing with a GET or a POST. Depending on the request type, it will process the data accordingly and then send back an appropriate response using write. Throughout this process, fork is employed to handle each connection concurrently, ensuring that your server can manage multiple requests at the same time. After completing this, you will have built a simple, but fully functional, web server capable of handling different types of HTTP requests.

Connect with SSH

Link your SSH key, then connect with: ssh [email protected]

Module 4

CSE 365 - Spring 2026.

Questions and Discussions (Discord)

Your First Program

LECTURE: Computer Architecture 341 solves

LECTURE: Computer Architecture

Connect with SSH

LECTURE: Assembly 340 solves

LECTURE: Assembly

Connect with SSH

LECTURE: Registers 341 solves

LECTURE: Registers

Connect with SSH

Your First Register 972 solves

Your First Register

Connect with SSH

Your First Syscall 952 solves

Your First Syscall

Connect with SSH

Exit Codes 949 solves

Exit Codes

Connect with SSH

Building Executables 916 solves

Building Executables

Connect with SSH

Moving Between Registers 902 solves

Moving Between Registers

Connect with SSH

Computer Memory

LECTURE: Memory 340 solves

LECTURE: Memory

Connect with SSH

Loading From Memory 858 solves

Loading From Memory

Connect with SSH

More Loading Practice 861 solves

More Loading Practice

Connect with SSH

Dereferencing Pointers 847 solves

Dereferencing Pointers

Connect with SSH

Dereferencing Yourself 845 solves

Dereferencing Yourself

Connect with SSH

Dereferencing with Offsets 834 solves

Dereferencing with Offsets

Connect with SSH

Stored Addresses 833 solves

Stored Addresses

Connect with SSH

Double Dereference 835 solves

Double Dereference

Connect with SSH

The Stack

The Stack 1198 solves

The Stack

Connect with SSH

Stack Offsets 1182 solves

Stack Offsets

Connect with SSH

Program Arguments on the Stack 1164 solves

Program Arguments on the Stack

Connect with SSH

Popping From the Stack 1154 solves

Popping From the Stack

Connect with SSH

Software Introspection

LECTURE: Data 336 solves

LECTURE: Data

Connect with SSH

Bytes as Hex and Binary

Bytes as Hex and Binary

Disassembling Programs 1084 solves

Disassembling Programs

Connect with SSH

Tracing Syscalls 730 solves

Tracing Syscalls

Connect with SSH

Starting GDB 735 solves

Starting GDB