diff --git a/chapters/binary-introduction/taming-the-stack/guides/function-calls/README.md b/chapters/binary-introduction/taming-the-stack/guides/function-calls/README.md new file mode 100644 index 00000000..369911a7 --- /dev/null +++ b/chapters/binary-introduction/taming-the-stack/guides/function-calls/README.md @@ -0,0 +1,46 @@ +# Function Calls + +Use `objdump` to investigate the prologue of the `read_array()` and `print_array()` functions. + +```console +root@kali:~$ objdump -d -M intel main +``` + +Notice how in the prologue, `ebp` saves the `esp` value before the local variables are stored on stack: + +```asm +080491a6 : + 80491a6: 55 push ebp + 80491a7: 89 e5 mov ebp,esp + 80491a9: 83 ec 18 sub esp,0x18 + 80491ac: 83 ec 08 sub esp,0x8 +``` + +What's more, take a closer look at how the parameters are handled: + +```asm + 80491af: ff 75 0c push DWORD PTR [ebp+0xc] ; the second argument of read_array() + 80491b2: 68 08 a0 04 08 push 0x804a008 + 80491b7: e8 c4 fe ff ff call 8049080 <__isoc99_scanf@plt> + + 8049213: 8b 45 08 mov eax,DWORD PTR [ebp+0x8] ; the first argument of print_array() +``` + +Now, inside `gdb`, let's take a look at where the return address is saved: + +```console +pwndbg> info frame +Stack level 0, frame at 0xffffcd30: + eip = 0x80491ac in read_array (main.c:5); saved eip = 0x8049273 + Saved registers: + ebp at 0xffffcd28, eip at 0xffffcd2c + +pwndbg> x 0xffffcd2c +0xffffcd2c: 0x08049273 +``` + +Let's do the math: + +- `ebp` points at `0xffffcd28` +- `ebp + 4` will then point at `0xffffcd2c` +- the value stored at `0xffffcd2c` is `0x08049273`, the same as the one from the saved `eip` diff --git a/chapters/binary-introduction/taming-the-stack/guides/function-calls/support/Makefile b/chapters/binary-introduction/taming-the-stack/guides/function-calls/support/Makefile new file mode 100644 index 00000000..bcd83401 --- /dev/null +++ b/chapters/binary-introduction/taming-the-stack/guides/function-calls/support/Makefile @@ -0,0 +1,15 @@ +CFLAGS = -m32 -Wall -fno-PIC -g -O0 +LDFLAGS = -m32 -no-pie + +.PHONY: all clean + +all: main + +main: main.o + +main.o: main.c + +clean: + -rm -f main.o + -rm -f main + -rm -f *~ diff --git a/chapters/binary-introduction/taming-the-stack/guides/function-calls/support/main b/chapters/binary-introduction/taming-the-stack/guides/function-calls/support/main new file mode 100644 index 00000000..325e0ec5 Binary files /dev/null and b/chapters/binary-introduction/taming-the-stack/guides/function-calls/support/main differ diff --git a/chapters/binary-introduction/taming-the-stack/guides/function-calls/support/main.c b/chapters/binary-introduction/taming-the-stack/guides/function-calls/support/main.c new file mode 100644 index 00000000..3d899017 --- /dev/null +++ b/chapters/binary-introduction/taming-the-stack/guides/function-calls/support/main.c @@ -0,0 +1,57 @@ +// SPDX-License-Identifier: BSD-3-Clause +/* + * Copyright 2023 University POLITEHNICA of Bucharest + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions are met: + * + * 1. Redistributions of source code must retain the above copyright notice, this + * list of conditions and the following disclaimer. + * + * 2. Redistributions in binary form must reproduce the above copyright notice, + * this list of conditions and the following disclaimer in the documentation + * and/or other materials provided with the distribution. + * + * 3. Neither the name of the copyright holder nor the names of its contributors + * may be used to endorse or promote products derived from this software + * without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED + * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE + * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR + * SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER + * CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, + * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include + +void read_array(int *v, int *n) +{ + scanf("%d", n); + + for (int i = 0; i < *n; ++i) + scanf("%d", &v[i]); +} + +void print_array(int *v, int n) +{ + for (int i = 0; i < n; ++i) + printf("%d ", v[i]); + + printf("\n"); +} + +int main(void) +{ + int v[16], n; + + read_array(v, &n); + print_array(v, n); + + return 0; +} diff --git a/chapters/binary-introduction/taming-the-stack/media/function-stack.jpg b/chapters/binary-introduction/taming-the-stack/media/function-stack.jpg new file mode 100644 index 00000000..03515222 Binary files /dev/null and b/chapters/binary-introduction/taming-the-stack/media/function-stack.jpg differ diff --git a/chapters/binary-introduction/taming-the-stack/media/stack-array.svg b/chapters/binary-introduction/taming-the-stack/media/stack-array.svg new file mode 100644 index 00000000..7bfa688b --- /dev/null +++ b/chapters/binary-introduction/taming-the-stack/media/stack-array.svg @@ -0,0 +1,4 @@ + + + +
arr[0]
arr[1]
arr[N]
Addresses Grow
0x00000000
0xffffffff
.
.
.
\ No newline at end of file diff --git a/chapters/binary-introduction/taming-the-stack/media/stack-high-low.png b/chapters/binary-introduction/taming-the-stack/media/stack-high-low.png new file mode 100644 index 00000000..07b0e257 Binary files /dev/null and b/chapters/binary-introduction/taming-the-stack/media/stack-high-low.png differ diff --git a/chapters/binary-introduction/taming-the-stack/reading/README.md b/chapters/binary-introduction/taming-the-stack/reading/README.md deleted file mode 100644 index c4c2cb47..00000000 --- a/chapters/binary-introduction/taming-the-stack/reading/README.md +++ /dev/null @@ -1,40 +0,0 @@ ---- -linkTitle: 11. Taming the Stack -type: docs -weight: 10 ---- - -# Taming the Stack - -## Table of Contents - -Use [gh-md-toc](https://github.com/ekalinin/github-markdown-toc). - -## Introduction - -Objectives and rationale for the current session. - -## Reminders and Prerequisites - -- Information required for this section -- Commands / snippets that should be known, useful to copy-paste throughout the practical session - -## Content Sections: - -- Content split in sections, according to session specifics -- Demos will be part of the session presentation and will be referenced (snippets, images, links) in the content - -## Summary - -- Sumamrizing session concepts -- Summarizing commands / snippets that are useful for tutorials, challenges (easy reference, copy-paste) - -## Activities - -Tasks for the students to solve. They may be of two types: -- **Tutorials** - simpler tasks accompanied by more detailed, walkthrough-like explanations -- **Challenges** - the good stuff - -## Further Reading - -Any links or references to extra information. diff --git a/chapters/binary-introduction/taming-the-stack/reading/data-data-everywhere.md b/chapters/binary-introduction/taming-the-stack/reading/data-data-everywhere.md new file mode 100644 index 00000000..c2a1ce7d --- /dev/null +++ b/chapters/binary-introduction/taming-the-stack/reading/data-data-everywhere.md @@ -0,0 +1,13 @@ +# Data, Data Everywhere + +Up until now, we've learnt that our application (or program) is made out of data and code. +While the code is the engine of the process, as it obviously tells the processor the work that it should do, data is the most interesting (and dangerous) part when it comes to changing the execution of an app. +Why, you might ask? +Well, because it's modifiable; +the majority of the data contained by your program lays around in the `.stack`, `.heap` or `.data` sections of the executable, which makes it **writable**. +And therefore, even more appealing to the attackers. + +Attacks on `.rodata` variables are rarely possible due to the protections enforced by the permissions, or lack thereof. +Even though less protected, the `.text` section also gets fewer attacks, as the `W ^ X` security feature becomes the norm. + +The gate remains open for malicious endeavours on the `.stack`, `.heap` and `.data` sections, and, today, we'll discuss the most prolific one: the stack. diff --git a/chapters/binary-introduction/taming-the-stack/reading/functions-and-the-stack.md b/chapters/binary-introduction/taming-the-stack/reading/functions-and-the-stack.md new file mode 100644 index 00000000..a0277095 --- /dev/null +++ b/chapters/binary-introduction/taming-the-stack/reading/functions-and-the-stack.md @@ -0,0 +1,61 @@ +# Functions and the Stack + +Every function has two classes of values, usually stored on stack, extremely important for its well-being: + +1. the return address + +1. the parameters / arguments + +Meddling with these might get you to a big fat **SEGFAULT** or to great power. + +## `ebp`, the Stack Frame + +But before discussing that, we have to bring light to another obscure register, `ebp`. +We kind of used it before, in our journey, as it has a great advantage. +It stores the stack pointer value right before the stack begins to hold local variables and preserved register values. +In other words, it keeps a pointer to the stack at the beginning of the function, enabling us to actually move freely through the stack. +We will, now, refer to values stored on it, even though they are not the last ones. + +```asm +push ebp +mov ebp, esp + +push dword 3 +push dword 4 +push dword 5 + +; at this point esp decreased its value with 3 * 4 = 12 bytes +; traditionally we can access the last value only, +; however the stack is like an array, so we will use the pointers +; it offers us + +mov eax, [esp + 8] ; eax = 3 +mov eax, [ebp - 4] ; eax = 3 +``` + +## The Return Address + +The return address of a function is one of the **most targeted** piece of information in an attack. +There is even a special class of attacks that takes its name from it, [ROP](https://security-summer-school.github.io/binary/return-oriented-programming/) (Return Oriented Programming). +Moreover, the return address can also be defined as a **code pointer**, a pointer that stores the address of an instruction. +Remember how the instructions were stored in the code or text section, hence the **code pointer** label. + +The reason for this kind of popularity is obvious: it represents one of the rare instances when the program **performs a jump to a code pointer saved on stack**, which, combined with the stupidity or the laziness of the programmer, can result in a nasty backdoor to the system. + +The address at which the return address is usually stored on x86 systems is `[ebp + 4]`. + +## The Parameters + +The parameters follow a similar story to that of the return address, with a slight modification, though. +On 64-bit x86 they are placed in special registers, if possible. +If the number of parameters is high, they would get transmitted using the stack, just as it happens, on 32-bit x86. + +The address at which the first parameter gets stored on 32-bit x86 systems is `ebp + 8`. + +The address at which the second parameter gets stored on 32-bit x86 systems is `ebp + 12`. + +The address at which the third parameter gets stored on 32-bit x86 systems is `ebp + 16`. + +And so on. + +![parameters and ebp](../media/function-stack.jpg) diff --git a/chapters/binary-introduction/taming-the-stack/reading/further-reading.md b/chapters/binary-introduction/taming-the-stack/reading/further-reading.md new file mode 100644 index 00000000..86297684 --- /dev/null +++ b/chapters/binary-introduction/taming-the-stack/reading/further-reading.md @@ -0,0 +1,7 @@ +# Further Reading + +[Stack Immersion](https://github.com/systems-cs-pub-ro/iocla/blob/master/laborator/content/stiva/README.md) + +[Function Calls Immersion](https://github.com/systems-cs-pub-ro/iocla/blob/master/laborator/content/apel-functii/README.md) + +[ROP attacks](https://resources.infosecinstitute.com/topics/hacking/return-oriented-programming-rop-attacks/) diff --git a/chapters/binary-introduction/taming-the-stack/reading/introduction.md b/chapters/binary-introduction/taming-the-stack/reading/introduction.md new file mode 100644 index 00000000..158dc4ab --- /dev/null +++ b/chapters/binary-introduction/taming-the-stack/reading/introduction.md @@ -0,0 +1,10 @@ +# Introduction + +While the last three sessions introduced concepts regarding processes, executables and the means to investigate them, this session will, **finally**, focus on how to manipulate apps to do strange and (sometimes) undesired behaviour. + +## Reminders and Prerequisites + +- hexadecimal representation +- process address space +- x86 Assembly knowledge: memory addressing and basic commands +- `Ghidra`, `objdump`, `nm`, `gdb` diff --git a/chapters/binary-introduction/taming-the-stack/reading/stack.md b/chapters/binary-introduction/taming-the-stack/reading/stack.md new file mode 100644 index 00000000..ff35ad8f --- /dev/null +++ b/chapters/binary-introduction/taming-the-stack/reading/stack.md @@ -0,0 +1,158 @@ +# Stack + +## An Abstract Data Structure + +The stack, as a data structure, functions on the LIFO (Last In First Out) concept: one can access only one element at a time, and it's always the last one. + +The two operations accepted by the stack data structure are: + +- `push` (add one element at the back) +- `pop` (retrieve the last element, while removing it from the stack) + +## Real Life Use Case + +Enough about abstract data structures for now, let's get down to the OS business! +Every process on your system has a stack, used for a plethora of reasons. +While the program stack has a similar behaviour with the abstract data structure from which it got its name, its role is of great interest to attackers and programmers alike: + +- **it stores the return address** + + Have you ever wondered how a program knows how to return and execute the next instruction after a function call? + Here comes the stack. + The return address (meaning the address of the next instruction after a function call) gets **pushed** on the stack right before entering the function and **popped** at the end of the function. + + ```asm + call my_func + + ; this is equivalent to + + inc rip + push rip + jmp my_func + ``` + +- **it saves the content of the registers** + + Let's say we have multiple **nested** functions that all use the same `ecx` register. + In this scenario, every nested function call overwrites the original `ecx` value (set in the previous method), therefore producing garbage results and even critical errors. + As a result, the stack is used to preserve the values of the affected registers. + + For example, corrupting registers would look something like this: + + ```asm + f: + mov ecx, 1 + call g + + ; this should result in ecx = 2, + ; however ecx gets corrupted by the function calls + inc ecx + + g: + mov ecx, 128 + shr ecx, 2 + call h + + h: + xor ecx, ecx + ``` + + We should preserve the register value by saving it on stack: + + ```asm + f: + mov ecx, 1 + + ; preserve the ecx value + push ecx + call g + ; restore the ecx value, WE ARE SAFE + pop ecx + + inc ecx + + g: + mov ecx, 128 + shr ecx, 2 + + ; preserve the ecx value + push ecx + call h + ; restore the ecx value, WE ARE SAFE + pop ecx + + h: + ; no need to save ecx, there are no further function calls + xor ecx, ecx + ``` + +- **it stores local variables** + + Remember the local variables from C/C++? + The ones that were disposable and didn't preserve their value between different function calls? + Well, surprise! + They are stored on the stack. + + For instance, the following C function gets translated to Assembly like so: + + ```C + void my_func() + { + int a = 5; + } + ``` + + ```asm + my_func: + ; preserve the stack frame, a special register + push ebp + mov ebp, esp + + ; allocate space on stack for an integer + sub esp, 4 + ; initialize the integer with 5 + mov [esp], 5 + + ; restore the stack pointer and the stack frame + mov esp, ebp + pop ebp + ret + ``` + +## `push` and `pop` Under the Hood + +As you probably already figured, the fact that the stack operates with its last added value only means that we somehow need to store the exact address of the top somewhere. +Here comes the `esp` register. +Whenever a `push` or `pop` occurs, this register either gets decreased or increased with the size of the value stored on the stack. +You've read it right: the **`push` operation decreases the `esp` value** and a **`pop` operation increases it**! +That's because the stack grows downwards. + +Why so? +Think about an array, let's call it `arr`. +Its elements are consecutive and the one at the lowest address is the first one: `arr[0]`. +The next one, at a higher address, is `arr[1]` and so on. +Therefore, if we place this tack in memory it'll look like this: + +![stack array](../media/stack-array.svg) + +![stack growth](../media/stack-high-low.png) + +So, the following 4 snippets are equivalent 2 by 2: + +```asm +push dword 5 +``` + +```asm +sub esp, 4 +mov [esp], dword 5 +``` + +```asm +pop ecx +``` + +```asm +mov [esp], ecx +add esp, 4 +``` diff --git a/chapters/binary-introduction/taming-the-stack/reading/summary.md b/chapters/binary-introduction/taming-the-stack/reading/summary.md new file mode 100644 index 00000000..3e016220 --- /dev/null +++ b/chapters/binary-introduction/taming-the-stack/reading/summary.md @@ -0,0 +1,10 @@ +# Summary + +The both curse and blessing of modern C/C++ code is the absolute control over memory it gives the programmer. +This comes as a doubled edged sword: + +- the stack is just an array: we can modify and access it with `push` and `pop`, but also by using the special stack registers, `esp` and `ebp` +- direct access to the return address of the function at `ebp + 4` +- direct access to the parameters, found either in registers or on stack, at `ebp + 8`, `ebp + 12`, `ebp + 16` etc + +making the program vulnerable to ROP attacks. diff --git a/config.yaml b/config.yaml index bfd98c8a..e9405ad3 100644 --- a/config.yaml +++ b/config.yaml @@ -266,8 +266,22 @@ docusaurus: - Call Me Little Sunshine/: drills/tasks/call-me-little-sunshine/README.md - Taming the Stack: path: chapters/binary-introduction/taming-the-stack/ + extra: + - media/ subsections: - - Reading: reading/README.md + - Reading: + path: reading/ + subsections: + - Introduction: introduction.md + - Data, Data everywhere: data-data-everywhere.md + - Stack: stack.md + - Functions and the Stack: functions-and-the-stack.md + - Summary: summary.md + - Further Reading: further-reading.md + - Guides: + path: guides/ + subsections: + - Function Calls/: function-calls/README.md static_assets: - application-lifetime: /build/make_assets/chapters/binary-introduction/application-lifetime/slides/_site