Pwn Learning

0x00 Environment Setup

mac下的虚拟机：Virtual Box

镜像：ubuntu-18.04.4-desktop-amd64.iso

安装过程略

写（膜改）了一个命令行脚本vbox_ubuntu.sh：

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37


# if (( $# > 0 )) # isn't supported by POSIX shell
vmname="Ubuntu"
if [ $# -gt 0 ]
then
    if [ "$1" == 'off' ]
    then
      VBoxManage controlvm $vmname poweroff --type headless
    elif [ "$1" == 'pause' ]
    then
      VBoxManage controlvm $vmname pause --type headless
    elif [ "$1" == 'resume' ]
    then
      VBoxManage controlvm $vmname resume --type headless
    elif [ "$1" == 'status' ]
    then
      vboxmanage showvminfo $vmname | grep State
    elif [ "$1" == 'ssh' ]
    then
      ssh -p 2222 soreatu@127.0.0.1
    elif [ "$1" == 'ip' ]
    then
      ip=`VBoxManage guestproperty get $vmname "/VirtualBox/GuestInfo/Net/0/V4/IP" | cut -d' ' -f 2`
      echo $ip
    else
      echo 'Usage: ubuntu [off/pause/resume/status/ssh/ip]'
    fi
else
  is_running=`vboxmanage showvminfo $vmname | grep -c "running (since"`
  # echo $is_running
  if [ "$is_running" == '0' ]
  then
    echo 'starting...'
    VBoxManage startvm $vmname --type headless
  else
    echo 'Ubuntu is already running...'
  fi
fi

然后在~/.zshrc的末尾加上

1
2


# virtual machine
alias ubuntu='~/GitHub/MyMacSetup/vbox_ubuntu.sh'

source ~/.zshrc使其生效

ubuntu：以非图形化界面启动ubuntu虚拟机
ubuntu off：关闭ubuntu虚拟机
ubuntu pause ：暂停（pause）ubuntu虚拟机
ubuntu resume：从paused状态恢复到running
ubuntu status：查看ubuntu虚拟机当前的状态（running, powered off, paused)
ubuntu ip：查看ubuntu的ip
ubuntu ssh：ssh连入ubuntu虚拟机

哦，关于ssh这边，可能需要去找到ubuntu虚拟机的ip才能连进去，每次都挺麻烦的。

所以我干脆就直接把ubuntu虚拟机的22端口（用于ssh服务）直接转发到了host machine的2222端口上：

ubuntu → Settings → Network → Port Forwading

然后就只需要ssh -p 2222 soreatu@127.0.0.1即可连入ubuntu虚拟机

python肯定是用python3的，网上随便搜搜教程，装了一波python3.8

接下来就是装pwntools：https://docs.pwntools.com/en/stable/install.html

1
2
3
4


$ sudo apt-get update
$ sudo apt-get install python3 python3-pip python3-dev git libssl-dev libffi-dev build-essential
$ python3.8 -m pip install --upgrade pip
$ python3.8 -m pip install --upgrade pwntools

gdb的gef：https://gef.readthedocs.io/en/master/

环境应该差不多了

Binary Exploitation / Memory Corruption by LiveOverflow

先从liveoverlow的二进制专辑学起

专辑链接：https://www.youtube.com/playlist?list=PLhixgUqwRTjxglIswKp9mpkfPNfHkzyeN

0x01 Introduction to Linux Operating System

在vmware里安装ubuntu

介绍了一些linux命令：

0x02 Writing a simple program in C

Vim

: syntax on
: set number

0x03 Writing a simple Program in Python

Vim -O matrix.c matrix.py 同时在vim中打开两个文件

光标位置：ctrl + w + ⬅️/➡️

O/shift + O：在这一行下面/上面添加一行

在Vim中执行shell命令

: !ls
: !python

shebang: #!/usr/bin/python3

0x04 How a CPU works and Introduction to Assembler

0x05 Reversing and Cracking first simple program

一个简单的crackme，手动画了一下control graph

0x06 Simple Tools and Techniques for Reversing a binary

各种可以用来分析elf文件的工具

file
strings
hexdump -C
objdump -d -x
strace
ltrace
Hopper (mac)
radara2

And remember that no tool is better than the other, it makes sense to master them all. Except radara, some say radara is the best, but nobody ever masters radara.

0x07 Uncrackable Programs? Key validation with Algorithm and creating a Keygen - Part 1/2

liveoverflow秀了一波他熟练的radara2以及他nb的反汇编分析能力

0x08 Uncrackable Program? Finding a Parser Differential in loading ELF - Part 2/2

如何使得elf文件能够正常运行，但是gdb、radara2识别不出来？

在本视频中，liveoverflow展示了一个可能的fuzz方法：随机修改elf文件中的1byte

0x09 Syscalls, Kernel vs. User Mode and Linux Kernel Source Code

syscalls, user-mode, kernel-mode

Ring 0, Ring 3

File:Priv rings.svg - Wikimedia Commons

The system call is the fundamental interface between an application and the linux kernel.

System calls are generally not invoked directly, but rather via wrapper functions in glibc.

Often the glibc wrapper function is quite thin, doing little work other than copying arguments to the right registers before invoking the system call.

例如C语言中的printf，就是write这个syscall函数的wrapper

CPU启动后，会做一些初始化操作，然后从Ring 0进入到Ring 3（用户模式）。在Ring 3想要回到Ring 0，只能通过syscall，但是syscall只会执行一段已经在kernel中预定义好的代码片段，并不能让我们随意操作。syscall结束后，又会回到Ring 3

然后就是a random dive into the kernel code，基本跟不上liveoverflow的节奏

0x0A The deal with numbers: hexadecimal, binary and decimals

1个hex character可以用来表示4bits，那么1byte=8bits就可以很清晰地用2个hex character来表示。

0x0B Smashing the Stack for Fun and Profit - setuid, ssh and exploit.education

http://www.phrack.org/ 二进制安全相关的paper
- http://www.phrack.org/issues/49/14.html#article 1996年（可能是最早的？）一篇介绍buffer overflow的paper
  
  Paper Study
exploit-exercises 二进制入门习题？

然后讲了一下setuid，某些命令（程序）会以root身份去运行（rwsr-xr-x），例如sudo、ping，我们可以尝试利用这些程序中的某些漏洞来以root身份执行命令

在权限rwsr-xr-x中，s表示设定了SUID位，unprivileged user能够以privilege user (例如root) 的身份来执行这个程序。

SUID (Set User ID) is a type of permission which is given to a file and allows users to execute the file with the permissions of its owner. There are plenty of reasons why a Linux binary can have this type of permission set. For example the ping utility require root privileges in order to open a network socket but it needs to be executed by standard users as well to verify connectivity with other hosts.

escalate privileges：提权

扩展学习：

Paper Study

2021.06.08更新：上周日去听了深信服来我校做的一个技术沙龙分享会，其中有一位主讲人非常流畅地讲了一遍“二进制漏洞攻防演变史”，感觉可以顺着这条发展线去找paper研究一番～

Smashing The Stack For Fun And Profit (1996)

Introduction

This paper attempts to explain what buffer overflows are, and how their exploits work.

Basic knowledge of assembly is required. An understanding of virtual memory concepts, and experience with gdb are very helpful but not necessary. We also assume we are working with an Intel x86 CPU, and that the operating system is Linux.

buffer: a contiguous block of computer memory that holds multiple instances of the same data type.

Process Memory Organization

先来了解一下一个process在memory中的构造。

Process由三个部分（regions）组成：

Text：program code (instructions) + read-only data（感觉是字面量？） r–
Data：initialized and uninitialized data (static variables) 全局变量 + heap？

大小会随着brk(2) system call 而变化
Stack

What Is A Stack?

一个先进后出 (LIFO) 的抽象数据结构 (abstract data type)

PUSH: adds an element at the top of the stack
POP: reduces the stack size by one by removing the last element at the top of the stack.

Why Do We Use A Stack?

高级语言中会有function (procedure)。函数调用会跳转 (call xxxx) 到一片代码区域，执行完了后又会跳转 (ret) 回来继续执行下一条指令，这就需要stack来帮助实现。

stack同时也可以用来

存储函数中的局部变量 (local variables)
函数传参
放置函数的返回值

The Stack Region

A stack is a contiguous block of memory containing data.

有一个专门的(e)sp寄存器用来指向栈顶 (the top of the stack)，栈底在一块固定的地址上。栈空间会在run time动态变化。

这里先假定为32位

stack frame (栈帧)：在函数调用 (call) 时开辟，在函数返回 (ret) 时销毁。

一个frame包括了这个函数的：

参数
局部变量（临时变量、本地变量）
一些可以用来恢复上一级frame的数据 (push ebp，即上一级frame的ebp)

返回地址（例如下面指令中的0x565555b7）

1
2


  0x565555b2 call xxx
→ 0x565555b7 add  eax, 1  ; not exactly this instruction

栈是往低地址生长的（当然也有一些implementaion是往高地址长的）

esp指向的是 last address on the stack（也有一些是指向下一个空闲的、还未使用的地址）

除了这个指向栈顶的SP，其实还有frame pointer FP/LB（Intel CPU中的ebp），用于指向frame中某一个固定的位置（一般来说ebp指向返回地址的上一层）。

因为栈顶指针esp是会动态变化的，所以想以esp为基地址，来reference到frame里的一个局部变量或者函数参数，是比较麻烦的（偏移量会变）；但是如果我们用ebp来作为基地址的话，偏移量就是固定的了，这就是为什么需要有ebp的原因吧。此时，函数参数的偏移量就是正的，而局部变量的偏移量就是负的。

Screen Shot 2020-09-09 at 11.17.20 AM

在procedure被called了之后，首先要做的就是把上一级frame的ebp给保存下来（这样函数返回的时候就可以完整地恢复上一级frame）；然后把当前esp所指的地址赋值给ebp，从而得到当前frame的基地址；再减少esp来为当前frame的局部变量预留存储空间。这些准备环节被叫做procedure prolog。

当proceduce exit时，即函数返回时，需要对当前frame进行一些清理工作。首先将esp指向返回地址 (ebp + 0x4)，然后再将ebp指向上一级frame的基地址 (ebp = *ebp)，最后ret再根据此时栈顶的返回地址 (*esp) 来跳转回去。这些收尾环节叫做procedure epilog。

Intel CPU有ENTER和LEAVE这两个指令来干这两件事。

接下来，我们通过一个简单的例子来看看stack到底长什么样子。

1
2
3
4
5
6
7
8
9


// example1.c
void function(int a, int b, int c) {
    char buffer1[5];
    char buffer2[10];
}

void main() {
    function(1, 2, 3);
}

用gcc对其进行编译：gcc -g -m32 -fno-stack-protector -o example1 example1.c

-g表示保留程序的调试信息；

-m32表示将其编译成32bit的程序（不然在我的虚拟机上就默认是64bit的了）

-fno-stack-protector表示关闭栈保护（canary），不然反汇编后会有一些额外的与canary相关的指令

然后用gdb对其main函数进行反汇编：

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21


$ gdb example1
...
gef➤  b main
Breakpoint 1 at 0x50d: file example1.c, line 8.
gef➤  r
...
gef➤  disassemble main
Dump of assembler code for function main:
   0x56555500 <+0>:		push   ebp
   0x56555501 <+1>:		mov    ebp,esp
   0x56555503 <+3>:		call   0x5655551e <__x86.get_pc_thunk.ax>
   0x56555508 <+8>:		add    eax,0x1ad4
=> 0x5655550d <+13>:	push   0x3						; 压入第3个参数：3
   0x5655550f <+15>:	push   0x2						; 压入第2个参数：2
   0x56555511 <+17>:	push   0x1						; 压入第1个参数：1
   0x56555513 <+19>:	call   0x565554ed <function>	; call指令调用function函数
   0x56555518 <+24>:	add    esp,0xc
   0x5655551b <+27>:	nop
   0x5655551c <+28>:	leave
   0x5655551d <+29>:	ret
End of assembler dump.

首先会将3个函数参数 (3, 2, 1) 压入栈中（第一个参数最后被压进）
然后再去call function，call指令会将当前eip所指向下一条指令的地址（add esp, 0x10这条指令的地址0x56555518）压入栈中（我们将其称为the return address RET），并跳转至function函数的指令片段

然后再来反汇编一下function函数：

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11


gef➤  disassemble function
Dump of assembler code for function function:
   0x565554ed <+0>:		push   ebp
   0x565554ee <+1>:		mov    ebp,esp
   0x565554f0 <+3>:		sub    esp,0x10
   0x565554f3 <+6>:		call   0x5655551e <__x86.get_pc_thunk.ax>
   0x565554f8 <+11>:	add    eax,0x1ae4
   0x565554fd <+16>:	nop
   0x565554fe <+17>:	leave
   0x565554ff <+18>:	ret
End of assembler dump.

其中call 0x... <__x86.get_pc_thunk.ax>; add eax, 0x...这两条指令是与position-independent相关的指令，先不用管它。

在function的开头可以看到

1
2
3


push   ebp
mov    ebp,esp
sub    esp,0x10

这三条指令就是用来开辟一个新的frame (procedure prologue)。在函数function中分别定义了5bytes、10bytes的局部变量buffer1, buffer2，所以sub esp, 0x10为此开辟了一段0x10bytes的空间。

应该还会有一些对齐 (alignment) 之类的，可能编译器有一些优化吧，所以不像paper里那样是subl $20, %esp

在function的结尾可以看到

1
2


leave
ret

这两条指令就是用来销毁这个frame (procedure epilogue)。leave将esp指向返回地址 (ebp + 0x4)、并将ebp指向上一级frame的栈基地址 (*ebp)；ret将取出 (pop) 此时栈顶的值 (*esp) 作为返回地址，并调转至该返回地址处。

动态调试一下，来看看栈到底是怎么变化的：

首先是push 0x3传入的第三个参数：

Screen Shot 2020-09-09 at 11.40.01 AM

然后push 0x2; push 0x1再传入第二个参数、第一个参数：

![Screen Shot 2020-09-09 at 11.40.46 AM](/Users/Soreat_u/Library/Application Support/typora-user-images/Screen Shot 2020-09-09 at 11.40.46 AM.png)

然后call 0x... <function>进行调用：

Screen Shot 2020-09-09 at 2.33.31 PM

ok，此时已经进入到了function的领地了，接着push ebp：

Screen Shot 2020-09-09 at 2.40.50 PM

再mov ebp, esp，确定下当前frame的基地址：

Screen Shot 2020-09-09 at 2.42.48 PM

然后sub esp, 0x10，为当前frame的局部变量开辟空间：

Screen Shot 2020-09-09 at 2.47.15 PM

这样，function函数的frame就已经完成了初始化，接下来就是funtion函数里的各种操作。

function函数返回时，先leave：

Screen Shot 2020-09-09 at 2.52.14 PM

再ret返回到调用者的frame：

Screen Shot 2020-09-09 at 2.54.37 PM

哦，对了，还需要把这3个参数给清理掉，所以call 0x... <function>的下一条指令就是add esp, 0xc：

![Screen Shot 2020-09-09 at 3.00.53 PM](/Users/Soreat_u/Library/Application Support/typora-user-images/Screen Shot 2020-09-09 at 3.00.53 PM.png)

ok，完全恢复到最初的状态

在调用function时的栈布局：

Screen Shot 2020-09-09 at 4.10.19 PM

Buffer Overflows

A buffer overflow is the result of stuffing more data into a buffer than it can handle.

例如下面这个实例代码：

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16


// example2.c
void function(char *str) {
   char buffer[16];

   strcpy(buffer,str);
}

void main() {
  char large_string[256];
  int i;

  for( i = 0; i < 255; i++)
    large_string[i] = 'A';

  function(large_string);
}

当我们call function时，其栈布局如下：

Screen Shot 2020-09-09 at 4.19.17 PM

function函数中的局部变量buffer只有16bytes，而传入的str却能有256bytes，strcpy语句将str中的所有东西都全部拷贝至buffer中 (strcpy是空字符\x00截断，不会有边界检查bound check)，显然可能会出现溢出，而溢出的那些东西就会把栈下面的东西给覆盖掉。

So a buffer overflow allows us to change the return address of a function. In this way we can change the flow of execution of the program.

利用缓冲区溢出漏洞可以控制程序的执行流

ok，我们来修改一下最开始的那个example1：

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20


// example3.c
#include<stdio.h>

void function(int a, int b, int c) {
    char buffer1[5];
    char buffer2[10];
    int *ret;

    ret = buffer1 + 12;
    (*ret) += 8;
}

void main() {
    int x;

    x = 0;
    function(1, 2, 3);
    x = 1;
    printf("%d\n", x);
}

该代码将buffer1后12bytes处的那32-bit (1-word) 增加了8，即返回地址+8，将会导致x = 1;这条语句跳过，直接去执行printf("%d\n", x);，这样变量x打印出来就是0，而非1！

草，本地编译出来的不太对劲，没对齐，栈上还有一些其他的东西。。不本地测试了，就单纯地看paper了。

Shell Code

ok，既然我们已经可以通过buffer overflow来控制程序的流程了，那么我们应该去执行一些杀伤力比较大的指令。我们可以来起一个shell，有了shell，想干嘛就干嘛。

可以在栈上写一段shellcode，然后将ret地址改为shellcode的开头。

想要起一个shell，也就是执行：

 1
 2
 3
 4
 5
 6
 7
 8
 9
10


// shellcode.c
#include <stdio.h>

void main() {
   char *name[2];

   name[0] = "/bin/sh";
   name[1] = NULL;
   execve(name[0], name, NULL);
}

将其编译，即可得到它的汇编指令（进而机器码）。

![image-20200911192931394](/Users/Soreat_u/Library/Application Support/typora-user-images/image-20200911192931394.png)

但是，有一个问题：如何确定返回地址？shellcode是在栈上，但是不知道它的具体地址，无法给ret指定到底是跳转到哪里，以及"/bin/sh"字符串的绝对地址。

后面这个问题可以通过call和jump指令的相对跳转来解决。

我们只需要在ret前面布上我们的shellcode，然后ret到前面的位置，即可定位到我们的shellcode；还可以利用call指令会将其下一条指令的地址压到栈顶，得到"/bin/sh"字符串的绝对地址。所以我们可以这样布栈：

JJ是jump指令，SSSSS是shellcode，CC是call指令，sss是字符串。

测试程序：

但是，我们会发现shellcode里面会有一些空字符"\x00"，buffer overflow通常的情况是strcpy等危险函数，会以"\x00"作为字符串结尾的判断，所以我们的shellcode中不能有空字符"\x00"。这也很简单，可以通过xor eax, eax; mov ebx, eax这种操作来实现。

修改完后的shellcode：

Writing an Exploit (or how to mung the stack)

很多情况，我们都需要确定一下我们shellcode的地址，这样才能让ret精准跳转到我们的shellcode。

在本节中，作者给出了2种方法：

根据stack最初的基地址，爆破shellcode在stack上与基地址的偏移
在shellcode前加上很多nop指令，ret到nop中，然后滑到shellcode

Small Buffer Overflow

还有些情况，我们可以溢出的字节数很少，这样我们的shellcode就很受限制。

我们可以通过将shellcode放到这个程序的一个环境变量里，程序加载时，环境变量是会在stack的top处：

我们可以ret到envp处，进而执行shellcode。

Finding Buffer Overflows

总的来说，buffer overflow出现的根本原因在于C语言没有内置的边界检查，例如：

strcat, strcpy, sprintf, vsprintf 等函数无bound check，且以空字符null作为终止字符
gets以换行符\n作为终止字符
scanf以non-white-space字符作为终止字符

还有某些函数会以是否读到一个换行字符（或者其他的delimiter）的while来获取用户输入，getc, fgetc, getchar，这些while loop也非常容易用来buffer overflow。

To conclude, grep(1) is your friend.

最后paper的附录给出了当时（1996年）各种架构下的shellcode以及一个Generic Buffer Overflow Program。

StackGuard: Automatic Adaptive Detection and Prevention of Buffer-Overflow Attacks (1998)

Here is Notes .

内功修炼

COMPILER, ASSEMBLER, LINKER AND LOADER: A BRIEF STORY

源代码=>可执行文件=>运行镜像：编辑edit，编译compile，汇编assemble，链接link，装载load

ccompilerlinker001

链接、装载与库

预处理：处理#指令，删除注释，添加标识

编译：词法分析、语法分析、语义分析、中间语言生成、目标代码生成与优化

汇编：将汇编代码翻译成机器指令assembly

链接：组合目标文件object files，重定位relocation，生成可执行文件executable

装载：将硬盘secondary memory中的可执行文件，加载到内存primary memory中，用于CPU执行

静态链接vs动态链接

section vs segment

内存memory中程序的布局

ccompilerlinker006

Runtime linker

Load-time dynamic linking：加载到内存后，一次性全部解析完所有的external symbols
Run-time dynamic linking：加载到内存后，先不解析，等遇到了再解析这一个

got && plt

Linux中的GOT和PLT到底是个啥？

main函数前后发生了什么？

Linux X86 程序启动 – main函数是如何被执行？

callgraph

C语言运行库，C语言标准库

系统调用

系统调用：用户层面与内核层面的界限

Windows系统完全基于DLL机制

系统调用有明确的定义、向后兼容性

Linux： int 0x80 / sysenter

Windows: Windows API

运行库兼容Windows、Linux等操作系统

系统调用的方式不一样，运行库提供统一的接口，针对不同的操作系统分别实现（封装系统调用）接口函数

helloworld

helloworld程序，如何从源代码一直到屏幕上显示？

编辑源代码helloworld.c

1
2
3
4
5
6
7


#include <stdio.h>

int main() {
    printf("Hello world!\n");

    return 0;
}

编译，链接生成可执行文件：gcc helloworld.c -o helloworld

执行可执行文件：./helloworld

命令gcc helloworld.c -o helloworld在shell中得到执行，shell进程会fork出一个新进程gcc，新进程完成编译工作

gcc是一个工具集，包括了预处理器、编译器、汇编器和链接器等等

gcc具体工作（编译原理）：对源代码进行预处理，编译生成目标程序（词法分析、语法分析、语义分析及中间代码生成、代码优化、目标代码生成），然后跟用到的一些库文件进行链接，最终得到可执行文件helloworld，并保存在磁盘中

命令./helloworld会从创建一个新进程，操作系统为其分配一个独立的虚拟地址空间，读取文件头，建立虚拟空间与可执行文件的映射关系，将eip指向程序的入口点（_start或动态链接器），CPU一行一行执行命令。此时，内存中还没东西，引发缺页中断，从磁盘中按页调取数据到内存。