简介 angr是一个Python实现的二进制分析框架, 实现了程序插桩、符号执行等二进制分析技术, 项目的GitHub:https://github.com/angr/angr
安装 官方文档
根据文档, angr会修改libz3
和libVEX
, 会影响其他程序的正常使用, 因此官方推荐使用 Python 虚拟环境virtualenvwrapper
Ubuntu18.04 虚拟机 安装virtualenvwrapper virtualenwrapper 官方文档
我的环境是Ubuntu18.04, 根据官方文档一把梭
➜ ~ pip3 install virtualenvwrapper ... ➜ ~ export WORKON_HOME=~/Envs ➜ ~ mkdir -p $WORKON_HOME ➜ ~ VIRTUALENVWRAPPER_PYTHON=/usr/bin/python3 ➜ ~ source /home/lantern/.local /bin/virtualenvwrapper.sh .... ➜ ~ mkvirtualenv env1 .... (env1) ➜ ~
这里就是我安装的全过程, 由于环境原因一开始找不到/usr/local/bin/virtualenvwrapper.sh
, 后面又因为我默认是python2所以找不到virtualenvwrapper
模块, 但都解决了。
安装angr 接下来退出env1
, 安装angr
➜ ~ source /home/lantern/.local /bin/virtualenvwrapper.sh ➜ ~ mkvirtualenv --python=$(which python3) angr && python -m pip install angr
成功安装后, 就可以使用angr
了:
(angr) ➜ ~ python3 Python 3.6.9 (default, Apr 18 2020, 01:56:04) [GCC 8.4.0] on linux Type "help" , "copyright" , "credits" or "license" for more information. >>> import angr >>>
WSL Ubuntu18.04 安装virtualenvwrapper https://gist.github.com/fedme/442e7d1d7eb7d68e02cfbf6441d42759 跟着这篇文章即可
创建虚拟环境 指定为python3
mkvirtualenv angr --python=python3
安装angr 一把梭
python3 -m pip install angr
加入路径(WSL) 由于每次终端启动都得执行
export PATH=~/.local /bin:$PATH export WORKON_HOME=~/.venvssource ~/.local /bin/virtualenvwrapper.shexport PIP_VIRTUALENV_BASE=~/.venvs
~/.local/bin/virtualenvwrapper.sh
请替换为自己的路径所以可以将他们写入终端的配置文件, 这里我用的是zsh
, 所以添加到~/.zshrc
中, 添加到最后面即可, 这样每次启动终端的时候都会自动运行, 就可以愉快的用workon进入angr虚拟环境了~
➜ Desktop workon angr (angr) ➜ Desktop
跟着文档学使用 https://docs.angr.io/examples , 由于文档的题目我甚至看不懂, 这里先鸽了, 后期实力上升以后再来研究吧
https://docs.angr.io/appendix/migration 这里是python2 和 python3的一些区别
震惊!竟然在B站学angr 漏洞银行丨二进制自动化解题技术-蓝鲸塔主丨咖面64期 , B站大学名不虚传
符号执行 符号执行就是在运行程序时, 用符号来替代真实值。符号执行相较于真实值执行的优点在于, 当使用真实值执行程序时, 我们能够遍历的程序路径只有一条, 而使用符号进行执行时, 由于符号是可变的, 我们就可以利用这一特性, 尽可能的将程序的每一条路径遍历, 这样的话, 必定存在至少一条能够输出正确结果的分支, 每一条分支的结果都可以表示为一个离散关系式, 使用约束求解引擎即可分析出正确结果。
利用angr暴力求解 装载二进制文件到到分析平台 转换二进制文件为中间语言(intermediate representation) (IR) 转换IR为语义描述(即它做什么而不是它是什么) 执行真正的分析, 这包括:部分或者全部的静态分析 对程序状态空间的符号探索 对上述的情况的一些混合 装载二进制文件 angr的二进制装载组件是CLE, 它负责装载二进制对象(以及它依赖的任何库)和把这个对象以易于操作的方式交给angr的其他组件。
import angrb = angr.Project("ctf_game" ) print (b.entry) // 程序入口点print (b.loader.min_addr, b.loader.max_addr) // 该二进制文件在内存空间中的最小地址和最大地址print (b.filename) // 文件的全名
中间语言 由于angr需要处理很多不同的架构, 所以它必须选择一种中间语言(IR)来进行它的分析, 我们使用Valgrind的中间语言, VEX来完成这方面的内容。VEX中间语言抽象了几种不同架构间的区别, 允许在他们之上进行统一的分析
基本使用流程 载入二进制程序, auto_load_libs 是设置是否自动加载外部动态链接
import angrproj = angr.Project('./ctf_game' , auto_load_libs = False )
然后获取当前的入口状态
state = proj.factory.entry_state()
在获取到当前的入口状态后, 模拟执行
simg = proj.factory.simgr(state)
模拟执行后产生多种状态, 我们要选择最终要到达的(find
), 过滤掉不需要的(avoid
)(具体例子看下面的练习)
simg.explore(find = 0x400844 , avoid = 0x400855 )
获取最终的状态结果
simgr.found[0 ].posix.dumps(0 ) // dump(0 )表示从标准输入中获取字符串
练习 练习1——无参 由于没有找到视频中的题目, 我自己实现了一个简单的程序来练手, 代码如下:
#include <stdio.h> #include <stdbool.h> #include <string.h> bool check (char flag[9 ]) { char check_str[] = "aoeqh`gi{" ; for (int i = 0 ; i < 9 ; ++i) { flag[i] ^= i; } if (strcmp (flag, check_str) != 0 ) { return false ; } else { return true ; } } int main () { puts ("Please input your flag:" ); char flag[10 ]; if (fgets(flag, 10 , stdin )) { if (check(flag)) { puts ("True!" ); } else { puts ("False" ); } } return 0 ; }
flag为angrleans
IDA打开以后, 找到我们需要的和不需要的地址:
使用脚本进行爆破:
import angrproj = angr.Project('./angr1' , auto_load_libs=False ) state = proj.factory.entry_state() simgr = proj.factory.simgr(state) simgr.explore(find=0x40086D , avoid=0x40087B ) print (simgr.found[0 ].posix.dumps(0 ))
运行脚本:
(angr) ➜ r100 python3 s.py ..... b'angrleans'
正是我们的flag!
练习2——带参 对练习1的代码稍作修改:
#include <stdio.h> #include <stdbool.h> #include <string.h> bool check (char flag[9 ]) { char check_str[] = "aoeqh`gi{" ; for (int i = 0 ; i < 9 ; ++i) { flag[i] ^= i; } if (strcmp (flag, check_str) != 0 ) { return false ; } else { return true ; } } int main (int argc, char **argv) { if (argc == 1 ) { puts ("Usage: angr2 your_flag" ); return 0 ; } if (check(argv[1 ])) { puts ("True!" ); } else { puts ("False" ); } return 0 ; }
我们还是用IDA打开以后获取find和avoid, 然后编写脚本:
import angrimport claripyproj = angr.Project('./angr2' , auto_load_libs=False ) argv1 = claripy.BVS("argv1" , 9 * 8 ) // 这里用的单位是bit, 因此需要乘以8 state = proj.factory.entry_state(args=['./angr2' , argv1]) // 导入参数 simgr = proj.factory.simgr(state) print (simgr.explore(find=0x4007DC , avoid=0x4007EA ))print (simgr.found[0 ].solver.eval (argv1, cast_to=bytes )) // 直接输出是ascii 码, 用cast_to=bytes 转为bytes 类型
运行脚本:
(angr) ➜ angr2 python3 s.py ...... <SimulationManager with 1 active, 1 found> b'angrleans'
对下面用到的一些东西进行解释 claripy
: angr的求解引擎claripy.BVS('password', 32)
: 创建一个32位的符号矢量claripy.BVV(8, 32)
: 创建一个32位的矢量, 并初始化值为8
题目练习 2018 网鼎杯 Matricks 程序打开以后的流程如下:
__int64 __fastcall main (__int64 a1, char **a2, char **a3) { unsigned __int8 v4; signed int v5; signed int v6; int v7; signed int v8; signed int i; int ia; int n_23; int n_23a; int v13; int v14; signed int v15; char flag[56 ]; unsigned __int64 v17; __int64 savedregs; v17 = __readfsqword(0x28 u); puts ("input your flag:" ); __isoc99_scanf("%49s" , flag); v15 = 1 ; i = 0 ; n_23 = 23 ; while ( i <= 48 ) { *((_BYTE *)&savedregs + 7 * (n_23 / 7 ) + n_23 % 7 - 192 ) = flag[i] ^ n_23; *((_BYTE *)&savedregs + 7 * (i / 7 ) + i % 7 - 128 ) = xor_bytes[n_23] ^ i; ++i; n_23 = (n_23 + 13 ) % 49 ; } ia = 41 ; v13 = 3 ; v14 = 4 ; v7 = 5 ; v5 = 0 ; while ( v5 <= 6 && v15 ) { v6 = 0 ; while ( v6 <= 6 && v15 ) { v4 = 0 ; v8 = 0 ; while ( v8 <= 6 ) { v4 += *((_BYTE *)&savedregs + 7 * v7 + v14 - 128 ) * *((_BYTE *)&savedregs + 7 * v13 + v7 - 192 ); ++v8; v7 = (v7 + 5 ) % 7 ; } for ( n_23a = 17 ; n_23a != ia; n_23a = (n_23a + 11 ) % 49 ) ; if ( check_num[7 * (n_23a / 7 ) + n_23a % 7 ] != ((unsigned __int8)n_23a ^ v4) ) v15 = 0 ; ia = (ia + 31 ) % 49 ; ++v6; v14 = (v14 + 4 ) % 7 ; } ++v5; v13 = (v13 + 3 ) % 7 ; } if ( v15 ) puts ("congrats!" ); else puts ("wrong flag!" ); return 0LL ; }
很符合我们练习的第一题的题型, 直接爆破
import angrproj = angr.Project('./martricks' , auto_load_libs=False ) state = proj.factory.entry_state() simgr = proj.factory.simgr(state) simgr.explore(find=0x0000000000400A84 , avoid=0x0000000000400A90 ) print (simgr.found[0 ].posix.dumps(0 ))
这次就没有跑那么快了, 但是还是直接跑出了结果
(angr) ➜ martricks_1 python3 s.py b'flag{Everyth1n_th4t_kill5_m3_m4kes_m3_fee1_aliv3}'
2020 网鼎杯 Singal 具体分析看2020 网鼎 wp , 这里只放angr爆破脚本
import angrimport claripyproject = angr.Project("./signal.exe" ) state = project.factory.entry_state() class ReplacementScanf (angr.SimProcedure ): def run (self, format_string, string_address ): flag = claripy.BVS('flag' , 8 * 0xf ) self.state.memory.store(string_address, flag) for char in flag.chop(bits=8 ): self.state.add_constraints(char >= '0' , char <= 'z' ) self.state.globals ['solutions' ] = (flag) scanf_symbol = "scanf" project.hook_symbol(scanf_symbol, ReplacementScanf()) simgr = project.factory.simgr(state) simgr.explore(find=0x0040179E , avoid=0x04016E6 ) if simgr.found: solver_state = simgr.found[0 ] stored_solution = solver_state.globals ['solutions' ] print (solver_state.solver.eval ( stored_solution, cast_to = bytes ))
angr_ctf 题目地址
b站上有全套的解题讲解: [angr_ctf] angr符号执行
00_angr_find IDA打开, 很标准的可以用angr的题型
int __cdecl main (int argc, const char **argv, const char **envp) { int i; char s1[9 ]; unsigned int v6; v6 = __readgsdword(0x14 u); print_msg(); printf ("Enter the password: " ); __isoc99_scanf("%8s" , s1); for ( i = 0 ; i <= 7 ; ++i ) s1[i] = complex_function(s1[i], i); if ( !strcmp (s1, "QWSYJIQP" ) ) puts ("Good Job." ); else puts ("Try again." ); return 0 ; }
脚本 import angrproj = angr.Project('./00_angr_find' , auto_load_libs=False ) state = proj.factory.entry_state() simgr = proj.factory.simgr(state) simgr.explore(find=0x804867D , avoid=0x804866B ) print (simgr.found[0 ].posix.dumps(0 ))
得到答案:
(angr) ➜ 00_angr_find git:(master) ✗ python3 s.py ............ b'QTMPXTYU'
01_angr_void 跟第一个差不多, 这次不过这次avoid是avoid_me()
函数的地址, 而不是字符串
脚本 import angrproj = angr.Project('./01_angr_avoid' , auto_load_libs = False ) state = proj.factory.entry_state() simgr = proj.factory.simgr(state) print (simgr.explore(find=0x080485E0 , avoid=0x080485A8 ))print (simgr.found[0 ].posix.dumps(0 ))
得到答案
(angr) ➜ 01_angr_avoid git:(master) ✗ python3 s.py ...... <SimulationManager with 1 active, 16 deadended, 1 found, 10 avoid> b'RNGFXITY'
02_angr_find_condition 跟前两个都差不多, 不过这次用不同的写法, 前两次我们都是给出地址, 这次我们直接写
脚本 import angrimport sysdef main (argv ): bin_path = argv[1 ] project = angr.Project(bin_path) init_state = project.factory.entry_state() simgr = project.factory.simgr(init_state) def is_successful (state ): return b"Good Job." in state.posix.dumps(1 ) def should_abort (state ): return b"Try again." in state.posix.dumps(1 ) print (simgr.explore(find=is_successful, avoid=should_abort)) if simgr.found: print (simgr.found[0 ].posix.dumps(0 )) if __name__ == "__main__" : main(sys.argv)
运行
(angr) ➜ dist git:(master) ✗ python3 s.py 02_angr_find_condition ...... <SimulationManager with 1 found, 17 avoid> b'HETOBRCU'
03_angr_simbolic_registers 该题主要学会符号化寄存器 现angr已支持多参数的求解
这里我们直接跳过get_user_input()
函数, 直接设置寄存器eax, ebx, edx
.text:0804897B call get_user_input .text:08048980 mov [ebp+var_14], eax .text:08048983 mov [ebp+var_10], ebx .text:08048986 mov [ebp+var_C], edx
脚本 import angrimport sysimport claripydef main (argv ): bin_path = argv[1 ] project = angr.Project(bin_path) start_addr = 0x804897B init_state = project.factory.blank_state(addr=start_addr) pass1 = claripy.BVS("pass1" , 32 ) pass2 = claripy.BVS("pass2" , 32 ) pass3 = claripy.BVS("pass3" , 32 ) init_state.regs.eax = pass1 init_state.regs.ebx = pass2 init_state.regs.edx = pass3 simgr = project.factory.simgr(init_state) def is_successful (state ): return b"Good Job." in state.posix.dumps(1 ) def should_abort (state ): return b"Try again." in state.posix.dumps(1 ) print (simgr.explore(find=is_successful, avoid=should_abort)) if simgr.found: print (simgr.found[0 ].posix.dumps(0 )) if __name__ == "__main__" : main(sys.argv)
运行:
(angr) ➜ dist git:(master) ✗ python3 s.py 03_angr_symbolic_registers ...... <SimulationManager with 1 found, 3 avoid> b'b9ffd04e ccf63fe8 8fd4d959'
得到结果
04_angr_symbolic_stack 这里我们还是假设angr不支持多参数输入, 主要是为了学习如何符号化栈上的值
IDA打开以后看到函数handle_user()
, 我们需要跳过scanf()
, 将栈上的值进行符号化
.text:0804868A push offset aUU ; "%u %u" .text:0804868F call ___isoc99_scanf .text:08048694 add esp, 10h .text:08048697 mov eax, [ebp-12] ; 参数1 .text:0804869A sub esp, 0Ch .text:0804869D push eax .text:0804869E call complex_function0 .text:080486A3 add esp, 10h .text:080486A6 mov [ebp-12], eax .text:080486A9 mov eax, [ebp-16] ; 参数2 .text:080486AC sub esp, 0Ch .text:080486AF push eax .text:080486B0 call complex_function1 .text:080486B5 add esp, 10h .text:080486B8 mov [ebp+var_10], eax
由于初始化时是没有栈的, 因此我们需要对栈进行padding
脚本 import angrimport sysdef main (argv ): bin_path = argv[1 ] project = angr.Project(bin_path) start_addr = 0x8048697 init_state = project.factory.blank_state(addr=start_addr) padding_size = 8 init_state.regs.ebp = init_state.regs.esp init_state.regs.esp -= padding_size pass1 = init_state.solver.BVS("pass1" , 32 ) pass2 = init_state.solver.BVS("pass2" , 32 ) init_state.stack_push(pass1) init_state.stack_push(pass2) simgr = project.factory.simgr(init_state) def is_successful (state ): return b"Good Job." in state.posix.dumps(1 ) def should_abort (state ): return b"Try again." in state.posix.dumps(1 ) print (simgr.explore(find=is_successful, avoid=should_abort)) if simgr.found: pass1 = simgr.found[0 ].solver.eval (pass1) pass2 = simgr.found[0 ].solver.eval (pass2) print (pass1, pass2) else : raise (Exception("Solution not found." )) if __name__ == "__main__" : main(sys.argv)
运行得结果:
(angr) ➜ dist git:(master) ✗ python3 s.py 04_angr_symbolic_stack ...... <SimulationManager with 1 found, 2 avoid> 1704280884 2382341151
验证:
(angr) ➜ dist git:(master) ✗ ./04_angr_symbolic_stack Enter the password: 1704280884 2382341151 Good Job.
05_angr_symbolic_memory 这题主要学会符号化内存
IDA打开后, 查看主函数
int __cdecl main (int argc, const char **argv, const char **envp) { signed int i; memset (user_input, 0 , 0x21 u); printf ("Enter the password: " ); __isoc99_scanf("%8s %8s %8s %8s" , user_input, &unk_A1BA1C8, &unk_A1BA1D0, &unk_A1BA1D8); for ( i = 0 ; i <= 31 ; ++i ) *(_BYTE *)(i + 169583040 ) = complex_function(*(char *)(i + 169583040 ), i); if ( !strncmp (user_input, "NJPURZPCDYEAXCSJZJMPSOMBFDDLHBVN" , 0x20 u) ) puts ("Good Job." ); else puts ("Try again." ); return 0 ; }
我们需要符号化四个变量user_input
, unk_A1BA1C8
, unk_A1BA1D0
, unk_A1BA1D8
首先获得地址:
.bss:0A1BA1C0 ; char user_input[8] .bss:0A1BA1C0 user_input db 8 dup(?) ; DATA XREF: main+18↑o .bss:0A1BA1C0 ; main+47↑o ... .bss:0A1BA1C8 ; char byte_A1BA1C8[8] .bss:0A1BA1C8 byte_A1BA1C8 db ? ; DATA XREF: main+42↑o .bss:0A1BA1C9 db ? ; .bss:0A1BA1CA db ? ; .bss:0A1BA1CB db ? ; .bss:0A1BA1CC db ? ; .bss:0A1BA1CD db ? ; .bss:0A1BA1CE db ? ; .bss:0A1BA1CF db ? ; .bss:0A1BA1D0 byte_A1BA1D0 db ? ; DATA XREF: main+3D↑o .bss:0A1BA1D1 db ? ; .bss:0A1BA1D2 db ? ; .bss:0A1BA1D3 db ? ; .bss:0A1BA1D4 db ? ; .bss:0A1BA1D5 db ? ; .bss:0A1BA1D6 db ? ; .bss:0A1BA1D7 db ? ; .bss:0A1BA1D8 unk_A1BA1D8 db ? ;
这次start_addr
从0x08048601
也就是跳过scanf
脚本 import angrimport sysdef main (argv ): bin_path = argv[1 ] project = angr.Project(bin_path) start_addr = 0x08048601 init_state = project.factory.blank_state(addr = start_addr) user_input = 0x0A1BA1C0 password = [init_state.solver.BVS("pass%d" , 64 ) for i in range (4 )] for i in range (4 ): init_state.memory.store(user_input + i * 8 , password[i]) simgr = project.factory.simgr(init_state) def is_successful (state ): return b"Good Job." in state.posix.dumps(1 ) def should_abort (state ): return b"Try again." in state.posix.dumps(1 ) print (simgr.explore(find=is_successful, avoid=should_abort)) if simgr.found: for i in range (4 ): print (simgr.found[0 ].solver.eval (password[i], cast_to=bytes ), end="" ) else : raise (Exception("Solution not found." )) if __name__ == "__main__" : main(sys.argv)
运行结果:
(angr) ➜ dist git:(master) ✗ python3 s.py 05_angr_symbolic_memory ...... <SimulationManager with 1 found, 65 avoid> b'NAXTHGNR' b'JVSFTPWE' b'LMGAUHWC' b'XMDCPALU' %
06_angr_symbolic_dynamic_memory 这里我们主要是符号化动态内存
主函数如下
int __cdecl main (int argc, const char **argv, const char **envp) { char *v3; char *v4; signed int i; buffer0 = (char *)malloc (9u ); buffer1 = (char *)malloc (9u ); memset (buffer0, 0 , 9u ); memset (buffer1, 0 , 9u ); printf ("Enter the password: " ); __isoc99_scanf((int )"%8s %8s" , (int )buffer0, (int )buffer1); for ( i = 0 ; i <= 7 ; ++i ) { v3 = &buffer0[i]; *v3 = complex_function(buffer0[i], i); v4 = &buffer1[i]; *v4 = complex_function(buffer1[i], i + 32 ); } if ( !strncmp (buffer0, "UODXLZBI" , 8u ) && !strncmp (buffer1, "UAORRAYF" , 8u ) ) puts ("Good Job." ); else puts ("Try again." ); free (buffer0); free (buffer1); return 0 ; }
我们首先要找到ESP
:
import angrimport sysdef main (argv ): bin_path = argv[1 ] project = angr.Project(bin_path) start_addr = 0x08048601 init_state = project.factory.blank_state(addr = start_addr) print ("ESP:" , init_state.regs.esp) if __name__ == "__main__" : main(sys.argv)
运行得到:
(angr) ➜ dist git:(master) ✗ python3 s.py 06_angr_symbolic_dynamic_memory ESP: <BV32 0x7ffefffc>
由于我们跳过了scanf()
及其以前的内容, 即以下两条语句是不会运行的:
buffer0 = (char *)malloc (9u ); buffer1 = (char *)malloc (9u );
因此我们需要修改buffer0
和buffer1
指向我们准备好的内存地址
buffer0
和buffer1
地址如下:
.bss:0ABCC8A4 buffer0 .bss:0ABCC8AC buffer1
脚本 import angrimport sysdef main (argv ): bin_path = argv[1 ] project = angr.Project(bin_path) start_addr = 0x08048699 init_state = project.factory.blank_state(addr = start_addr) print ("ESP:" , init_state.regs.esp) buffer0 = init_state.regs.esp - 0x100 buffer1 = init_state.regs.esp - 0x200 buffer0_addr = 0x0ABCC8A4 buffer1_addr = 0x0ABCC8AC init_state.memory.store(buffer0_addr, buffer0, endness = project.arch.memory_endness) init_state.memory.store(buffer1_addr, buffer1, endness=project.arch.memory_endness) password = [init_state.solver.BVS("password%d" , 8 * 8 ) for i in range (2 )] init_state.memory.store(buffer0, password[0 ]) init_state.memory.store(buffer1, password[1 ]) simgr = project.factory.simgr(init_state) def is_successful (state ): return b"Good Job." in state.posix.dumps(1 ) def should_abort (state ): return b"Try again." in state.posix.dumps(1 ) print (simgr.explore(find=is_successful, avoid=should_abort)) if simgr.found: for i in range (2 ): print (simgr.found[0 ].solver.eval (password[i], cast_to=bytes )) else : raise (Exception("Solution not found." )) if __name__ == "__main__" : main(sys.argv)
运行结果:
(angr) ➜ dist git:(master) ✗ python3 s.py 06_angr_symbolic_dynamic_memory ESP: <BV32 0x7ffefffc> ...... <SimulationManager with 1 found, 34 avoid> b'UBDKLMBV' b'UNOERNYS'
检验一下:
(angr) ➜ dist git:(master) ✗ ./06_angr_symbolic_dynamic_memory Enter the password: UBDKLMBV UNOERNYS Good Job.
正确
07_angr_symbolic_file 这个主要学习如何符号化一个文件里面的内容
主函数如下:
int __cdecl __noreturn main(int argc, const char **argv, const char **envp){ signed int i; // [esp+Ch] [ebp-Ch] memset(buffer, 0 , 0x40u); printf("Enter the password: " ); __isoc99_scanf("%64s" , buffer); ignore_me((int )buffer, 0x40u); memset(buffer, 0 , 0x40u); fp = fopen("OJKSQYDP.txt" , "rb" ); fread(buffer, 1u, 0x40u, fp); fclose(fp); unlink("OJKSQYDP.txt" ); for ( i = 0 ; i <= 7 ; ++i ) *(_BYTE *)(i + 134520992 ) = complex_function(*(char *)(i + 134520992 ), i); if ( strncmp(buffer, "AQWLCTXB" , 9u) ) { puts("Try again." ); exit(1 ); } puts("Good Job." ); exit(0 ); }
ignore_me
主要是把第一个读取的内容存入OJKSQYDP.txt
, 不用我们自己创建文件 , 然后从文件OJKSQYDP.txt
读取数据存入buff
这里我们对文件的内容进行符号化, 地址从0x080488D6
开始, 主要是初始化buff
.text:080488CE call ignore_me .text:080488D3 add esp, 10h ; 这个是ignore_me的栈回收 .text:080488D6 sub esp, 4 .text:080488D9 push 40h ; n .text:080488DB push 0 ; c .text:080488DD push offset buffer ; s .text:080488E2 call _memset
脚本 import angrimport sysdef main (argv ): bin_path = argv[1 ] project = angr.Project(bin_path) start_addr = 0x080488D6 init_state = project.factory.blank_state(addr = start_addr) filename = "OJKSQYDP.txt" file_size = 0x40 password = init_state.solver.BVS("password" , file_size) simgr_file = angr.storage.SimFile( filename, content=password, size=file_size) init_state.fs.insert(filename, simgr_file) simgr = project.factory.simgr(init_state) def is_successful (state ): return b"Good Job." in state.posix.dumps(1 ) def should_abort (state ): return b"Try again." in state.posix.dumps(1 ) print (simgr.explore(find=is_successful, avoid=should_abort)) if simgr.found: print (simgr.found[0 ].solver.eval (password, cast_to=bytes )) else : raise (Exception("Solution not found." )) if __name__ == "__main__" : main(sys.argv)
运行结果:
(angr) ➜ dist git:(master) ✗ python3 s.py 07_angr_symbolic_file ..... <SimulationManager with 1 found, 17 avoid> b'AZOMMMZM'
验证一下:
(angr) ➜ dist git:(master) ✗ ./07_angr_symbolic_file Enter the password: AZOMMMZM Good Job.
08_angr_constraints 前面我们曾经把auto_load_libs=False
关闭, 主要是因为符号执行有个问题:路径爆炸问题
, 例如strcpy, 一个一个字符的比较就会产生非常多的路径, 导致路径爆炸。
该题主要学习通过添加约束条件来解决路径爆炸问题
主函数如下:
int __cdecl main (int argc, const char **argv, const char **envp) { signed int i; qmemcpy(password, "AUPDNNPROEZRJWKB" , sizeof (password)); memset (&buffer, 0 , 0x11 u); printf ("Enter the password: " ); __isoc99_scanf("%16s" , &buffer); for ( i = 0 ; i <= 15 ; ++i ) *(_BYTE *)(i + 134520912 ) = complex_function(*(char *)(i + 134520912 ), 15 - i); if ( check_equals_AUPDNNPROEZRJWKB((int )&buffer, 0x10 u) ) puts ("Good Job." ); else puts ("Try again." ); return 0 ; }
其中, check_equals_AUPDNNPROEZRJWKB()
函数就是一个字符一个字符的比较, 就会产生路径爆炸问题
, 这里我们的解决方法是当执行到这个函数里面时, 我们用自己的方法来实现, 实现的方法是添加约束add_constraints
这里我们首先还是跳过scanf()
函数
text:08048613 push offset buffer .text:08048618 push offset a16s ; "%16s" .text:0804861D call ___isoc99_scanf .text:08048622 add esp, 10h .text:08048625 mov [ebp+var_C], 0 .text:0804862C jmp short loc_8048663
开始地址从mov [ebp+var_C], 0
, 即0x08048625
开始
脚本 import angrimport sysdef main (argv ): bin_path = argv[1 ] project = angr.Project(bin_path) start_addr = 0x08048625 init_state = project.factory.blank_state(addr = start_addr) buff_addr = 0x0804A050 password = init_state.solver.BVS("password" , 16 * 8 ) init_state.memory.store(buff_addr, password) simgr = project.factory.simgr(init_state) check_addr = 0x08048565 simgr.explore(find = check_addr) if simgr.found: check_state = simgr.found[0 ] desired_string = "AUPDNNPROEZRJWKB" check_param1 = buff_addr check_param2 = 0x10 check_bvs = check_state.memory.load(check_param1, check_param2) check_constraint = desired_string == check_bvs check_state.add_constraints(check_constraint) print (check_state.solver.eval (password, cast_to = bytes )) if __name__ == "__main__" : main(sys.argv)
运行结果:
(angr) ➜ dist git:(master) ✗ python3 s.py 08_angr_constraints ..... b'LGCRCDGJHYUNGUJB'
验证一下:
(angr) ➜ dist git:(master) ✗ ./08_angr_constraints Enter the password: LGCRCDGJHYUNGUJB Good Job.
09_angr_hooks 这里学习使用angr的hook技术解决路径爆炸问题, 由于angr支持多参数, 因此不需要之前复杂的写法, 之前主要为了学习
主函数如下:
int __cdecl main (int argc, const char **argv, const char **envp) { _BOOL4 v3; signed int i; signed int j; qmemcpy(password, "XYMKBKUHNIQYNQXE" , 16 ); memset (buffer, 0 , 0x11 u); printf ("Enter the password: " ); __isoc99_scanf("%16s" , buffer); for ( i = 0 ; i <= 15 ; ++i ) *(_BYTE *)(i + 134520916 ) = complex_function(*(char *)(i + 134520916 ), 18 - i); equals = check_equals_XYMKBKUHNIQYNQXE((int )buffer, 0x10 u); for ( j = 0 ; j <= 15 ; ++j ) *(_BYTE *)(j + 134520900 ) = complex_function(*(char *)(j + 134520900 ), j + 9 ); __isoc99_scanf("%16s" , buffer); v3 = equals && !strncmp (buffer, password, 0x10 u); equals = v3; if ( v3 ) puts ("Good Job." ); else puts ("Try again." ); return 0 ; }
这里我们主要hook掉check_equals_XYMKBKUHNIQYNQXE()
函数, 方法如下
.text:080486AC push 10h .text:080486AE push offset buffer .text:080486B3 call check_equals_XYMKBKUHNIQYNQXE .text:080486B8 add esp, 10h
首先找到需要hook的地址0x080486B3
接着需要hook的长度, 由于我们hook掉call check_equals_XYMKBKUHNIQYNQXE
, 指令长度为5
然后在脚本中进行hook
check_equals_called_address = 0x80486B3 instruction_to_skip_length = 5 @project.hook(check_equals_called_address, length=instruction_to_skip_length )
然后对函数进行模拟, 紧跟在@project.hook
语句之后
def skip_check_equals_ (state ): user_input_buff_address = 0x804a054 user_input_buff_length = 0x10 user_input_string = state.memory.load( user_input_buff_address, user_input_buff_length ) check_against_string = "XKSPZSJKJYQCQXZV" state.regs.eax = claripy.If ( user_input_string == check_against_string, claripy.BVV(1 , 32 ), claripy.BVV(0 , 32 ) )
脚本 import angrimport claripyimport sysdef main (argv ): bin_path = argv[1 ] project = angr.Project(bin_path) initial_state = project.factory.entry_state() check_equals_called_address = 0x80486B3 instruction_to_skip_length = 5 @project.hook(check_equals_called_address, length=instruction_to_skip_length ) def skip_check_equals_ (state ): user_input_buff_address = 0x804a054 user_input_buff_length = 16 user_input_string = state.memory.load( user_input_buff_address, user_input_buff_length ) check_against_string = "XKSPZSJKJYQCQXZV" state.regs.eax = claripy.If ( user_input_string == check_against_string, claripy.BVV(1 , 32 ), claripy.BVV(0 , 32 ) ) simulation = project.factory.simgr(initial_state) def is_successful (state ): stdout_output = state.posix.dumps(1 ) return b"Good Job." in stdout_output def should_abort (state ): stdout_output = state.posix.dumps(1 ) return b"Try again." in stdout_output simulation.explore(find = is_successful, avoid = should_abort) if simulation.found: print (simulation.found[0 ].posix.dumps(0 )) else : raise (Exception("Could not find the solution" )) if __name__ == "__main__" : main(sys.argv)
运行:
(angr) ➜ dist git:(master) ✗ python3 s.py 09_angr_hooks ...... 'ZXIDRXEORJOTFFJNWUFAOUBLOGLQCCGK'
验证
(angr) ➜ dist git:(master) ✗ ./09_angr_hooks Enter the password: ZXIDRXEORJOTFFJNWUFAOUBLOGLQCCGK Good Job.
10_angr_simprocedures 这里学习如何用函数名对函数进行hook
用IDA打开以后, 发现在很多的地方进行了check_equals_ORSDDWXHZURJRBDH()
虽然用IDA的优化我们可以很快定位最终执行的是哪里的check函数, 但是这并不是我们学习的重点, 这里我们主要学习如何用函数名对函数进行hook
Hook方法如下:
class mySimPro (angr.SimProcedure ): def run (self, user_input, user_input_length ): angr_bvs = self.state.memory.load ( user_input, user_input_length ) check_string = "ORSDDWXHZURJRBDH" return claripy.If ( check_string == angr_bvs, claripy.BVV(1 , 32 ), claripy.BVV(0 , 32 ) ) check_symbol = "check_equals_ORSDDWXHZURJRBDH" project.hook_symbol(check_symbol, mySimPro())
脚本 import angrimport claripyimport sysdef main (argv ): bin_path = argv[1 ] project = angr.Project(bin_path) initial_state = project.factory.entry_state() class mySimPro (angr.SimProcedure ): def run (self, user_input, user_input_length ): angr_bvs = self.state.memory.load ( user_input, user_input_length ) check_string = "ORSDDWXHZURJRBDH" return claripy.If ( check_string == angr_bvs, claripy.BVV(1 , 32 ), claripy.BVV(0 , 32 ) ) check_symbol = "check_equals_ORSDDWXHZURJRBDH" project.hook_symbol(check_symbol, mySimPro()) simulation = project.factory.simgr(initial_state) def is_successful (state ): stdout_output = state.posix.dumps(1 ) return b"Good Job." in stdout_output def should_abort (state ): stdout_output = state.posix.dumps(1 ) return b"Try again." in stdout_output simulation.explore(find = is_successful, avoid = should_abort) if simulation.found: print (simulation.found[0 ].posix.dumps(0 )) else : raise (Exception("Could not find the solution" )) if __name__ == "__main__" : main(sys.argv)
运行结果:
(angr) ➜ dist git:(master) ✗ python3 s.py 10_angr_simprocedures ...... 'MSWKNJNAVTTOZMRY'
验证:
(angr) ➜ dist git:(master) ✗ ./10_angr_simprocedures Enter the password: MSWKNJNAVTTOZMRY Good Job.
11_angr_sim_scanf 这里主要学习hookscanf
函数
Hook思路和10差不多
首先知道函数名 编写一个类来替代它 然后对函数进行hook 脚本 import angrimport claripyimport sysdef main (argv ): bin_path = argv[1 ] project = angr.Project(bin_path) initial_state = project.factory.entry_state() class ReplacementScanf (angr.SimProcedure ): def run (self, format_string, scanf0_address, scanf1_address ): scanf0 = claripy.BVS('scanf0' , 32 ) scanf1 = claripy.BVS('scanf1' , 32 ) self.state.memory.store(scanf0_address, scanf0, endness = project.arch.memory_endness) self.state.memory.store(scanf1_address, scanf1, endness = project.arch.memory_endness) self.state.globals ['solutions' ] = (scanf0, scanf1) scanf_symbol = "__isoc99_scanf" project.hook_symbol(scanf_symbol, ReplacementScanf()) simulation = project.factory.simgr(initial_state) def is_successful (state ): stdout_output = state.posix.dumps(1 ) return b"Good Job." in stdout_output def should_abort (state ): stdout_output = state.posix.dumps(1 ) return b"Try again." in stdout_output simulation.explore(find = is_successful, avoid = should_abort) if simulation.found: solution_state = simulation.found[0 ] stored_solutions = solution_state.globals ['solutions' ] scanf0_solution = solution_state.solver.eval (stored_solutions[0 ], cast_to = bytes ) scanf1_solution = solution_state.solver.eval (stored_solutions[1 ], cast_to = bytes ) print (scanf0_solution, scanf1_solution) else : raise (Exception("Could not find the solution" )) if __name__ == "__main__" : main(sys.argv)
运行:
(angr) ➜ dist git:(master) ✗ python3 s.py 11_angr_sim_scanf ...... 1448564819 1398294103
验证:
(angr) ➜ dist git:(master) ✗ ./11_angr_sim_scanf Enter the password: 1448564819 1398294103 Good Job.
12_angr_veritesting 学习使用Veritesting
的技术解决路径爆炸问题
Veritesting - 结合静态符号执行和动态符号执行 - 把限制式全部合并到一条路径上 - 减少 path explosion 的影响 project.factory.simgr(initial_state, veritesting=True )
IDA打开, 其中这个循环会在二叉决策的时候导致路径爆炸
for ( i = 0 ; i <= 31 ; ++i ){ v5 = *((char *)s + i + 3 ); if ( v5 == complex_function(75 , i + 93 ) ) ++v15; }
脚本 import angrimport claripyimport sysdef main (argv ): bin_path = argv[1 ] project = angr.Project(bin_path) initial_state = project.factory.entry_state() simulation = project.factory.simgr(initial_state, veritesting = True ) def is_successful (state ): stdout_output = state.posix.dumps(1 ) return b"Good Job." in stdout_output def should_abort (state ): stdout_output = state.posix.dumps(1 ) return b"Try again." in stdout_output simulation.explore(find = is_successful, avoid = should_abort) if simulation.found: solution_state = simulation.found[0 ] print (solution_state.posix.dumps(0 )) else : raise (Exception("Could not find the solution" )) if __name__ == "__main__" : main(sys.argv)
运行结果
(angr) ➜ dist git:(master) ✗ python3 s.py 12_angr_veritesting ...... b'OQSUWYACEGIKMOQSUWYACEGIKMOQSUWY'
验证
(angr) ➜ dist git:(master) ✗ ./12_angr_veritesting Enter the password: OQSUWYACEGIKMOQSUWYACEGIKMOQSUWY Good Job.
13_angr_static_binary 学习angr如何求解一个静态编译的程序
第13题跟以往的题都不一样, 因为它是静态编译的
➜ 13_angr_static_binary git:(master) ✗ file 13_angr_static_binary 13_angr_static_binary:...., statically linked, .....
angr已经为我们提供了这些静态函数, 这里列举一些常用的函数 我们需要找到函数中使用静态函数的地址, 然后对其进行hook
脚本 import angrimport claripyimport sysdef main (argv ): bin_path = argv[1 ] project = angr.Project(bin_path) initial_state = project.factory.entry_state() simulation = project.factory.simgr(initial_state) project.hook(0x804ed40 , angr.SIM_PROCEDURES['libc' ]['printf' ]()) project.hook(0x804ed80 , angr.SIM_PROCEDURES['libc' ]['scanf' ]()) project.hook(0x804f350 , angr.SIM_PROCEDURES['libc' ]['puts' ]()) project.hook(0x8048d10 , angr.SIM_PROCEDURES['glibc' ]['__libc_start_main' ]()) def is_successful (state ): stdout_output = state.posix.dumps(1 ) return b"Good Job." in stdout_output def should_abort (state ): stdout_output = state.posix.dumps(1 ) return b"Try again." in stdout_output simulation.explore(find = is_successful, avoid = should_abort) if simulation.found: solution_state = simulation.found[0 ] print (solution_state.posix.dumps(0 )) else : raise (Exception("Could not find the solution" )) if __name__ == "__main__" : main(sys.argv)
运行
(angr) ➜ dist git:(master) ✗ python3 s.py 13_angr_static_binary ...... b'PNMXNMUD'
验证:
(angr) ➜ dist git:(master) ✗ ./13_angr_static_binary Enter the password: PNMXNMUD Good Job.
去掉hook以后反正我是没跑出来……
14_angr_shared_library 这题主要学习如何分析不是典型程序的二进制文件
IDA打开以后, 主函数如下:
int __cdecl main (int argc, const char **argv, const char **envp) { char s; unsigned int v5; v5 = __readgsdword(0x14 u); memset (&s, 0 , 0x10 u); print_msg(); printf ("Enter the password: " ); __isoc99_scanf("%8s" , &s); if ( validate((int )&s, 8 ) ) puts ("Good Job." ); else puts ("Try again." ); return 0 ; }
其中validate()
是一个外部函数, 查看导入表即可知
接着我们打开题目所给lib14_angr_shared_library.so
文件
validate()
伪C代码如下:
_BOOL4 __cdecl validate (char *s1, int a2) { char *v3; char s2[4 ]; int v5; int j; int i; if ( a2 <= 7 ) return 0 ; for ( i = 0 ; i <= 19 ; ++i ) s2[i] = 0 ; *(_DWORD *)s2 = 0x474B4C57 ; v5 = 0x48574A4C ; for ( j = 0 ; j <= 7 ; ++j ) { v3 = &s1[j]; *v3 = complex_function(s1[j], j); } return strcmp (s1, s2) == 0 ; }
其中s1
是password
, a2
是字符串长度8
那么我们直接对lib
进行符号执行求解
由于shared library
使用的是跟地址无关的代码, 每次使用都是基址+偏移
, 因此我们需要设定基址
的值 base = 0x4000000 project = angr.Project(path_to_binary, load_options={ 'main_opts' : { 'custom_base_addr' : base } })
.text:000006D7 ; int __cdecl validate(char *s1, int) .text:000006D7 public validate .text:000006D7 validate proc near ; DATA XREF: LOAD:00000250↑o .text:000006D7
则validate()
函数的地址为
validate_function_address = base + 0x6d7
buffer_pointer = claripy.BVV(0x3000000 , 32 ) initial_state = project.factory.call_state(validate_function_address, buffer_pointer, claripy.BVV(8 , 32 ))
其中, buffer_pointer
主要用于存储我们的password
, claripy.BVV(8, 32)
则是该函数的另一个参数字符串长度
接下来就是创建我们的符号向量, 并存储入buff_pointer
password = claripy.BVS('password' , 8 * 8 ) initial_state.memory.store(buffer_pointer, password)
simgr = project.factory.simgr(initial_state) success_address = base + 0x783 simgr.explore(find=success_address)
if simgr.found: solution_state = simgr.found[0 ] solution_state.add_constraints(solution_state.regs.eax != 0 ) solution = solution_state.solver.eval (password, cast_to=bytes ) print (solution) else : raise Exception('Could not find the solution' )
脚本 import angrimport claripyimport sysdef main (argv ): path_to_binary = argv[1 ] base = 0x4000000 project = angr.Project(path_to_binary, load_options={ 'main_opts' : { 'custom_base_addr' : base } }) buffer_pointer = claripy.BVV(0x3000000 , 32 ) validate_function_address = base + 0x6d7 initial_state = project.factory.call_state( validate_function_address, buffer_pointer, claripy.BVV(8 , 32 )) password = claripy.BVS('password' , 8 *8 ) initial_state.memory.store(buffer_pointer, password) simgr = project.factory.simgr(initial_state) success_address = base + 0x783 simgr.explore(find=success_address) if simgr.found: solution_state = simgr.found[0 ] solution_state.add_constraints(solution_state.regs.eax != 0 ) solution = solution_state.solver.eval (password, cast_to=bytes ) print (solution) else : raise Exception('Could not find the solution' ) if __name__ == '__main__' : main(sys.argv)
运行结果:
(angr) ➜ 14_angr_shared_library git:(master) ✗ python3 s.py lib14_angr_shared_library.so ...... b'WWGNDMKG'
验证, 直接运行会导致错误:
➜ 14_angr_shared_library git:(master) ✗ ./14_angr_shared_library ./14_angr_shared_library: error while loading shared libraries: lib14_angr_shared_library.so: cannot open shared object file: No such file or directory
主要原因是这个程序找不到需要动态链接的这个库, 我们可以用如下命令进行解决:
LD_LIBRARY_PATH=. ./14_angr_shared_library
其中LD_LIBRARY_PATH=.
是告诉14_angr_shared_library
在当前路径下寻找链接的动态库.
验证结果, 正确
(angr) ➜ 14_angr_shared_library git:(master) ✗ LD_LIBRARY_PATH=. ./14_angr_shared_library placeholder Enter the password: WWGNDMKG Good Job.
15_angr_arbitrary_read 该题学会如何任意读, 有点pwn的感觉
IDA打开, 主逻辑很简单
int __cdecl main (int argc, const char **argv, const char **envp) { char v4; char *s; s = try_again; print_msg(); printf ("Enter the password: " ); __isoc99_scanf("%u %20s" , &key, &v4); if ( key == 0x129B961 ) puts (s); else puts (try_again); return 0 ; }
我们的目的是最终输出Good jobs
我们查看这里的栈布局:
-0000001C var_1C db ? -0000001B db ? ; undefined -0000001A db ? ; undefined -00000019 db ? ; undefined -00000018 db ? ; undefined -00000017 db ? ; undefined -00000016 db ? ; undefined -00000015 db ? ; undefined -00000014 db ? ; undefined -00000013 db ? ; undefined -00000012 db ? ; undefined -00000011 db ? ; undefined -00000010 db ? ; undefined -0000000F db ? ; undefined -0000000E db ? ; undefined -0000000D db ? ; undefined -0000000C s dd ?
这里var_1C
就是我们的v4
, 很显然, 可输入的字符串刚刚好可以让我们覆盖到s
这次我们仍然需要hookscanf
函数, 只不过这次我们加入了限定条件, 限定为可见字符
class ReplacementScanf (angr.SimProcedure ): def run (self, format_string, param0, param1 ): scanf0 = claripy.BVS('scanf0' , 32 ) scanf1 = claripy.BVS('scanf1' , 20 *8 ) for char in scanf1.chop(bits=8 ): self.state.add_constraints(char >= 'A' , char <= 'Z' ) scanf0_address = param0 self.state.memory.store(scanf0_address, scanf0, endness=project.arch.memory_endness) scanf1_address = param1 self.state.memory.store(scanf1_address, scanf1) self.state.globals ['solutions' ] = (scanf0, scanf1) scanf_symbol = '__isoc99_scanf' project.hook_symbol(scanf_symbol, ReplacementScanf())