演示未知TCP端口的两种调试场景
$ nc -4 -l 12345$ netstat -natp | grep 12345
tcp 0 0 0.0.0.0:12345 0.0.0.0:* LISTEN 94352/nc
$ lsof -lnPR -i 4tcp | grep 12345
nc 94352 98085 0 3u IPv4 31876655 0t0 TCP *:12345 (LISTEN)
主套接字对应3号句柄
$ gdb -q -nx -x /tmp/gdbinit_x64.txt -x "/tmp/ShellPipeCommand.py" -x "/tmp/GetOffset.py" -ex 'display/5i $pc' -p 94352(gdb) b *accept
Breakpoint 1 at 0x7ffff781a750
(gdb) info symbol 0x7ffff781a750
accept in section .text of /lib64/libc.so.6
这次用了libc提供的accept()
(gdb) catch syscall accept
Catchpoint 2 (syscall 'accept' [43])
(gdb) c
Continuing.Catchpoint 2 (call to syscall 'accept'), 0x00007ffff781a760 in __accept_nocancel () from /lib64/libc.so.6
1: x/5i $pc
=> 0x7ffff781a760 <__accept_nocancel+7>: cmp rax,0xfffffffffffff001
0x7ffff781a766 <__accept_nocancel+13>: jae 0x7ffff781a799 <accept+73>
0x7ffff781a768 <__accept_nocancel+15>: ret
0x7ffff781a769 <accept+25>: sub rsp,0x8
0x7ffff781a76d <accept+29>: call 0x7ffff7829360 <__libc_enable_asynccancel>
(gdb) bt
#0 0x00007ffff781a760 in __accept_nocancel () from /lib64/libc.so.6
#1 0x0000000000402c12 in ?? ()
#2 0x00007ffff774fd1d in __libc_start_main () from /lib64/libc.so.6
#3 0x0000000000401349 in ?? ()
(gdb) c
必须两次c,本例中"catch syscall accept"会直接命中第一次,服务端nc已经阻塞在"syscall accept"中。
从客户端nc访问服务端12345/TCP:
$ nc -vv -n 192.168.65.25 12345Catchpoint 2 (returned from syscall 'accept'), 0x00007ffff781a760 in __accept_nocancel () from /lib64/libc.so.6
1: x/5i $pc
=> 0x7ffff781a760 <__accept_nocancel+7>: cmp rax,0xfffffffffffff001
0x7ffff781a766 <__accept_nocancel+13>: jae 0x7ffff781a799 <accept+73>
0x7ffff781a768 <__accept_nocancel+15>: ret
0x7ffff781a769 <accept+25>: sub rsp,0x8
0x7ffff781a76d <accept+29>: call 0x7ffff7829360 <__libc_enable_asynccancel>
(gdb) bt
#0 0x00007ffff781a760 in __accept_nocancel () from /lib64/libc.so.6
#1 0x0000000000402c12 in ?? ()
#2 0x00007ffff774fd1d in __libc_start_main () from /lib64/libc.so.6
#3 0x0000000000401349 in ?? ()
注意,"catch syscall accept"命中,"b *accept"未命中。这个很好理解,因为服务端nc侦听端口时rip已经越过libc!accept()的入口点。
"catch syscall accept"命中时rax对应子套接字:
(gdb) i r rax
rax 0x4 4
子套接字对应4号句柄,将来read/write等操作用的是4号句柄。
$ netstat -natp | grep 12345
tcp 0 0 0.0.0.0:12345 0.0.0.0:* LISTEN 94352/nc
tcp 0 0 192.168.65.25:12345 192.168.65.20:36586 ESTABLISHED 94352/nc$ lsof -lnPR -i 4tcp | grep 12345
nc 94352 98085 0 3u IPv4 31876655 0t0 TCP *:12345 (LISTEN)
nc 94352 98085 0 4u IPv4 31880089 0t0 TCP 192.168.65.25:12345->192.168.65.20:36586 (ESTABLISHED)
拦截__read_chk(),句柄为4时断下:
b *__read_chk
commands $bpnum
silent
if ($edi==4)
display
i r rdi rsi rdx
else
c
end
end
这个断点会直接命中,因为nc会去调用__read_chk()。
=> 0x7ffff7831b80 <__read_chk>: sub rsp,0x8
0x7ffff7831b84 <__read_chk+4>: cmp rdx,rcx
0x7ffff7831b87 <__read_chk+7>: ja 0x7ffff7831b9d <__read_chk+29>
0x7ffff7831b89 <__read_chk+9>: movsxd rdi,edi
0x7ffff7831b8c <__read_chk+12>: xor eax,eax
rdi 0x4 4
rsi 0x7fffffff64c0 140737488315584
rdx 0x800 2048
(gdb) bt
#0 0x00007ffff7831b80 in __read_chk () from /lib64/libc.so.6
#1 0x0000000000401a62 in ?? ()
#2 0x0000000000402bca in ?? ()
#3 0x00007ffff774fd1d in __libc_start_main () from /lib64/libc.so.6
#4 0x0000000000401349 in ?? ()(gdb) disas __read_chk
Dump of assembler code for function __read_chk:
=> 0x00007ffff7831b80 <+0>: sub rsp,0x8
0x00007ffff7831b84 <+4>: cmp rdx,rcx
0x00007ffff7831b87 <+7>: ja 0x7ffff7831b9d <__read_chk+29>
0x00007ffff7831b89 <+9>: movsxd rdi,edi
0x00007ffff7831b8c <+12>: xor eax,eax
0x00007ffff7831b8e <+14>: syscall
0x00007ffff7831b90 <+16>: cmp rax,0xfffffffffffff000
0x00007ffff7831b96 <+22>: ja 0x7ffff7831ba2 <__read_chk+34>
0x00007ffff7831b98 <+24>: add rsp,0x8
0x00007ffff7831b9c <+28>: ret
0x00007ffff7831b9d <+29>: call 0x7ffff78316c0 <__chk_fail>
0x00007ffff7831ba2 <+34>: mov rdx,QWORD PTR [rip+0x28d3ff] # 0x7ffff7abefa8
0x00007ffff7831ba9 <+41>: neg eax
0x00007ffff7831bab <+43>: mov DWORD PTR fs:[rdx],eax
0x00007ffff7831bae <+46>: or rax,0xffffffffffffffff
0x00007ffff7831bb2 <+50>: jmp 0x7ffff7831b98 <__read_chk+24>
End of assembler dump.
拦截"syscall read"的某个返回点,句柄为4时断下,显示读取的数据:
b *0x7ffff7831b90
commands $bpnum
silent
if ($edi==4)
display
i r rdi rsi rdx rax
db $rsi $rax
else
c
end
end
在客户端nc中发送"scz@nsfocus",断点命中:
=> 0x7ffff7831b90 <__read_chk+16>: cmp rax,0xfffffffffffff000
0x7ffff7831b96 <__read_chk+22>: ja 0x7ffff7831ba2 <__read_chk+34>
0x7ffff7831b98 <__read_chk+24>: add rsp,0x8
0x7ffff7831b9c <__read_chk+28>: ret
0x7ffff7831b9d <__read_chk+29>: call 0x7ffff78316c0 <__chk_fail>
rdi 0x4 4
rsi 0x7fffffff64c0 140737488315584
rdx 0x800 2048
rax 0xc 12
00007fffffff64c0: 73 63 7a 40 6e 73 66 6f 63 75 73 0a scz@nsfocus.
虽然我是用nc演示服务端,但前述调试技巧是通用的。小结一下:
a) "catch syscall accept"获取子套接字,不要用"b *accept"
b) 在适当位置拦截read/write操作,用子套接字做过滤条件
若"catch syscall read"无BUG,直接用这个
若"catch syscall read"有BUG,需要寻找0x7ffff7831b90这种位置
接下来演示另一种调试场景。服务端nc侦听12345/TCP;客户端nc建立TCP连接,但暂未发送数据。netstat、lsof查看已建立的TCP连接:
$ netstat -natp | grep 12345
tcp 0 0 0.0.0.0:12345 0.0.0.0:* LISTEN 115692/nc
tcp 0 0 192.168.65.25:12345 192.168.65.20:36588 ESTABLISHED 115692/nc$ lsof -lnPR -i 4tcp | grep 12345
nc 115692 98085 0 3u IPv4 32028398 0t0 TCP *:12345 (LISTEN)
nc 115692 98085 0 4u IPv4 32033202 0t0 TCP 192.168.65.25:12345->192.168.65.20:36588 (ESTABLISHED)
gdb attach服务端nc,直接用这个断点:
b *0x7ffff7831b90
commands $bpnum
silent
if ($edi==4)
display
i r rdi rsi rdx rax
db $rsi $rax
else
c
end
end
小结一下:
a) 用客户端nc建立TCP连接,暂未发送数据
b) 用lsof确定TCP连接对应的子套接字
c) 在适当位置拦截read/write操作,用子套接字做过滤条件
第二种调试场景看似和第一种调试场景区别不大,只是借助lsof省去对accept的拦截。但有一些微妙之处,第一种调试场景实际对应服务端阻塞在accpet(),第二种调试场景实际对应服务端阻塞在read()或类似操作。
假设服务端在TCP连接建立后主动向客户端发送数据,为了拦截这个动作,可以用第一种调试场景的技巧,从accept开始,结合"catch syscall write"之类的断点。
为什么特意演示第二种调试场景呢?设想这样一些情况,主套接字与子套接字不在同一进程中使用,甚至不是在父子进程关系中使用,而是完全不同的两个进程;即便主套接字与子套接字在父子进程关系中使用,但你不想在gdb中处理fork/vfork。
这两种调试场景都利用了TCP长连接的特点,相应技巧不适用于UDP通信。