NVIDIA nvdisasm REL section header parsing out-of-bounds write vulnerability
NVIDIA nvdisasm 12.9.88 存在 REL 节头解析越界写入漏洞,可致任意代码执行。 2025-9-24 00:1:0 Author: talosintelligence.com(查看原文) 阅读量:7 收藏

SUMMARY

An out-of-bounds write vulnerability exists in the REL section header parsing functionality of NVIDIA nvdisasm 12.9.88. A specially crafted ELF file can lead to a arbitrary code execution. An attacker can provide a malicious file to trigger this vulnerability.

CONFIRMED VULNERABLE VERSIONS

The versions below were either tested or verified to be vulnerable by Talos or confirmed to be vulnerable by the vendor.

NVIDIA nvdisasm 12.9.88

PRODUCT URLS

nvdisasm - https://docs.nvidia.com/cuda/cuda-binary-utilities/index.html

CVSSv3 SCORE

7.8 - CVSS:3.1/AV:L/AC:L/PR:N/UI:R/S:U/C:H/I:H/A:H

CWE

CWE-122 - Heap-based Buffer Overflow

DETAILS

The nvdisasm tool provided by the Nvidia CUDA Toolkit is used to display information about CUDA ELF files. Apart from the disassembly of CUDA compiled binary code, it is capable of displaying control flow graphs, register life-ranges, debug information etc.

The nvdisasm tool provided by the Nvidia CUDA Toolkit is used to display information about CUDA ELF files. Apart from the disassembly of CUDA compiled binary code, it is capable of displaying control flow graphs, register life-ranges, debug information etc.

The REL section of an ELF file contains relocation information needed when the binary is loaded into memory for execution. A section header for a section is described in the ELF specification. In our case, nvdisasm handles 32-bit ELF files so we use the relevant definition:

typedef uint16 Elf32_Half;
typedef uint32 Elf32_Word;
typedef uint32 Elf32_Addr;
typedef uint32 Elf32_Off;

typedef struct {
    Elf32_Word sh_name;
    Elf32_Word sh_type;
    Elf32_Word sh_flags;
    Elf32_Addr sh_addr;
    Elf32_Off  sh_offset;
    Elf32_Word sh_size;
    Elf32_Word sh_link;
    Elf32_Word sh_info;
    Elf32_Word sh_addralign;
    Elf32_Word sh_entsize;
} Elf32_Shdr;

nvdisasm parses the REL sections of an ELF file in the function at 0x45a670. At offset 0x45bf69, the sh_size for the current REL section is moved to the rsi register. Then at (1) and (2), a shift-right with 3 and a shift-left with 4 is performed in order to calculate an allocation size parameter for the internal allocation function at (3). The instructions at (1) and (2) effectively discard the last 3 bits of the sh_size and then multiply it by 2.

0045bf69  mov     rsi, qword [rsp {var_b8_1}]
0045bf6d  mov     rdi, qword [rax+0x18]
0045bf71  shr     rsi, 0x3                              (1)
0045bf75  shl     rsi, 0x4                              (2)
0045bf79  call    sub_410af0                            (3)

The code then continues to calculate the size of the buffer again, at offset 0x045bfbf we have:

0045b94b  lea     r14, [rbx+rsi]                        (4.a)
...
0045bfbf  lea     rax, [r14+0x7]                        (4.b)
0045bfc3  lea     rsi, [rbx+0x8]                        (4.c)
0045bfc7  sub     rax, rsi                              (4.d)

Simplifying the sub operation at (4.d) by taking into account the previous instructions, we have:

buffer_size = rax - rsi = (r14 + 0x7) - (rbx + 0x8) = (rbx + rsi + 0x7) - (rbx + 0x8) = rsi - 1 = sh_size - 1

In short, the sh_size is decremented by 1. Then, we have a series of shift operations:

0045bfca  mov     r8, rax
0045bfcd  shr     r8, 0x3
0045bfd1  add     r8, 0x1

Simplifying, we have:

offset = (((sh_size - 1) >> 3) + 1)

Later, the code at 0x045c092 performs an AND operation to the offset with the value 0xfffffffffffffffc, effectively omitting the last 3 bits of the value at (6). At (7), this value is being shifted to the left by 4 and then added to the rcx register at (8), which holds the pointer to the allocated buffer previously at (3).

0045c092  mov     rax, r8                               // offset
0045c095  and     rax, 0xfffffffffffffffc               (6)
0045c099  mov     rdx, rax                              (7.a)
...
0045c0a0  shl     rdx, 0x4                              (7.b)
0045c0a4  add     rcx, rdx                              (8)
0045c0a7  cmp     rax, r8                               (9)
0045c0aa  je      0x45b960                              (10)

Effectively what this code does is to add the size of the buffer to the pointer at (8). Then, if the offset before and after the AND operation at (6) is the same, the jump at (10) will be executed. Alternatively, if the comparison is false, meaning that the offset value is not aligned with 4, execution will proceed to 0x45c0b0.

0045c0b0  lea     rsi, [rbx+0x8]
0045c0b4  mov     eax, dword [rbx]                      (11)
0045c0b9  mov     qword [rcx], rax                      (12)

At (11) the code copies a 4-byte value from the input to the rax register and stores it to the memory location pointed by rcx. Note however that this pointer was incremented by the total size of the allocated buffer previously at (8), meaning that rcx points beyond the allocated buffer, leading to a heap out-of-bounds write with controlled data.

The code continues to store input data to various offsets using the rcx register which leaves much room for successful exploitation.

0045c0ce  mov     qword [rcx+0x8], rax
...
0045c0d8  mov     eax, dword [rbx+0x8]
0045c0db  mov     edx, dword [rbx+0xc]
0045c0de  mov     qword [rcx+0x10], rax
...
0045c0ee  add     rax, rdx
0045c0f1  mov     qword [rcx+0x18], rax

As we saw earlier, in order for the vulnerable code to be reached, the jump instruction at (10) must not be executed. This means that for a REL section, the following condition must be true:

(((sh_size - 1) >> 3) + 1) & 0x3 != 0

It is trivial to write code that find the appropriate values for this condition:

#!/usr/bin/env python3

    def calc_offset(sh_size):
        return ((sh_size - 1) >> 3) + 1

    count = 0

    for sh_size in range(0xff):
        offset = calc_offset(sh_size)

        if (offset & 0x3) != 0:
            count += 1
            print(f"sh_size: {sh_size:#04x}, offset: {offset:#04x}")

print(f"Total valid sh_size vulnerable values: {count}")

Executing the code we get:

...
sh_size: 0x17, offset: 0x03
sh_size: 0x18, offset: 0x03
sh_size: 0x21, offset: 0x05
...

Crash Information

Using valgrind we get:

==141955== Memcheck, a memory error detector
==141955== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al. 
==141955== Using Valgrind-3.19.0 and LibVEX; rerun with -h for copyright info
==141955== Command: ./nvdisasm-12.9.88 --life-range-mode wide --print-instr-offsets-cfg --print-life-ranges --print-line-info --print-line-info-inline --print-line-info-   +++ptx --sort-sections inputs/queue/id:002880,time:0,execs:0,orig:id:001568,src:001545+000635,time:4890772,execs:963632,op:splice,rep:4
==141955== Parent PID: 115760
==141955== 
==141955== VALGRIND_ERROR_START
==141955== Invalid write of size 8
==141955==    at 0x45C0B9: ??? (in /home/dtatsis/nvdisasm-fuzz/nvdisasm-12.9.88)
==141955==    by 0x45C775: ??? (in /home/dtatsis/nvdisasm-fuzz/nvdisasm-12.9.88)
==141955==    by 0x403041: ??? (in /home/dtatsis/nvdisasm-fuzz/nvdisasm-12.9.88)
==141955==    by 0x488B249: (below main) (libc_start_call_main.h:58)
==141955==  Address 0x4a55298 is 0 bytes after a block of size 328 alloc'd
==141955==    at 0x48407B4: malloc (vg_replace_malloc.c:381)
==141955==    by 0x45D619: ??? (in /home/dtatsis/nvdisasm-fuzz/nvdisasm-12.9.88)
==141955==    by 0x410E79: ??? (in /home/dtatsis/nvdisasm-fuzz/nvdisasm-12.9.88)
==141955==    by 0x45BF7D: ??? (in /home/dtatsis/nvdisasm-fuzz/nvdisasm-12.9.88)
==141955==    by 0x45C775: ??? (in /home/dtatsis/nvdisasm-fuzz/nvdisasm-12.9.88)
==141955==    by 0x403041: ??? (in /home/dtatsis/nvdisasm-fuzz/nvdisasm-12.9.88)
==141955==    by 0x488B249: (below main) (libc_start_call_main.h:58)
==141955== 
==141955== VALGRIND_ERROR_END
==141955== VALGRIND_ERROR_START
==141955== Invalid write of size 8
==141955==    at 0x45C0CE: ??? (in /home/dtatsis/nvdisasm-fuzz/nvdisasm-12.9.88)
==141955==    by 0x45C775: ??? (in /home/dtatsis/nvdisasm-fuzz/nvdisasm-12.9.88)
==141955==    by 0x403041: ??? (in /home/dtatsis/nvdisasm-fuzz/nvdisasm-12.9.88)
==141955==    by 0x488B249: (below main) (libc_start_call_main.h:58)
==141955==  Address 0x4a552a0 is 8 bytes after a block of size 328 alloc'd
==141955==    at 0x48407B4: malloc (vg_replace_malloc.c:381)
==141955==    by 0x45D619: ??? (in /home/dtatsis/nvdisasm-fuzz/nvdisasm-12.9.88)
==141955==    by 0x410E79: ??? (in /home/dtatsis/nvdisasm-fuzz/nvdisasm-12.9.88)
==141955==    by 0x45BF7D: ??? (in /home/dtatsis/nvdisasm-fuzz/nvdisasm-12.9.88)
==141955==    by 0x45C775: ??? (in /home/dtatsis/nvdisasm-fuzz/nvdisasm-12.9.88)
==141955==    by 0x403041: ??? (in /home/dtatsis/nvdisasm-fuzz/nvdisasm-12.9.88)
==141955==    by 0x488B249: (below main) (libc_start_call_main.h:58)
==141955== 
==141955== VALGRIND_ERROR_END
==141955== 
==141955== HEAP SUMMARY:
==141955==     in use at exit: 26,737 bytes in 153 blocks
==141955==   total heap usage: 244 allocs, 91 frees, 42,944 bytes allocated
==141955== 
==141955== For a detailed leak analysis, rerun with: --leak-check=full
==141955== 
==141955== For lists of detected and suppressed errors, rerun with: -s
==141955== ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 0 from 0)
TIMELINE

2025-06-30 - Vendor Disclosure
2025-09-23 - Vendor Patch Release
2025-09-24 - Public Release

Discovered by Dimitrios Tatsis of Cisco Talos.


文章来源: https://talosintelligence.com/vulnerability_reports/TALOS-2025-2204
如有侵权请联系:admin#unsafe.sh