Vulnhalla: Picking the true vulnerabilities from the CodeQL haystack

Vulnhalla: Picking the true vulnerabilities from the CodeQL haystack
研究人员开发开源工具Vulnhalla，结合CodeQL和GPT-4o过滤代码分析中的误报，两天内发现Linux Kernel等项目的多个CVE漏洞，大幅减少工作量和成本。 2025-12-21 10:47:14 Author: www.reddit.com(查看原文) 阅读量:2 收藏

Full disclosure: I'm a researcher at CyberArk Labs.

This is a technical deep dive from our threat research team, no marketing fluff, just code and methodology.
Static analysis tools like CodeQL are great at identifying "maybe" issues, but the signal-to-noise ratio is often overwhelming. You get thousands of alerts, and manually triaging them is impossible.

We built an open-source tool, Vulnhalla, to address this issue. It queries CodeQL's "haystack" into GPT-4o, which reasons about the code context to verify if the alert is legitimate.

The sheer volume of false positives often tricks us into thinking a codebase is "clean enough" just because we can't physically get through the backlog. This creates a significant amount of frustration for us. Still, the vulnerabilities remain, hidden in the noise.
Once we used GPT-4o to strip away ~96% of the false positives, we uncovered confirmed CVEs in the Linux Kernel, FFmpeg, Redis, Bullet3, and RetroArch. We found these in just 2 days of running the tool and triaging the output (total API cost <$80).
Running the tool for longer periods, with improved models, can reveal many additional vulnerabilities.
Write-up & Tool:

Technical Blog:https://www.cyberark.com/resources/threat-research-blog/vulnhalla-picking-the-true-vulnerabilities-from-the-codeql-haystack
GitHub:https://github.com/cyberark/Vulnhalla

文章来源: https://www.reddit.com/r/netsec/comments/1ps3taw/vulnhalla_picking_the_true_vulnerabilities_from/
如有侵权请联系:admin#unsafe.sh