By Mark Brand, Project Zero
In 2018, in the v8.5a version of the ARM architecture, ARM proposed a hardware implementation of tagged memory, referred to as MTE (Memory Tagging Extensions).
In Part 1 we discussed testing the technical (and implementation) limitations of MTE on the hardware that we've had access to. This post will now consider the implications of what we know on the effectiveness of MTE-based mitigations in several important products/contexts.
To summarize - there are two key classes of bypass techniques for memory-tagging based mitigations, and these are the following (for some examples, see Part 1):
There are two main modes for MTE enforcement:
Since Spectre, it has been clear that using standard memory-tagging approaches as a "hard probabilistic mitigation"1 is not generally possible. In any context where an attacker can construct a speculative side-channel, known-tag-bypasses are a fundamental weakness that must be accounted for.
Another proposed approach is combining MTE with another software approach to construct a "hard deterministic mitigation"2. The primary example of this would be the *Scan+MTE combinations proposed in Chrome to mitigate use-after-free vulnerabilities by ensuring that a tag is not re-used for an allocation while there are any stale pointers pointing to that allocation.
In this case, is async-MTE sufficient as an effective mitigation? We have demonstrated techniques that allow an unknown-tag-bypass when using async-MTE, so it seems clear that for a "hard" mitigation, (at least) sync-MTE will be necessary. This should not, however, be interpreted as implying that such a "soft" mitigation would not prove a significant inconvenience to attackers - we're going to discuss that in detail below.
In order to understand the "additional difficulty" that attackers will face in writing exploits that can bypass MTE based mitigations, we need to consider carefully the context in which the attacker finds themself.
We have made some assumptions here about the target reliability that a high-tier attacker would want as around 95% - this is likely lower than currently expected in most cases, but probably higher than the absolute limit at which an exploit might become too unreliable for their operational needs. We also note that in some contexts an attacker might be able to use even an extremely unreliable exploit without significantly increasing their risk of detection. While we'd expect attackers to desire (and invest in) achieving reliability, it's unlikely that even if we could force an upper bound to that reliability this would be enough to completely prevent exploitation. However, any such drop in reliability should generally be expected to increase detection of in-the-wild usage of exploits, increasing the risk to attackers accordingly. An additional note is that most unknown-tag-bypasses would be prevented by the use of sync-MTE, at least in the absence of specific weaknesses in the application which would over time likely be fixed as exploits are observed exploiting those weaknesses.
We consider 4 main contexts here, as we believe these are the most relevant/likely use-cases for a usermode mitigation:
Context |
Mode |
Bypass techniques | |||||
known-tag-bypass |
unknown-tag-bypass | ||||||
Chrome: Renderer Exploit |
async |
Trivial ♻️ |
Likely trivial ♻️ | ||||
sync |
Trivial ♻️ |
Bypass techniques should be rare 🛠️ | |||||
Chrome: IPC Sandbox Escape |
async |
Likely possible in many cases ♻️ |
Likely possible in many cases 🐛* | ||||
sync |
Likely possible in many cases ♻️ |
Bypass techniques should be rare 🛠️ | |||||
Android: Binder Sandbox Escape |
async |
Difficulty will depend on service |
Difficulty will depend on service 🐛* | ||||
sync |
Difficulty will depend on service |
Bypass techniques should be rare 🛠️ | |||||
Android: Messaging App Oneshot |
async |
Likely impossible in most cases |
Good enough bugs will be very rare 🐛* | ||||
sync |
Likely impossible in most cases |
Bypass techniques should be rare 🛠️ |
The degree of pain for attackers caused by needing to bypass MTE is roughly assessed from low to very high.
♻️: Once developed, generic bypass technique can likely be shared between exploits.
🛠️: Limited supply of bypass techniques that could be fixed, eventually eliminating this bypass.
🐛: Additional constraints imposed by bypass techniques mean that the subset of issues that are exploitable is significantly reduced with MTE.
* Note that it's also potentially possible to design software to make exploitation of these unknown-tag-bypass techniques more restrictive by eg. inserting system calls in specific choke-points. We haven't investigated the practicality or limitations of such an approach at this time, but it is unlikely to be generally applicable especially where third-party code is involved.
Spectre-type speculative side-channels can be used to break tag confidentiality, so known-tag-bypasses are a certainty.
For unknown-tag-bypasses, javascript should in most situations be able to avoid system calls for the duration of their exploit, so the exploit needs to complete within the soft time constraint. It's likely that both known-tag-bypass and unknown-tag-bypass techniques can be made generic and reused across multiple different vulnerabilities.
It is likely that Spectre-type speculative side-channels can be used to break tag confidentiality, so known-tag-bypasses are a possibility.
For unknown-tag-bypasses, we believe that there are circumstances under which multiple IPC messages can be processed without a system call in between. This does not hold for all situations, and the current public state-of-the-art techniques for Browser process IPC exploitation would perform multiple system calls and would not be suitable.
It's possible that a generic speculative-side-channel attack could be developed that would allow reuse of techniques for known-tag-bypasses against a specific privileged process (likely the Browser process). Any unknown-tag-bypass exploit would likely require additional per-bug cost to develop due to the additional complexity required in avoiding system calls for the duration of the exploit.
It is likely that Spectre-type speculative side-channels can be used to break tag confidentiality, so known-tag-bypasses are a possibility.
As there are a large number of different Binder IPC services hosted in different processes, a known-tag-bypass technique is less likely to be reusable between different vulnerabilities. The exploitation of unknown-tag-bypasses is unlikely to be reusable, and will require specific per-vulnerability development.
For unknown-tag-bypasses we note that there is a mandatory system call ioctl(..., BINDER_WRITE_READ, …) between different IPC messages. This implies that an exploit would need to complete within the execution of the triggering IPC message in addition to meeting the soft time constraint.
It is unlikely that Spectre-type speculative side-channels can be used to break tag confidentiality, so known-tag-bypasses are highly unlikely, unless the application allows alternative side channels (or eg. repeated exploitation attempts until the tag is correct).
For unknown-tag-bypasses, attackers will need a very good one-shot bug, likely in complex file-format parsing. An exploit would still need to meet the soft time constraint.
It's likely that exploitation techniques here will be bespoke and both per-application and per-vulnerability, even if there is significant shared code between different messaging applications.
Part 3 continues this series with a more detailed discussion of the specifics of applying MTE to the kernel, which has some additional nuance (and may still contain some information of interest even if you're only interested in user-space applications).