$300,000 RCE @ Wordpress
2021-10-11 17:06:14 Author: medium.com(查看原文) 阅读量:43 收藏

When you first start reviewing Wordpress you’ll be immediately hit by what looks like weak system design and poor programming practices.

SQL Injection

Developers writing SQL queries in Wordpress involves, in many cases, string concatenation and that developers remember to sanitize their inputs.

The following is an example in the class WP_Term_Query which is used to query the Wordpress taxonomy, a feature that offers Wordpress developers the ability to create custom categories to describe posts, but this pattern is extremely prevalent in the Wordpress codebase. In the example below you’ll see that a LIMIT SQL clause is being constructed using string concatenation:

src/wp-includes/class-wp-term-query.php

Eventually the various clauses are combined further down in the file to form the full SQL statement:

src/wp-includes/class-wp-term-query.php

The next question is to establish how $offset and $number are validated. They’re validated further up the class in the parse_query function which converts the input into an integer.

src/wp-includes/class-wp-term-query.php

Finally, parse_query is called within the main entrypoint get_terms. This means that the validation will always be called no matter how any developer integrates with this class.

src/wp-includes/class-wp-term-query.php

So what’s the problem with this code then? We’ve established that validation works in this case and can’t be bypassed, everything is ok — right?

What is a Secure System?

Well the answer to that question gets to the core of what it means to design a secure system. Perhaps we can define a system as secure if security researchers are not uncovering vulnerablities. It looks like the last publically disclosed instances of SQL injection being uncovered in the Wordpress core was in 2017. So in this sense, perhaps Wordpress is sufficiently secure against SQL Injection vulnerabilities.

Although it doesn’t mean that a security flaw will never emerge. Systems are constantly added to and removed from, so a vulnerability that doesn’t exist today may exist tomorrow, and vice versa. Perhaps a better definition of a secure system is one where the emergence of a flaw is less likely to lead to complete compromise of the system, i.e. there are sufficient mitigations in place. Under this definition, Wordpress, in my opinion, isn’t sufficiently secure against SQL injection. For every endpoint, any mistake in type validation could potentially lead to a serious vulnerability.

A secure system is one where the emergence of a flaw is unlikely to lead to complete compromise of the system.

The bigger issue with how Wordpress implements SQL queries is that developers and review processes are fallible. By relying on string concatenation and userland validation, every line of query code becomes security critical — since only one small mistake, one oversight, is all that will be needed for complete compromise of the database.

Other platforms and frameworks generally isolate this functionality into a specific place, a core abstraction that can be used by developers to implement queries. The result is that only one area of code must be tested and secured. For example in PHP’s symphony framework, an ORM is implemented. The limit clause creation would look something like this:

$entities = $em
->createQuery( '...' )
->setMaxResults($limit)
->setFirstResult($offset)
->execute()

In this case, even if a developer forgets to convert the string input into a number, perhaps the request will fail but no vulnerability arises.

A Counter-Argument?

Let’s wait a bit longer before cracking out the pitchforks

All of this so far sounds fairly damning and critical of the Wordpress developers, but before going on I think it’s important to reevaluate our own views and preconceptions in light of our findings. After all, my review did not uncover any SQL injection vulnerabilities, nor have there been any publically disclosed in quite some time. Perhaps the reality is that these flaws are of such a trivial nature, are so easy to review for, grep for and fix, that the risk is just theoretical. Perhaps the lack of any core abstractions, i.e. an ORM, makes the code easier to implement/review and therefore critical flaws have nowhere to hide except in plain sight.

The argument sound reasonable, but am I convinced? Not really. Are you? I’d love to hear your thoughts below.

A Brief Aside on Deserialization

I explored many other options in the time I allocated to Wordpress, in particular I looked at how values are serialized/deserialized in the Wordpress database. You’ll find that PHP unserialize and serialize is used in a number of key places in the core application. Some notable examples of this are the various _meta tables and the options/transient api which is used for managing settings and cached data.

When reviewing how metadata values are retrieved, I found the code below interesting — why would a value only “maybe” be unserialized?

get_metadata_raw @ src/wp-includes/meta.php

The answer is that when data in written in, it’s only serialized in the case that the value is an array or an object, or it passes the is_serialized case. In other words, Wordpress wants to support writing scalar strings into the database —possibly to avoid the overhead of deserializing every entry in the metadata/options/transients tables.

src/wp-includes/functions.php

If the alarm bells are ringing at this point, you’d be thinking along the same lines as me — is it possible we can find a string that will be written as a string but deserialized as a PHP object?

src/wp-includes/functions.php

The answer was no, since the logic is symetrical enough to avoid exploitation. Even with complete control over the string value being written to a metadata/transients/options table, this is currently unexploitable due to this. Interestingly it looks like 12 years ago the developers of wordpress accidentally introduced an arbitrary deserialization flaw with the release of wordpress 3.6.1. This is ancient history in terms of security developments, so I don’t regard this as something to hold against the wordpress developers.

maybe_serialize @ src/wp-includes/functions.php

So why am I telling you about this if it is not a flaw? It’s notable for two reasons. The first is that it gives us a target for database writes if we find an SQL injection vulnerability. If we can combine this with the right deserialization gadgets, it could potentially be leveraged to achieve RCE.

The second reason is that PHP’s native serialization is inherently risky to use. It doesn’t implement any functionality that validates the serialized value was issued by the application or another known actor. This means that if an attacker can write something that looks like a serialized PHP value, it will be deserialized without any verification. Since most Wordpress applications include third party plugins, some of which will have a poor security posture, it’s certainly possible that attractive deserialization gadgets exist in many installations. Any developments in this space will be crucial to keep an eye on.

Must-Use Plugins

Totally benign, as all things should be

Anyone who knows anything about Wordpress, will have heard about Wordpress plugins. Wordpress is built around the concept that third party developers can write extensions to the Wordpress core. These plugins can then be installed by non-technical website administrators via the wordpress admin UI, if it is enabled. PHP plugins are a trope for achieving RCE in both the real-world and CTFs.

However, we’re not interested in admin functionality since Wordpress adminstrators are already extremely highly privileged. What I found interesting is that Wordpress has a secondary mechanism for loading plugins named must-use plugins which are loaded on boot regardless of whether the plugin is enabled by an admin member of staff. In the image below you can see the bootstrap code in wp-settings which is included in all wordpress pages.

src/wp-settings.php

The only requirement is that a PHP file is dropped in the WPMU_PLUGIN_DIR folder which is is the wp-content/mu-plugins folder by default.

src/wp-includes/load.php

Whilst this alone isn’t a vulnerability, since as an unauthenticated user we have no mechanism for writing to the WPMU_PLUGIN_DIR directory, this directory does offer an attractive and relatively stealthy alternative target for wordpress installations with plugins installed. If a situation arises in which you have an arbitrary filesystem write, but can’t write to the /var/www directory, can’t get the web server to execute your script as PHP, or need persistence, this could be a potentially useful alternative.

The Traditional Path to RCE

The analysis I’ve done so far has looked at a very traditional route to RCE. Putting it all together, if a bug does emerge that gives us SQL Injection, the path to RCE could look like one of these scenarios:

  1. SQL Injection -> Auth Bypass -> WP Admin -> Admin Bugs/File Upload
  2. SQL Injection -> Deserialization Attack -> Secondary Flaws

The final area which I think is potentially a fruitful direction to focus research efforts is libxml2. It’s a widely used C library that PHP integrates with in order to parse and query XML documents. In the case of Wordpress, it’s used in many places throughout the application, but the xmlrpc integration is relevant for preauthentication exploits.

libxml2 is a software library for parsing XML documents.

So why is libxml2 notable? I think what’s notable about it is that parsers are notoriously difficult to implement and even more so in a language like C without memory protection. I invite you to browse the code in libxml2 to get a sense of how it’s implemented, but at a glance it seems complex. For example, the bulk of the parsing logic exists in a 10,000 line long file. Is it easy for the developers working on libxml2 to reason about and understand the consequences of all changes?

Is all of the parser state handled correctly?

In 2021 alone we’ve seen a use-after-free and an out-of-bounds read, so memory corruption flaws do not exist solely in the realm of imagination. However, the reality is that RCE via memory corruption today is complex, there are many countermeasures that must be bypassed which means that specific gadgets are needed, which could simply not exist. Furthermore just because libxml2 is a complex procedure and it’s written in C, doesn’t mean it’s definitely insecure… but the risk is certainly non-zero.

Overall is $300k a sufficient incentive to cover the research time involved in finding and building an exploit chain for libxml2? It is a large commitment in terms of time and potentially a fruitless one, but overall my opinion is that it probably is.

Comment at top of the parser logic

At some point in the future I’d love to devote some time to reviewing libxml2 in detail, but it’s beyond my capabilities at the moment. However, a determined researcher with sufficient time could potentially uncover useful gadgets here and you never know — maybe the set needed for RCE currently exists, maybe not.


文章来源: https://medium.com/@_ip_/300-000-rce-wordpress-29700ad6a993?source=rss-76269375e0ff------2
如有侵权请联系:admin#unsafe.sh