Developed in 2012 and made open source in 2015 by Facebook, GraphQL (Graph Query Language) has been under the umbrella of the GraphQL Foundation since 2019.
GraphQL is a query language, i.e. a language used to access data in a database or any other information system, in the same way as SQL (Structured Query Language).
It is also an SDL (Specification and Description Language). There is no official implementation provided by its creators. The various existing implementations (Apollo Server, Express GraphQL, graphql-yoga, etc.) follow the specifications linked to GraphQL.
Like all APIs, GraphQL enables data to be transferred between a client and a server. It is an alternative to REST (Representational State Transfer) APIs. The main advantage of GraphQL is that it can provide an application through a single request, delegating the task of structuring the data to the server.
In 2023, a survey conducted by Postman shows that GraphQL APIs are the third most widely used API architecture. A pentester will therefore regularly be confronted with this type of API during pentests.
In this article, we will look at how GraphQL APIs work, the vulnerabilities and attacks common to this type of system, and the best practices and measures to implement to secure your systems. We will also look at the methodology and tools used during a GraphQL API pentest.
Before diving into exploiting the various vulnerabilities associated with this type of API, it is important to take the time to understand it.
From an auditor’s point of view, a thorough understanding of how GraphQL works is crucial.
We will therefore begin by detailing the structure and main concepts of GraphQL, before looking at the various possible attack vectors and exploitable vulnerabilities.
The schema is the central element of a GraphQL API. It is the fundamental structure, defining all possible interactions between the server and the client. For a given API, there is only one schema which acts as a reference.
The schema specifies the types of requests that a client can submit, the types of data that can be retrieved from the server, and the relationships between these different types of data.
All the data handled is organised by “type”. We are now going to look at the most commonly encountered types and those that will be the focus of our attention during a pentest.
The scalar type is the most elementary data type in GraphQL. It represents the primitive data assigned to the field:
The following example illustrates the use of scalar types to define the Book object, which has three fields of types String, String and Int respectively:
These are the building blocks for the more complex types we will see later.
Object types are the central construct for modelling complex data structures with GraphQL. The vast majority of types defined in a GraphQL API are objects.
An object is made up of fields, each with its own associated data type. This type can be a scalar (String, Int, etc.) for primitive data, or another object to represent nested data.
Other more advanced types such as enumerations, unions or interfaces can also be used, but we won’t go into them here.
Taking the previous example, Book is an object made up of scalar fields only.
We can imagine a more complex object, Library. This will allow us to introduce lists and non-nulls, and to discover a slightly more complex implementation of an object.
The notation [Type!] indicates a list whose elements must be of type Type. The exclamation mark indicates that this field cannot be null (be careful not to confuse an empty list with a null list).
Objects are essential building blocks in GraphQL, enabling data models to be structured.
While most of the types defined in a GraphQL schema are objects, there are also two special types: queries and mutations.
These are the types we will use to retrieve or modify data.
A Query is used to retrieve data from the GraphQL server. It follows the founding principle of GraphQL by returning only the fields explicitly requested in the query.
Here is an example of a basic query to retrieve certain fields from the Book object:
Here we have a very basic query, called GetLibrary, where we explicitly request certain fields. In response, we’ll get only the fields requested.
Queries can be much more complex by adding arguments, variables, fragments, directives and so on. We won’t go into these details here.
Mutations, on the other hand, can be used to modify data on the server side (create, update, delete). A Mutation can also return data in its response.
Let’s say we wanted to add a new book to our library. We could use a mutation like this:
This mutation allows us to add a book to a library. AddBookInLibrary specifies the expected fields for this mutation, and addBook is the field that will be executed.
In return, we request the title field, which will be returned in the response if the request is successful.
Like Queries, Mutations can be very complex and nested as required. But this simple example illustrates their main role of modifying data.
We have now understood the basis of GraphQL. A thorough understanding of these concepts is essential to mastering this language and optimising penetration tests on APIs using it.
Before getting to the heart of the matter, it is important to remember that the pentesting of a GraphQL API generally follows the same rules as for other types of API.
The testing methodology will therefore be similar. However, as GraphQL has its own specificities, certain stages and attack vectors are unique.
A reconnaissance phase is necessary to assess the attack surface, unless the client provides full documentation of the API. We will describe the tools and techniques available to us below.
Once we have this information, we’ll move on to the vulnerability identification part. Here we will test vulnerabilities common to all APIs (injections, access control flaws, exposure of accessible data, etc.) as well as vulnerabilities specific to GraphQL.
To find out more about the objectives and testing methodology of an API pentest, we refer you to our dedicated article: API Penetration Testing: Objective, Methodology, Black Box, Grey Box and White Box Tests.
The first crucial step in pentesting a GraphQL API is to discover its endpoint. This is not always obvious, particularly if the API is only used for certain functions or for particular roles.
So we’re going to fuzz our target to find the endpoint, using this Seclists wordlist for example, with the query body “query{__typename}”.
If we get a response containing {“data”:{“__typename”: “Query”}}, then this will confirm the presence of a GraphQL API on the URL under test.
Alternatively, you can simply use the application legitimately and the graphql endpoint will be discovered.
Once the endpoint has been discovered, we can begin the schema discovery and enumeration.
During a GraphQL API pentest, auditors can rely on various tools to facilitate their task at different stages.
Here is an overview of the main tools you need to know about:
Introspection is not a tool, but a GraphQL feature. It is used to retrieve the complete schema of an API, which defines its data structure.
We will use this introspection request:
{__schema{queryType{name}mutationType{name}subscriptionType{name}types{…FullType}directives{name description locations args{…InputValue}}}}fragment FullType on __Type{kind name description fields(includeDeprecated :true){name description args{…InputValue}type{…TypeRef}isDeprecated deprecationReason}inputFields{…InputValue}interfaces{…TypeRef}enumValues(includeDeprecated :true){name description isDeprecated deprecationReason}possibleTypes{…TypeRef}}fragment InputValue on __InputValue{name description type{…TypeRef}defaultValue}fragment TypeRef on __Type{kind name ofType{kind name ofType{kind name ofType{kind name ofType{kind name ofType{kind name ofType{kind name ofType{kind name}}}}}}}}
If enabled, this request will return a JSON response detailing all the types, fields, arguments, etc. defined in the schema.
Having access to the complete schema is extremely useful, and enables the enumeration phase to be completed.
Once in our possession, it is possible to identify potentially sensitive areas or data that should not be exposed publicly.
However, for security reasons, introspection is generally disabled on GraphQL APIs in production (in theory).
When schema introspection is disabled on a GraphQL API, the Clairvoyance tool can be used as an alternative to attempt to reconstruct the schema.
It works by using a wordlist.
By sending a number of requests, the tool relies on GraphQl’s suggestions feature. It should be noted that suggestions may not be returned on the client side, rendering this tool totally ineffective.
By analysing these responses, Clairvoyance is able to reconstruct part of the API schema in the form of JSON.
Although less complete than introspection, this technique provides a good overview of the schema.
GraphQL Voyager is a valuable tool that can be used to visualise the schema of a GraphQL API, based on the schema that we were able to retrieve using introspection or Clairvoyance.
It is possible to retrieve an introspection query from its interface.
Once the schema has been loaded, the tool will generate a graphical representation of the schema structure. All types, fields and relationships are displayed, making it much easier to understand the API architecture.
A panel at the bottom left can be used to explore the list of available queries, mutations and subscriptions in greater detail.
In short, GraphQL Voyager is very useful for quickly identifying areas of interest to investigate.
Postman is a tool initially designed for developing and testing APIs. Its use during an audit can prove to be a valuable asset, facilitating the repetition of requests with its intuitive interface.
Once the GraphQL endpoint has been configured in Postman, it will automatically perform an introspection request and generate interactive documentation of the API. All types, queries, mutations and subscriptions are listed.
The dedicated interface lets you build and replay queries by selecting the required fields and arguments. Variables can also be easily defined.
The work of an auditor can therefore be greatly simplified by using this tool for an API pentest.
InQL is an extension to the Burp Suite proxy designed for testing GraphQL APIs. In terms of features, it is similar to Postman. It can be installed from the Burp BApp Store, in the Extensions section.
It is possible to perform an introspection request to retrieve the schema in two different ways.
The first, directly from a GraphQL:
And the second, from the InQL tab:
Whichever method we use, we can then access the list of queries and mutations from the extension interface. It lists the parameters required and the fields available.
We can then send these queries directly to the Repeater panel, in order to run our tests on the chosen routes.
Graphql-cop lists the main vulnerabilities likely to be present in a GraphQL API. It is an open source tool available on github. However, it has not been maintained since 2022.
It is very easy to use:
By simply specifying the URL of our target, the tool will attempt to find the default graphql paths. Note that the list of paths is very limited:
It is therefore preferable not to rely on this tool to find the graphql endpoint.
A series of tests will then be carried out, including:
Once this has been done, we can look at the list of vulnerabilities to which our target is vulnerable.
The last tool we will discuss here is graphw00f. As mentioned previously, GraphQL has no implementation of its own and requires an execution engine to work.
This is where the graphw00f tool comes in. It works by sending specially designed GraphQL requests to the API and will be able to identify the engine thanks to unique signatures present in the errors returned or in the metadata.
From there, we can refer to this table, which identifies potential vulnerabilities depending on the implementation:
As mentioned with graphql-cop, the list of endpoints to find the path is limited:
It is therefore preferable to identify it before using it, or to specify a custom wordlist to use using the -w flag. We’ll use -t to specify the URL of our target, and we’ll get output like this on successful identification:
In this example, the Graphene engine has been identified.
At this point, the enumeration of our target is complete, and we now need to move on to the phase of testing the available routes.
Many tests are carried out during our audits, and we won’t present them all here, but we’ll look at some of them to better understand how they work.
Some of these tests will be directly linked to the architecture and operation of GraphQL, while others will concern the APIs in general.
DoS attacks aim to overload the targeted server, with the aim of making it much slower or even inaccessible for the duration of the attack.
This has a major impact on the experience of other users, preventing them from using it, and also damaging the company’s image.
GraphQL APIs can be particularly vulnerable to this type of attack if there is no limit to the depth of the querys.
To illustrate this, consider the following image:
When we make a getPaste request referencing an owner, itself referencing pastes, and so on, this request can become extremely heavy for the server. Such a structure risks compromising the smooth running of the application.
The number of objects that the server will return is exponential with the level of depth, which causes overloading.
We can see that the server was momentarily overloaded, which led to an extremely long response (almost 12 seconds). During this time, the server was unavailable.
This was because the query was particularly demanding in terms of resources. An even deeper query would have rendered the server unavailable for a longer period of time, and could even have caused a complete shutdown.
It should be borne in mind that these tests were carried out locally on our own machine, which explains the significant impact observed. As a general rule, corporate servers are more robust and can handle several simultaneous requests (in theory).
However, let’s consider a scenario where this attack is launched from several machines at the same time; this could lead to the same consequences.
Batched queries and aliases are another aspect that can create vulnerabilities in GraphQL. Although they can be considered from the point of view of denial of service, in this example we are going to use them in a brute force attack scenario.
Let’s imagine that a client has set up an HTTP request limit on a login form, allowing only 10 requests per second to be sent from the same IP address. This measure is designed to counter brute force attacks on the login form.
If batched queries or aliases are authorised, this will not be enough to stop the attacker. Attackers could send a multitude of requests in a single HTTP request, thus circumventing the limitation and succeeding in their attack.
This method considerably speeds up a brute force attack. By sending 100 requests, each containing 100 aliases or batched queries, the attacker already makes 10,000 authentication attempts.
Multiplying the number of requests in a single HTTP request increases the effectiveness of the attack exponentially.
It is common practice to use whitelists or blacklists to authorise or prohibit users from making GraphQL requests. However, this approach can be vulnerable if it is not implemented correctly.
Consider a GraphQL API that exposes a systemHealth request that checks the state of the server. This request requires authentication as an administrator. Without a valid authentication token, the request is rejected, which is expected behaviour.
However, it is possible to bypass this restriction by using an authorised operation name without an authentication token, while querying the systemHealth field which should be protected.
For example, suppose an unauthenticated user is authorised to execute a getPastes request. An attacker could then send the following request:
The problem presented here is that the operation name requested is subject to an authorisation check, whereas the resource requested is not.
In this request, the operation name getPastes is authorised for an unauthenticated user. However, the attacker has also included the systemHealth field, which he should not have access to.
The problem here is that the authorisation check is performed only on the operation name, while the resources requested are not checked individually.
Although GraphQL introduces specific vulnerabilities, it should not be forgotten that a GraphQL API is fundamentally a web API.
As such, it can be exposed to the same types of vulnerabilities as other types of APIs. We are going to review 3 vulnerabilities likely to be encountered during an audit.
A stored Cross-Site Scripting (XSS) vulnerability can have serious consequences. It allows an attacker to inject malicious JavaScript code that will be stored server-side, usually in a database. This malicious code will then be executed in any user’s browser, where the malicious value will be reflected.
The consequences can range from web page defacing to user account theft.
To find out more about XSS vulnerabilities, please see our article: XSS (Cross-Site Scripting) vulnerabilities: principles, types of attacks, exploitations and security best practices.
Let’s take the example of a forum platform where users can publish messages. These messages are then taken up and displayed on the web application’s public pages.
The expected use of this feature would look like this:
Now let’s add a malicious user, who decides to test this application and posts a malicious message. We can see that the comment is sent using the API, but we can also see that it is stored without having validated the user’s input.
The infected content is then taken over and displayed publicly, without encoding the special characters, and so the Javascript code is interpreted and then executed, leaving legitimate users of the site vulnerable:
Another type of vulnerability that can be encountered during an API pentest is an arbitrary file upload combined with a path traversal.
If correctly executed and the conditions are met, this vulnerability can have disastrous consequences, allowing malicious files to be written to the server.
During our exploration we discovered a mutation that allows files to be uploaded to the server. This mutation expects 2 input fields, the name of the file and its content, and will return the result.
A classic upload works perfectly normally, and a response is sent back with the contents of the file that has been uploaded.
If you take a look at what’s happening on the server side, you’ll see that everything went well, and you’ll find this new file at /opt/dvga/pastes:
By changing the filename to “../../../tmp/pwn”, the query still works but we see that the file is not present in the expected path. Instead, we can find it in the /tmp folder.
If we look at the source code responsible for saving our file, we can see that there is no check on the file name, which explains its presence in the other folder.
One of the most critical vulnerabilities that can be found during a pentest is the execution of a command on a remote server or RCE. This can lead to a number of consequences, including data theft, privilege escalation, setting up a backdoor, etc.
We are going to use the importPaste mutation to exploit this vulnerability.
Originally, this functionality was designed to import data from a URL entered by the user, in order to save it on the server.
The implementation of such a feature needs to be particularly careful from a security point of view, and this is the kind of feature that auditors will be particularly interested in.
Here’s the implementation:
The corresponding GraphQL mutation expects different parameters, host, port, path and scheme. So a ‘well-formed’ query might look like this:
Now consider that a malicious user decides not to comply with what is expected, and replaces ‘/’ in the path parameter with ‘/; sleep 10’.
In this way, the final command executed by the server would become: “curl http://example.com:80/; sleep 10”, and both commands would be executed.
If we look at the server side, we can confirm this theory.
An attacker could thus attempt to escalate his privileges, pivot in the internal network, interrupt the smooth running of the application, etc.
We have now taken a look at GraphQL, how it works and the main vulnerabilities that can be encountered during an audit.
As we have seen, GraphQL is not immune to the attack vectors impacting traditional web APIs, but also includes vulnerabilities specific to its implementation. Rigorous security measures must therefore be put in place.
To ensure protection from a global point of view on a GraphQL API, several protections can be put in place.
By default, GraphQL implementations have default configurations that should be changed:
GraphQL is particularly vulnerable to denial of service attacks if it is incorrectly configured. This type of attack impacts the availability and stability of the API, making it slower or even unavailable.
Here are a few recommendations that can be followed to protect against this type of attack:
As we saw earlier, batching attacks can be used to carry out bruteforce attacks. To defend against this type of attack, it is necessary to impose limits on incoming requests:
The user input sent to the GraphQL API must be strictly validated. This input is often reused in multiple contexts, whether HTTP, SQL or other requests. If it is incorrectly validated, this could lead to injection vulnerabilities.
To validate user input, you can follow these recommendations:
Sources :
Author : Théo ARCHIMBAUD – Pentester @Vaadata