I decided to make this guide due to the lack of material on this topic and my own struggles with GraphQL. Its purpose is to provide pentesters with the necessary tools to perform tests against GraphQL implementations. I encourage you to do further research and practice on your own with the references provided at the end.
GraphQL is a language used for data query and manipulation, primarily used with APIs. It is used to handle data from a server to a client through the use of various types of operations.
It is currently open-source and there are many frameworks that use it or integrate with it. It is becoming more and more common in web applications nowadays, especially ones that use APIs.
Although it is technically an API, all queries through GraphQL are performed against a single endpoint, for example /graphql. Think of it as middleware between a web/mobile application and an API or other solution.
In terms of functionality, it is quite similar to an SQL database, as it allows to perform operations that
can retrieve or alter data.
GraphQL allows for One-to-one, One-to-many, Many-to-one, and Many-to-many relationships,
as well as combining multiple relationships, and relational mutations:
GraphQL has the following benefits:
GraphQL uses a human-readable schema definition language (or SDL) that defines the schema and stores it as a string. The GraphQL schema is a description of the data that can be requested from a
GraphQL endpoint.
It defines available queries, mutations, fragments, fields, and supported types:
Queries are used to fetch specific data from a GraphLQ instance. They are interactive, meaning they can be changed to shape the field objects they return upon execution.
The fields that are allowed to be retrieved when using a query can be limited so that unauthorised users cannot access sensitive information:
Mutations are used to modify data within a GraphQL instance. Just like in queries, mutations can return the value of the newly mutated fields. Mutations can also contain multiple fields.
While query fields are executed in parallel, mutation fields run in series, meaning if we send two AddMoney mutations in one request, the first will finish before the second begins, preventing race conditions.
The structure used by GraphQL to compose operations can be observed below:
Although operations in GraphQL can be executed by inserting arguments inside the query string, this is not feasible when they need to be dynamic. GraphQL has a way to factor dynamic values out of the query and pass them as a separate dictionary through variables.
This is achieved by replacing static values in the query with $variable, declaring it as a variable accepted by the query, and passing it in the variables dictionary:
Fragments are reusable units that let you construct sets of fields and include them in queries where needed. This avoids writing very repetitive queries. The concept of fragments is often used to
split complicated application datasets.
It is possible for fragments to access variables declared in the query or mutation:
Directives are attached to fields to affect queries. They are useful where you otherwise would need to
do string manipulation to change a query. The core GraphQL specification includes two directives:
A common use of directives is to implement permissions:
There are many ways to perform authentication in GraphQL, however the most common one is
JSON Web Tokens.
JWT is a standard in which information can be securely transmitted between two entities through a compact JSON object. It is generally managed by an authorisation server; companies often use third-party services such as Auth0 to handle JWT tokens.
JWT is a standard used to safely transmit information (often user identity) between parties. It is a small and simple token that is used by protocols such as OpenID and OAuth 2.0 to represent identity to an application or access token for API authorization.
It is a format that can be signed and/or encrypted. When signed it uses JSON Web Signature (JWS), when
encrypted it uses JSON Web Encryption (JWE). When encrypted, the body cannot be viewed without the encryption key.
JSON web tokens are made of three base64-encoded, dot-separated components: the header, the payload and the signature.
The single endpoint approach reduces the effort required to enumerate existing operations. It contains default functionality that is designed to help developers and it is often not disabled in production.
Additionally, it is meant to be extremely easy to work with, therefore it will advise when running an invalid operation and it will help you build the right query structure.
Look in your Burp Suite HTTP history for any of the GraphQL keywords such as query, mutation etc. Perform a directory bruteforce attack against the web application, some common GraphQL endpoints are /graphql, /graphiql, /gql.
The Nmap GraphQL Introspection NSE script can also be used for this task, and it contains a comprehensive list of potential GraphQL endpoints.
graphw00f sends a mix of benign and malformed queries to determine the GraphQL engine in use. It provides insights into the security defences each technology uses and whether are on by default, to have an idea of how the instance can be attacked.
This is possible due to how GraphQL responds to specially crafted requests:
Introspection allows to query a GraphQL server for information about the schema in use. It can enumerate the available types, fields, queries, mutations, fragments etc. It can often be used unauthenticated.
It is normally enabled by default, although new frameworks such as Apollo are disabling it in production.
If introspection is disabled, there may be another way to enumerate the schema in use.
Thanks to GraphQL’s validation capability as well as verbose errors by default, information about the
schema can be easily enumerated by providing incorrect values.
When providing an invalid operation, variable, or value, GraphQL will suggest operations that match a certain portion of the one provided. This helps construct valid operations to further enumerate GraphQL.
Tools such as Clairvoyance or ShapeShifter can be used to automate this type of attack.
Use the application through its UI and observe the operations being made within Burp’s HTTP history to get an idea of the underlying schema. This can be very tedious, though there may be no other options.
If introspection and verbose errors are disabled, it will only be possible to enumerate operations that are accessible through the UI. Luckily, thanks to how GraphQL works, filtering by the endpoint in Burp Suite narrows down to all GraphQL operations.
By now you hopefully have a lot of information about the target schema. These can be your next steps:
Pretty much all of the REST API vulnerabilities found in the OWASP Security Top 10 are also applicable to GraphQL, in particular:
Information disclosure is one of the most common vulnerabilities in GraphQL. It can arise from:
If access controls are poorly implemented, it could allow unauthorised data access.
As a middleware application, GraphQL could be used to ingest malicious data. This can introduce injection attacks into the applications sitting on the other end.
This type of attack could lead to XSS, SQL Injection, or command injection:
Although GraphQL is very well optimized and designed to return only specific sets of data, it won’t necessarily stop attackers from abusing a badly configured implementation.
For example, attacks leveraging nested queries to loop through the same data over and over again could cause GraphQL to hang, or potentially run out of resources and eventually crash.
DOS in GraphQL can manifest itself in several forms:
Batch Query Attack Example:
Improper or missing authorization checks could allow users to perform unauthorised actions. While testing, ensure tokens/keys/cookies are being validated on each request and that permissions are implemented appropriately.
If OAuth is being used, there could be issues with the way the authorization server performs authentication.
JWT Tool can be used to automate JWT testing.
Mass assignment is a vulnerability where requests are abused to access or modify data that the user should not have access to. This can be abused by adding extra, unintended fields to the operation.
Knowledge of the underlying schema is often required for this attack to work.
GraphQL is generally safe from CSRF as long as it uses proper JWT or other token/key-based authentication methods. There are still circumstances in which CSRF could be exploited:
The implementation may have functionality for fetching or pushing data to an external or internal service by passing its URL within a parameter. If appropriate controls (i.e. whitelisting/sanitisation) have not been implemented, an attacker may be able to temper with the URL.
This may allow to interact with services that are not directly exposed on the internet. Additionally, attacks performed this way will originate from the vulnerable GraphQL application.
Other issues affecting GraphQL can be: