Spelunking in Comments and Documentation for Security Footguns

Spelunking in Comments and Documentation for Security Footguns
2024-11-21 03:0:43 Author: blog.includesecurity.com(查看原文) 阅读量:0 收藏

When we perform security assessments at Include Security, we like to have a holistic view of the application and attack from multiple angles. For me, this means going deeper than just looking at client code and also exploring third-party libraries and frameworks used by the application. Doing a full security assessment of every third-party component is not feasible during an engagement, but documentation and comments for these projects have given me a lot of insight quickly. This has yielded very practical security bugs in the context of real assessments at Include Security. Because of this, I always include this step as part of my methodology. Join me as we look at some security footguns that I’ve encountered during security assessments across a wide variety of languages and libraries. But before we continue, let’s define what a footgun is first.

Any feature likely to lead to the programmer shooting themself in the foot.
Wiktionary

Enough said, and the focus of this post will be around footguns that have security implications. We’ll start off with two footguns: one in an Elixir library and the other in a Python library. Afterwards, we’ll dig a little deeper into three footguns in the Golang standard library that many people might not be aware of, unless you were part of these conversations:

I also wanted to spark a larger conversation around how we can better track these types of issues, that might not cleanly fall into the security vulnerability category.

Elixir’s Tesla

The first footgun can be described as overly flexible behavior unexpectedly designed for convenience rather than security. Tesla is a commonly used HTTP client in the Elixir world. This library had an interesting behavior that felt unexpected, yet was fully documented. While I was performing a security assessment I came across code similar to this pseudocode:

base_url = "https://example.com"
client = build_client(config, base_url, headers)
user_controlled_path = "/hello"
response = Tesla.get(client, user_controlled_path)

The application took a user controlled path and appended it to the base URL that was defined. As a result it would make a GET request to https://example.com/hello. All fine and dandy, right?

Let’s look a bit closer at the library code, particularly this function:

  defp apply_base(env, base) do
	if Regex.match?(~r/^https?:\/\//i, env.url) do
  	# skip if url is already with scheme
  	env
	else
  	%{env | url: join(base, env.url)}
	end
  end

So when base is applied it first checks to see if the URL passed to the client starts with a scheme like http:// or https:// and if it does then it skips applying the base URL entirely. By supplying a full URL in user_controlled_path such as https://attacker.com/hello we can bypass the defined base_url. This behavior resulted in server-side request forgery in a real engagement, but I couldn’t shake the feeling that the default behavior felt wrong. So I went to the documentation and read this:

Set base URL for all requests.

The base URL will be prepended to request path/URL only
if it does not include http(s).

## Examples

defmodule MyClient do
use Tesla

plug Tesla.Middleware.BaseUrl, "https://example.com/foo"
end

MyClient.get("/path") # equals to GET https://example.com/foo/path
MyClient.get("path") # equals to GET https://example.com/foo/path
MyClient.get("") # equals to GET https://example.com/foo
MyClient.get("http://example.com/bar") # equals to GET http://example.com/bar

The documentation clearly defined this behavior, so did that make this a security bug or just user error? I wonder how many people would expect this behavior without having read the documentation.

The Pyscopg Minefield

For our second footgun, a subtle difference in syntax is all it takes to introduce a vulnerability. Pyscopg is a PostgreSQL driver for Python and from a security point of view SQL injection is the primary concern. While performing a security review I went to the documentation for the library and it was filled with a ton of “wrong” and “correct” ways of performing queries. One of the examples was the following:

>>> cur.execute("INSERT INTO numbers VALUES (%s, %s)" % (10, 20)) # WRONG
>>> cur.execute("INSERT INTO numbers VALUES (%s, %s)", (10, 20)) # correct

Yes, that one character difference is all it takes for a SQL injection vulnerability to appear. Although, the documentation has a pretty clear warning that states the following:

Never, **never**, **NEVER** use Python string concatenation (`+`) or string parameters interpolation (`%`) to pass variables to a SQL query string. Not even at gunpoint.

In general, when I read documentation for libraries that show lots of examples of wrong and correct ways of doing things my radar lights up and I start looking closer. Often those examples were included because of developers “misusing” the library. What are the chances that the developers for the security assessment I was performing also misused the library?

GitHub Spelunking

I was working on a project that used Golang and I was looking for security issues related to a particular package in the standard library. So I went to the Golang repository and started searching. As I was going through issues, I had a sudden realization. There are a lot of issues that have been reported with security implications that have had some interesting discussions, maybe had been resolved, and then life just continued. For lots of these issues, there was never a CVE filed or a security advisory created, and maybe that’s okay. But it does mean that there may be security bugs or footguns lying around that many people might not know about.

For each of these GitHub issues, I will dive a bit into the issue with references to some code and talk about the security implications of each one. I reference each issue from where I first discovered it so feel free to also read through the comments there for more context.

ReverseProxy May Remove Headers

This footgun is due to the library correctly implementing an RFCs, but causing an unexpected clash with the high-level design of another component. The GitHub issue was very thorough and well written, so I would recommend jumping there to read it first and then come back here.

httputil.ReverseProxy is an HTTP Handler that is commonly used for proxying requests and responses in Go. The Director function is used for modifying requests that come in and is commonly used for adding headers. The issue with Director is that ReverseProxy, compliant with RFC 7230, will remove hop-by-hop headers after the Director function may have added headers. As stated in the issue:

When ReverseProxy forwards a request, it follows the following steps in order:

1. Clone the incoming request to produce an outbound request, outreq.

2. Pass outreq to the Director function, which may modify it.

3. Remove hop-by-hop headers from outreq.

4. Send outreq to the backend.

A request made by a user could include a header such as Connection: forwarded. This would cause the header Forwarded added by Director to be dropped right after it was added.

To maintain backwards compatibility, a new function called Rewrite was added that is not affected by hop-by-hop headers. The following note was left for the Director function:

// Hop-by-hop headers are removed from the request after
// Director returns, which can remove headers added by
// Director. Use a Rewrite function instead to ensure
// modifications to the request are preserved.

There are likely still a lot of people using Director that are unaware about this behavior and still others who come across tutorials and guides that push Director as the best way to modify requests for ReverseProxy. This behavior isn’t a security issue by itself, but in the case where your origin server is expecting a header and not receiving it due to user forcing a header to be dropped, it could lead to some funky behavior.

To help others catch this behavior we created a Semgrep rule that is part of the official ruleset.

Accidentally Mutate Shared URL Struct

This footgun involves the mutation of a shared struct, but may not be apparent to developers that use the struct. A typical scenario is a shared URL struct being used between HTTP requests of different users and a mutation from one request affecting a separate request later on. The following goes over a scenario involving one such case with redirect URLs.

From the GitHub issue that was reported:

It’s pretty easy to accidentally mutate a shared URL struct when trying to create a URL with query parameters:

var redirectURL, _ = url.Parse("https://example.com") // #1

func main() {
	http.ListenAndServe("", http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
    	u := redirectURL // #2
    	q := u.Query() // #3

    	// opaque process that might fail
    	token, err := getRedirectToken()
    	if err != nil {
        	q.Set("error", err.Error())
    	} else {
        	q.Set("token", token) // #4
    	}

    	u.RawQuery = q.Encode() // #5

    	http.Redirect(w, r, u.String(), http.StatusFound)
	}))
}

So what’s actually happening in this example code? Let’s walk through it step by step.

1. First, the redirectURL variable is created to store a URL struct pointer (since that’s what url.Parse returns). This is the URL that HTTP clients will be redirected to.

2. When a request comes in, create a temporary variable u to hold the same pointer as redirectURL.

3. Call u.Query() to parse RawQuery from redirectURL and create a map stored in q with the existing URL parameters.

4. Call q.Set("token", token) to set a query parameter. If getRedirectToken fails then we set the error query parameter instead.

5. Set u.RawQuery to q.Encode(), thus overriding the shared URL struct that will be used for future requests.

This has some interesting security implications since now we are mutating the URL struct that is used between requests. This could lead to unexpected behavior. A lot of developers might not be aware that they are even dealing with a pointer to the URL struct!

For example, a future request might fail when executing getRedirectToken and never get to #4. Now, instead of setting the token URL parameter, the error parameter gets set instead. This means that the redirect URL will now have both the previous token parameter and the new error parameter due to starting off with RawQuery from the previous request.

As of writing this post, the issue was still open with some ideas around implementing a Clone() method floating around. For now, it might be a better idea to do a shallow copy of the URL like u := *redirectURL or just re-parse the URL every request.

To help others catch this behavior we created a Semgrep rule that is part of the official ruleset.

Unexpected JSON Encoding of a Field

This footgun is a bit different than the rest as it involves an unexpected library bug or inconsistency that is now baked-in behavior. The GitHub issue reporter included the following example code for JSON marshalling.

package main

import (
	"encoding/json"
	"os"
)

type T bool

func (v *T) MarshalJSON() ([]byte, error) {
	return []byte{'1'}, nil
}

type S struct {
	X T
}

func main() {
	v := S{true}
	e := json.NewEncoder(os.Stderr)

	e.Encode(v)  // should print {"X":true}
	e.Encode(&v) // should print the same value
}

What did you expect to see?

{"X":true}
{"X":true}

What did you see instead?

{"X":true}
{"X":1}

Depending on whether the struct being encoded was a pointer or not, the output changed. Everyone in the thread agreed that this is inconsistent and unexpected behavior. This could also lead to security bugs depending on the context.

This issue had an existing Semgrep rule created by dgryski. On August 21, 2024 the issue 606495 on the Go bug tracker resolved this inconsistency, but it doesn’t appear to have made it in the recently released Go 1.23. The GODEBUG setting xmlinconsistentmarshal will allow users to revert the new behavior once it has been released.

Conclusion

All of the behaviors that I talked about here have never had any CVEs assigned. But should they even have had one assigned? The issue is that a lot of these behaviors aren’t necessarily a security vulnerability in itself, but in just the right context and usage they very well may be. A lot of these behaviors can also be avoided by reading the code and understanding the limitations or unique behaviors. I think footguns are a good way to describe all of these issues, since unless you are intimately familiar with the relevant code and context, you will probably end up shooting yourself in the foot. If we’re being honest, sometimes the real solution for these is re-designing the library to make it harder to do the wrong thing, whether that’s through interface re-design or better defaults.

At minimum, these behaviors should be documented. Additionally, rules for Semgrep or your language-specific static analysis tool, would be an ideal place to document these types of behaviors. I think that this is a really great way to get awareness to developers at a lot of different companies that already are using static analysis tools in their pipelines. So if you come across an-almost-but-not-quite security issue I would highly recommend creating a rule and sharing it with the world. You never know who you’ll help out!

文章来源: https://blog.includesecurity.com/2024/11/spelunking-in-comments-and-documentation-for-security-footguns/
如有侵权请联系:admin#unsafe.sh