Introducing CS2BR pt. I – How we enabled Brute Ratel Badgers to run Cobalt Strike BOFs
2023-5-15 15:0:0 Author: blog.nviso.eu(查看原文) 阅读量:29 收藏


If you know all about CS, BRC4 and BOFs you might want to skip this introduction and get right into the problem statement. You can also jump right to the solution.

Introduction

When we conduct Red Team assessments at NVISO, we employ a wide variety of proprietary and open source tools. One central component in these assessments is the command & control (C2) framework we use to remotely interact with compromised machines and move laterally through our targets’ networks. They usually feature a C2 server for central access, implants (analogous to bots in botnets) that execute commands, and client interfaces that allow red team operators to interact with the implants. Among others, there are two popular C2 frameworks that we use: Cobalt Strike and Brute Ratel C4.

Both C2s are proprietary and they have a lot of features in common. A particular capability they share is execution of beacon object files (BOFs). Normally you work with object files during compilation of C and C++ programs as they contain the compiled code of individual C/C++ source files and are not directly executable.

CS and BRC4 provide a mechanism to send BOFs to implants and execute their code on the remote machines. And the best thing about it is: one can write their own BOFs and have implants execute them. This comes with quite a set of benefits:

  • Implants don’t need to implement a lot of capabilities as those capabilities can be streamed and executed on demand, reducing the implant’s footprint.
  • Using custom BOFs, operators have finer control over the exact way an implant interacts with the target system. They can choose to implement new features or operate more covertly and OPSEC-safe.
  • While the C2s might be proprietary, BOFs can be open-source and shared with everyone.

There are many open-source BOFs available, such as TrustedSec’s CS-Situational-Awareness, that can easily be used in various C2s like CS and sliver. Nearly all of these BOFs use Cobalt Strike’s de-facto BOF API standard – which isn’t compatible with Brute Ratel’s BOF API. Thus, the vast majority of available BOFs is incompatible with BRC4.

Turns out there are only very few BRC4 BOFs!

In this blog post, we present an approach to solve this problem that enables Brute Ratel’s implants (“badgers”) to run BOFs written for Cobalt Strike. The tool we developed based on this approach will be presented in a follow-up blog post.

I. So what’s the exact problem?

In theory any BOF can be executed by either C2 framework so long as they don’t make any use of the C2-specific APIs. Practically, this doesn’t make much sense since using these APIs is required for basic tasks such as sending back information to operators.

The following paragraphs break down both BOF APIs in order to help understand how they’re incompatible.

Cobalt Strike’s BOF C API

Cobalt Strike BOF C API

Cobalt Strike splits its APIs into roughly four distinct groups:

  • Data Parser API: provides utilities to parse data passed to the BOF. This allows BOFs to receive arbitrary data as input, such as regular string-based values but also arbitrary binary data like files.
  • Output API: lets BOFs output raw buffers and format output. The output is sent back to operators.
  • Format API: allows BOFs to format output in buffers for later transmission.
  • Internal APIs: feature several utilities related to user impersonation, privileges and process injection. BRC4 doesn’t currently have an equivalent API.

Furthermore, the signature of CS BOFs entrypoints is void go(char*, int) and explicitly expects binary data to be passed and to be used with the Data Parser API.

Brute Ratel C4’s BOF C API

Brute Ratel BOF C API

Brute Ratel C4’s API on the other hand comes as a loose list that I grouped in this diagram for simplicity:

  • Output API: contains output printf-like functions for regular ANSI-C strings and wide-char strings.
  • String API: features various strlen and strcmp functions for regular ANSI-C strings and wide-char strings.
  • Memory API provides convenient memory-related functions for allocating, freeing, copying and populating buffers.

The signature of BRC4 BOF entrypoints is void coffee(char**, int, WCHAR**) and explicitely expects string-based inputs (similar to regular executable’s void main(int argc, char* argv[])).

Comparison & Conclusion

When comparing their APIs, it becomes apparent that CS and BRC4 follow different approaches to their APIs:

  • While both C2s provide some convenience APIs, Cobalt Strike’s APIs feature a higher abstraction level. As a result, CS doesn’t only feature an API dedicated to output (as does BRC4) but also one for formatting output.
  • CS provides advanced APIs (e.g. the “internal” ones) while BRC4 provides mostly low-level APIs.
  • Both differ greatly in their approach to passing inputs to BOFs: Cobalt Strike allows passing arbitrary binary data and provides a separate API for this task while BRC4 sticks to the traditional main entrypoint and its GUI only allows operators to pass strings to BOFs.
CS BOFs are (almost) the same as BRC4 BOFs - or at least BRC4 would like you to think that.

Actually BRC4’s documentation makes porting BOFs from CS to BRC4 look like an easy task. Simply trying to map the CS’s BOF API to BRC4’s shows that this is a more intricate task:

Cobalt Strike and Bute Ratel C4 BOF API mapping

As you can see, there are only very few CS APIs that (more or less) can be mapped to BRC4’s APIs. What are the implications for porting CS BOFs to BRC4 then? Well, it’s going to require some engineering.

II. Working out a solution

Now that we know how BRC4’s and CS’s BOF APIs are different from each other, we can work out a solution. Well, I’d love to tell you that that was the approach I took: read up on the problem’s intricacies first and then work out a well-structured and thought out solution. Things went a little different though, and I’d like to show you.

Approach 1: The naïve way

My first approach at porting BOFs from CS to BRC4 was based on the BRC4 documentation and involved only three steps:

  • Replace the go(char*, int) entrypoint with coffee(char**, int, WCHAR**).
  • Remove CS API imports (“beacon.h”) and add BRC4 API imports (“badger_exports.h”).
  • Replace uses of CS APIs with BRC4 APIs.

That looks easy to do! So let’s test this using the DsGetDcNameA example they posted in the documentation:

BRC4 DsGetDcNameA BOF

Nice, that worked very well! How about a real world example of an open source BOF: Outflank’s Winver BOF grabs the exact windows version of the victim machine. Again we replace the entrypoint, API imports and API uses:

#include "badger_exports.h"

//snip

VOID coffee(char** Args, int len, WCHAR** dispatch) {
	// snip
	dwUBR = ReadUBRFromRegistry();
	if (dwUBR != 0) {
		BadgerDispatch(dispatch, "Windows version: %ls, OS build number: %u.%u\n", chOSMajorMinor, pPEB->OSBuildNumber, dwUBR);
	}
	else {
		BadgerDispatch(dispatch, "Windows version: %ls, OS build number: %u\n", chOSMajorMinor, pPEB->OSBuildNumber);
	}
	
	return;
}
Common output with BOFs: not much to work with!

Running this gives us… nothing! This is a problem you’ll encounter when working with BOFs: you won’t receive any feedback when they aren’t executed or crash, making debugging and identifying the root cause just a bit harder.

But what was the problem in this case? Well, I got stuck for a bit at this point and after digging into the compiled BOF I noticed that there were WinAPI calls in the code that were not explicitly declared as imports:

DWORD ReadUBRFromRegistry() {
	//snip
	_RtlInitUnicodeString RtlInitUnicodeString = (_RtlInitUnicodeString)
		GetProcAddress(GetModuleHandleA("ntdll.dll"), "RtlInitUnicodeString");

Cobalt Strike’s BOF documentation says that “GetProcAddress, LoadLibraryA, GetModuleHandle, and FreeLibrary are available within BOF files” and don’t need to be explicitly imported by BOFs. This doesn’t apply to BRC4 though so imports for those need to be added:

WINBASEAPI FARPROC WINAPI KERNEL32$GetProcAddress (HMODULE hModule, LPCSTR lpProcName);
WINBASEAPI HMODULE WINAPI KERNEL32$GetModuleHandleA (LPCSTR lpModuleName);
WINBASEAPI HMODULE WINAPI KERNEL32$GetModuleHandleW (LPCWSTR lpModuleName);
WINBASEAPI HMODULE WINAPI KERNEL32$LoadLibraryA (LPCSTR lpLibFileName);
WINBASEAPI HMODULE WINAPI KERNEL32$LoadLibraryW (LPCWSTR lpLibFileName);
WINBASEAPI BOOL  WINAPI KERNEL32$FreeLibrary (HMODULE hLibModule);

#ifdef GetProcAddress
#undef GetProcAddress
#endif
#define GetProcAddress KERNEL32$GetProcAddress
#ifdef GetModuleHandleA
#undef GetModuleHandleA
#endif
#define GetModuleHandleA KERNEL32$GetModuleHandleA
#ifdef GetModuleHandleW
#undef GetModuleHandleW
#endif
#define GetModuleHandleW KERNEL32$GetModuleHandleW
#ifdef LoadLibraryA
#undef LoadLibraryA
#endif
#define LoadLibraryA KERNEL32$LoadLibraryA
#ifdef LoadLibraryW
#undef LoadLibraryW
#endif
#define LoadLibraryW KERNEL32$LoadLibraryW
#ifdef FreeLibrary
#undef FreeLibrary
#endif
#define FreeLibrary KERNEL32$FreeLibrary

Note that the macros allow us to leave the original function calls in the BOF untouched. If we didn’t do that, we would need to prepend KERNEL32$ to every call of the functions listed above.

After adding those and recompiling, we can run the BOF again and now it runs just fine:

Running a CS BOF in BRC4 after adding default imports

That’s great! However this approach is pretty limited. Let’s have a look at some of its shortcomings.

Caveat

The naïve approach works great for very simple BOFs that don’t use any of CS’s higher-level APIs. Many of the advanced BOFs use some of those APIs though.

Let’s examine TrustedSec’s sc_enum as it’s a very useful BOF and great example: it allows an operator to enumerate Windows services on a target machine. If we were to perform our simple 3-step approach again we’d hit a roadblock:

VOID go( 
	IN PCHAR Buffer, 
	IN ULONG Length 
) 
{
	const char * hostname = NULL;
	const char * servicename = NULL;
	DWORD result = ERROR_SUCCESS;
	datap parser;
	init_enums();
	BeaconDataParse(&parser, Buffer, Length);
	hostname = BeaconDataExtract(&parser, NULL);
	//snip
	if ((gscManager = ADVAPI32$OpenSCManagerA(hostname, SERVICES_ACTIVE_DATABASEA, SC_MANAGER_CONNECT | GENERIC_READ)) == NULL)

You can see that this BOF takes the hostname parameter from the “Data Parser API” (BeaconDataExtract) that isn’t present in BRC4. There’s no equivalent to this API in BRC4.

At this point I figured that instead of coming up with some hacky fix I’d work out a proper solution that works more reliably and is more flexible: after all, I didn’t want to manually edit all the BOFs I use on a regular basis and troubleshoot API replacements.

Approach 2: True Compatibility

Since replacing APIs was already tricky in some cases and just impossible for higher-level APIs, I was searching for a solution that allowed me to (ideally) not touch any API calls in BOFs’ source code at all. There are two challenges to solve: allowing BOFs to call CS APIs and transforming their entrypoint to BRC4’s signature.

Compatibility Layer

Luckily, I wasn’t the first one to attempt this: TrustedSec’s COFFLoader is able to execute arbitrary compiled CS BOFs which means that it treats BOFs as blackboxes and introduces a compatibility layer that implements the CS BOF API. With this approach in mind I modelled the following design:

The compatibility layer

The idea is simple:

The CS API definitions included in BOF source code, usually called beacon.h, are replaced with stubs that don’t import the actual API but use the compatibility layer code. The compatibility layer imports the BRC4 BOF API and calls it as needed. COFFLoader’s compatibility layer is very readable and straight-forward to understand. It implements all the higher-level concepts missing in the BRC4 API. One only needs to copy their implementation and swap out some bits that require imports, such as string or memory utilities. They should be replaced with BRC4’s equivalents (e.g. replacing memcpy with BadgerMemcpy) or, less ideally, with MSVCRT imports (e.g. vsnprintf for string formatting). For example, the BeaconFormatAlloc API can be implemented as follows:

void BeaconFormatAlloc(formatp* format, int maxsz) {
	if (format == NULL) return;
	format->original = (char*)BadgerAlloc(maxsz);
	format->buffer = format->original;
	format->length = 0;
	format->size = maxsz;
}

For the sake of completeness: the compatibility layer should also include imports to the WinAPI functions included by default in CS (GetProcAddress, LoadLibraryA, GetModuleHandle, and FreeLibrary).

As a result, following this approach won’t tamper with the BOFs’ original logic but lets us implement the CS API ourselves, which in turn allows our BOFs to run in BRC4 now. Well, almost: the entrypoint isn’t compatible yet, and that’s not necessarily trivial.

Wrapping the Entrypoint

As we saw in the first attempt, porting the entrypoint from CS to BRC4 BOFs isn’t really tricky as we only need to change the function signature. It does get tricky if our BOF uses its start parameters (and thereby CS’s Data Parser API) though:

This API allows passing arbitrary data to CS BOFs. To achieve this, CS BOFs can ship with CNA scripts that allow the CS client to query input data (such as files) from operators, which the CNA assembles into a binary blob. This blob is sent along with the BOF itself to the implant (“beacon”). The BeaconData* APIs (which make up the Data Parser API) allow BOFs to disassemble this blob into structured data again. BRC4 doesn’t have this scripting capability and its BOF entrypoint only allows passing string-based arguments instead.

Again, COFFLoader solved the same problem before: it comes with a Python script that encodes arbitrary input into a hex-string that can be deserialized to a byte-buffer and passed to CS BOF entrypoints. Following the same approach, I worked out the following rather simple addition to the design above:

Wrapped entrypoint

Once more, the idea is simple:

Operators encode their inputs to string and pass it to the BOF using BRC4’s coffexec command. A minimal BRC4 entrypoint is appended to the BOF source code. This entrypoint decodes supplied input strings to a buffer and passes that buffer to the original CS entrypoint.

Summary

In essence, this approach consists of only three steps:

  1. Replace CS API imports with compatibility layer implementations
  2. Wrap CS entrypoint with a custom BRC4 entrypoint that prepares input for the Data parser API
  3. Manually encode execution parameters

This still isn’t a perfect solution but leaves us with a couple of pros and cons:

  • ✅ Doesn’t touch the original BOF’s logic
  • ✅ Flexibility: the same approach works for most (if not all) BOFs out there
  • ❌ Requires (somewhat) elaborate compatibility implementation
  • ❌ Requires some way to inject the compatibility layer (e.g. via source code)

III. Coming up next

Now that we have a solid and flexible approach to run CS BOFs on BRC4, there’s only one thing missing – a tool that automates it all!

We will publish CS2BR – a tool that does just that – as an open source project on Github along with a follow-up blogpost all about it soon. Stay tuned!

Moritz Thomas

Moritz Thomas

Moritz is a senior IT security consultant and red teamer at NVISO.
When he isn’t infiltrating networks or exfiltrating data, he is usually knees deep in research and development, working on new techniques and tools in red teaming.


文章来源: https://blog.nviso.eu/2023/05/15/introducing-cs2br-pt-i-how-we-enabled-brute-ratel-badgers-to-run-cobalt-strike-bofs/
如有侵权请联系:admin#unsafe.sh