Developing a BugSleep C2 Server and Monitoring Its Traffic with Snort

Short Summary:

In June 2024, researchers analyzed a new remote access tool (RAT) named “MuddyRot” or “BugSleep,” which utilizes a custom command and control (C2) protocol for reverse shell and file I/O capabilities. The article discusses the methodology for reversing BugSleep’s protocol, building a functional C2 server, and detecting its traffic using Snort.

Key Points:

BugSleep is a remote access tool (RAT) with reverse shell and file I/O capabilities.
It employs a bespoke command and control (C2) protocol over plain TCP sockets.
The protocol uses a pseudo-TLV (Type Length Value) structure for communication.
BugSleep implements file obfuscation techniques to evade detection.
Key functions include C2Loop for establishing connections and CommandHandler for processing commands.
Commands include Ping, GetFile, PutFile, and a reverse shell command.
Detection strategies using Snort involve monitoring beacons and command IDs.
Published Snort SIDs for BugSleep traffic are 63937 and 63938.

MITRE ATT&CK TTPs – created by AI

Remote Access Tool (RAT) – T1219
- Procedures:
  - Utilizes a custom C2 protocol for remote access.
  - Enables reverse shell and file I/O operations on the target system.
Command and Control – T1071
- Procedures:
  - Communicates over TCP sockets with a custom protocol.
  - Beacons to the C2 server for command execution.
Data Obfuscation – T1027
- Procedures:
  - Employs file obfuscation techniques to avoid detection.
  - Uses payload encryption by modifying byte values.

In June 2024, security researchers published their analysis of a novel implant dubbed “MuddyRot”(aka “BugSleep”). This remote access tool (RAT) gives operators reverse shell and file input/output (I/O) capabilities on a victim’s endpoint using a bespoke command and control (C2) protocol. This blog will demonstrate the practice and methodology of reversing BugSleep’s protocol, writing a functional C2 server, and detecting this traffic with Snort.

BugSleep implant implements a bespoke C2 protocol over plain TCP sockets.
BugSleep operators have demonstrated multiple file-obfuscation techniques to avoid detection.
BugSleep implements reverse shell, file I/O, and persistence capabilities on the target system.

This blog will use sample b8703744744555ad841f922995cef5dbca11da22565195d05529f5f9095fbfca for analysis. Two of the lowest functions in the C2 stack, referred to as SendSocket (FUN_1400034c0) and ReadSocket (FUN_140003390), are very light wrappers for the send and receive API functions and handle payload encryption. They include some error handling by attempting to send or receive data 10 times before failing.

This protocol uses a pseudo-TLV (Type Length Value) structure with only two types: integer or string. Integers are sent as little-endian 4- or 8-byte values, and strings are prepended with the 4-byte value of its length. Payloads are then encrypted by subtracting a static value from each byte in the buffer (in this sample it is three).

	Plain text	Cipher text
IntegerMsg	06 00 00 00	03 FD FD FD
StringMsg	05 00 00 00 48 65 6C 6C 6F	02 FD FD FD 51 5E 69 6C 70

Figure 1: Example of data encryption used by BugSleep

There are two main functions for handling C2 communications: C2Loop (FUN_1400012c0) and CommandHandler (FUN_1400028a0). C2Loop is responsible for setting up socket connections with the server and sending a beacon, while CommandHandler is responsible for processing and executing commands from the server.

After setting up the socket connection, the implant beacons (FUN_140003d80) to the C2 server for a command. The beacon is a StringMsg in the form ComputerName/Username. If the server responds with an IntegerMsg equal to 0x03, BugSleep will terminate itself. We suspect this is remnants of an old kill command or an emergency kill without the overhead of reading the real kill command later.

Each BugSleep command is sent as an IntegerMsg after the beacon response. The following enumeration defines all the command IDs discovered.

The implant communicates using plain TCP sockets, which can be seen using a Netcat listener and Wireshark.

A screenshot of a computer

Description automatically generated — Figure 3: BugSleep beacon as seen through Wireshark.

Recalling the message encryption demonstrated in Figure 1, the beacon can be decrypted with a little bit of Python (Figure 4). This will be used again when building the rest of the C2 server.

A computer screen shot of a program code

Description automatically generated — Figure 4: Decoding beacon data

With an understanding of the protocol basics, it is time to start building the C2 server. Full source code can be found here.

Beacon

As mentioned earlier, the BugSleep beacon function sends a StringMsg and reads an IntegerMsg response from the server. Since the IntegerMsg returned can be anything but 0x03, we returned the length of the Computer Name/Username string received by the server.

A screenshot of a computer program

Description automatically generated — Figure 5: Output from C2 server receiving beacon data

Ping command

The simplest command to implement is the Ping command. It has the command ID of 0x63 (BugSleep subtracts one from whatever ID it receives). The code is simple: send back 4 bytes.

Figure 6: Switch case for handling ping command

Once the beacon comes in, the server is responsible for:

Sending 4 bytes for beacon response
Sending 4 bytes for Ping command ID
Reading 4 bytes of Ping data

The ping command was observed sending back 4 bytes recently allocated on the heap, so it’s not guaranteed to know what that data looks like. To validate things are really working, a breakpoint can be set in WinDbg and memory set manually before being sent.

File commands

The next set of commands are responsible for downloading files onto the compromised system or uploading files to the C2 server (PutFile and GetFile, respectively). These commands are inverses of each other, so only the GetFile command will be discussed in detail. The methodology was to trace each call to SendSocket or ReadSocket and implement the response for that call in Python. In CommandHandler, the implant reads the length and value off the wire. This is the file to be retrieved.

A computer code on a black background

Description automatically generated — Figure 8: GetFile reading path string length and path string from socket

The CmdGetFile function opens the target file and chunks it over the socket one page at a time. The list of SendSocket calls is as follows:

The PutFile command differs slightly from the GetFile command with how it uses pointer math to process incoming pages.

Figure 11: Tricky file pointer math

This translates to each page starting with a 4-byte page number followed by 1020 bytes (or 0x3fc) of file data, which the GetFile command does not do; it sends full 1024-byte pages of file data without page numbers.

Reverse shell

The last command is the reverse shell. This is the most complex because it requires many reads and writes over the socket. The disassembly is rather long and difficult to keep track of the socket calls, so we have omitted it. Effectively, the implant spawns a cmd.exe process (FUN_1400016e0) and reads the command to execute from the socket. The shell command and its output are marshaled between the processes via pipes during the session. The complexity of this operation comes from BugSleep incrementally reporting return values from pipe API calls while attempting to read shell output (FUN_140003840). The implant will enter this loop of reading commands and sending output until it receives the string “terminaten”.

A computer screen shot of white text

Description automatically generated — Figure 12: Example output from C2 server running the reverse shell command

The rest of the commands are less complex but have been implemented and are viewable here.

This server gives Talos the ability to emulate any number of conversations between BugSleep and its operators. This traffic is crucial for writing and validating our detections’ performance in the wild.

The initial candidate for detection would be the beacon. It is the first opportunity to shut down communications, isolating any BugSleep instance from receiving commands. It was observed that each beacon has the form of <len><data>, where data is sub_string(COMPUTER_NAME + “/” + USERNAME, 3). This string is not long or static, which makes it a poor candidate for a fast_pattern; however, recall that each beacon is prepended with a 4-byte length of this string. A Computer Name/Username string from any given victim is unlikely to be longer than 255 characters. This means most length fields are going to look like |XX 00 00 00| or |XX FD FD FD| when encoded. This could be a quick match, early in the stream, at a static offset, making it a decent fast_pattern candidate.

Figure 13: Detecting higher order encoded zero bytes of beacons sent from BugSleep

This will work but is likely to cause false-positives (FP) in the wild. Every sample of BugSleep was seen using port 443. The implant is also reaching outside the network to a C2 server, so traffic to be inspected by this rule can be reduced using the following header:

Figure 14: Restricting rule to inspect traffic leaving the network to port 443

The flow:to_server,established option can be used to restrict Snort to data coming from a client over established TCP streams. The FP-rate on this rule still isn’t great. Any TCP traffic leaving the network on port 443 with |FD FD FD| at offset 1 will alert. That might sound unique, but it does not indicate with confidence that the traffic is a BugSleep beacon.

One powerful tool in Snort to add more logic or state to rules is flowbits. These allow a writer to have a sense of state within a stream across multiple rules. In this case, the beacons aren’t enough to reliably alert on. What if we use flowbits to chain beacons with the commands being sent back? The commands themselves don’t provide much content, as they are variable length non-deterministic strings (e.g., get, put, etc.) or a nondeterministic 4-byte integer (e.g., heartbeat, increment timeout, etc.). They do, however, all start with a 4-byte command ID. Setting a flowbit when a beacon leaves the network will allow another rule down the line to alert with higher confidence if it sees a command ID come back in the same stream.

Command rules

The pcre rule option can be used to reduce 11 rules down to one. Like the beacon rule, the three zero bytes, encoded as |03|, can be used as a fast_pattern. Once the rule has entered, the bugsleep_beacon flowbit check can be performed to help the rule exit quickly in the event of a false positive. After the three |03| bytes are confirmed to be at offset five, a PCRE can verify one of the command IDs is present.

Sharp edges

Sometimes, we are reminded that Snort can handle or interpret data differently than expected. Conveniently, this sample’s traffic was a perfect example and opportunity to peek under the hood and see what Snort sees. Originally, our beacon rule looked like this, trying to catch the encoded forward-slash that is always present in the Computer Name/Username string (encoded as a comma).

Recall that the implant will:

Connect to the server
Send a string length (4 bytes)
Send the PC/User string N bytes
Read 4 bytes back to ensure a response
Read 4-byte command ID and N command data bytes
Start sending command responses

As Snort is reading data over the wire, it is interpreting it and sorting it into different buffers (pkt_data, file_data, js_data, http_*, etc.). In this case, as TCP data is being chunked along the wire, Snort is looking at those individual TCP segments. Only after it has enough data will it flush into the larger “TCP stream” buffer so a rule can parse the entire stream sent from a client or server.

Initially, the get command traffic was alerting while the put command traffic was not. Fortunately, Snort 3 comes with a tracing module to help debug these issues. The buffer option will print out Snort’s different buffers as they are filled and rule_eval will trace the rule as it is evaluated. The following screenshots are output from individual runs of Snort against each PCAP. “snort.raw” represents an individual packet, while “snort.stream_tcp” represents a reassembled TCP stream.

At the start of the working GetFile command, the beacon size and data can be seen as two separate packets (Figure 17).

Further down, the reassembled TCP stream can be seen being inspected and alerted on. Moving from the top to bottom in Figure 18, the cursor position and state of the buffer can be observed changing as the rule is evaluated. At the end, the flowbit is set and made available for the command rule.

Figure 18: Snort trace output setting flowbit for BugSleep beacon

Further down, the TCP stream for the command data is processed. The higher-order zeroes of the command are found, the flowbit checked, the PCRE performed, and the SID alerts as expected.

Figure 19: Get file command rule alerts on traffic as expected

When the results of the put file command traffic are inspected, a different behavior is observed. The individual packets for beacon length and beacon data are seen coming in; however, the first reassembled TCP stream that Snort is inspecting is the command being sent back to the implant. Figure 20 shows the command ID being found and then the flowbit check failing.

Scrolling further in the log reveals the TCP stream for the beacon data is eventually populated and Snort sets the flowbit as expected. The stream for the command ID, however, has already passed and failed analysis because of the unset flowbit, resulting in no alert. The cause of this issue is the raw packets coming from the client not being reassembled into a TCP stream by the time the server packets are reassembled and inspected. This happens because Snort only reassembles when it has enough data, and 20 bytes is not enough yet.

The fix

Unfortunately, the beacon rule must be tweaked so it can alert as soon as possible and not rely on the TCP reassembly. Recall that the beacon function invokes SendSocket twice, once for 4-length bytes and again for the beacon data. This means the first packet Snort sees will only be 4 bytes long. Adding “bufferlen:=4” restricts Snort to only look at 4-byte packets, significantly reducing any FP rate. Our solution ended up being this:

Now the rules work as expected!

Since BugSleep is a new implant and weekly releases were observed being deployed, this protocol might change and bypass these rules. However, two things have been accomplished:

This variant will no longer communicate over our customers’ networks.
Attackers must invest development time and money to use BugSleep again.

The published Snort SIDs covering this traffic are 63937 and 63938.

Hosts:

1[.]235[.]234[.]202
146[.]19[.]143[.]14
46[.]19[.]143[.]14
5[.]239[.]61[.]97

Hashes

The following Windows executables were collected during our research. Assuming these have not been manipulated, the compilation time for this set of binaries indicates weekly releases of BugSleep.

	Compile Time
b8703744744555ad841f922995cef5dbca11da22565195d05529f5f9095fbfca	Wed., May 8 00:55:53 2024 UTC
94278fa01900fdbfb58d2e373895c045c69c01915edc5349cd6f3e5b7130c472	Wed., May 22 21:56:39 2024 UTC
73c677dd3b264e7eb80e26e78ac9df1dba30915b5ce3b1bc1c83db52b9c6b30e	Fri., May 31 23:29:21 2024 UTC
5df724c220aed7b4878a2a557502a5cefee736406e25ca48ca11a70608f3a1c0	Sun., Jul 07 21:09:49 2024 UTC
960d4c9e79e751be6cad470e4f8e1d3a2b11f76f47597df8619ae41c96ba5809	Sat., Jul 15 09:15:20 2079 UTC

Source: https://blog.talosintelligence.com/writing-a-bugsleep-c2-server/

Tags: PAYLOAD, PERSISTENCE, WINDOWS, TOOL, full