Ghidra Basics – Manual Shellcode Decryption

This post is a continuation of “Malware Unpacking With Hardware Breakpoints”.

Here we will be utilising Ghidra to locate the shellcode, analyse the decryption logic and obtain the final decrypted content using Cyberchef.

Locating the Shellcode Decryption Function In Ghidra

At the point where the hardware breakpoint was first triggered, the primary executable was likely in the middle of the decryption function. We can use this information to locate the same decryption function within Ghidra.

From here, we can do some interesting things which are covered in the next 7 sections.

  • Locating the Shellcode Decryption Function In Ghidra
  • Identifying Decryption Routine Logic With ChatGPT
  • Identifying the Decryption Key Using Ghidra
  • Locating the Encrypted Shellcode Using Entropy
  • Performing Manual Decoding Using Cyberchef
  • Hunting For Additional Samples Using Decryption Bytes
  • Creating a Yara Rule Using Decryption Code

Locating the Shellcode Decryption Function In Ghidra

If we run the malware again, we can stop at the initial hardware breakpoint trigger and scroll up slightly in the disassembly window.

This will reveal the decryption logic used to obtain the shellcode.

In addition to the notes above, we can observe that the instruction pointer RIP is inside of a loop that contains an XOR instruction.

A looping XOR instruction can be a strong indicator of decryption/decoding logic.

If we copy the contents of the loop, we can use this to investigate the logic inside of Ghidra.

We can use Right-Click -> Binary -> Edit to copy out the bytecodes associated with the suspected decoding loop.

We can load the file within Ghidra and perform a memory search on these suspicious bytes. This can lead us to the decryption function.

Ghidra -> Search -> Memory – Make sure to use “hex” format and “All Blocks”

By clicking “Next” or Search All within the search menu, we are taken straight to the decryption function inside of Ghidra.

We can observe the call to VirtualAlloc, VirtualProtect and CreateThread inside of the Decompiler.

We can also view the decompiled decryption logic inside of the for loop. A primary giveaway here is the ^ xor operator.

If the above output is confusing, you can increase the readability by disabling type casts. Edit -> Tool Options -> Decompiler -> Disable Printing of Type Casts.

We can also see that the variable lpAddress (assigned to the result of VirtualAlloc) will receive the decoded content (it is assigned the result of the xor ^ operation) and is then modified by VirtualProtect and then executed via CreateThread

Now that we’ve identified the function and logic associated with decryption, we can go ahead and try to identify the type of encryption/obfuscation used.

Identifying Decryption Routine Logic With ChatGPT

Using ChatGPT, we can attempt to gather additional information about the decompiled code.

We can copy out the decompiled code, and ask ChatGPT something like In 3 sentences or less, can you summarise the purpose of this Ghidra Decompiled code

This identifies the general gist of the code, but doesn’t provide a lot of information about the decryption routine.

To gather more information about the encoding itself, we can take out the contents of the for loop and summarise it with ChatGPT.

ChatGPT is able to recognize that a simple 4-byte key is used to decrypt some bytes and write them to the buffer we identified at lpAddress

Identifying the Decryption Key Using Ghidra

If we return to the decompiler output, we can observe the 4-byte key param_3 that was referenced by ChatGPT.

We can confirm that param_3 is part of the for loop used for decoding, and also that the value of param_3 is not visible within this function.

By right-clicking on the function name from the above screenshot, FUN_0040152e we can use “Show References To” to identify where the function is called.

We can use this to identify the value that is passed in param_3, which likely contains the 4-byte decryption key.

By Clicking on the value in the 3rd argument (param_3), we can jump to the location where the 4-byte key is stored.

This can be seen in the left window of the below screenshot. The 4 byte decryption key is 32 2f 0d 96

Locating the Encrypted Shellcode Using Entropy

The process of locating the encrypted shellcode is slightly more complex.

Cobalt Strike uses a system of named pipes to move around encrypted data. It is quite tedious to locate the shellcode from the point of the previous screenshot.

Instead, we will use entropy to locate the encrypted shellcode content.

We can begin this process by

  • Enabling the entropy view
  • Identifying a high-entropy section
  • Locating the beginning of the high entropy section using “recent labels”

We can begin by enabling the entropy view and clicking on the area with the highest entropy.

Typically high entropy areas are indicated by a red section within the entropy view. However for some reason Ghidra also highlights high entropy areas with a bright white colour.

(There is an entropy colour reference within Ghidra but it’s blank when using dark mode)

We can move on by clicking anywhere within the white section.

We will now be somewhere within the encrypted section.

We want to go to the start of the encrypted region, which we can do by selecting the “L” (most recent label) button in Ghidra.

We should now be at the start of the encrypted content.

If we want to obtain the encrypted content for manual decoding, we can highlight it and select Copy Special -> Byte String

Performing Manual Decoding Using Cyberchef

From here we can paste the encrypted content into CyberChef and decrypt it using the 4-byte key identified from param_3 in the previous heading.

In the CyberChef output, we can observe the same strings previously identified within the decrypted shellcode.

Hunting For Additional Samples Using Decryption Bytes

Decryption and decoding routines are often unique enough to be used for malware hunting and Yara rules.

If we go back and take the bytes we obtained from x64dbg and searched with Ghidra, we can go hunting for additional samples.

For example, we can search for additional samples using unpac.me. In this case, 58 results were obtained which all appeared to be Cobalt Strike samples.

The results from the search all returned 50+ results for Cobalt Strike on Virustotal.

Creating a Yara Rule Using Decryption Code

Working on the (generally safe) assumption that the decryption logic remains the same across similar samples.

We can use the identified decryption bytes to create a simple Yara rule. This should return the same results as the previous byte search with unpacme.

MITRE TTP :

  1. Obfuscated Files or Information (T1027): The use of encrypted shellcode and a simple XOR decryption routine indicates obfuscation of information to hinder analysis.
  2. Deobfuscate/Decode Files or Information (T1140): The process of identifying and manually decoding the encrypted shellcode using CyberChef falls under this technique, as it involves reversing obfuscation techniques to reveal the original content.
  3. Process Injection (T1055): The use of VirtualAlloc, VirtualProtect, and CreateThread to allocate memory, decode shellcode, and execute it within the context of the same process suggests process injection techniques for executing malicious code.
  4. Named Pipe (T1096): The use of a named pipe to transfer encrypted data, as indicated by the CreateNamedPipeA function, is an example of this technique, as named pipes are often used for inter-process communication and can be abused by malware for stealthy communication.
  5. Data Encoding (T1132): The presence of encrypted content being written into a named pipe indicates the use of data encoding techniques to conceal the actual data being transferred.
  6. Command and Scripting Interpreter (T1059): The execution of shellcode or scripts via CreateThread falls under this technique, as it involves using the command and scripting capabilities of the host environment to execute commands.
  7. Dynamic API Resolution (T1106): The dynamic resolution of API functions at runtime, as indicated by the decryption and use of API names like VirtualAlloc, VirtualProtect, and others, is a technique used to evade static analysis tools.
  8. Ingress Tool Transfer (T1105): The transfer of encrypted shellcode through a named pipe might be indicative of this technique, as it involves the transfer of tools or other files from an external source into the compromised environment.

Source: Original Post