This post is a continuation of "Malware Unpacking With Hardware Breakpoints".
Here we will be utilising Ghidra to locate the shellcode, analyse the decryption logic and obtain the final decrypted content using Cyberchef.
Locating the Shellcode Decryption Function In Ghidra
At the point where the hardware breakpoint was first triggered, the primary executable was likely in the middle of the decryption function. We can use this information to locate the same decryption function within Ghidra.
From here, we can do some interesting things which are covered in the next 7 sections.
- Locating the Shellcode Decryption Function In Ghidra
- Identifying Decryption Routine Logic With ChatGPT
- Identifying the Decryption Key Using Ghidra
- Locating the Encrypted Shellcode Using Entropy
- Performing Manual Decoding Using Cyberchef
- Hunting For Additional Samples Using Decryption Bytes
- Creating a Yara Rule Using Decryption Code
Locating the Shellcode Decryption Function In Ghidra
If we run the malware again, we can stop at the initial hardware breakpoint trigger and scroll up slightly in the disassembly window.
This will reveal the decryption logic used to obtain the shellcode.
In addition to the notes above, we can observe that the instruction pointer RIP
is inside of a loop that contains an XOR instruction.
A looping XOR
instruction can be a strong indicator of decryption/decoding logic.
If we copy the loop's contents, we can use this to investigate the logic inside of Ghidra.
We can use Right-Click -> Binary -> Edit
to copy out the bytecodes associated with the suspected decoding loop.
We can load the file within Ghidra and perform a memory search on these suspicious bytes. This can lead us to the decryption function.
Ghidra -> Search -> Memory
– Make sure to use "hex" format and "All Blocks"
By clicking "Next" or Search All
within the search menu, we are taken straight to the decryption function inside Ghidra.
We can observe the call to VirtualAlloc
, VirtualProtect
and CreateThread
inside of the Decompiler.
We can also view the decompiled decryption logic inside of the for
loop. A primary giveaway here is the ^
xor operator.
If the above output is confusing, you can increase the readability by disabling type casts. Edit -> Tool Options -> Decompiler -> Disable Printing of Type Casts
.
We can also see that the variable lpAddress
(assigned to the result of VirtualAlloc
) will receive the decoded content (it is assigned the result of the xor ^
operation) and is then modified by VirtualProtect
and then executed via CreateThread
Now that we've identified the function and logic associated with decryption, we can try to identify the type of encryption/obfuscation used.
Identifying Decryption Routine Logic With ChatGPT
Using ChatGPT, we can attempt to gather additional information about the decompiled code.
We can copy out the decompiled code and ask ChatGPT something like In 3 sentences or less, can you summarise the purpose of this Ghidra Decompiled code
This identifies the general gist of the code but doesn't provide a lot of information about the decryption routine.
To gather more information about the encoding itself, we can take out the contents of the for
loop and summarise it with ChatGPT.
ChatGPT can recognize that a simple 4-byte key is used to decrypt some bytes and write them to the buffer we identified at lpAddress
Identifying the Decryption Key Using Ghidra
If we return to the decompiler output, we can observe the 4-byte key param_3
that was referenced by ChatGPT.
We can confirm that param_3 is part of the for
loop used for decoding and also that the value of param_3
is not visible within this function.
By right-clicking on the function name from the above screenshot, FUN_0040152e
we can use "Show References To" to identify where the function is called.
We can use this to identify the value passed in param_3
4-byte, which likely contains the 4-byte decryption key.
By Clicking on the value in the 3rd argument (param_3), we can jump to the location where the 4-byte key is stored.
This can be seen in the left window of the below screenshot. The 4-byte decryption key is 32 2f 0d 96
Locating the Encrypted Shellcode Using Entropy
The process of locating the encrypted shellcode is slightly more complex.
Cobalt Strike uses a system of named pipes to move around encrypted data. It is quite tedious to locate the shellcode from the point of the previous screenshot.
Instead, we will use entropy to locate the encrypted shellcode content.
We can begin this process by
- Enabling the entropy view
- Identifying a high-entropy section
- Locating the beginning of the high entropy section using
recent labels
We can begin by enabling the entropy view and clicking on the area with the highest entropy.
High entropy areas are indicated by a red section within the entropy view. However, for some reason, Ghidra also highlights high entropy areas with a bright white colour.
(There is an entropy colour reference within Ghidra, but it's blank when using dark mode)
We can move on by clicking anywhere within the white section.
We will now be somewhere within the encrypted section.
We want to go to the start of the encrypted region, which we can do by selecting the "L" (most recent label) button in Ghidra.
We should now be at the start of the encrypted content.
If we want to obtain the encrypted content for manual decoding, we can highlight it and select Copy Special -> Byte String
Performing Manual Decoding Using Cyberchef
We can paste the encrypted content into CyberChef and decrypt it using the 4-byte key identified from param_3
in the previous heading.
In the CyberChef output, we can observe the same strings previously identified within the decrypted shellcode.
Hunting For Additional Samples Using Decryption Bytes
Decryption and decoding routines are often unique enough for malware hunting and Yara rules.
If we go back and take the bytes we obtained from x64dbg and search with Ghidra, we can hunt for additional samples.
For example, we can search for additional samples using unpac.me. In this case, 58 results were obtained which all appeared to be Cobalt Strike samples.
The results from the search all returned 50+ results for Cobalt Strike on Virustotal.
Creating a Yara Rule Using Decryption Code
Working on the (generally safe) assumption that the decryption logic remains the same across similar samples.
We can use the identified decryption bytes to create a simple Yara rule. This should return the same results as the previous byte search with unpacme.
Source: Original Post