Apache Applications Targeted by Stealthy Attacker

Researchers at Aqua Nautilus have uncovered a new attack targeting Apache Hadoop and Flink applications. This attack is particularly intriguing due to the attacker’s use of packers and rootkits to conceal the malware. The simplicity with which these techniques are employed presents a significant challenge to traditional security defenses.

The exploited misconfigurations

Apache Hadoop and YARN

Apache Hadoop is an open source framework designed for distributed storage and processing of large datasets across clusters of computers using simple programming models. It is highly scalable and designed to handle failures at the application layer.

Hadoop YARN (Yet Another Resource Negotiator) is a component of Hadoop that provides a platform for managing computing resources in clusters and using them for scheduling users’ applications. YARN separates the resource management and job scheduling/monitoring functions into separate daemons, allowing for more efficient cluster management.

Over the past few weeks, we discovered a new and interesting attack that targeted our cloud honeypots. The exploited misconfiguration is in the ResourceManager of Hadoop YARN. The YARN permits unauthenticated users to create and run applications. This misconfiguration can be exploited by an unauthenticated, remote attacker through a specially designed HTTP request, potentially leading to the execution of arbitrary code, depending on the privileges of the user on the node where the code is executed.

Apache Flink

Apache Flink is an open-source, unified stream-processing and batch-processing framework developed by the Apache Software Foundation. The core of Apache Flink is a distributed streaming data-flow engine written in Java and Scala. Flink is designed to run in all common cluster environments, perform computations at in-memory speed, and at any scale.

Apache Flink Unauthenticated RCE refers to a misconfiguration that allows a remote attacker to execute arbitrary code on a system running Apache Flink without needing to authenticate.

Both misconfigurations are not new, it is actually well known and was reported by our team in the past (TeamTNT campaigns for instance).

Hadoop attack flow

Both attacks are pretty much the same, thus we focused this blog on the Hadoop attack flow.

The attack flow

Figure 1: The attack flow

As illustrated in Figure 1, the attack itself is pretty simple and straightforward. A misconfiguration in Hadoop is exploited to drop and executed the binary dca which downloads 2 other binaries (rootkits) and writes to disk a Monero cryptominer. Below we will further explain the techniques used during this attack.

Initial access

The targeted application is Apache Hadoop, exploiting a misconfiguration of the ResourceManager of the YARN. As seen in Figure 2, below the attacker sends an unauthenticated request to deploy a new application.

Discovering the environment

Figure 2: The threat actor is discovering our environment

Next, as illustrated in Figure 3 below, the attacker is able to run a remote code by sending a POST request to the YARN, requesting to launch the new application with the attacker’s command.

new application with a malicious command

Figure 3: The attacker uploads a new application with a malicious command

Primary payload

A closer look to the code execution as seen in Figure 4 below, reveals that the attacker is deleting the content of the /tmp directory, downloading from the C2 server a file (dca) into /tmp, executing the file and deleting again the content of the /tmp directory.

download a packed ELF binary (dca)

Figure 4: The malicious command is set to download a packed ELF binary (dca)

Secondary payloads and main impact

The file ‘dca’ is a packed ELF binary (MD5=901ac649b47e0261d88f568f02c90412).

indication within the packed binary

Figure 5: indication within the packed binary of a packer named BGP

In Virus Total it is undetected. However, the behavioral analysis indicates its true functionality is to serve as a downloader of 2 different rootkits aimed to hide sh and another binary tmp. This binary is actually a Monero cryptominer which is contained within dca.

Virus Total classification of dca

Figure 6: Virus Total classification of dca

The Monero cryptominer is written to /var/tmp/tmp and executed.

execution of the binary tmp (a Monero cryptominer)

Figure 7: execution of the binary tmp (a Monero cryptominer)

As illustrated in Figure 7 above, the /var/tmp/ directories are deleted. Next the tmp binary is written to disk and executed with the pool location ns1.disponsibletogether.com. Next, the binary and /tmp directories content is deleted to evade detection.

Cryptominer network communication

Figure 8: Cryptominer network communication

As illustrated in Figure 9 below, the malware is set to delete the configuration of the dynamic linker under /etc/ld.so.preload and delete all the shared objects under /usr/local/lib/. It’s then set to download 2 .so files, initrc.so (MD5= 0a100f6a07e7fd611553ef7c42f37f5a) and pthread.so (MD5= 38d898459a3f530e2db083e1bb1e1524). Both detected in Virus Total as a Process hider rootkit. An analysis of the rootkits shows the threat actor is hiding all shell processes.

Deletion of LD_Preload and download of rootkits

Figure 9: Deletion of LD_Preload and download of rootkits

Lastly, to ensure persistence of this attack, the cronjob locations are deleted and re-written with a command to download dca.sh script.

Cron Job creation

Figure 10: Cron Job creation

This script is actually a deployer of the dca binary as can be seen in Figure 11 below.

the dca.sh script content

Figure 11: the dca.sh script content

The threat actor’s infrastructure

Analysis of the threat actor’s infrastructure shows that he used IP address 20[.]150[.]209[.]84 to scan and infect our honeypots. Specifically, it targeted our Hadoop, Flink, Redis and Spring CVE-2022-22965.

Shodan search engine

Figure 12: A screenshot from Shodan search engine

When inspected in Shodan (IP addresses search engine), the infecting IP address seems to contain, an odd Java byte code encoded in Hexadecimal. After decoding it seems to be a Java interface named Stage within the javapayload.stage package. This interface is presumably part of the Metasploit Framework’s Java payload implementation.

The ‘Stage’ script

Figure 13: The ‘Stage’ script

As mentioned above, the threat actor is using a staging server 185[.]196[.]9[.]190 and the URL ns1[.]disponibletogether[.]com. The domain resolves to the IP address of the staging server. This domain was registered on October 31st, 2023:

whois screenshot

Figure 14: whois screenshot indicating the domain was recently registered

On the http[:]//185[.]196[.]9[.]190/srv/ path you can find the binary dcd and dca as well as the .so files, crontab and the download scripts dca.sh and dcd.sh. The two binaries dca and dcd seem to bear great similarity.

Expanding our search on this IP address linked it to 2 other IPs 185[.]196[.]9[.]181 and 185[.]196[.]9[.]200. Further inspection of our data revealed that these IPs were used by the threat actor to attack our honeypots as well.

Mapping the Campaign to the MITRE ATT&CK Framework

Our investigation showed that the attackers have been using some common techniques throughout the campaign. However, the defense evasion tactics have evolved:

mitre-table-3

Based on the information provided, here are the techniques and sub-techniques according to the MITRE ATT&CK framework that are relevant to this attack:

  1. Initial AccessExploit Public-Facing Application (T1190): The attackers exploited a misconfiguration in the ResourceManager of Hadoop YARN, allowing unauthenticated users to create and run applications.
  2. ExecutionCommand and Scripting Interpreter: Unix Shell (T1059.004): The attacker executed arbitrary code through a specially crafted HTTP requests.
  3. PersistenceScheduled Task/Job: Cron (T1053.003): The threat actor is deleting all cron jobs and creating a cron job to establish persistence.
  4. Defense EvasionObfuscated Files or Information: Software Packing (T1027.002): The threat actor is using BGP packer to pack and obfuscat the main payload the ‘dca’ ELF binary.Obfuscated Files or Information: Stripped Payloads (T1027.008): Once unpacked the ‘dca’ ELF binary symbols and strings were removed to make the analysis more difficult.Obfuscated Files or Information: Embedded Payloads (T1027.009): The Monero cryptominer is embedded withing the ‘dca’ ELF binary.File and Directory Permissions Modification: Linux File and Directory Permission Modification (T1222.002): Modifying permissions to execute the payload.Rootkit (T1014): The threat actor is using 2 Processhider rootkits to hide the cryptominer (tmp) and all SH commands.
  5. DiscoverySystem Information Discovery (T1082): Initially the threat actor is conducting a vCore (virtual CPU cores) and memory discovery.
  6. ImpactResource Hijacking (T1496): The deployment of a Monero cryptominer suggests resource hijacking for cryptocurrency mining.

Summary of key points

In this blog post we reviewed a cyberattack targeting Apache Hadoop and Flink servers in containerized environments. Here’s a summary of the key points

  1. Attack Target and Method: The attack focuses on Apache Hadoop, an open-source framework for distributed storage and processing. Specifically, it exploits a misconfiguration in Hadoop’s YARN ResourceManager, which allows unauthenticated users to create and run applications.
  2. Attack Flow: The attackers send an unauthenticated request to deploy a new application on Hadoop, followed by a POST request to execute arbitrary code. This misconfiguration is not new and has been previously reported.
  3. Malware Deployment: The primary payload involves a binary named ‘dca’, which is downloaded and executed. This binary further downloads two other binaries (rootkits) and contains a Monero cryptominer which is written to disk.
  4. Defense Evasion Techniques: The attack employs sophisticated evasion techniques, including the use of packed ELF binaries and rootkits that are undetected by regular security solutions. The malware deletes contents of specific directories and modifies system configurations to evade detection.
  5. Persistence Mechanisms: Persistence is achieved by manipulating cron jobs to download and execute a script that deploys the ‘dca’ binary.
  6. Threat Actor’s Infrastructure: The infrastructure used by the attackers includes specific IP addresses and domains.
  7. Detection and Recommendations: agent-based runtime suspicious/malicious behavior detection solutions are a good solution to detect cryptominers, rootkits, obfuscated or packed binaries, as well as container drift. Aqua’s customers who deployed our CNAPP agent-based runtime solution are protected from these kinds of attacks.

Indications of Compromise (IOCs)

Type Value Comment
IP address
IP address 20[.]150[.]209[.]84 Scan and infect IP address (Hadoop)
IP address 185[.]196[.]9[.]181 Download server (Hadoop)
IP address 185[.]196[.]9[.]190 Download server (Hadoop)
IP address 185[.]196[.]9[.]200 Download server (Hadoop)
IP address 185[.]196[.]9[.]5 Scan and infect IP address (Flink)
IP address 185[.]196[.]9[.]7 Download server (Flink)
IP address 185[.]196[.]9[.]8 Download server (Flink)
Domain
Domain ns1[.]disponibletogether[.]com Mining address
Files
File (ASCII) Name: dca.sh MD5: 58794e43c039fe20281bf0777721c8ce dca binary download shell script
File (ASCII) Name: dcd.sh MD5: 94e0f679758facf683a217774e29c2b2 dcd binary download shell script
File (ELF) Name: dca MD5: 901ac649b47e0261d88f568f02c90412 Main payload, dca malware
File(ELF) Name: dcd MD5: cebadcafee4ed6a69c64ab08496163d7 Main payload, dcd malware
File(ELF) Name: tmp MD5: d37e385f2fa64173c44b001b40ce48a3 XMRIG cryptominer
File (SO) Name: pthread.so MD5: 0a100f6a07e7fd611553ef7c42f37f5a Proccesshider rootkit
File (SO) Name: initrc.so MD5: 38d898459a3f530e2db083e1bb1e1524 Proccesshider rootkit

Source: https://blog.aquasec.com/threat-alert-apache-applications-targeted-by-stealthy-attacker