Summary: Researchers from NTT Security and the University of Tokyo have developed a technique called “Bytecode Jiu-Jitsu” that allows attackers to insert malicious bytecode into the memory of interpreters like VBScript and Python, evading detection by most security software. This method exploits the way interpreters execute code, highlighting a significant vulnerability in current security measures against bytecode attacks.
Threat Actor: NTT Security Holdings Corp. and University of Tokyo Researchers | NTT Security Holdings Corp.
Victim: Software Interpreters | Software Interpreters
Key Point :
- Bytecode Jiu-Jitsu allows attackers to insert malicious commands into the bytecode held in memory, avoiding detection by security tools that typically scan source code.
- The technique takes advantage of the interpreter’s execution model, which does not require execution privileges for bytecode, making it easier for attackers to manipulate.
- Existing defenses, such as pointer checksums, may not be effective against this method, prompting the need for stricter memory write protections in interpreters.
- This research serves as a warning to security professionals about the evolving landscape of bytecode attacks and the necessity for improved defenses.
Attackers can hide their attempts to execute malicious code by inserting commands into the machine code stored in memory by the software interpreters used by many programming languages, such as VBScript and Python, a group of Japanese researchers will demonstrate at next week’s Black Hat USA conference.
Interpreters take human-readable software code and translate each line into bytecode — granular programming instructions understood by the underlying, often virtual, machine. The research team successfully inserted malicious instructions into the bytecode held in memory prior to execution, and because most security software does not scan bytecode, their changes escaped detection.
The technique could allow attackers to hide their malicious activity from most endpoint security software. Researchers from NTT Security Holdings Corp. and the University of Tokyo will demonstrate the capability at Black Hat using the VBScript interpreter, says Toshinori Usui, research scientist with NTT Security. The researchers have already confirmed that the technique also works for inserting malicious code in the in-memory processes of both the Python and the Lua interpreters.
“Malware often hides its behavior by injecting malicious code into benign processes, but existing injection-type attacks have characteristic behaviors … which are easily detected by security products,” Usui says. “The interpreter does not care about overwriting by a remote process, so we can easily replace generated bytecode with our malicious code — it’s that feature we exploit.”
Bytecode attacks are not necessarily new, but they are relatively novel. In 2018, a group of researchers from the University of California at Irvine published a paper, “Bytecode Corruption Attacks Are Real — And How to Defend Against Them,” introducing bytecode attacks and defenses. Last year, the administrators of the Python Package Index (PyPI) removed a malicious package, known as fshec2, which escaped initial detection because all its malicious code was compiled as bytecode. Python compiles its bytecode into PYC files, which can be executed by the Python interpreter.
“It may be the first supply chain attack to take advantage of the fact that Python byte code (PYC) files can be directly executed, and it comes amid a spike in malicious submissions to the Python Package Index,” Karlo Zanki, reverse engineer at ReversingLabs, said in a June 2023 analysis of the incident. “If so, it poses yet another supply chain risk going forward, since this type of attack is likely to be missed by most security tools, which only scan Python source code (PY) files.”
Going Beyond Precompiled Malware
After an initial compromise, attackers have a few options to expand their control of a targeted system: They can perform reconnaissance, try to further compromise the system using malware, or run tools already existing on the system — the so-called strategy of “living off the land.”
The NTT researchers’ variation of bytecode attack techniques essentially falls into the last category. Rather than using pre-compiled bytecode files, their attack — dubbed Bytecode Jiu-Jitsu — involves inserting malicious bytecode into the memory space of a running interpreter. Because most security tools do not look at bytecode in memory, the attack is able to hide the malicious commands from inspection.
The approach allows attacker to skip other more obviously malicious steps, such as calling suspicious APIs to create threads, allocating executable memory, and modifying instruction pointers, Usui says.
“While native code has instructions directly executed by the CPU, bytecode is just data to the CPU and is interpreted and executed by the interpreter,” he says. “Therefore, unlike native code, bytecode does not require execution privilege, [and our technique] does not need to prepare a memory region with execution privilege.”
Better Interpreter Defenses
Developers of interpreters, security-tools developers, and operating-system architects can all have some impact on the problem. While attacks targeting bytcode do not exploit vulnerabilities in interpreters, but rather the way that they execute code, certain security modifications such as pointer checksums could mitigate the risk, according to the UC Irvine paper.
The NTT Security researchers noted that checksum defenses would not likely be effective against their techniques and recommend that developers enforce write protections to help eliminate the risk. “The ultimate countermeasure is to restrict the memory write to the interpreter,” Usui says.
The purpose of presenting a new attack technique is to show security researchers and defenders what could be possible, and not to inform attackers’ tactics, he stresses. “Our goal is not to abuse defensive tactics, but to ultimately be an alarm bell for security researchers around the world,” he says.