Maldocs: Word and Excel’s Ageless Vigor

Research by: Raman Ladutska

We chose a fantasy decoration style at certain points of the article to attract attention to the described problem. We hope that visualizing a fantasy adventure as a fight against the source of evil will transform the real world and make it a safer and better place.

Figure 1 – The Title Page

Chasing new exploits, vulnerabilities, and threats is the way to go in the ever-changing cybercrime landscape. However, in a constant flow of information, the focus on yesterday’s highlights is low: every day, new CVEs occur, and new threats emerge. With this state of affairs, old menaces can be easily overlooked and still used by the attackers, even considering the exploits’ venerable age.

In this article, we focus on 3 old and well-known CVEs used in Microsoft Word and Microsoft Excel:

While not being 0-day or even 1-day exploits, they still threaten the cyber community. As painfully showcased by the infamous WannaCry attack in 2017, which leveraged EternalBlue vulnerability, N-day exploits can be equally successful in causing massive damage.

In our research, we show the statistics on attacked industries and countries and highlight the payloads – many of them are in the top prevalent malware lists – delivered by maldocs. We investigate lures used in different attack campaigns and describe several tricks that can help maldocs fool automated sandboxes, even though the CVEs used are well-known and well-aged. We do not provide an extensive technical analysis in the article, and instead share links for corresponding resources where this analysis was made 5 or more years ago.

We gathered the data to summarize the state of affairs in these 3 CVEs as we saw quite a few correlations in their usage among different malware actors and campaigns. Similarities include lure topics to trick the users into opening the document and technical tricks to hide the maldocs’ malicious intentions – to name a few.

While the threat is decreasing, it is not eliminated. More than 13000 samples that use old CVEs are lurking in-the-wild in 2023. Different formats – DOC(X), XLS(X), RTF – and tricks are used, all with the same purpose: to lure the victim into clicking and cause the subsequent malware to spread.

During its rich history, maldocs with described CVEs were used to spread a lot of infamous malware families with loud names, including Dridex in 2017 (CVE-2017-0199), Guloader in 2021 (CVE-2017-11882), and others. In 2023, there was no alteration in the situation as some notable additions to the spread payloads were detected – including samples used by Gamaredon APTAgent Tesla and Formbook/Xloader. Attack sectors chosen by the operators behind maldocs are among highly profitable spheres such as Finance/BankingGovernmental, and Healthcare.

Now, as the scene is set and the introduction is made, let us take a look at what malware is spread with the help of maldocs in 2023.

Still used by top cybercrime gangs?

As we have already mentioned, quite a few prominent malware families were spread via maldocs in 2023. We will provide some of the corresponding hashes linked with such attacks for the reader to unveil further investigation.

One of the most notable are samples used in Gamaredon APT operations. Gamaredon APT is an infamous Russian state-sponsored hacking group.

Figure 2 – Connection of the maldoc exploiting CVE-2017-0199 with
Gamaredon APT
Figure 2 – Connection of the maldoc exploiting CVE-2017-0199 with Gamaredon APT

Another prominent malware family is Agent Tesla which topped up the most prevalent malware list in October 2022:

Figure 3 – Connection of the maldoc exploiting CVE-2017-11882 with
Agent Tesla
Figure 3 – Connection of the maldoc exploiting CVE-2017-11882 with Agent Tesla

0fd5e881a9ed54f69c35f9db17c4ea12fc7c10500b339a7fa11a695b4019954c

Another malware family caught being spread by maldocs is GuLoader. GuLoader is a prominent shellcode-based downloader that has been used in a large number of attacks to deliver a wide range of the “most wanted” malware.

Figure 4 – Connection of the maldoc exploiting CVE-2017-0199 with
GuLoader
Figure 4 – Connection of the maldoc exploiting CVE-2017-0199 with GuLoader

aac88dbc105d5dcc83b431181c093c752ab9189dcc47576f8e0d961eb3c0c044

Another example is Formbook. Formbook is an infostealer malware that was first discovered in 2016. It steals various types of data from infected systems, including credentials cached in web browsers, screenshots, and keystrokes. It also can act as a downloader, enabling it to download and execute additional malicious files.

Figure 5 – Connection of the maldoc exploiting CVE-2017-11882 with
FormBook
Figure 5 – Connection of the maldoc exploiting CVE-2017-11882 with FormBook

f211a5b6b757111a8094e290bf015ead9ebe8d79646a44684e9d9b88b0f68e52

We didn’t list each and every malware example delivered via maldocs, but the general picture is noted here: as we see, old CVEs in maldocs are used by seasoned players in the malware industry, rather than amateurs.

You may suspect that spreading such serious malware threats should not be a problem for contemporary security solutions. The methodology of the 5-year-old spreading method must be well known, and this malware must be detected and stopped as early as possible. Well, things are a bit more complicated than this, as shown in the next chapter.

Detected at earliest stages?

Let us make it explicit: the utmost importance of keeping a hand on the maldocs is their short activity time. The distribution stage for the majority of maldocs is active within a timeframe of less than a week when maximum damage impact is caused:

Figure 6 – Breakdown of the timeframes within which the maldoc distribution stage is active
Figure 6 – Breakdown of the timeframes within which the maldoc distribution stage is active

With this information in hand, let us take a look at the example of the malicious document that has only 1 detection after 6 days, submitted to the analysis:

Figure 7 – Low detection ration of the maldoc exploiting
CVE-2017-0199
Figure 7 – Low detection ration of the maldoc exploiting CVE-2017-0199

Another example is a detection history of the maldoc that was initially detected by 4 vendors; and only after one day, the detection rate became much more appealing:

Figure 8 – Detection rate history of one of the maldocs
Figure 8 – Detection rate history of one of the maldocs

As we see, even in modern systems with automated environments, certain samples can remain weakly detected – despite applying the well-known algorithm for the exploitation. Apparently, some of the documents are easier to detect as malicious than others. For example, maldocs can use remote templates or links without extensions so that it is not obvious what the contacted site will reveal and what type of payload it will be. In this case, security solutions will have a harder time explicitly stating that the document is malicious – even in spite of the same exploitation logic.

You may now ask, “Should I be worried? Is my organization included in the high-risk list of the most obvious targets?” Let us explore these questions and see the breakdown of affected industries and countries.

Who should raise caution on these threats?

Whether it is a targeted attack or a mass-spread campaign, it has never been a more convenient time for choosing potential victims than this year. In 2023, we constantly see news about leaked emails, with one of the more recent examples shown below:

Figure 9 – News about leaked sensitive emails
Figure 9 – News about leaked sensitive emails
Figure 10 – Forbes warns about data of Twitter users offered for
free
Figure 10 – Forbes warns about data of Twitter users offered for free

We also managed to find the 13GB leaked database, and even this is huge:

Figure 11 – Leaked 13GB emails database available for download
Figure 11 – Leaked 13GB emails database available for download

As stated above, the attack sectors are chosen by the operators behind maldocs and focus on highly profitable spheres such as Finance/BankingGovernmental, and Healthcare – according to Check Point telemetry:

Figure 12 – Breakdown by attacked industries
Figure 12 – Breakdown by attacked industries
Figure 13 – Breakdown to affected countries
Figure 13 – Breakdown to affected countries

There are 12 countries present in 15 top 3 places (Turkey and Russia appear in several rows). 5 of these 12 countries are among 20 most technologically advanced countries in 2023, it means lucrative markets for the attackers to operate in. The spread of affected countries is different across the industries: more even for “Healthcare” and “Other” spheres, more deviated in other three industries. The spheres selection promises financial gain for the attackers either directly (most obvious choice, Financial sphere) or in a form of future gains (think of ransomware attacks) for Governmental and Healthcare spheres.

We cannot say for sure why exactly these countries got to the list. There is a combination of factors to consider:

• What were the emails of attacked organizations? Were they biased towards particular country/organization or not?

• Who received the document to be opened? Is this person cyber-security aware or not?

• Are employees in one organization more cyber-security aware than in the other one?

We cannot calculate the exact values of probabilities for these questions, and what we get is similar to neural network output: some inputs are thrown, and the result is obtained. According to the statistics above, malware operators can be satisfied with their approach.

Malware actors are well aware of this natural curiosity and get creative when crafting what is called a “lure”, the aggregate word for the name, the look, and the contents of malicious documents. Below we discuss this important topic and show how cybercrime actors trick victims into opening malicious documents.

What do maldocs look like inside?

Maldocs’ lures come in all varieties and forms, and we list several examples in this chapter.

Let us consider the file named “robertozx.doc”, the hash is:
5cd806c0a528ca7ea6b3e2139c4c4165992d22610c50b0fecd47e08720835b4a

Inside we see a pretty badly formatted text which nevertheless mentions the necessity for the user to “enable editing” for this document:

Figure 14 – Badly formatted text inside the maldoc
Figure 14 – Badly formatted text inside the maldoc

Sometimes the name itself is enough, like in the case of “Calvin-Ellis-CV.docx”, and the document inside is completely empty. The hash is the following:

59943c6c6f823b9fed47873c27db84710fd7b639698eca736af1b901c0f002b1

After opening, the maldoc gets straight to the business and tries to download a payload from the server:

Figure 15 – Web-link accessed right away after opening the maldoc
Figure 15 – Web-link accessed right away after opening the maldoc

In other cases. lures contain a text relevant to its name, like in the case of “INDIAN_STATE_SPONSOR_OF_TERRORISM__DESTABILIZATION_IN_PAKISTAN.docx”, with the hash: 66a9b9955fa7240b45137d09dc265306ae751541de510cd9f4288f1a9972b02c

Figure 16 – Text inside matching the name of the document – about the
conflict of India and Pakistan
Figure 16 – Text inside matching the name of the document – about the conflict of India and Pakistan

Another example shows a quite humble approach with a screenshot asking the user to grant the full permissions for the document by clicking the “Enable Editing” button. The name is neutral, “facility_Request_Order.docx”, and the hash of the maldoc is: f28cb523ca32452c2efdb1cbe1c921ab0388a158b80661e65b08c9951c674c1f

Figure 17 – The maldoc with a statement of “alleged” protection
needed to be removed
Figure 17 – The maldoc with a statement of “alleged” protection needed to be removed

Now, as we made an overview of how maldocs present themselves to the victims – from different angles – let us dive deeper into technical examination and investigate certain techniques used in the malicious documents to complicate their analysis by researchers and automated labs.

Technical tricks

Below, we describe several techniques that help maldoc operators increase the chances of their creations flying under the radar of security solutions. These techniques can be used in any combination as the maldoc operators see fit.

Encryption

Excel maldocs can come encrypted to make the analysis more difficult:

Figure 18 – Encrypted content inside password-protected Excel
maldoc
Figure 18 – Encrypted content inside password-protected Excel maldoc

The encryption/decryption is performed with the help of MS Enhanced RSA and AES crypto-provider:

Figure 19 – Microsoft crypto-provider used for protection of
password-protected doucments
Figure 19 – Microsoft crypto-provider used for protection of password-protected doucments

Following the process of opening and decrypting the document, the password can be seen in plain text in the debugger:

Figure 20 – ”VelvetSweatshop” password is seen inside the
debugger
Figure 20 – ”VelvetSweatshop” password is seen inside the debugger

This feature of using “VelvetSweatshop” password is well-known since at least 2006 (17 years already!):

Figure 21 – ”VelvetSweatshop” password feature is known for 17 years
already
Figure 21 – ”VelvetSweatshop” password feature is known for 17 years already

It is not clear what the intention of adding this feature to Excel was. However, there is a clue why exactly this password was chosen. Back in 1989, the description “Velvet Sweatshop” was first used to describe the Microsoft company. This dubious characteristic meant that the company provided a lot of benefits to its employees in the exchange for their time spent at work.

There is no self-decrypt possibility present for MS Word documents. As we saw above, a password-protected Excel document that will not require a password input on opening – using “VelvetSweatshop” as a password – can be created. This way the document will be protected from the analysis, yet be fully-functional without any user interaction. But the same scenario cannot be accomplished with Word: such a document will require a user to input a password in any case, if it is a password protected document.

Figure 22 – Password input form present upon Word document
opening
Figure 22 – Password input form present upon Word document opening

Peculiar URLs

Some URLs inside malicious documents (most often seen in CVE-2017-0199) come in a peculiar format http://<junk>@<digital_ip>/<long_path>.<ext>

An example of such a URL is shown below:

Figure 23 – Peculiar URL extracted from the maldoc
Figure 23 – Peculiar URL extracted from the maldoc

However, the presence of @ is perfectly valid in the URL: https://stackoverflow.com/questions/19509028/can-i-use-an-at-symbol-inside-urls

In such cases, it means that authentication data is sent with the request.

If such a URL is opened in a browser, there will be an explicit note on the authentication data sent:

Figure 24 – A warning shown when visiting site with a username
Figure 24 – A warning shown when visiting site with a username

This message can be safely ignored as the file by the link is accessed anyway:

Figure 25 – The document can be downloaded without any issues, even
when accessed with a username
Figure 25 – The document can be downloaded without any issues, even when accessed with a username

Of course, no authentication warnings are shown when the URL is accessed without a GUI.

It’s important to note an unusual form of IP addresses in these URLs, they come without dots but with pure digits instead. Python snippet to convert such IP addresses to their more readable dot-form is provided below:

import ipaddress

def convert_digital_ip(digital_ip):
try:
   return ipaddress.ip_address(int(digital_ip))
except ValueError:
   return None

Shellcode with junk

Shellcodes inside maldocs can be the beasts on their own to analyze. Their logic is obfuscated and spread through junk instructions and spaghetti-jumps across the code. The key part of the code is encrypted and decrypted inside the shellcode itself. Let us take the example of this hash:

34b82dfffb003d39b09dc4c071432c17145165ece3a0ae193c564c7d0a2ab550

The structure of the shellcode can be envisioned in the following image:

Figure 26 – Junk and encrypted instructions inside the shellcode
Figure 26 – Junk and encrypted instructions inside the shellcode

Automated analysis of this code can be complicated as a state machine must work in conjunction with a disassembler to give a verdict of whether to include current instruction to the execution flow or not. In the end, the logic of the shellcode boils down to these several instructions:

0A66 call sub_B9A
0B9A pop  ebp             <- get base value for encrypted code start (0xA6B)
0BBA add  ebp, 1C4h       <- add some value to the start base (0xC2F)
0BD2 lea  edx, [ebp+2A3h] <- get the end of the encrypted block (0xED2)
; perform decryption routine
0B3D imul eax, 0
0BFB imul eax, 122222ADh
0B42 add  eax, 49260C1Fh
0C08 xor  [ebp+0], eax
; increase loop count and check of the last block is reached
0BF3 add  ebp, 4
0C2B cmp  ebp, edx
0C2D jb   short loc_BFB
; continue execution from now decrypted code
0C2F sahf

The accessed URL in the example is this one:

http[://107.172.73.137/abc/loader5.exe

Enormous oleObject

Let us examine the document with the following hash:

b5296c6e715e656b052ad5fbf0687610519916aa96ea4005be3f3cd2117273a7

From the outside, it looks completely ordinary, just a 2MB maldoc:

Figure 27 – Maldoc of the usual size
Figure 27 – Maldoc of the usual size

Things get more exciting when this document is unpacked and what we see inside is a huge oleObject of 2GB size:

Figure 28 – Huge OLE object inside the maldoc
Figure 28 – Huge OLE object inside the maldoc

Extracting the objects directly with dedicated tools such as “oletools” package may fail, or be slow, so the trick here is to unzip the document manually and analyze its parts (like VBA macro) from there.

A huge number of meaningless bytes is present inside:

Figure 29 – Meaningless bytes inside huge OLE object
Figure 29 – Meaningless bytes inside huge OLE object

Besides this, there is an “Equation Native” object inside exactly this big “oleObject3.bin” container:

Figure 30 – ”Equation Native” part inside the OLE Object</p>
<p>Looking inside this equation object, we see a simple command:
Figure 30 – ”Equation Native” part inside the OLE Object

Looking inside this equation object, we see a simple command:

Figure 31 – A command inside the equation object
Figure 31 – A command inside the equation object

This command is a part of an execution chain which is related to the obfuscated VBA macro:

Figure 32 – Obfuscated VBA macro inside the maldoc</p>
<p>After the decryption we see an embedded URL:
Figure 32 – Obfuscated VBA macro inside the maldocAfter the decryption we see an embedded URL:
Figure 33 – Decrypted URL used in the maldoc
Figure 33 – Decrypted URL used in the maldoc

We didn’t manage to set the attribution and link this particular maldoc to the concrete campaign as the URL returned 404 status when it was spotted for the first time:

Figure 34 – Accessed site was down at the time of analyzing the
maldoc
Figure 34 – Accessed site was down at the time of analyzing the maldoc

Although we have no direct proof of this particular example being malicious, it’s hard to imagine why this document could be made for legitimate reasons and humble intentions. The maldoc crafted in such a way can be a real burden for the automated environments to take care of and the analysis may fail completely.

Obfuscated VBA macros

The list of tricks used in maldocs would not be complete without mentioning the obfuscated VBA macros. The example was seen in the previous chapter, “Enormous oleObject”. There is no one-for-all method of dealing with different types of obfuscations: sometimes, olevba tool is enough; sometimes, ViperMonkey can help; other times, manual processing is required.

This is a feature not used widely with CVEs. However, some cases (like the example above) from recent years are present:

Figure 35 – Number of maldocs with macros used inside
Figure 35 – Number of maldocs with macros used inside

Conclusion

When a disaster notoriously perpetuated in historical books appears, the most logical reaction from humanity is to take measures against repeating the same event in the future. Sometimes this approach works, and sometimes the works of eternal classics like Homer, Shakespeare, or Pushkin remain bestsellers across generations – because history repeats itself in a spiral, and what once was an issue for ancestors is once again the issue for descendants.

Let us also include the reality of a modern, fast-paced world where certain bits of information can be easily overlooked – and we will get the situation that spreads smoothly to a cybersecurity area as well. New threats appear daily. However, some of them (as old as 5-6 years) still stay relevant and come in huge numbers in the wild, helping to deliver one of the most prevalent malware families to date. The tricks used in maldocs – aimed at humans and machines – can significantly increase the chances of getting the subsequent payload and launching the whole infection chain.

Check Point Research raises a red flag of warning and would like to remind you of several simple rules of thumb when it comes to dealing with Microsoft Office documents:

  • Keep the OS and used applications up-to-date
  • Do not open links in unexpected emails from unknown senders
  • Raise cybersecurity awareness across employees
  • Consult with a security specialist in case of any uncertainty – the incident is better be prevented than cured

Protections

Check Point customers remain protected against the threat described in this research.

Check Point Threat Emulation and Harmony Endpoint provide comprehensive coverage of attack tactics, file-types, and operating systems and protect against the type of attacks and threats described in this report.

Against CVE-2017-11882:

  • RTF.CVE-2017-11882.gen.TC.*
  • Win32.CVE-2017-11882.TC.*
  • HEUR:Exploit.MSOffice.CVE-2017-11882..TC.

Against CVE-2017-0199:

  • MSOffice.CVE-2017-0199..TC.
  • RTF.CVE-2017-0199..TC.
  • Win32.CVE-2017-0199.TC.*
  • HEUR:Exploit.MSOffice.CVE-2017-0199.gen.TC.*
  • Wins.Maldoc_cve-2017-0199.*

Against CVE-2018-0802:

  • MSOffice.CVE-2018-0802.gen.TC.*
  • RTF.CVE-2018-0802.gen.TC.*
  • Win32.CVE-2018-0802.TC.*
  • HEUR:Exploit.MSOffice.CVE-2018-0802.gen.TC.*

Resources

CVE-2017-11882:

CVE-2017-0199:

CVE-2018-0802:

Spreading history:

Spreading Agent Tesla in 2023:

About Excel encryption:

Dotless URLs:

Graphics:

The post Maldocs ­of Word and Excel: Vigor of the Ages appeared first on Check Point Research.