How to Create F.L.I.R.T Signature Using Yara Rules for Static Analysis of ELF Malware – JPCERT/CC Eyes

It has been observed that ELF malware removes symbol information during its build. This creates extra work in malware analysis to identify each function name because you do not know them. In addition, in IDA, an analysis tool, existing F.L.I.R.T signatures [1] (hereafter abbreviated as FLIRT signatures in this article) are often not applicable to ELF malware functions, making analysis difficult when right signatures are not found.

This blog article describes how to identify function names using Yara rules. It also explains how to use “AutoYara4FLIRT,” an IDA script that automatically generates Yara rules for this method. The tool is available on GitHub, and you can download and use it from the following webpage.

JPCERTCC/AutoYara4FLIRT – GitHub
https://github.com/JPCERTCC/AutoYara4FLIRT

Problems in Static Analysis of ELF Malware

When a Linux program is built, each library is linked, and a single ELF binary is generated as shown in Figure 1. Information such as function names is generally stored in sections such as .symtab or .strtab. In the case of ELF malware, this section is removed by using the strip option at the time of its build. In ELF malware, by removing the symbols, the original function names are also lost for each statically linked library. This is problematic because it takes much more time to analyze.

Figure 1: Static links and symbol information in ELF binary

Difficulty in finding the right FLIRT signatures

FLIRT signature is a feature that allows IDA to automatically identify existing libraries and function names. Without FLIRT signatures, when malware analysts find a complex function like shown in Figure 2, they would have a hard time understanding its behavior. However, if they know that the function is memcpy in advance, such analysis is not needed. With a FLIRT signature that includes the memcpy function, IDA can identify the function name, which saves time on analysis.

Figure 2: memcpy function

The FLIRT signatures that IDA includes by default do not match in most cases. Therefore, a proper FLIRT signature must be found or created. However, finding a suitable signature is very difficult because FLIRT signatures do not match even when compilation condition or library version is different. In the traditional method of creating FLIRT signatures, shown in Figure 3, differences in libraries and their versions, compiler versions, and compile conditions all affect the ELF binary to be generated. For this reason, it is difficult to find the right FLIRT signature without having those for various conditions.

Figure 3: Traditional method of creating FLIRT signatures in IDA

How to create FLIRT signatures using Yara rules

Figure 1 shows the overview of this method. First, a Yara rule is created from the ELF malware. The rule is designed to search for ELF binaries that still contain symbol information. Next, use Retrohunt[2] on Virus Total with the above Yara rule to create a FLIRT signature (.sig) from the obtained ELF binary. The FLIRT signature is created from the Retrohunt-ed file based on the ELF malware, which is more likely to match with the target ELF malware.

Figure 4: Overview of IDA’s signature creation method using Yara rules

In addition, to search for ELF binaries with remaining symbol information, the following condition needs to be added to the Yara rule.

import "elf"

rule Template{

condition:
        for 2 i in (0 .. elf.number_of_sections) : (
            ((elf.sections[i].name == ".symtab") and (elf.sections[i].type == elf.SHT_SYMTAB))
         or ((elf.sections[i].name == ".strtab") and (elf.sections[i].type == elf.SHT_STRTAB))
        )
        and
        not ( for 1 i in (0 .. elf.number_of_sections) : (
             ((elf.sections[i].name == ".dynamic") and (elf.sections[i].type == elf.SHT_DYNAMIC))
              )
        )
}

Evaluation results

We applied this method to 50 ELF malware on both x86 and ARM and used the created FLIRT signatures to resolve each function name. The result showed that, on average, FLIRT signatures can be created for about 60% of the functions in the x86 ELF malware and about 30% of those in the ARM ELF malware. Although we found that there were some exceptional cases in x86 ELF malware in which this method cannot be used well, the match rate was about 60% for most samples, and for some of them, the match rate was even over 90%. On the other hand, the match rate for ARM ELF malware was lower, indicating this method needs to be improved. Refer to Tables 1 and 2 in the Appendix for details of the results. The number of all functions in the target samples is defined as Total functions, and the number of functions matched by the FLIRT signatures created from the Retrohunt ELF binary is defined as Matched functions.

For sample selection, we obtained 300 ELF malware samples most recently uploaded to MalwareBazaar [3] as of November 2022. Then, 50 relevant samples for x86 and ARM architectures were selected. Therefore, the type of malware was not considered, but mirai, gafgyt, TSCookie, Coinminer, and others were included as the target when the method was evaluated. Please also note that the result is reference values since it varies depending on whether ELF binaries containing the target libraries are on Virus Total or not.

AutoYara4FLIRT, an automatic Yara rule generator for creating FLIRT signatures

In this method, Yara rules need to be created from ELF malware, and AutoYara4FLIRT is an automatic Yara rule generator. The tool was created as a plugin for IDA. To use AutoYara4FLIRT, move AutoYara4FLIRT.py to the IDA’s plugin folder, select AutoYara4FLIRT from the IDA plugin menu. A Yara file is generated in the same folder, and the results of the Yara file can be viewed on IDA. An example of executing AutoYara4FLIRT is shown in Figure 5.

Figure 5: Result of executing AutoYara4FLIRT

Normally, Yara rules are created by figuring out strings and disassembler instruction sequences contained in the malware. AutoYara4FLIRT automatically generates Yara rules using the disassembler strings necessary for Yara rule creation. Since the extracted disassembler instruction sequence covers multiple statically linked libraries, it is extracted from multiple separated blocks, and in order to capture differences such as compiler conditions, the longest sequence among them is used. By Retrohunt-ing using the generated Yara rules, the ELF binaries with symbol information can be searched, which saves time and efforts when creating FLIRT signatures.

CLI automation tool to apply this method to multiple types of malware

To use this method for multiple types of malware, you can use the following CLI tool on the command line to automatically generate Yara rules at once. It is also possible to generate FLIRT signatures (.sig) at once for multiple ELF binaries that have been Retrohunt-ed based on Yara rules. Refer to the following link for more information.

JPCERTCC/AutoYara4FLIRT#CLI_AutoYara – GitHub
https://github.com/JPCERTCC/AutoYara4FLIRT#CLI_AutoYara

In the case of using other services than Virus Total

This method assumes that Retrohunt can be used and requires a paid account of Virus Total. However, by using the following services, you can search for files by Yara rules for free.

In Closing

This method can be used to create FLIRT signatures for ELF malware which is difficult to analyze statically because existing FLIRT signatures are not applicable. In addition to x86 and ARM, the method can also be used for x86-64 and MIPS, and thus we recommend trying it on a variety of ELF malware that has been difficult to analyze.

Yuma Masubuchi

Translated by Takumi Nakano

References

[1] F.L.I.R.T
https://hex-rays.com/products/ida/tech/flirt/

[2] Retrohunt
https://support.virustotal.com/hc/en-us/articles/360001293377-Retrohunt

[3] MalwareBazaar
https://bazaar.abuse.ch/

Appendix

Table 1: Verification results of the method (x86)

No Matched functions / Total functions (x86) Match ratio(x86) [%]
1 82 / 141 58.2
2 193 / 3267 5.9
3 87 / 152 57.2
4 79 / 138 57.2
5 87 / 145 60.0
6 84 / 148 56.8
7 93 / 201 46.3
8 168 / 273 61.5
9 85 / 153 55.6
10 82 / 155 52.9
11 83 / 147 56.5
12 1 / 183 0.5
13 923 / 1006 91.7
14 87 / 146 59.6
15 85 / 153 55.6
16 93 / 176 52.8
17 83 / 149 55.7
18 83 / 149 55.7
19 1 / 161 0.6
20 96 / 175 54.9
21 1 / 162 0.6
22 82 / 147 55.8
23 1 / 163 0.6
24 86 / 157 54.8
25 95 / 154 61.7
26 93 / 205 45.4
27 132 / 204 64.7
28 89 / 152 58.6
29 85 / 153 55.6
30 83 / 159 52.2
31 82 / 151 54.3
32 87 / 146 59.6
33 84 / 145 57.9
34 96 / 161 59.6
35 81 / 142 57.0
36 129 / 197 65.5
37 82 / 147 55.8
38 85 / 151 56.3
39 128 / 214 59.8
40 136 / 214 63.6
41 1 / 162 0.6
42 168 / 271 62.0
43 82 / 139 59.0
44 128 / 207 61.8
45 84 / 142 59.2
46 78 / 148 52.7
47 133 / 204 65.2
48 1010 / 1342 75.3
49 1009 / 1336 75.5
50 1010 / 1336 75.6
Average 62.5

Table 2: Validation results of the method (ARM)

No Matched functions / Total functions (ARM) Match ratio (ARM) [%]
1 78 / 196 39.8
2 70 / 223 31.4
3 35 / 186 18.8
4 103 / 299 34.4
5 121 / 266 45.5
6 38 / 170 22.4
7 26 / 153 17.0
8 70 / 186 37.6
9 31 / 158 19.6
10 75 / 237 31.6
11 26 / 156 16.7
12 75 / 236 31.8
13 56 / 253 22.1
14 76 / 236 32.2
15 55 / 255 21.6
16 106 / 252 42.1
17 31 / 163 19.0
18 68 / 192 35.4
19 79 / 225 35.1
20 26 / 153 17.0
21 31 / 163 19.0
22 70 / 233 30.0
23 74 / 237 31.2
24 69 / 190 36.3
25 47 / 272 17.3
26 77 / 200 38.5
27 75 / 200 37.5
28 50 / 211 23.7
29 32 / 160 20.0
30 111 / 264 42.0
31 76 / 209 36.4
32 31 / 163 19.0
33 152 / 255 59.6
34 33 / 180 18.3
35 70 / 282 24.8
36 31 / 170 18.2
37 28 / 145 19.3
38 31 / 159 19.5
39 36 / 171 21.1
40 97 / 244 39.8
41 27 / 154 17.5
42 78 / 207 37.7
43 71 / 280 25.4
44 73 / 196 37.2
45 72 / 194 37.1
46 36 / 171 21.1
47 73 / 196 37.2
48 31 / 163 19.0
49 76 / 275 27.6
50 31 / 159 19.5
Average 28.4

Source: Original Post