Beginning with macOS 10.12 (Sierra), Apple introduced a key change to how logging was done on their systems. This new logging system replaced common Unix logs with macOS Unified Logs. These logs can provide forensic investigators a valuable artifact to aid in investigating macOS systems or other Apple devices.
In this blog post, we will cover an overview of the Unified Logs and the challenges presented in using them during an investigation. Along with this blog post, we also released a tool called “macos-unifiedlogs“ to help overcome some of the challenges in parsing log data, and to provide examples of how it can uncover vital information during an investigation.
What are the Unified Logs?
Before the Unified Logs, the primary log source for macOS systems was the Apple System Logs (ASL) and other plaintext logs residing on the endpoint. With the release of macOS 10.12 (Sierra) in 2016, Apple replaced the ASL with a new proprietary format called the Unified Logging System, which centralized the storage of log data in memory and on disk.
The Unified Logs are composed of three components:
- tracev3 files – Binary files containing the log entries
- UUID files – Binary files containing metadata for a log entry
- timesync files – Binary files containing timestamp metadata associated with a log entry
Table 1 shows the directories related to the Unified Log files on a macOS system.
Path |
Content |
Total Size |
/private/var/db/diagnostics/Persist |
Tracev3 files |
Up to ~520 MB |
/private/var/db/diagnostics/Special |
Tracev3 files |
Varies |
/private/var/db/diagnostics/Signpost |
Tracev3 files |
Varies |
/private/var/db/diagnostics/HighVolume |
Tracev3 files |
Varies |
/private/var/db/diagnostics/Timesync |
Timesync files |
Varies |
/private/var/db/uuidtext/{00-FF,dsc} |
UUID files |
Varies |
In addition, since /var is a symbolic link to /private/var, the directories can also be found at /var/db/diagnostics and /var/db/uuidtext. By parsing these three components it is possible to construct the Unified Log data.
The Unified Logs contain a large amount of valuable information for forensic investigations such as:
- Process path associated with log entry
- Library path associated with log entry
- PID
- EUID
- Timestamp for the log entry in Unix Epoch
- Log Type – Ex: Default, Error, Debug, Info, Fault
- Event Type – Ex: Log, Signpost, Activity, Statedump, or Simpledump
- Subsystem – Typically the bundle ID associated with log entry, ex: com.apple.news
- Category – An optional addition to the subsystem that can be used to associate log entries with specific categories. Ex: com.apple.news has the categories NewsToday and Network.
- Log Message
Each log entry is associated with an Event Type and depending on the Event Type a Log Type (also called a Message Type).
Table 2 shows the types of Event and Log types commonly found in the Unified Log format.
Event Type |
Associated Log Types Observed |
Overview |
Log |
Fault, Default, Error, Debug, Info |
Standard log entries |
Signpost |
Process, System, Thread |
Associated with process/app metrics |
Activity |
Create, Useraction |
Standard log entries |
Statedump |
N/A |
Special logs that may contain PLIST data, custom objects, or protocol buffers |
Simpledump |
N/A |
Special logs that contain a simple string |
Loss |
N/A |
The log entry was lost and failed to record |
Trace |
Default, Info |
Standard log entries |
Figure 1 shows a partial example of the data types associated with Unified Logs when using the built-in macOS log command.
|
In Figure 1, the Unified Log recorded a log entry from the softwareupdated process. This entry has the Event Type Log and a Log Type (Message Type) Default. The entry has the subsystem com.apple.SoftwareUpdateMacController and the log was categorized as SU.
Changes in macOS Monterey and Apple Silicon (ARM)
With the release of macOS 12 (Monterey) in 2021, Apple introduced two minor changes to the Unified Log format: The format of UUID files at /private/var/db/uuidtext/dsc changed and the EventType Simpledump was added to tracev3 files.
In addition, with the introduction of Apple Silicon, the way the macOS system kept track of Mach (kernel) time changed. Mach time is one of the timestamp values needed to calculate the actual timestamp for a Unified Log entry. On Apple Intel devices the Mach time is recorded in nanoseconds. However, on Apple Silicon the Mach time is recorded in ticks and must be converted to nanoseconds to get accurate timestamps on the log entries. During the transition to Apple Silicon, Apple mentioned this difference when documenting the application porting process to Apple Silicon.
Challenges Related to the Unified Logs
When working with Unified Logs, there are three primary challenges with reviewing the data. First, investigators must select a standardized means through which they will perform their analysis of the Unified Logs. Second, the Unified Logs produce large volumes of data even on endpoints with moderate use. Finally, reviewing the logs to identify data of interest requires familiarity with the Unified Logs and their sources.
Analysis of Log Data
There are four primary approaches to analyze the Unified Log data:
- Apple APIs
- The built-in CLI log command.
- The GUI Console Application
- Parse the raw Unified Log format
The first three analysis options require a macOS system and rely on using Apple provided tools or APIs. The fourth option allows an investigator to review logs outside of a macOS system and access data not exposed via the built-in tools or APIs, however if Apple changes the log format a parser will need to be updated to parse the new changes.
Volume of Data
A common challenge of reviewing the Unified Logs is handling the large volume of data the logs provide. The total number of log entries can vary between 18 million and 50 million entries depending on activity on the system. Since the tracev3 files contain the log data, the size and number of files will influence the amount of data depending on the directory location and endpoint activity.
Table 3 shows the size and number of tracev3 files based on directory location.
Path |
Max file Size |
Max number of files Observed |
m/private/var/db/diagnostics/Persist |
~10.5 MB |
~52 |
/private/var/db/diagnostics/Special |
~2.1 MB |
Varies |
/private/var/db/diagnostics/Signpost |
~2.1 MB |
Varies |
/private/var/db/diagnostics/HighVolume |
Unknown |
Varies |
Approximately 10.5 MB tracev3 may contain between 300,000 and 400,000 log entries. Due to the volume of log data, additional tools will likely be needed to review the parsed data. Using command-line tools like grep, xsv, or jq can assist in reviewing parsed out data in a CSV or JSON format. In addition, uploading the parsed data to a SIEM or log aggregation tool can provide a way for multiple analysts to review the data.
Identifying Data of Interest
Due to the potentially overwhelming amount of data, effectively reviewing the Unified Logs requires the application of filters on the full collection of data. Two effective ways to reduce the amount of data to review are Indicators of Compromise (IOCs) and data type filters.
Leveraging Indicators of Compromise (IOCs) from other artifacts such as a filename or Launchd label can be a quick way to identify log entries of interest. We can then pivot off any hits to identify any additional malicious activity. For example, if the Unified Logs recorded the presence of a known malicious file, we could use the timestamp associated with the log entry and review entries that occurred before and after. In addition, if the log entry recorded the Process or Library Path or Subsystem, we can search those data types to see of any other malicious files were recorded in in the logs.
Even without specific IOCs to leverage, the Unified Logs can be a valuable source of data for threat hunting. Applying filters related to specific data types such as Process and Library Path, Subsystem, or Category can be used to show specific log data. For example, to filter for SSH logons you can limit entries associated with the Process Path /usr/bin/sshd to show SSH activity on a macOS system.
Parsing the Unified Logs
At Mandiant, we created a cross platform Unified Log parser (and simple library) called macos-unifiedlogs and are open sourcing it to help other forensic investigators review the Unified Logs. This tool can parse the raw Unified Log format to CSV or JSON. The parser builds upon the previous work by the libyal and the UnifiedLogReader projects
The goal of macos-unifiedlogs is to provide a tool that can parse all the components that make up the Unified Logs on a live macOS system or a logarchive collection created from the log command and construct the log entries. In addition, macos-unifiedlogs also supports parsing individual components that make up the Unified Log.
Since the command line tool is built on a simple library, other programs can import the library and manipulate the results for additional uses such as uploading results to a SIEM or apply additional filtering on the parsed data.
The parser has been tested on log data from macOS 10.12 (Sierra) to macOS 12 (Monterey). The macos-unifiedlogs tool includes three example programs to parse the log data. The simplest is unifiedlog_parser which can parse the logs on a live system or a provided logarchive created by the log command.
Figure 2 shows a logarchive created by the built-in log command and its contents. We can view the contents by right-clicking and selecting Show Package Contents.
The logarchive contains the three components for the Unified Logs:
- Directories 00 through FF and dsc contain the UUID files
- Persist, HighVolume, Signpost, and Special contain the tracev3 files
- The file logdata.LiveData.tracev3 are the log entries in memory
- Timesync directory contains the timesync files
The directory Extra and file Info.plist contain metadata about the Unified Log daemon logd, but they are not needed to parse the Unified Log data.
The logarchive can then be passed as an argument to unifiedlog_parser.
Figure 3 shows a snippet of the execution of unifiedlog_parser.
|
Once the unifiedlog_parser has finished parsing the data to a CSV file, a variety of tools can be used to review the data such as Microsoft Excel, Splunk, xsv, or grep.
Data the macos-unifiedlogs parser can extract from the logs includes but not limited to:
- Timestamps (Intel and Apple Silicon supported)
- PID
- Process path associated with the log entry
- Library path associated with the log entry
- EUID
- Log Type
- Event Type
- Subsystem
- Category
- Log Message
How macos-unifiedlogs Parser Helps Investigators
Since Unified Logs are now the primary logging mechanism for Apple products, they contain a large of amount of valuable information for forensic investigators. Depending on system activity, data that the logs may include are:
- Logons to the macOS system
- Sudo commands
- DNS resolutions
- Gatekeeper events – macOS application verifier
- Xprotect events – macOS built-in Yara scanner
- Apple Script activity – Built-in macOS scripting language
- Opendirectory events – macOS LDAP service events
In addition, third-party applications may also leverage the Unified Log for logging and can be referenced during forensic investigations.
The Unified Logs can be leveraged to identify multiple stages of the attack lifecycle. For example, one technique to maintain persistence on macOS is to create a LoginItem. Whenever a user logs on to a macOS system, any registered LoginItems are executed. All LoginItems that execute are recorded in the Unified Logs. We can identify and filter for this activity by searching for the Subsystem com.apple.loginwindow.logging and Messages that contain performAutolaunch.
Figure 4 shows a snippet of the log entry created by the popular Lulu Firewall application executed via LoginItem persistence.
|
macOS commands executed with sudo privileges are also recorded within the Unified Logs. By filtering on the Process Path /usr/bin/sudo we can filter to only show log entries related to sudo activity. It’s possible to further filter out data by only showing Messages that contain the user root.
One of the most verbose logging sources is the Process Path /usr/bin/mDNSResponder. mDNSResponder generates log entries for DNS lookups by processes. However, many of the log entries by mDNSResponder are private due to possible sensitive DNS lookups. If private data is disabled, the DNS hostnames are hashed and base64 encoded.
Figure 5 shows a partial example /usr/bin/mDNSResponder entry associated with the process rclone, a popular backup software. The domain associated with the DNS lookup has been hashed and base64 encoded since private data is not shown.
|
When reviewing Unified Log data an analyst will likely see logs that contain the value <private> or a base64 encoded string of raw bytes. The Unified Logs allow developers to mark logged data as private. When private data is logged the system will either mask it with <private> or a hash and base64 encode the data.
It is possible to show private data by installing a custom Profile on a macOS system. However, this will reveal private data for all software that log to the Unified Log. Enabling private data will not retroactively show previous private data, it is only applied for future log events.
If we enable private data via a custom macOS profile, new DNS entries will become visible.
Figure 6 shows a partial example /usr/bin/mDNSResponder entry associated with rclone after private logging has been enabled. Since the custom Profile is installed, the logs will now show the domain associated with the DNS lookup, api[.]onedrive[.]com.
|
Conclusion
By making macos-unifiedlogs free and open source, we hope it will be useful for other forensic investigators when investigating macOS systems. The source code, documentation, and limitations for macos-unifiedlogs is at Mandiant’s GitHub.
Additional resources and background on the Unified Logs can be found at:
Source: https://www.mandiant.com/resources/blog/reviewing-macos-unified-logs