Sandfly Blog

How To Detect and Decloak Linux Stealth Rootkit Data

15 November 2022

Rootkits

Linux stealth rookits have a variety of mechanisms to hide on a host. Aside from standard tactics such as hiding running processes (which we show you how to decloak here), they also can hide data inside files. This tactic can prevent security teams from detecting malicious modules loading and maintaining persistence on a host.

In this article, we'll show you how to decloak this hidden Linux rootkit data using two methods:

1) Using common command line tools ls, cat, and wc.

2) Using memory mapped file I/O with the free sandfly-file-decloak utility.

We are releasing a free tool using the second method to help security teams detect cloaked data.

Intrusion Detection Philosophy - Police Interrogation

To understand how to find stealth rootkits on Linux we suggest you adopt a philosophy used by our product Sandfly. That philosophy is taken directly from police interrogation techniques and is simple:

Ask the same question multiple ways to see if the answers are identical.

If you ask the question multiple ways, but one or more of the answers is different, then you know there is a problem. You see this method used by the police when investigating a case. For instance:

"Where were you last night?"

"We have surveillance footage showing your vehicle at the crime scene, why is that?"

"You were seen with the victim last night. What were you doing with them?"

Now all of the above questions try to get the same data: Was the suspect where the crime happened? The questions start out generic and vague to let the suspect tell a lie. Then they start to get more specific about the incident and police can then use the initial answers to start finding inconsistencies in the story and ask more questions.

Sandfly adopts the same approach with our tactics hunting and we want you to do the same when investigating a Linux host for compromise. Don't ask just one question and take the answer as truth. Instead, ask the same question several different ways and see if the answers are the same.

Using the above, we're going to ask the same question about a system with a Linux Loadable Kernel Module (LKM) stealth rootkit running on it and see if the answers are the same. Almost always, they are not.

Linux Stealth Rootkit Hiding Data

Most rootkits try to maintain persistence by hiding in various start-up files located under /etc on Linux. For LKM rootkits, they need to load their module into the kernel during boot to activate. After activation they can then hide the data in the file they used to start.

For example, a stealth rootkit will often target the following files to load its module on boot:

/etc/modules
/etc/ld.so.conf

Critical system directories like below can also contain malicious insertion code:

/etc/modules-load.d
/etc/init.d
/etc/rc*.d

There are any number of places it can happen, but the above are pretty common.

To prevent the module from being seen by the victim, the rootkit will have a special tag sequence they can wrap around their boot-up entry. When the rootkit is active it will not show the data that is between these tags. For instance, the Reptile rootkit uses a couple special tags (#<reptile> and <#/reptile>) to hide any data that is between them when active.

For example under /etc/modules:

# /etc/modules: kernel modules to load at boot time.
#
# This file contains the names of kernel modules that should be loaded
# at boot time, one per line. Lines beginning with "#" are ignored.

# malicious content below
#<reptile>
malicious_module_to_load
#</reptile>

The tags appear as comments in the file, but the file malicious_module_to_load will be read on boot and inserted into the kernel space. After this happens if you run commands like cat or even an editor like vi it will not show the tags or data between them to the viewer. We see this in action in this screenshot of opening the file with vi on an infected host:

By cloaking the data the malicious rootkit module can maintain persistence indefinitely unless you are able to see what is going on. We're going to show you how to do just that next.

Searching for Incorrect Byte Counts

Hiding data like this is very clever, but it suffers from a big issue on Linux and that is this:

The Linux file system wants consistency and doesn't want to hide how many bytes are present in a file by removing data on the fly.

Something on the system will know the data is there because it is there. We just have to ask the right questions.

To take advantage of this we are going to again ask the system the same question in multiple different ways. Our question for finding this attack is simple:

How many bytes do you see in this file?

We will ask this question of the file system with a simple directory listing. Then we ask the same question of the kernel by having it give us the bytes of data it reads. Then, we see if they match. We will use the tools ls, cat, and wc to do this in the commands below:

ls -al /etc/modules
cat /etc/modules | wc -c

Here we see the file /etc/modules in the directory listing under an active LKM rootkit. It shows 283 bytes according to the file system using ls. Now we ask the kernel to read the file and tell us how many bytes are there using the cat /etc/modules | wc -c command to count the bytes. Here wc reports 222 bytes are present from the cat command output. There is a 61 byte discrepancy between these results. Where are those bytes that the cat command couldn't see?

The Case of the Missing Bytes

The byte counts above don't match because the rootkit cannot cover all the ways a file can be seen on a Linux host. Yes, it hooked the file read calls that cat and other command line tools use. But, it is not altering what the file system returns in the internal stat() calls for file size data.

In this case the file system reports the correct size with the ls command. However reading the data passes through the hooked kernel calls and the data is scrubbed for the cat command. One is correct and the other is not. Basically, we are expecting the rootkit to lie to us and are using the fact it is lying to reveal itself by getting a known good value it didn't expect to compare against.

The key takeaway is this:

It's one thing to get the kernel to lie about file contents, but something entirely more complicated to get the file system to agree with this lie.

Again the file system wants to maintain consistency to avoid data corruption. It is much harder to have bogus data returned through the kernel and also get the file system to show the same bogus data. The task would require much more coding to do well and likely would cause serious system performance and stability issues even if it did work in some capacity.

In our above technique we simply are asking the kernel and the file system a simple question: "Do you agree that both of these files have the same data size?" So the logical next question is: "Can we see what is being hidden?" Yes, you can.

Decloaking LKM Rootkits with Memory Mapped (mmap) I/O

Now that we know there are some bytes hiding in a file, the question is how to read them? Traditionally you would mount or clone the volume on a known good system and read the data without the LKM rootkit loaded. However, this requires taking the host offline which may not be immediately possible. It may also alert the attacker that you have found them which is not always a good idea. In these cases you could try to see if there really is a problem before deciding on your next steps.

Now we know the rootkit has control of the standard file read functions so we need to again ask the question a different way. Instead of using standard file I/O functions to read a file, we'll bypass them and use memory mapped I/O.

The reason this works is again limitations on what a rootkit can reasonably do. It's one thing to intercept all file read operations (which is already very intensive and risky). However it's even more expensive and even riskier to start intercepting memory mapped I/O on Linux. The chances of causing severe system performance and stability issues goes up significantly by trying to cover all these bases.

So, we'll use memory mapped I/O to read our file and again compare it against what the standard file I/O says is present. If we find a difference we'll report what it is. Doing this instantly decloaks any hidden data.

We have a free utility that does just this as you can see in this screenshot:

Sandfly File Decloak: A Memory Mapped I/O Rootkit Decloaking Tool

The tool used above is called sandfly-file-decloak and is released for free on our Github page. This tool is written in Python and can rapidly check if critical system files are hiding data.

Of particular concern for hiding LKM modules and malicious libraries are the following:

/etc/modules 
/etc/ld.so.conf
/etc/modules-load.d (all files in this directory)
/etc/init.d (all files in this directory)
/etc/systemd (all files in this directory)
/etc/rc*.d (all files in these directories)

To add to the above, any text based file under /etc that can be used to hold system configuration or boot scripts should be checked. For instance files such as /etc/hosts can hold malicious host information for Command and Control (C2) servers or rc and init scripts can contain other areas to store persistence data.

What Next?

Once you confirm a system has a rootkit in operation it will be up to your organization to decide on next steps. However, confirmation gives you confidence that what is going on is not an accident and an actual incident that needs immediate attention. A Linux stealth rootkit means total system compromise and exposure of all data on the host along with login credentials. A strategy to contain the host and expand the investigation outward needs to be done immediately.

Automating Rootkit Detection with Sandfly

Manually searching for rootkits is helpful for doing one-off investigations, but we recommend you automate searching for this problem so it can be spotted immediately. The last thing you want is a stealth rootkit in operation for extended periods because it makes the damage far worse as intruders can linger and learn your network.

Sandfly has a variety of modules designed to interrogate remote systems in different ways to see if something is amiss. For the attack described here, we have built in capability to flag files where their byte count size mismatches and giving an alert like below. But, we also check for thousands of other signs of compromise on Linux agentlessly.

The above check is designed to search for suspicious /etc/modules discrepancies, but we also have modules to sweep the entire /etc directory to find other files being targeted. In fact, you can simply clone any of our modules and customize them to sweep any directory you want for this attack using the size_mismatch parameter in a custom Sandfly module. Simply alter the directories you want scanned, save the file, and then send it out to all your systems in seconds to get results.

Don't Fear The Linux Rootkit

There is a lot of hype around Linux stealth rootkits, but the reality is that they can be easily found with command line forensics and simple tools that know what questions to ask. When investigating a system, keep in mind the idea of how you can ask the same question in multiple ways and see if the answers are the same. If they don't match, you might want to dig deeper to see if something is trying to hide.

If you found the above interesting, be sure to check out our free trial of Sandfly to check your systems for Linux rootkits and much more today.

Post Tags:

Rootkits Linux Forensics

Share this post:

← Return to Blog