Writing YARA Rules with Custom GPTs and SOCRadar Platform
YARA rules stand out as essential instruments for identifying and classifying malware. These rules are indispensable for cybersecurity professionals, aiding them in detecting and mitigating cyber threats. However, the task of crafting YARA rules is challenging. It demands a comprehensive understanding of evolving malware patterns and requires staying up-to-date with the latest threat intelligence.
Cyber Threat Intelligence (CTI) plays a pivotal role in this context. It provides an in-depth understanding of the threat landscape by offering detailed insights into potential or ongoing attacks that could impact an organization. You can even search the ready-made YARA and Sigma Rules on the SOCRadar Platform using Threat Hunting Rules page, consider them as examples, or do threat hunting within your systems.
However, when you need to write the rules yourself, you can still solve what you need with the help of SOCRadar Platform and artificial intelligence. This article delves into an approach to enhance the creation of YARA rules through the integration of Custom GPT models and SOCRadar platform. This fusion is set to ease the development of YARA rules, paving the way for a more efficient, accurate, and timely response to emerging cyber threats.
Basics of YARA Rules
YARA is a powerful tool for the cybersecurity community, providing a mechanism to identify and classify malware. Derived from the concept of ‘Yet Another Ridiculous Acronym,’ YARA offers a structured way to describe patterns that correspond to malware families or suspicious behaviors. It’s akin to creating a set of highly specific rules that can quickly scan files or data streams to pinpoint potential threats.
Fundamentals of YARA Rule Structure:
Rule Name: Each YARA rule begins with a unique identifier or name, following the keyword ‘rule.’
Tags: Optional and used for categorization or adding context.
Meta Section: Includes descriptive data about the rule, like author, date, or impact.
Strings Section: The heart of the rule, where actual patterns or sequences to be searched are defined. These can be text strings, binary sequences, or regular expressions.
Condition Section: Defines the logic for when a rule should trigger. It utilizes the defined strings and may include logical and arithmetic operators to create complex criteria.
Creating a YARA rule involves specifying patterns that are characteristic of malware. These patterns could be anything from specific strings, binary sequences, or even particular behaviors. The key is in identifying unique and definitive attributes of malware that distinguish them from benign software.
For a deeper understanding, you should check the official YARA guide as well. So, now that we understand the general concept, let’s see how we can write these rules with an assistant.
Rule Writing
The syntax required to write the YARA Rule is as simple as we showed above and can be written even on a primitive text editor like nano in Nix systems or Notepad in Windows. (Of course, you also need to obtain the necessary repository to use YARA rules.) ChatGPT is a very helpful tool in the process, furthermore, a Custom GPT called Malware Rule Master can help us both in malware analysis and YARA rule writing and can even scan web sources and obtain malware from open sources without analyzing malware.
This custom GPT also claims to utilize db.pdf, the YARA rules collection of the GNU-GPLv2 license. Although ChatGPT 3.5 and 4 can also be used in this process, a Custom GPT makes things easier for this specific task.
The main reason why it is more useful is that it stays on topic and stays away from the danger of AI Hallucinations and also includes the necessary up-to-date information about Malware and YARA Rules. Although GPT 4 can also browse for up-to-date information, checking its sources is also an extra work.
Below we use a Custom GPT fed with YARA Rules knowledge, but the use of specific Custom GPT is not mandatory.
When we want to write a YARA rule, it lists its needs, file signatures, strings, binary patterns, etc. We have several methods to obtain these. It is not the easiest, but the healthiest process may be to provide the outputs to this GPT after conducting a malware analysis.
Our article on how to perform Malware Analysis with Custom GPTs and SOCRadar Malware Analysis can help you in this context.
Now let’s create an example template:
rule ExampleMalware
{
meta:
description = “Detects ExampleMalware”
author = “SOCRadar”
reference = “https://socradar.io”
date = “2023-**-**”
strings:
$a = { 6A 40 68 00 30 00 00 6A 14 8D 91 } // Example binary pattern
$s1 = “ExampleMalware” ascii wide // Example string
$s2 = “wallet.dat” ascii wide // Targeting cryptocurrency wallet theft
$s3 = “password” ascii wide // Common string in credential theft
// Add more strings or binary patterns as known
condition:
any of them
}
In the template above, a simple but ineffective example can usually be achieved with static and string analyses. The answer to why it is not effective is that malware is generally obfuscated, common strings have a high probability of giving false positives, or the rule we created can be avoided even with minor changes in different samples and versions of the malware.
Then, we need to get more specific. These specific needs are more unique strings and binary patterns. Binary patterns are especially useful for identifying malware because they can target unique sequences of machine code or data structures that are specific to a particular malware strain. This is effective even if the malware attempts to obfuscate its presence through techniques like polymorphism or encryption, as certain core binary structures may still remain unchanged.
Things that can be done to obtain more unique IoCs are to extract IoCs from SOCRadar Platform, collect IoCs from malware repositories, security forums, or threat intelligence feeds, or collect samples if you are capable of analysis. Again, you can also take a look at our article, where we review Malware Analysis Tools.
If the analysis method is preferred, strings can be extracted with static analysis, and thus, you can obtain URLs, file names, registry keys, or text-based indicators. Do not forget that at this point, SOCRadar Malware Analysis will show detailed unique strings and IoCs in its reports.
To obtain binary patterns, unique binary patterns in the memory can be detected by performing Memory Dump Analysis in the Sandbox environment or tools like IDA Pro, Ghidra, or x64dbg can be used to disassemble the binary. This allows you to see the assembly code and identify unique function calls, algorithms, or other binary patterns.
For example, by providing byte patterns for specific malware, This Custom GPT can create a more helpful rule.
If SOCRadar is open in another browser tab, it can streamline your tracking of various Threat Feeds and help you use existing IoCs on the platform for rule creation.
For example, when we search for QakBot malware in SOCRadar Platform, you can extract the latest IoCs, and you can even directly access the YARA rule written for QakBot. You can use this rule as well as evaluate the strings and patterns in it.
Lastly, the condition is dependent on the preceding section’s information, dictating the frequency of alert triggers in matches. For something like patterns linked to malware, a situation can be specified where the trigger occurs if even one of the listed criteria is met. However, for vaguer IoCs with a higher risk of false positives, like strings, a condition can be set where a larger number, such as 4 out of 6, of the criteria need to be met.
Conclusion
Using Custom GPT models like Malware Rule Master can be a big step forward in cybersecurity. These AI tools do more than just analyze malware; they also help write YARA rules, which are used to identify malware. They work by searching the internet, collecting malware samples from public sources, and offering important information that makes writing these rules better and more accessible. When combined with SOCRadar, which has a lot of information on cyber threats and malware, these AI tools give cybersecurity professionals new and better ways to create YARA rules.
We’ve shown how YARA rules are structured and given an example to help you understand how to write them. The success of these rules depends on how specific and flexible they are to keep up with changing malware tactics. Adding specific strings and binary patterns, which are either carefully analyzed or taken from platforms like SOCRadar, is crucial. These details make the rules more accurate and able to stand up to the tricky methods malware uses to hide.