As technology evolves, software development has become increasingly intricate and interconnected. Third-party dependencies and open-source libraries have made it easier for developers to build complex applications in today’s world. However, this convenience has risks, with the possibility of attacks from malicious packages being a significant threat. Such attacks can compromise the organizations’ software’s security and cause severe consequences.
This article delves into the dangers of malicious packages, methods used by cybercriminals, and how to safeguard your software and organization against them.
What is a Malicious Package?
A malicious package is a code or software component designed or altered to perform harmful actions in software development. These packages often masquerade as legitimate software and can infiltrate the software supply chain via open-source libraries or third-party dependencies.
In addition to PyPI and npm, other platforms also carry this risk:
- In NuGet, the Microsoft-powered code-sharing platform for .NET and .NET Core also discovered malicious packages with over 100,000 downloads.
- In 2020, researchers found more than 750 malicious RubyGems packages (created mainly by typosquatting). This situation shows that RubyGems is also a risk that cannot be ignored.
- Maven Central, managed by Sonatype, is the default repository for Apache Maven, Scala SBT, and other build systems and is also considered the most extensive Java collection. Maven Central is also widely used by developers as it hosts many Java components. Maven Central is reliable because it also needs Group ID, Artifact ID, and Version information compared to other component ecosystems. Although security is assumed high, malicious packages can be seen on the Maven platform at various times.
Supply Chain Attacks
From an attacker’s perspective, a relatively easy way to infiltrate an organization is to attack the company by exploiting the vulnerabilities of the product or vendor that the organization relies on. No matter how much attention the organization pays to its security, security flaws in the products it uses can disregard this security.
Considering that open source is used by many companies, it is helpful to know that the use of open source packages by companies enables them to be included in the supply chain and that these packages or the developer team are also comprised in a software supply chain attack in case of using any malicious package. The consequences can be devastating, as these attacks are aimed at most product users rather than just one target. Examples include the supply chain attack on SolarWinds by APT29 in 2019, the attack of the Clop ransomware group on hundreds of companies due to the zero-day vulnerability in GoAnyWhere MFT, and most recently, the supply chain attack by SmoothOperator using a trojanized version of 3CX VOIP Desktop Client.
What are the Methods used by Hackers to provide Malicious Packages?
Cybercriminals often use the following methods when creating and publishing malicious packages:
Typosquatting is a type of cyber attack where a malicious actor uses a name similar to a legitimate one by using common misspellings or a slight variation of the original one, making it difficult for users to notice the difference.
Package managers do not check whether the user has typed the packet name correctly; they only verify whether there is a package with the same name as the input received from the user and then proceed with the download. Attackers consider this fact and create malicious packet names by predicting misspellings by users. An excellent example would be a user who wants to download BeautifulSoup4, a well-known HTML and XML parser, and gets input as BaeutifulSoup4 due to mistakenly pressing the letter group ea wrong or beautifulsoup because of not knowing the original package name contains the number “4”.
According to the article “Backstabber’s Knife Collection: A Review of Open Source Software Supply Chain Attacks,” 61% of malicious packages are created using typosquatting. This ratio is 75% on PyPI packages.
When considered in the context of malicious packages, masquerading is used for impersonating a known package or repository to look legitimate for deceiving the developers. Those who are not sure of the package name can fall for this method, as the attacker copies the original package and its content exactly the same.
- Trojan Package
A Trojan package is designed to look like a package containing legitimate code but has hidden code that can harm the victim’s computer or steal the victim’s data. As an example for that method, the attacker creates a library that operates effectively such as a command line color changer, but hides malicious code within it. This malicious code is often small or difficult to understand because it is obfuscated. As a result, it becomes challenging to identify the library’s legitimate capabilities from the components.
- Dependency Confusion
Many packages come with their predefined dependencies and the package manager downloads all dependencies during the installation process of the package. According to JFrog, a company that provides a DevOps Platform that powers and controls the software supply chain, most default package managers prefer downloading external packages which have the highest version number. There is an exploitation method for dependencies called Dependency Confusion, which means if the attacker names its malicious packages the same as any dependency and sets its version number to any higher value than the original package’s, the package manager decides to choose one with the highest version to install within the package.
- Injections on Package
This method is one of the most difficult to achieve because of its complexity. With this method, the attacker should inject malicious code into any package known and used by many developers. This could happen via contributing to an open-source project with hidden or obfuscated code snippets inside a piece of code that is seemingly harmless or gaining access to the package’s developer or maintainer.
- Repo Hijacking
Regarding open source and software development, GitHub comes to most developers’ minds. GitHub is used for this purpose intensely, and most software development repositories are developed on this platform. Holds can be hijacked by exploiting GitHub’s logical loopholes. If the developer changes their username, the existing username becomes available to any other user, and the user’s repositories have automatically redirected to the new user’s repositories. However, if the old username is taken by another user and the new user creates a repository with the same name, the redirection will be removed, and the user with the old username will own the repository.
An example is the Arch User Repository (AUR) PKGBUILD file discussed on Vracken’s blog. In the code below, makepg clones the code from blacksphere/blackmagic and installs it. When trying to access this source, it redirects to the blacksphere-debug/blackmagic address. Since the blacksphere username has been changed to blacksphere-debug, makepg automatically accesses the repository defined for the new username. Still, any user taking the blacksphere username and creating a repository with the same name will risk the original blacksphere package.
How a Malicious Package Can Harm a System?
From various observations, it seems that malicious packages are most often used for:
- Cryptocurrency theft,
- Information stealing,
- Deploying Remote Access Trojans.
According to the article “Backstabber’s Knife Collection: A Review of Open Source Software Supply Chain Attacks,” 54% of the attacks aim data exfiltration. In support of this information, the observation of the intensive use of stealer malware can be inferred that the danger is more about obtaining credentials rather than damaging the victim’s system.
Recent Threat Observed using malicious PyPI Packages, the W4SP Stealer
Towards the end of 2022, multiple malicious packages were discovered on PyPi. These packages deploy stealer malware on the computers of Python developers. Researchers have observed that the stealers, which appear under the names Celestial Stealer, Leaf $tealer, ANGEL Stealer, and @skid stealer, are actually a variant of the W4SP Stealer software.
W4SP stealer was shared on GitHub by a user named loTus04 with the phrase “for educational purposes only” in the description. This repository contains many features and detailed information about using the W4SP, the API, and other features. Although it was removed from GitHub when it was first discovered, the developer IoTus04 has republished the stealer.
Although the primary owner of the software is IoTuS, it can be seen that the source code also includes an actor named billythegoat356. Suppose Billythegoat’s GitHub profile has been checked. In that case, it can be seen that they wrote many (there are 44 code repositories in their account for now) malicious codes such as Discord token grabber, Python obfuscator, and Network Layer 4 Denial of Service (DoS) tool. A unique situation is the command line coloring package named “pystyle.”
Although there are currently no malicious commands in the “pystyle” package, its developers include loTuS01(most probably loTuS04 itself), and there is always a risk that malicious code will be added in upcoming package updates.
Recent Observations: The Shift Key Attack – An Emerging Threat in the npm Ecosystem
Recent research of Checkmarx has disclosed observations in the npm ecosystem. Cybercriminals have been exploiting package dependencies using a technique known as the Shift Key attack. This attack involves adding or removing capital letters in package names and deceiving developers into using malicious packages that closely resemble legitimate ones.
The researchers found 3,815 npm packages with capital letters in their titles, and about 50% (1,900)were at risk of lowercase typosquatting. Rogue npm packages can compromise enterprise networks, posing risks like information theft, ransomware, crypto mining, and Denial Of Service(DoS).
Since 2017, hackers have mimicked legitimate npm packages using lowercase spelling. Before a policy change that year, packages on npm could be named using uppercase letters, leaving thousands of existing packages with capital letters in their titles vulnerable to lowercase typosquatting. NPM has patched this vulnerability, but organizations should check any downloaded packages they use.
The prevalence of Shift Key attacks in the npm ecosystem is alarming and underscores the importance of securing package management systems. These recent findings demonstrate the need for developers to be increasingly vigilant in using open-source repositories.
A Move from PyPI: Suspension
On May 20, in a recent status share by PyPI, the platform suspended all user sign-ups and package uploads. This move comes after an ongoing cyber attack led to a surge in the creation of malicious users and projects on the index, which had overwhelmed the maintainers’ ability to respond effectively. As it can be understood from the official Python Infrastructure Status page, the suspension was initially enforced as a protective measure.
On May 21, after one day, the suspension was lifted.
RATs hidden in packages: TurkoRAT Infostealer
ReversingLabs recently warned about malicious packages in the npm repository that had an open-source info-stealer named TurkoRat. The malware can collect various types of sensitive data from infected machines, including login credentials, cryptocurrency wallets, and website cookies. The malware was activated after the package was run, relying on concealed commands within the index.js file.
The packages were downloaded approximately 1,200 times in two months before being detected and removed from the npm repository.
TurkoRAT is an open-source Remote Access Trojan, and as can be seen from its GitHub page, it continues to be developed and strengthened. Also, the To-Do list on its GitHub page shows that the developer is working on Firefox stealer, VPN, and Gaming Grabber, and also Keylogger. Like W4SP Stealer, it is possible that the use of TurkoRAT will increase.
Developers and users of these platforms should be careful and proactively protect their systems and data from such threats. It is vital to check the packages used by the development teams and look for unusual details such as typos or strange version numbers. The latest incident reminds us of the persistent and evolving threats of the digital landscape.
With the technology’s evolution, software development has become more complex and interconnected. Organizations use open-source and third-party dependencies more often because they made the software development cycle easier.
It has become a dynamic for software developers to use functions and libraries that are available in open-source rather than reusing functions they have written themselves. Applying the DRY(Don’t Repeat Yourself) Principle, a principle of software development that aims to reduce repetition and redundancy in code, pre-built libraries and frameworks can be reused across projects without forcing the developers to reinvent the wheel and disrupt the software development process. However, this convenience also comes with risks, including the possibility of attacks from malicious packages.
These attacks can compromise software security and have severe consequences for organizations. Therefore, developers and organizations must be aware of the dangers of malicious packages and take steps to defend against them. By understanding the methods used by cybercriminals to provide malicious packages and how they can harm systems, organizations can better prepare themselves to prevent these attacks and keep our software and organizations secure.
In addition, there are some preventive measures that organizations can follow:
- Conduct a software bill of materials (SBOM) analysis: An SBOM is a complete inventory of all the components that make up a software product. Organizations should require their suppliers to provide an SBOM for all software products and components.
- Check the package and its dependencies on development medium: It is very vital to check the GitHub repositories of the components to be used in the development processes. Development teams’ checklist should contain the following:
- The repository history (can be checked thanks to the version control system of the Git itself),
- The number of Forks, Stars, and Contributors to the project,
- The creation date of the repository,
- And last but not least, the name of the creator(s) also should be cross-checked on package management databases such as on PyPI, npm, etc.
These steps will be a great benefit in distinguishing between fake(malicious) and real repositories to prevent any risk factors.
- Monitor the Software Supply Chain: Monitoring the supply chain has two main benefits: improving general security posture and providing Cyber Threat Intelligence (CTI).
- General security posture benefits include:
- Preventing cyber threats, identifying vulnerabilities, and proactively searching for potential threats within an organization’s systems.
- The supply chain’s complexity with multiple vendors and partners creates opportunities for cybercriminals to exploit vulnerabilities.
- A security breach in the supply chain can cause reputational damage and revenue loss.