Overview of the Internet as an Attack Vector: Censys State of The Internet Report
The Internet is a vast network that has revolutionized our daily lives. It encompasses many technologies, including web servers, content delivery networks, and cloud computing. Web entities’ content served over HTTP (like websites, web-based control panels, and APIs) has become integral to our daily routines. They enable us to shop, read the news, and connect with loved ones.
Since its inception, the Internet has been a transformative force, reshaping how we communicate, conduct business, and access information. As it has grown and evolved, it has incorporated many technologies ranging from web servers, content delivery networks, and cloud computing to the Internet of Things. However, this continuous adoption of new technologies has also brought challenges.
Every person now has an idea and a simple understanding of security about this concept, which has been in our lives for years. Cybersecurity has become an established sector, and the importance of cybersecurity and network security for organizations and even states is critical. So, what does the Internet look like today? For this, Censys provides a comprehensive overview by creating a state report for the vast domain of the Internet.
Internet or Web?
The internet and the web are close but distinct concepts; the internet is a network of networks connecting devices globally, while the web is a vast data collection accessible via the internet using HTTP. It serves as the foundational infrastructure supporting various services and protocols, with TCP/IP being a fundamental communication protocol. On the other hand, the web, or World Wide Web, is a specific subset of the internet. It encompasses a collection of interconnected documents and resources accessible through the internet via browsers, primarily using HTTP or HTTPS protocols. Censys defines web entities as services on the internet running HTTP, accessed by IP or domain name.
Then, the report divides the concepts into two again. There are two perspectives: the unnamed internet, responding to requests via IP or hostnames, and the named internet, viewed independently of physical IPs and referenced by a name.
Two crucial internet protocols support the mentioned duality concepts but facilitate effective communication for distinct reasons. HTTP, from version 1.1 onward, mandates the inclusion of a “Host” header in client requests, informing servers about the specific hostname and requested resource, eliminating the need for dedicated IP addresses for each domain.
TLS SNI, an extension of the TLS protocol, allows clients to indicate the server’s hostname before establishing a secure connection, ensuring the server responds with a hostname-specific certificate. Together, these mechanisms enable “Virtual Hosting,” where web servers respond differently based on client requests. Censys recognizes the importance of these protocols in data accuracy, adopting name-based network scanning and introducing “web entities” to provide a more comprehensive and modern view of web-based assets beyond bare IP addresses, ensuring a thorough understanding of the internet landscape.
HTTP Protocol
The Internet’s most prominent component is the myriad of services running over HTTP. These services are not limited to websites but extend to load balancers, web-based APIs, and more. The previous year’s report highlights how HTTP is omnipresent, accounting for a staggering 88% of the services observed on the Internet.
From a snapshot of the Internet in early 2023, Censys scan data revealed some intriguing numbers:
- Over 740 million hosts were found running approximately 1.3 billion HTTP services.
- This vast number includes 165 million unnamed hosts, which are only accessible by their bare IPv4 address, and over 570 million named or virtual hosts.
Such high numbers actually show how many potential targets there are when viewed from a cybersecurity perspective. Nearly a billion hosts and over a billion HTTP services also contain different plugins and services, revealing how many different targets there can be in a scenario where the internet is considered a single attack surface.
For example, a study by Zippia says that 43.1% of all websites use WordPress. More than 300 vulnerabilities have been found in WordPress so far; among them are critical and known exploited ones. Moreover, vulnerabilities in WordPress plugins can affect hundreds of thousands of hosts. Since websites are the main ways that most organizations interact with their clients, taking websites into the perspective is integral to reducing the attack surface and achieving digital resilience.
Diving deeper into these services:
Nearly 18% ran on servers hosted by major cloud providers such as Amazon, Oracle, GCP, or Azure.
Cloud services have managed to cover a large part of the internet, and it seems that they will continue to increase, but the fact that such high numbers belong to a small number of providers may increase the danger in a sense. AWS’s high share in the market also means that many operating on Cloud servers will be vulnerable in the event of an AWS vulnerability. Likewise, a misconfiguration on Microsoft Azure caused many cloud buckets to leak to the public.
SOCRadar’s abilities extend even to the Cloud. Our Cloud Security Module aims to enhance customers’ cloud storage security. The CSM allows for the identification of cloud buckets as assets and can distinguish between “public,” “private,” or “protected” statuses. SOCRadar issues a “Cloud Bucket Detected” alert when new cloud storage is identified, integrating it into digital assets. The CSM continually monitors the status of buckets and sends a “Cloud Bucket Status Change” alert in case of any alterations, such as a transition from “private” to “public.”
- Web server technologies were observed on over half of all HTTP services on both named and unnamed hosts. Apache and Nginx emerged as the most popular among them.
Similarly, the widespread use of similar technologies shows that in case of a security vulnerability, there will be millions of servers affected. On the SOCRadar Vulnerability Intelligence page, the “Apache” search query reveals over 2,000 vulnerabilities. These vulnerabilities in Apache – which is used in a quarter of the entire Internet – create a huge vulnerable attack surface in HTTP technology.
In this context, it became essential to properly manage, configure, and patch-manage the technologies used by your organization. Even if they do not try to target you specifically, many of your attack surfaces that touch the internet will be targeted by threat actors. Of course, the use of these technologies has become mandatory, but it is also our responsibility to protect this attack surface. SOCRadar Attack Surface Management module, with additional visibility and context regarding the severity of unknown external-facing digital assets in an automated manner, may help you in this regard.
- There are instances of discontinued and/or scrutinized web server products, especially on unnamed hosts. This suggests that non-optimal security practices might be in place on these devices.
There are also more apparent vulnerabilities. According to the report, some raise security concerns beyond the top 20, particularly those operating on unnamed hosts, notably Hikvision and Boa web servers. Hikvision web servers, managing video surveillance products, had a critical unauthenticated command injection vulnerability (CVE-2021-36260) in 2021, enabling unauthorized access. Approximately 1.5 million Hikvision hosts, primarily unnamed, may still be vulnerable. Despite discontinuation, Boa, an open-source server for embedded applications, faces ongoing security issues. With over a million servers detected, many on unnamed hosts, Boa’s vulnerabilities persist, including those exploited in attacks on critical infrastructure.
X.509 Certificates
One other protocol, X.509 certificates, commonly called SSL or TLS certificates, are foundational to the modern internet’s security infrastructure. They are the backbone for ensuring encrypted and authenticated communication between clients and servers. While their presence is often taken for granted, their role in maintaining a secure internet cannot be understated. These certificates have more than one role.
- Encryption for Web Traffic:
Certificates play a pivotal role in enabling encryption for web traffic. This ensures malicious actors cannot easily intercept and decipher the data between a client and a server. Gone are the days when one could simply use a network sniffer on the same broadcast domain to monitor unencrypted web traffic. The widespread adoption of certificates has rendered such practices largely obsolete.
- Identity Verification:
Certificates also serve as a means of identity verification. When an entity seeks a certificate from a trusted certificate authority (CA), it must furnish evidence of its identity. This could range from providing proof of domain ownership (Domain Validation or DV) to more rigorous checks like meeting with an employee from the organization requesting the certificate (Extended Validation or EV). When a site presents a certificate from a CA, it is a testament to its verified identity.
The Issues
One of the issues that may be a potential vulnerability in this topic is that these certificates have expiration times. For more information, be sure to check out our blog “How to Monitor Your SSL Certificates Expiration Easily and Why“.
Another problem is the false sense of security they provide. The presence of a certificate on a website should not be misconstrued as a guarantee of its legitimacy. Browsers use a padlock icon to indicate that a site uses HTTPS, which might lead users to believe that the site is secure. Yet, threat actors have exploited this perception by creating certificates for phishing sites to make them appear more legitimate.
While free certificate services like Let’s Encrypt have democratized the use of certificates, they have also made it easier for malicious actors to obtain seemingly certificates for malicious purposes. The Censys report also shows Let’s Encrypt’s dramatic share of the market.
Although most of them are not malicious, the fact that they have gained such a large market share only since 2014 also shows how open they can be to exploitation.
HTTP plus X.509: HTTPS
HTTPS is a secure version of the standard HTTP used for transferring data between a web browser and a website. The ‘S’ in HTTPS means ‘Secure,’ indicating that the communication is encrypted. This encryption through TLS or SSL, protects sensitive transactions like online banking and login credentials, ensuring the security and privacy of information exchanged between the user and the website.
Before the widespread adoption of HTTPS, web entities predominantly used plain text HTTP, exposing users’ sensitive information and passwords to potential interception through unencrypted transports and Man-in-the-Middle attacks. The introduction of certificates has played a pivotal role in promoting HTTPS. Certificates are essential for establishing trust between web clients and properties, forming the foundation for secure communication and safeguarding sensitive information exchanged between them.
A potential vulnerability that stands out in this section is that HTTPS is not used or, in scenarios where it is used, old versions of TLS may be used.
As shown in the above graphs, Censys found that almost 60% of web entities were using unencrypted HTTP without SSL/TLS.
A side note is HTTPS is typically recommended for secure online communication, but in certain cases, like websites sharing non-sensitive information or prioritizing performance, HTTP might be seen as more practical. During development and testing, HTTP is often chosen to simplify processes. The above graphs include various types of HTTP services on the internet, not just user-facing websites. However, the overall trend in the industry is moving towards using HTTPS universally to ensure security and privacy, even for websites that do not handle sensitive data directly.
Associated Risks
Since we have an overview of the main technologies and understand the attack surface, we can see other risks that arise due to them.
Data Leaks on the Web
While abuse of a web entity can cause harm in many ways, one of the most critical points is data security. Data leaks and breaches have become alarmingly frequent in the digital age, posing significant risks to organizations and individuals.
Before identifying the risk, the distinction between the two is crucial: while a data leak refers to the unintentional exposure of sensitive information, a data breach is an intentional unauthorized access to such data for malicious purposes. For instance, a data leak might arise from a misconfigured cloud storage service that inadvertently allows unauthorized access to confidential data. Both scenarios can lead to severe repercussions, including financial losses, damage to reputation, and potential regulatory penalties.
As stated in the report, over the past decade, some of the most notable data breaches weren’t the result of sophisticated hacking techniques but rather stemmed from simple misconfigurations or oversight. These incidents underscore the importance of robust security measures and the potential consequences of neglecting them.
The Prevalence of Data Leaks
We can take a while away from the Censys report and look at Verizon’s 2023 Data Breach report and get some insights. Drawing from the 2023 DBIR report by Verizon, it is revealed that 74% of all breaches incorporate the human element. Individuals are implicated through errors, privilege misuse, stolen credentials, or social engineering tactics. External actors are involved in 83% of breaches, and the predominant motivation for these attacks is financial gain, which accounts for 95% of breaches. The top methods attackers employ to infiltrate an organization include using stolen credentials, phishing schemes, and exploiting vulnerabilities.
In the report Censys provides critical numbers. Their research reveals over a thousand hosts exposing more than two thousand SQL database files without authentication on their HTTP services, posing significant security risks. The situation extends to 18,000 CSV files on 147 hosts, commonly used for tabular data storage. Additionally, over 5,000 hosts expose various files and directories related to backups, potentially containing sensitive information. Notably, 400 publicly accessible WordPress configuration files increase the risk of unauthorized database access. With WordPress widely used on over 13,000 hosts for public-facing websites, these vulnerabilities demand urgent attention. The exposures also include diverse risky data types, emphasizing the need for robust security measures to mitigate potential threats.
So as they say, the information presented illustrates that a “vulnerable” host extends beyond servers with outdated or exploitable software. Vulnerabilities can stem from diverse sources, such as errors in judgment, misconfigurations, and hastily executed tasks.
The Consequences of Data Leaks
The ramifications of data leaks can be far-reaching. Beyond the immediate financial implications, organizations may face reputational damage that can have long-term effects on customer trust and business operations. The importance of recognizing that a “vulnerable” host encompasses a broader spectrum, including errors in judgment, misconfigurations, and hastily executed tasks, cannot be overstated.
Quick solutions today may inadvertently result in severe data breaches tomorrow. This comprehensive understanding underscores the need for a proactive and holistic approach to cybersecurity, incorporating robust measures to mitigate risks originating from diverse sources and ensuring the protection of sensitive information in the ever-evolving digital landscape. Regulatory penalties, especially in sectors with stringent data protection regulations, can further compound the challenges.
SOCRadar may help you to not ever face these threats but in case of a breach, SOCRadar would continue to work for security as with its data leak detection service.
Friend or Foe
Organizations are constantly integrating new tools and technologies to enhance their operations and services in the rapidly evolving digital landscape. However, as Censys also stated, with the adoption of these tools comes the responsibility of ensuring their security. A common misconception is that tools, especially those from reputable vendors, are inherently secure right out of the box. This assumption can lead to significant security vulnerabilities.
As organizations and website architectures scale, monitoring and documenting them becomes increasingly intricate. Developers have responded to this challenge by creating tools that leverage dynamic feeds and connectivity to ever-expanding networks and services. These tools are invaluable for tracking system behavior performance and even self-documenting backend API functionality.
However, this convenience comes at a cost. In their quest for flexibility and simplicity, some of these systems may inadvertently prioritize these over security. This can lead to scenarios where monitoring software and API endpoints, if not adequately protected, become potential gateways for malicious actors. The most accurate example of this phenomenon is the firewalls purchased and used to protect the organization. Critical vulnerabilities, even in security products, have been a rising attack vector in recent years.
Case of Prometheus
A monitoring tool examined by Censys reveals the potential vulnerability. Prometheus, adapts to the changing web landscape with its capacity to effortlessly discover and monitor an organization’s assets in dynamic environments like Kubernetes or AWS. It pulls data from HTTP endpoints, offering flexibility through static configuration or dynamic auto-discovery via cloud connectors and DNS Service Discovery.
However, a notable security concern arises as Prometheus assumes all users, even untrusted ones, have default access to its data, posing a potential risk for unauthorized access. Over 41,800 Prometheus servers globally monitor 219,400 endpoints, with Amazon hosting a significant portion.
This exposure reveals potential vulnerabilities, especially in Amazon networks, where Prometheus servers may disclose information about private and public networks. Integrating such tools demands careful consideration to align with security needs and restrict access to trusted users.
If an attacker manages to gain access to an organization’s monitoring data, they could potentially identify other assets on the network, understand the organization’s infrastructure, and even pinpoint vulnerabilities to exploit. This could lead to unauthorized data access, system disruptions, and even data breaches.
Thus, it should be said that securely configuring software instances often falls on the end-users. Tool developers might provide the tools with default security settings, assuming that organizations will tailor them to their needs. However, if organizations fail to do so or are unaware of the potential risks, they expose themselves to significant threats.
Conclusion
The Internet is vast, intricate, and ever-evolving. As we navigate this expansive landscape, it becomes increasingly evident that our challenges are not solely the result of sophisticated hacking techniques or advanced exploits. Instead, many security issues highlighted in the “2023 State of the Internet Report” by Censys arise from improper patch management, misconfigurations, exposure issues, and simple mistakes.
To solve these problems, some security reflexes must be developed, and security solutions must be applied. The complicated tasks of asset management, vulnerability assessment, and patch management are paramount in reducing an organization’s attack surface. While these tasks may seem mundane, their significance cannot be understated. Organizations can significantly bolster their security posture and mitigate potential threats by addressing these foundational issues.