Get Your Free Report
Start for Free
SOCRadar® Cyber Intelligence Inc. | U.S. Elasticsearch Leak: 676M+ Identity Records & SSNs Exposed
Mar 03, 2026
7 Mins Read
Mar 10, 2026
Moon

U.S. Elasticsearch Leak: 676M+ Identity Records & SSNs Exposed

SOCRadar has identified a publicly accessible Elasticsearch instance containing over 676 million indexed U.S. identity records, including full SSNs and complete identity profiles. The dataset was exposed to the internet without authentication, enabling unrestricted access to full identity attributes, including Social Security Numbers (SSNs), dates of birth, historical address records, and phone numbers.

The exposed instance contained highly sensitive personal data at a scale exceeding the current U.S. population, strongly indicating large-scale aggregation, historical retention, or multi-source identity consolidation.

Given automated scanning activity targeting exposed Elasticsearch services, it should be assumed that any publicly accessible dataset of this nature may have been accessed or replicated prior to remediation.

Our team analyzed the dataset, validated samples, and initiated responsible disclosure procedures. In parallel with the analysis, we began efforts to identify the data owner and hosting provider in order to secure the exposure and take it offline. Below is a technical and threat intelligence overview.

Technical Snapshot of the Exposure

    • Indexed Records: ~676,798,*** (unique count unverified)
    • Total Data Size: 92 GB
    • Exposure Type: Publicly accessible, no authentication
    • Geographic Scope: United States
    • Data Format: Full name, DOB, full address, city/state/ZIP, phone number, SSN
    • Severity: Critical

Elasticsearch dashboard showing 92 GB and over 676 million documents

Elasticsearch dashboard showing 92 GB and over 676 million documents

The exposed Elasticsearch cluster revealed a single large index alongside default internal indices. The primary dataset contained structured identity records indexed in a searchable format.

What the Dataset Contained

Sample queries revealed structured identity documents containing:

  • First and last names
  • Full date of birth
  • Street address
  • City and state
  • Zip code
  • Phone number
  • Full SSN

JSON identity record sample showing full name, DOB, address, phone, SSN

JSON identity record sample showing full name, DOB, address, phone, SSN

Unlike typical PII exposures involving only email or phone numbers, this dataset included government-issued identifiers paired with full identity attributes, significantly increasing misuse potential.

Sampling suggests:

  • Records associated with both living and deceased individuals
  • Historical address tracking (multiple addresses per individual)
  • Aggregated multi-source identity data
  • Duplicates or legacy entries

The total indexed count exceeding the U.S. population strongly indicates the dataset does not represent entirely unique individuals, but rather multiple records per identity over time.

Sample Validation and Authenticity Indicators

Due to the scale of the dataset, validation sampling was conducted.

For example, in one instance, a record was cross-referenced with publicly available obituary information. The name, date of birth, and geographic details aligned with a recently deceased individual, confirming data authenticity.

Obituary page screenshot used for validation reference

Obituary page screenshot used for validation reference

This alignment, combined with previously observed mentions of approximately 250 million related data entries circulating on hacker forums, suggests that portions of this dataset may have already entered underground ecosystems.

Risk Assessment: Why This Exposure Is Severe

The presence of SSNs alongside full identity profiles elevates this case beyond routine PII leaks.

Primary risk considerations include:

Unlike passwords, identity attributes such as SSNs and dates of birth are non-rotatable and effectively permanent once exposed.

When combined with other breached datasets, this type of exposure enables:

  • High-confidence spear-phishing
  • Account recovery abuse
  • Credit fraud
  • Loan and benefits fraud
  • Executive targeting

From a threat intelligence perspective, large-scale identity datasets serve as infrastructure for fraud ecosystems rather than single-event exploitation.

Why Misconfigured Elasticsearch Instances Remain a Recurring Risk

This incident reflects a broader pattern.

Previously, we identified:

In each case, the root cause remained consistent:

  • Port 9200 exposed to the internet
  • Authentication disabled
  • Weak or absent network segmentation
  • Mismanaged cloud security configurations

Recurring exposure patterns of this nature reflect governance and cloud configuration management gaps rather than software vulnerabilities.

Elasticsearch is not inherently insecure; however, when deployed without authentication, network segmentation, and access control enforcement, it effectively becomes a publicly searchable identity repository.

Threat actors continuously scan for open Elasticsearch services. Once identified, extraction requires minimal effort and no exploitation.

Response Actions: Takedown Effort

Upon discovery, SOCRadar initiated response tracks.

The exposed Elasticsearch instance appears to be hosted by a third-party hosting provider, while the actual data owner remains unidentified at the time of publication. Our analysts initiated outreach efforts to identify the responsible entity and coordinate remediation. The objective is clear: ensure the database is secured and no longer publicly accessible.

This structured response model demonstrates how exposure monitoring must move beyond discovery into verification, intelligence integration, and remediation coordination.

Infrastructure Hardening and Identity Risk Monitoring

To reduce exposure risks, organizations should:

  • Restrict Elasticsearch access to internal networks
  • Enable authentication and role-based access control
  • Enforce IP allowlisting
  • Disable direct internet exposure of port 9200
  • Monitor cloud configurations for misconfigured services
  • Implement continuous external attack surface visibility

Exposure discovery must extend beyond internal monitoring. Public-facing databases, shadow IT assets, and forgotten environments frequently remain outside traditional security oversight.

How SOCRadar Detects and Mitigates Identity Exposure

Internal security tools often fail to detect publicly exposed databases. Once a service becomes internet-facing, it moves outside conventional perimeter controls. SOCRadar bridges that visibility gap. By combining capabilities such as External Attack Surface Management (ASM), Sensitive Data Exposure Monitoring, and Brand Protection, we continuously scan for exposed services and identity datasets across the open internet.

When full identity records or SSN-linked datasets are identified, our team analyzes the exposure, validates authenticity, assesses risk, and notifies affected stakeholders.

This enables:

  • Early exposure detection
  • Fraud risk reduction
  • Executive and customer protection
  • Proactive remediation before exploitation escalates

SOCRadar’s Attack Surface Management, Digital Footprint

SOCRadar’s Attack Surface Management, Digital Footprint

Industry Impact and Why This Discovery Matters

The exposure of approximately 676 million U.S.-based identity records, including full SSNs, represents one of the largest publicly accessible identity datasets identified in recent monitoring efforts.

Discoveries of this magnitude underscore the importance of continuous external exposure monitoring and structured remediation workflows. They also highlight how large-scale identity datasets can circulate unnoticed until actively identified and reported.

By proactively identifying and disclosing such exposures, SOCRadar reinforces its position as a leading authority in cyber threat intelligence and exposure monitoring. Large-scale discoveries not only protect affected individuals and organizations but also provide critical visibility into systemic infrastructure weaknesses that continue to fuel identity-driven fraud ecosystems.

Conclusion

A publicly exposed Elasticsearch instance containing approximately 676 million U.S.-based identity records, including SSNs and full address history, represents an extreme-scale identity risk.

Even if duplicate or historical entries exist, the presence of structured, searchable, government-issued identifiers in an unauthenticated database places this case in the Critical severity category.

This incident reinforces a persistent reality: misconfiguration, not zero-day exploitation, continues to drive some of the largest identity exposures on the internet.

In large-scale identity exposures, the distinction between misconfiguration and breach is often measured only by detection speed.

Continuous external monitoring, infrastructure hardening, and identity risk assessment are baseline requirements in an internet-facing environment where exposed data can become operational within minutes.