SOCRadar® Cyber Intelligence Inc. | Misconfigurations in Google Kubernetes Engine (GKE) Lead to a Privilege Escalation Exploit Chain

Resources

Dec 29, 2023

6 Mins Read

Misconfigurations in Google Kubernetes Engine (GKE) Lead to a Privilege Escalation Exploit Chain

A recent Unit 42 investigation revealed a dual privilege escalation chain impacting Google Kubernetes Engine (GKE). This exploit chain arises from specific misconfigurations in GKE’s FluentBit logging agent and Anthos Service Mesh (ASM). When combined, these issues could provide attackers with existing Kubernetes cluster access an opportunity to escalate privileges. Kubernetes, a widely adopted open-source container platform, is enhanced by GKE’s additional features for cluster deployment and management. However, the inherent complexity of Kubernetes environments exposes them to security risks, often stemming from misconfigurations and excessive privileges. If an attacker gains execution capabilities within the FluentBit container, potentially through the exploitation of a remotely vulnerable component, and the cluster has Anthos Service Mesh (ASM) enabled, the exploit chain emerges. This chain enables the attacker to seize complete control over a Kubernetes cluster, enabling data theft, deployment of malicious pods, and disruption of cluster operations. Palo Alto researchers state that this exploit chain is a next-generation second-stage cloud attack, where attackers leverage existing access to the Kubernetes cluster to spread within the cluster or escalate privileges. Now, let’s delve into this dual privilege escalation chain, highlighting vulnerabilities in the default configuration of GKE’s logging agent, FluentBit, which automatically runs on all clusters, and in the default privileges within Anthos Service Mesh (ASM), an optional add-on enabling service-to-service communication control within a GKE environment.

The Attack Chain

To initiate this second-stage attack, the attacker must first exploit the FluentBit container, leveraging a remote code execution or arbitrary file read vulnerability, or breaking out of another container, to gain access to the Node. The initial phase of the attack exploits a misconfiguration in the FluentBit container, which automatically mounts the /var/lib/kubelet/pods volume. Beneath this directory lies the kube-api-access volume, containing projected service account tokens for each pod on the Node. By compromising the FluentBit pod, the attacker gains access to these tokens, allowing them to impersonate a pod with privileged access to the Kubernetes API. This unauthorized access also potentially enables the attacker to map the entire cluster, and perform malicious actions based on their privilege. The subsequent step targets Anthos Service Mesh’s Container Network Interface (CNI) DaemonSet, initially installed to configure the Istio CNI plugin on each node. Istio-cni-node DaemonSet is installed in the cluster when Anthos Service Mesh is enabled. The DaemonSet retains excessive permissions post-installation, and an attacker can create a new pod endowed with these powerful permissions.

The ASM misconfiguration (Palo Alto Networks)

Chaining these two exploits allows the attacker to secure complete control over the Kubernetes cluster, escalating privileges to cluster admin status. After compromising the FluentBit container and gaining privileged access to the Kubernetes cluster, the attacker exploits the default configuration of FluentBit, which is mounting the /var/lib/kubelet/pods volume. This provides access to kube-api-access- directory and tokens from all pods with a Node. The FluentBit DaemonSet enables the attacker to repeat this compromise on each node, mapping the entire cluster and identifying tokens, including the Istio-Installer-container token. The attacker then leverages the ASM CNI DaemonSet’s excessive permissions post-installation, creating a new pod in the Kube-System namespace. Targeting powerful service accounts, the attacker selects the clusterrole-aggregation-controller (CRAC) service account, capable of adding arbitrary permissions to existing cluster roles. By updating the CRAC’s service account in the pod’s YAML file, the attacker secures the CRAC token. This token, mounted to the new pod, grants cluster admin privileges. The FluentBit misconfiguration is exploited once again to obtain the CRAC token, completing the chain and establishing the attacker as a cluster admin.

Is There Any Evidence of Exploitation in the Wild?

It is important to note that, as of now, there is no evidence indicating the exploitation of these vulnerabilities in real-world scenarios. Google’s swift response and the subsequent patch deployment have mitigated the risk, underlining the significance of regular updates and proactive security measures.

Patches for Google Kubernetes Engine (GKE) and Anthos Service Mesh (ASM)

Google responded promptly to remediate these vulnerabilities, issuing fixes on December 14, 2023. The following versions of Google Kubernetes Engine (GKE) and Anthos Service Mesh (ASM) incorporate the necessary patches: GKE Versions:

1.25.16-gke.1020000
1.26.10-gke.1235000
1.27.7-gke.1293000
1.28.4-gke.1083000

ASM Versions:

1.17.8-asm.8
1.18.6-asm.2
1.19.5-asm.4

Google addressed the issues by removing the /var/lib/kubelet/pod volume mount from FluentBit, preventing access to projected service account tokens. Additionally, modifications were made to ASM’s ClusterRole, restructuring functionalities to eliminate excessive permissions.

Proactive Measures and Recommendations to Address Cloud Misconfigurations; How Can SOCRadar Help?

Researchers emphasize the persistent threat of container escape, and the need for robust security measures within cloud infrastructures. Even if an attacker gains entry, it is critical to minimize the potential damage. Explore our blog post covering common attack types targeting cloud infrastructures, offering insights to enhance your security posture against potential future threats. Misconfigurations in cloud environments, like Google Kubernetes Engine (GKE), are common sources of security vulnerabilities. Here are the security recommendations to mitigate the risk of misconfigurations leading to vulnerabilities in cloud environments, including Kubernetes clusters:

Regular Audits and Updates: Conduct frequent security audits for critical services, employing automated tools for continuous checks. Keep all components, including Kubernetes clusters, up to date with the latest security patches.
Follow Best Practices: Adhere to best practices provided by your cloud service provider and specific services. Conduct training programs to bolster teams’ awareness of cloud security best practices.
Limit Permissions and Follow the Principle of Least Privilege: Restrict permissions to the minimum necessary, following the principle of least privilege for user accounts, service accounts, and components. Leverage built-in security features like IAM and security groups.
Implement Network Policies: Utilize network policies to control the flow of traffic between pods within your Kubernetes cluster, minimizing the attack surface.
Monitor and Analyze Logs: Implement robust logging and monitoring solutions for prompt detection and response to unusual activities or security incidents.

In this context, SOCRadar also provides a valuable service through its Cloud Security Module (CSM), designed to safeguard customers’ cloud storage.

By utilizing Attack Surface Management in conjunction with CSM, SOCRadar actively identifies new cloud buckets. CSM allows precise determination of whether these buckets are categorized as “public,” “private,” or “protected.” As part of its proactive strategy, SOCRadar promptly issues real-time alerts about bucket status changes and upon discovering new cloud storage owned by users. This continuous monitoring significantly strengthens security measures, effectively mitigating the risk of vulnerabilities for your critical cloud assets.

Extended Threat Intelligence Platform

Misconfigurations in Google Kubernetes Engine (GKE) Lead to a Privilege Escalation Exploit Chain

The Attack Chain

Is There Any Evidence of Exploitation in the Wild?

Patches for Google Kubernetes Engine (GKE) and Anthos Service Mesh (ASM)

Proactive Measures and Recommendations to Address Cloud Misconfigurations; How Can SOCRadar Help?