Cryptomining in Kubernetes Namespaces: How 11 Days Goes Undetected

Cryptomining process detection signals in a Kubernetes namespace runtime telemetry view

I spent 11 days hunting a ghost.

That's how long a cryptomining deployment ran undetected in a misconfigured Kubernetes namespace at my previous employer before anyone knew it was there. Not 11 hours. Eleven days. The first signal wasn't an alert from any security tool. It was a cloud billing spike. By the time we correlated the compute anomaly to a compromised namespace, the attacker had already moved laterally and exfiltrated S3 credentials from a co-located secret. The mining was almost the clean part of the story.

That incident is the reason Kubesentry exists. And in the time since, we've talked to dozens of mid-size SaaS security teams running EKS, GKE, and AKS who tell some version of the same story. The details change. The 11-day MTTD doesn't.

Why Billing Spikes Are a Lagging Indicator

Here's the thing about cloud cost anomalies as a detection signal: they're useful, but they're slow by design.

Cloud providers aggregate billing data on cycles that can lag actual usage by 24 to 48 hours or more, depending on the service and the region. Even when you set a billing alert at 20% above your baseline, the alert fires after a meaningful window of unauthorized compute has already run. In our experience, that window is almost always enough for an attacker to establish persistence, move laterally, or extract whatever credential they came for in the first place.

Billing detection also has a threshold problem. A mid-size SaaS team running 30 production namespaces across 3 environments sees natural cost variance that makes a cryptominer's footprint easy to miss. A workload spiking to 300% CPU in a dev namespace during a load test looks identical to xmrig running at steady state. By the time the delta is unambiguous, the attacker has had days.

Eleven days, give or take.

How Cryptominers Actually Get Into Your Cluster

There are three entry paths we see most often. None of them require a novel zero-day.

Misconfigured RBAC and Overpermissioned Service Accounts

This is the most common one. A service account with cluster-admin or a wildcard resource permission ends up bound to a workload that didn't need it. The original intent was "make deployment easier during the sprint." The consequence is that any container running under that service account can create new pods in any namespace. Once an attacker gets code execution in one container, they can schedule a mining workload anywhere in the cluster without touching the host.

We've seen this path in 60% or more of the cryptomining incidents our team has reviewed. The RBAC misconfiguration predates the attack by months. It just sits there, waiting.

Compromised or Backdoored Container Images

Supply chain compromise is increasingly where sophisticated attackers invest. A dependency pulled from a public registry, a base image with a pinned-but-vulnerable layer, a CI pipeline that pulls latest without pinning. The malicious payload is already baked into the image before the container starts. Once the pod is scheduled, the miner launches as part of the application's own process tree, making it nearly invisible to anything looking only at image provenance or configuration state.

CSPM tools see a valid image digest from a trusted registry. Runtime tells a different story.

Exec Abuse After Initial Foothold

The third path is the one that keeps platform security engineers up at night. An attacker establishes a foothold in a running container, through an application-level vulnerability or a stolen credential, and then uses kubectl exec or a shell spawned via exploit to download and launch a miner directly. No new image. No new pod spec. Just a process that wasn't there before, running inside a container that looks entirely normal from the outside.

Configuration scanning sees nothing. Behavioral telemetry sees everything.

What eBPF Catches Before the Billing Spike

The reason eBPF-based runtime detection changes the timeline is simple: it operates at the kernel system-call layer, not at the configuration layer. Every process that executes, every network connection that opens, every file that gets written, those events happen at the kernel. You can't hide them from a probe running at that level.

In practice, there are four signal classes that surface a cryptomining deployment within minutes, not days.

xmrig fingerprints. Kubesentry maintains a library of command-line argument patterns specific to xmrig and related miners. These include argument sequences like --donate-level 0 -o stratum+tcp:// and variant spellings that miners use to avoid naive string matching. When we see these in a process exec event from any container, it's a high-severity alert. Full stop.

Stratum protocol signatures. Cryptomining pools use the stratum protocol, which opens outbound TCP connections to pool addresses on specific port ranges, typically 3333, 4444, or 14444. An eBPF network probe that tracks every new outbound connection can flag a stratum handshake before the first share is submitted. In our data, stratum connection detection is the fastest signal, typically firing within 90 seconds of the miner's first pool connection.

CPU-pinning syscall sequences. Miners aggressively pin threads to CPU cores using sched_setaffinity calls. This syscall pattern, combined with a tight loop of compute-bound system activity, is statistically distinctive. A web server pod that suddenly starts issuing sched_setaffinity calls from a new child process is behaving outside its behavioral baseline. That deviation generates an anomaly event.

Unexpected exec events. The exec-abuse path is caught by a simple rule: any shell or binary spawned inside a container that doesn't appear in that workload's behavioral baseline is flagged. If your API server pods have never spawned a bash process in 14 days of baseline learning, and one does now, that's an alert. Not a low-severity informational event. An alert, paged to on-call within 90 seconds.

The Billing Spike Was Just the Beginning

I want to come back to the incident that started all of this, because the cryptomining wasn't actually the worst part.

Once we confirmed the miner was running in a misconfigured namespace, we started looking at what else had happened in that namespace over the 11-day window. That's when we found the S3 credential exfiltration. A co-located pod with read access to a Kubernetes secret containing AWS credentials had made a series of outbound API calls to an unfamiliar endpoint over a 48-hour window three days into the mining campaign.

The mining was opportunistic. The credential theft was deliberate. The attacker had used the initial namespace access to survey the cluster, found a high-value secret, and extracted it. We were looking at the billing anomaly while the more serious damage had already happened.

This is why the 11-day MTTD matters so much. It's not just about the compute cost of an unauthorized miner. It's about what else an attacker can do in 11 days once they're inside a Kubernetes namespace with overpermissioned service accounts and co-located secrets.

eBPF telemetry doesn't just catch the miner. It catches the lateral movement. The secret access. The API calls that don't match the workload's baseline. The whole kill chain, not just the noisy end of it.

Deploying Detection That Actually Works for a Two-Person Team

Here's the operational reality for most mid-size SaaS security teams: you have 1 or 2 DevSecOps engineers covering 5 to 50 production namespaces, and you cannot afford to tune a Falco ruleset for 40 hours before it generates signal you trust. That was one of the things that frustrated me most about the state of the tooling before we built Kubesentry.

Raw Falco output is useful if you have a dedicated threat analyst to manage it. Most teams don't. Our data from the Falco GitHub community showed that teams abandoned self-managed Falco deployments within 90 days in large part because alert fatigue made the signal indistinguishable from noise.

What makes eBPF-based detection operationally viable for a small team is the combination of automated behavioral baselining and MITRE ATT&CK tactic classification. When every alert ships with the tactic it maps to (Execution, Credential Access, Lateral Movement) and the full workload context (pod name, namespace, node, triggering syscall sequence), triage time drops. You're not reading raw system calls. You're reading a prioritized, contextualized event that tells you what happened, where, and what kind of threat behavior it represents.

Ninety seconds from stratum connection to PagerDuty page. That's the gap between catching a miner on day one and finding out about it on day 11.

We built Kubesentry because that gap shouldn't exist. And in our experience running incident response across mid-size Kubernetes environments, it doesn't have to.