In my previous homelab posts, I focused on getting a real three-node cluster running: Cilium networking, Traefik ingress, Flux GitOps, Vault for secrets, and a pile of self-hosted apps. I also had a proper observability stack — Grafana with Prometheus for metrics and Loki for logs. The infrastructure side was solid. The security side was not.
I had TLS, a subnet router, and a secrets vault, but nothing watching for suspicious behaviour inside my pods. Prometheus could tell me a pod was up; Loki gave me application logs when I went looking. Neither would catch a reverse shell or an unexpected outbound connection. No runtime detection. No admission policies. No automated response when something looked wrong. If a compromised container started spawning shells or phoning home, I would only find out if I happened to be staring at a Grafana dashboard at the right moment.
That gap is what k8s-soar is built to close. The name stands for Kubernetes Security Orchestration, Automation & Response. It is an open-source project I built to provision a complete detect-and-respond security stack on bare-metal Kubernetes, and the plan is to bring it onto my production homelab cluster next.
All the code lives in the k8s-soar repo on GitHub.
The Problem: Security Tools in Isolation
Most Kubernetes security guides treat each tool as a standalone install. Install Falco here. Add Kyverno there. Maybe throw in a NetworkPolicy or two. What you end up with is a collection of components that do not talk to each other.
That is fine for ticking boxes on a compliance checklist. It is not fine when you actually want to know whether detection triggers a meaningful response.
I wanted to answer a specific question: if an attacker gets a shell inside a container, what happens next? Not in theory, but in my cluster, with real alerts, real policies, and real isolation.
k8s-soar is the answer to that question. It wires together four security layers into a single install path, validates them against eight MITRE ATT&CK–mapped attack scenarios, and closes the loop with an automated Detect → Isolate workflow.
What I Built
k8s-soar is a Helm umbrella chart plus Ansible bootstrap that installs a full security stack from scratch on bare-metal Linux. The stack has four distinct phases:
| Phase | Component | What it does |
|---|---|---|
| Prevent | Kyverno | Blocks bad pods at admission time: privileged containers, hostPath mounts, :latest tags |
| Detect | Falco | Watches syscalls via modern eBPF and fires alerts on suspicious runtime behaviour |
| Enforce | Tetragon | Applies kernel-level TracingPolicies that can kill processes or log network connections |
| Respond | SOAR responder | Receives Falco alerts via webhook and quarantines the offending pod |
Under the hood, the stack also includes Cilium as the eBPF CNI (with Hubble for flow observability) and a dedicated security-lab namespace where attack scenarios run in isolation from everything else.
The SOAR Workflow
The “R” in SOAR is not a Splunk integration or a PagerDuty runbook. It is a lightweight Python webhook responder running inside the cluster. The flow looks like this:
Falco detects suspicious activity
│
▼
falcosidekick receives the JSON alert
│
▼
POST to k8s-soar-responder:8080/webhook
│
▼
Responder patches the pod with label security.quarantine=true
│
▼
CiliumNetworkPolicy denies all ingress and egress
The entire loop, from syscall to network isolation, completes in seconds. No human in the loop required.
The responder itself is roughly 130 lines of Python. It parses Falco alert metadata to identify the namespace and pod name, patches the pod label, and lets Cilium’s quarantine policy do the rest. Simple, auditable, and entirely in-cluster.
Custom Detection Rules
Falco ships with a solid default ruleset, but I wrote custom rules scoped to the security-lab namespace to keep noise down on a real cluster. The four custom rules cover:
- Shell spawned inside a container: detects
bash/shexecution in the lab victim pod - Sensitive credential access: reads of service account tokens or
/etc/shadow - Reverse shell outbound: outbound connections combined with shell processes
- Crypto miner processes: matches known miner binaries like
xmrigandminerd
Kyverno policies and Tetragon TracingPolicies follow the same pattern: scoped, named, and mapped to specific attack scenarios.
What It Defends Against
Every scenario in k8s-soar is mapped to the MITRE ATT&CK for Containers framework. There are eight core scenarios:
| Scenario | Threat | MITRE Technique | Primary Defense |
|---|---|---|---|
| Shell in container | Attacker execs into a running pod | T1059 Execution | Falco detect → SOAR isolate |
| Privileged pod / hostPath | Container escape to host | T1611 Escape to Host | Kyverno block at admission |
| SA token theft | Reading mounted service account credentials | T1552 Credential Access | Kyverno audit + Falco detect |
| Reverse shell | Outbound callback to attacker C2 | T1059 Execution | Falco detect + Tetragon observe |
| Crypto miner | Resource hijacking for cryptocurrency | T1496 Resource Hijacking | Falco detect → SOAR isolate |
| Missing security context | Pods running as root with :latest tags | Best practice | Kyverno audit |
| Lateral movement | Pod-to-pod communication inside the cluster | T1021 Lateral Movement | Default-deny NetworkPolicy + Hubble |
| Sensitive host path write | Writing to /etc/shadow or /root/ | T1611 Escape to Host | Kyverno block + Tetragon Sigkill |
Each scenario has a runbook (scenarios/NN-name/run.sh) and expected evidence documented in a README. You trigger the attack manually, then verify that the right tool fired at the right layer.
This is deliberately lab-first. The security-lab namespace runs a minimal victim workload behind default-deny network policies. Attack simulations never touch your real apps.
How It Will Be Applied on My Cluster
My homelab cluster already runs several pieces of this stack. The homelab repo manages everything through Flux GitOps on three HP EliteDesk nodes at 192.168.0.10–.12. Cilium 1.19 is already the CNI. Grafana, Prometheus, and Loki live in the monitoring namespace. Vault HA handles secrets via External Secrets Operator.
What is missing is everything above the network layer: no Falco, no Tetragon, no Kyverno, no SOAR responder, no runtime visibility at all.
The deployment plan breaks into four phases.
Phase 1: Add the Security Stack via Flux
Since Cilium is already running, the k8s-soar Helm install skips the CNI and deploys only the security components:
- Falco + falcosidekick in the
falconamespace - Tetragon TracingPolicies in
kube-system - Kyverno admission policies in the
kyvernonamespace - SOAR responder in the
k8s-soarnamespace
This means adding HelmRepository CRs for falcosecurity and kyverno to my Flux infrastructure layer, then creating HelmRelease manifests under apps/base/, the same pattern I already use for Vault, Traefik, and the media stack.
Kyverno policies will ship in Audit mode first. I want to collect a baseline of what would have been blocked before flipping anything to Enforce. Blocking Jellyfin because it uses a :latest tag is not the goal.
Phase 2: Deploy the Security Lab
The security-lab namespace gets its own Flux kustomization, sourced from the k8s-soar repo. It includes:
- A minimal victim deployment (
busybox:1.36, non-root, dropped capabilities) - Default-deny CiliumNetworkPolicies with explicit DNS egress only
- The quarantine CNP that triggers when a pod gets labeled
security.quarantine=true
Falco custom rules are scoped exclusively to security-lab. Running attack scenarios against the lab will not generate alerts from my production workloads.
Phase 3: Wire Up the SOAR Pipeline
falcosidekick is configured to POST alerts at WARNING priority or above to the in-cluster responder:
# values.yaml (k8s-soar)
falcosidekick:
config:
webhook:
address: "http://k8s-soar-responder.k8s-soar.svc.cluster.local:8080/webhook"
minimumpriority: "warning"
When a scenario triggers a Falco alert, the responder labels the pod and Cilium cuts its network access. No external orchestrator required, though I may later route alerts to Grafana dashboards and use Vault for webhook credentials.
Phase 4: Validate with Attack Scenarios
Once the stack is live, I run the eight scenario scripts one at a time against security-lab:
./scenarios/01-shell-in-container/run.sh
./scenarios/04-reverse-shell/run.sh
# ... see scenarios/threat-matrix.md
Each run produces evidence I can capture with scripts/capture-scenario-evidence.sh: Falco alert logs, falcosidekick delivery confirmation, responder patch events, and the quarantine label on the target pod.
Pass/fail against the threat matrix becomes the proof that the stack actually works, not just that it installed cleanly.
How This Improves Cluster Security
Before k8s-soar, my homelab security posture looked like this:
- Network boundary: Tailscale subnet router, no public exposure, private DNS
- Secrets: Vault HA with External Secrets Operator, no encrypted blobs in Git
- TLS: cert-manager with Cloudflare DNS-01
- Runtime: nothing
That last line is the gap. A cluster can have perfect secrets management and locked-down ingress and still be wide open internally. A compromised pod could spawn shells, read mounted credentials, mine cryptocurrency, or pivot to other pods, and nothing would notice.
k8s-soar adds four concrete improvements:
1. Prevention at the gate. Kyverno blocks privileged containers, hostPath volumes, and root pods before they ever schedule. Bad configurations fail at admission, not after they are running.
2. Runtime visibility. Falco watches every syscall in every container via eBPF. Shells, credential reads, outbound reverse shells, and miner processes all generate structured JSON alerts with Kubernetes metadata attached.
3. Kernel-level enforcement. Tetragon goes further than detection. It can Sigkill a process that tries to write to /etc/shadow or log outbound TCP connections from suspicious processes. Detection and enforcement are separate layers on purpose.
4. Automated containment. The SOAR responder closes the loop. Detection without response is just logging. Labeling a pod security.quarantine=true and letting Cilium deny all traffic means a compromised workload is contained in seconds, without waiting for me to notice an alert in Grafana.
Together, these layers turn the cluster from “secure at the perimeter” into “secure in depth”, with a validated, reproducible threat matrix to prove it.
What Comes Next
The k8s-soar repo is installable today on a fresh bare-metal cluster via a single ./ansible/setup.sh command. The homelab integration (Flux HelmReleases, security-lab namespace, scenario validation on production hardware) is the next step.
After baseline validation in Audit mode, the plan is to:
- Flip Kyverno policies to Enforce for cluster-wide hardening
- Extend quarantine CNPs beyond
security-labto cover production namespaces - Route Falco alerts to Grafana dashboards for ongoing visibility
- Document results in the thesis threat matrix with captured evidence per scenario
If you run a homelab or a small bare-metal Kubernetes cluster and want to go beyond “I installed Falco once”, the full stack is wired together with attack scenarios to prove it works. The repo is open and the install path is documented.
Wrap Up
Building k8s-soar started as a thesis project and turned into something I actually want running on my cluster. The stack is not exotic. Cilium, Falco, Tetragon, and Kyverno are all well-known tools. What is different is wiring them together into a single install, validating them against real attack scenarios, and closing the loop with automated quarantine.
My homelab already has the networking and secrets foundation. k8s-soar adds the runtime security layer on top. Prevent bad pods. Detect suspicious behaviour. Enforce at the kernel. Respond automatically.
That is the stack I wanted. Now it is time to deploy it.