design-proposal: compute plane for untrusted-code workloads#17
design-proposal: compute plane for untrusted-code workloads#17Andrei Kvapil (kvaps) wants to merge 2 commits into
Conversation
Signed-off-by: Andrei Kvapil <andrei.kvapil@aenix.io>
📝 WalkthroughWalkthroughA new design proposal document ( ChangesComputePlane Design Proposal
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request introduces a design proposal for ComputePlane, a Cozystack-managed Kubernetes cluster designed to isolate untrusted-code workloads from the management control plane. The reviewer feedback highlights three key areas for improvement: ensuring that the generated HelmRelease has spec.install.createNamespace: true enabled to prevent installation failures on the remote cluster, clarifying the exact data-path proxying mechanism to guarantee secure network isolation, and specifying how deletion ordering is enforced via a finalizer to prevent Flux finalizers from blocking indefinitely when a tenant is deleted.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| values: { ... } | ||
| ``` | ||
|
|
||
| The routing decision is driven by the `placement` enum on the `ApplicationDefinition` — `ManagementPlane` (default) applies into the tenant namespace on the management cluster as today; `ComputePlane` injects the ComputePlane `kubeConfig.secretRef`. The two values name the two symmetric planes. This keeps the routing policy declarative and out of per-app charts. |
There was a problem hiding this comment.
When using spec.kubeConfig to remote-apply a HelmRelease to the ComputePlane, the Helm controller will attempt to install the release into the target namespace (e.g., tenant-<name>). Since the ComputePlane is a freshly provisioned, separate cluster, this namespace will not exist by default. It is important to specify that spec.install.createNamespace: true must be enabled on the generated HelmRelease to prevent installation failures.
| The routing decision is driven by the `placement` enum on the `ApplicationDefinition` — `ManagementPlane` (default) applies into the tenant namespace on the management cluster as today; `ComputePlane` injects the ComputePlane `kubeConfig.secretRef`. The two values name the two symmetric planes. This keeps the routing policy declarative and out of per-app charts. | |
| The routing decision is driven by the `placement` enum on the `ApplicationDefinition` — `ManagementPlane` (default) applies into the tenant namespace on the management cluster as today; `ComputePlane` injects the ComputePlane `kubeConfig.secretRef`. The two values name the two symmetric planes. This keeps the routing policy declarative and out of per-app charts. | |
| Additionally, because the ComputePlane is a separate cluster, the target namespace (e.g., `tenant-<name>`) will not exist by default. The generated `HelmRelease` must ensure `spec.install.createNamespace: true` is set so Flux can provision the namespace on the remote cluster during installation. |
|
|
||
| ### 4. Access is proxied back through the tenant's normal entry point | ||
|
|
||
| Workloads expose themselves on the ComputePlane via standard Ingress/Gateway, and the ComputePlane's ingress is wired back to the tenant's existing entry point so the user reaches the app at a normal hostname. The user never receives ComputePlane credentials; only HTTP(S) app traffic crosses back, through the proxy/ingress path — not the kube-API path. |
There was a problem hiding this comment.
To strengthen the security guarantees, it would be beneficial to clarify the exact data-path plumbing for the proxying mechanism. Since the ComputePlane runs on KubeVirt VMs on the management cluster, explicitly stating how the management cluster's ingress routes traffic to the ComputePlane's ingress (e.g., via the VM's internal service/bridge IP) helps ensure there is no accidental reverse network path exposed to the untrusted workloads.
| Workloads expose themselves on the ComputePlane via standard Ingress/Gateway, and the ComputePlane's ingress is wired back to the tenant's existing entry point so the user reaches the app at a normal hostname. The user never receives ComputePlane credentials; only HTTP(S) app traffic crosses back, through the proxy/ingress path — not the kube-API path. | |
| Workloads expose themselves on the ComputePlane via standard Ingress/Gateway, and the ComputePlane's ingress is wired back to the tenant's existing entry point so the user reaches the app at a normal hostname. The user never receives ComputePlane credentials; only HTTP(S) app traffic crosses back, through the proxy/ingress path — not the kube-API path. Specifically, the management cluster's ingress controller proxies traffic directly to the KubeVirt VM's internal service/bridge IP, ensuring no reverse network path is exposed to the workload. |
| - **ComputePlane kubeconfig Secret missing/rotated** → remote apply fails closed (no fallback to local apply on the management cluster); status reflects the error. Failing closed is the security-correct behavior. | ||
| - **App declares `placement: ComputePlane` but no ComputePlane module set in the tenant chain** → reject at admission / surface a clear status error rather than silently deploying locally (which would re-introduce the risk). Inheritance walk: use the nearest enabling ancestor; if none, reject. | ||
| - **GPU exhaustion** → cluster-autoscaler adds GPU node groups up to `maxReplicas`; beyond that the workload pends, as in any autoscaled cluster. | ||
| - **Tenant deletion** → ComputePlane and its workloads are torn down with the tenant; ordering must delete remote HelmReleases before deprovisioning the ComputePlane to avoid orphaned remote resources. |
There was a problem hiding this comment.
Enforcing that remote HelmReleases are deleted before the ComputePlane is deprovisioned is a common challenge with Flux remote apply. If the ComputePlane cluster is destroyed first, Flux's HelmRelease will block indefinitely on its finalizer because it can no longer connect to the target API to perform cleanup. Specifying how this ordering is enforced (e.g., via a finalizer on the ComputePlane custom resource) makes the design robust against this failure mode.
| - **Tenant deletion** → ComputePlane and its workloads are torn down with the tenant; ordering must delete remote HelmReleases before deprovisioning the ComputePlane to avoid orphaned remote resources. | |
| - **Tenant deletion** → ComputePlane and its workloads are torn down with the tenant; ordering must delete remote HelmReleases before deprovisioning the ComputePlane to avoid orphaned remote resources. This ordering is enforced by a finalizer on the `ComputePlane` custom resource, which blocks its deletion until all associated remote `HelmRelease` resources have been successfully cleaned up. |
Review: ComputePlane design proposalThe core idea is sound and the mechanism choice is right: routing 1. "Hidden from the tenant" is conflated with the security boundary — and it isn't oneThe doc leans hard on invisibility as a selling point ("a tenant does not see and does not manage," "hidden cluster," repeated in Overview, Goals, §1, User-facing changes). But two separable properties are being bundled:
(b) does not follow from (a) and is not itself a security guarantee. The isolation argument is entirely about the ComputePlane→management direction; handing the tenant a scoped, read-only view of their own ComputePlane (it's their untrusted workloads, on a cluster that already has no path back to management) weakens nothing in that argument. And the cost of full opacity is real. When the Jupyter pod CrashLoops, OOMs, or the GPU node won't schedule (the doc's own "GPU exhaustion → pod pends" failure case), the tenant has zero If hiding is intentional, the justification needs to be stated, and I think the real one is narrower than "hide everything." The legitimate reason to withhold access is tamper-resistance: the platform wants to deploy hardening into the ComputePlane — restricted PSA, egress NetworkPolicies, admission control — that the tenant must not be able to remove, because removing it is what makes untrusted code dangerous. But that argues for withholding admin/write, not visibility. A read-only or namespace-scoped kubeconfig keeps the hardening tamper-proof while preserving debuggability. Suggested resolution: decouple the two properties explicitly. Keep "no credentials/path back to management" as the security guarantee; demote "tenant cannot see the ComputePlane" to a default, and design a scoped read/observability path so users aren't operating a black box. If full opacity really is intended, say so and give the tamper-resistance argument — but I'd push back on it. 2. The "1-click" contrast doesn't distinguish from the rejected alternative; and the target needn't be a hidden ComputePlane (nit)Two smaller things. The 1-click framing. §2 says the ComputePlane module is enabled by the parent tenant at child-creation time. So the real flow is: someone provisions a full Kamaji control plane + KubeVirt node groups first, and only then does the end user get their one click. The provisioning cost isn't eliminated, it's relocated to the parent/admin. That makes the rejection of "expose a managed Kubernetes and let them install the app" weaker than stated — the ComputePlane is then "little more than a template": pre-baked secure defaults plus routing glue over the same managed-Kubernetes substrate. That's a fine thing to be, but the honest differentiators are (i) secure defaults the tenant can't misconfigure and (ii) placement-routing that auto-targets the right cluster — not the click count. I'd soften the 1-click contrast and lead with those. The target generalizes. The routing mechanism ( 3. Inheriting a ComputePlane to subtenants re-creates the exact problem ComputePlanes exist to solve (objection)§2, Open questions, and Phase 3 all propose that child tenants may reuse an ancestor's ComputePlane, "matching the existing service-inheritance direction" (ingress/monitoring/etcd). The inheritance walk is encoded concretely: §2 "the chain walks up to the parent," Failure cases "use the nearest enabling ancestor." I think this is wrong and should be cut from the architecture, not merely "blocked in iteration 1, open later." The premise of the whole proposal (Overview, Security) is that untrusted code must not share an isolation domain with what it could escalate toward. The management cluster is effectively the root ComputePlane — the place managed workloads run, which ComputePlane exists to keep untrusted code off of. Now take parent tenant P with The service-inheritance analogy is precisely the flaw. ingress/monitoring/etcd are shared infrastructure services that sit at a trust level above their consumers — the tenant trusts its ingress. A ComputePlane is the inverse: a containment vessel for code its owner does not trust, sitting below the owner. You inherit a shared service; you do not share a containment boundary between mutually-distrusting parties (P and C need not trust each other) — that's just removing the containment. Each tenant running untrusted code needs its own ComputePlane, the same way each tenant gets its own isolation from the management cluster. Suggested resolution: state explicitly that a ComputePlane serves exactly one tenant, full stop; a child that wants untrusted compute provisions its own. Remove the parent-walk from §2 and Failure cases — a One distinction worth preserving so this doesn't read as anti-efficiency: sharing the physical node pool / capacity across tenants (the "managed RDS runs many DBs on shared instance types" analogy in Open questions) is fine — that's infrastructure-layer resource pooling. Sharing a ComputePlane (a kube cluster / isolation domain) is not. The doc currently conflates these; only the former is safe, and saying so explicitly would strengthen the section. 4. No story for ComputePlane workloads reaching tenant-namespace services — and that's the whole point of the apps (omission)The doc defers "credential propagation" (how a managed Postgres connection secret reaches a ComputePlane workload) to Open questions / Non-goals. But that's the lesser half of the problem and it's mis-framed as a secrets-plumbing issue. The harder half is network reachability, and it's in direct tension with the security model. The headline workloads — an LLM, a notebook, an n8n flow — are close to useless in isolation; their entire value is connecting to the tenant's data: "my Jupyter notebook wants to talk to my Postgres." But the tenant's managed Postgres runs in the tenant namespace on the management/infra cluster. So "let my notebook reach my database" means opening a path from the ComputePlane into the management cluster — the exact thing Security §2 forbids ("ComputePlane pods are denied egress to the management/infra kube-apiserver," management→ComputePlane is the only allowed direction). A database connection is a ComputePlane→management flow, and there is no design for it. The apps that justify the feature don't function as specified. The good news — and the reason this is an omission rather than a dead end — is that the connectivity primitive already exists in the design space and the doc cites it. PR #7 (cross-cluster-tenant-mesh) builds exactly a cross-cluster data-plane path (Kilo The key distinction the doc should make explicit: the threat is "no access to the management kube-API / no creds to escalate," not "no packets ever." Those are different planes. You can allow ComputePlane → a specific service endpoint (the Postgres pod IP:5432) while still denying ComputePlane → kube-apiserver — and the CiliumNetworkPolicy machinery the doc already cites (§Context, Suggested resolution: promote this from a one-line Open question to a real "Connectivity to tenant services" section in the Design. It should (a) draw the kube-API-access vs. data-plane-reachability distinction so the isolation guarantee is stated precisely, (b) sketch the brokered path — narrowly-scoped per-service egress (reusing/constraining the PR #7 mesh, or an outbound mirror of the §4 ingress proxy), with who authorizes each endpoint and how the policy is generated — and (c) reconcile it with Security §2, which currently reads as a blanket denial. Every such hole is a path back toward infra, so it needs to be narrow and audited by construction, which is exactly why it deserves design rather than deferral. Summary
Net: the mechanism is right and the proposal is close. №3 and №4 are the two I'd want resolved before this is implementable — one removes a security regression hiding inside a convenient analogy, the other fills in the connectivity half of the design without which the motivating apps don't work. |
|
Thanks for the thorough review — agree the mechanism is right, and these are the right things to push on. Point by point. #3 (single-tenant) — accepted. You're right that inheriting a ComputePlane to a child re-creates the exact escalation one level down, and that "block now, unblock later" is itself the hole. Making it single-tenant by design: removing the parent-walk from §2 and the Failure-cases inheritance — a #4 (connectivity) — agree it needs a real section; I don't think it needs a mesh. The path physically already exists: tenant-cluster worker nodes are KubeVirt VMs on the management Cilium pod network ( #1 (visibility vs security) — decoupling accepted. The load-bearing property is "no creds / no path back to management"; visibility is a separable UX / tamper-resistance concern, not a security guarantee, and I'll stop presenting invisibility as one. Today Kamaji already provisions an admin kubeconfig held by cluster-admins (not tenants), so operator-side debugging exists; a tenant-facing scoped read / observability path is a worthwhile extension and I'll record it as such rather than baking full opacity in. #2 (placement target) — recording as an option. Letting I'll revise the proposal along these lines (and the cozyllm-specific doc inherits the single-tenant fix). Thanks again. |
…ign, connectivity to tenant services, decouple visibility from the security boundary Signed-off-by: Andrei Kvapil <andrei.kvapil@aenix.io>
|
Pushed a revision (
Also folded in the inline nits: The connectivity-authorization details (#4) and the placement-target choice (#2) are left as Open questions. Re-review welcome — thanks again. |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@design-proposals/compute-plane/README.md`:
- Around line 149-160: Section 5 only defines network reachability; it still
leaves the ComputePlane-to-tenant-service credential flow unresolved, which
makes the rollout promise incomplete. Update the proposal around the
“Connectivity to tenant services” section to explicitly define how a workload
gets the managed Postgres secret/connection string, using the same terminology
and objects already named there (ComputePlane workloads, tenant service,
per-service CiliumNetworkPolicy, `exposeMethod: Proxied` if relevant). If that
mechanism is not ready, move the Phase 1 rollout commitment behind the
credential-delivery design instead of presenting it as solved.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 41068252-fe85-427a-ab0f-649c845bc159
📒 Files selected for processing (1)
design-proposals/compute-plane/README.md
| ### 5. Connectivity to tenant services | ||
|
|
||
| The headline workloads are nearly useless in isolation — a notebook, an LLM, an n8n flow exist to reach the tenant's data ("my Jupyter notebook talks to my managed Postgres"). But the tenant's managed Postgres runs as a Service in the tenant namespace **on the management cluster**, so reaching it is a ComputePlane→management flow — the direction Security §2 otherwise restricts. The resolution is to be precise about *which* plane is restricted. | ||
|
|
||
| **The guarantee is "no kube-API access / no creds to escalate," not "no packets ever."** Those are different planes. A database connection (ComputePlane → `postgres-pod:5432`) is a *data-plane* flow and can be allowed while ComputePlane → kube-apiserver stays denied. | ||
|
|
||
| **No mesh is required.** ComputePlane worker nodes are KubeVirt VMs attached to the management cluster's Cilium pod network (`packages/apps/kubernetes/templates/cluster.yaml` → `networks: - name: default; pod: {}`), so L3 adjacency already exists and is gated by `CiliumNetworkPolicy`. Connectivity is therefore a **scoping** problem, expressed with machinery already in the platform: | ||
|
|
||
| - **Egress to a tenant service** is granted by a narrow, per-service `CiliumNetworkPolicy`: allow ComputePlane workloads → a specific endpoint (the tenant's Postgres Service), deny ComputePlane → kube-apiserver. This is the same shape as the existing `policy.cozystack.io/allow-to-apiserver` label policy, pointed at a data-plane endpoint instead of the API. Per-service egress is narrower by construction than a node-to-node mesh, so untrusted workloads never get broad reach into the infra network. | ||
| - **Exposing a ComputePlane workload outward** reuses what the managed `kubernetes` app already ships (Design §4): `exposeMethod: Proxied` and kubevirt-ccm `Service type: LoadBalancer`; persistent storage uses the `kubevirt-csi-driver` path. | ||
|
|
||
| Open: who authorizes each tenant-service endpoint and how the per-service policy is generated (a tenant-scoped allowlist vs. an explicit "expose this service to my ComputePlane" action). The remaining secret-delivery half — getting the Postgres connection string into the workload — is tracked under Open questions; the **network** half is solved by the scoped policy above, and every such opening is narrow and audited by construction. |
There was a problem hiding this comment.
🔒 Security & Privacy | 🟠 Major | 🏗️ Heavy lift
Define the credential-delivery path, not just the network path.
Section 5 solves reachability, but the proposal still defers how a ComputePlane workload gets the managed-service secret it needs. That makes the motivating workloads incomplete, and Phase 1 is currently promising a capability whose authz/plumbing model is still open. Please either specify that mechanism here or move the rollout commitment behind it.
Also applies to: 216-228
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@design-proposals/compute-plane/README.md` around lines 149 - 160, Section 5
only defines network reachability; it still leaves the
ComputePlane-to-tenant-service credential flow unresolved, which makes the
rollout promise incomplete. Update the proposal around the “Connectivity to
tenant services” section to explicitly define how a workload gets the managed
Postgres secret/connection string, using the same terminology and objects
already named there (ComputePlane workloads, tenant service, per-service
CiliumNetworkPolicy, `exposeMethod: Proxied` if relevant). If that mechanism is
not ready, move the Phase 1 rollout commitment behind the credential-delivery
design instead of presenting it as solved.
|
Thanks for the revision ( I want to be clear up front about the verdict: I'm happy to see this built. ComputePlane is an excellent UX / managed-service offering — one-click managed deployment of code-executing apps that go beyond the existing platform catalog, with sane defaults the tenant doesn't have to assemble, on a cluster the operator keeps managed and can bill for. That's a real and worthwhile product. My remaining objection is narrow but I think it's important: the Security section claims an isolation boundary that ComputePlane does not actually provide. It should be rewritten so the doc doesn't sell security where there isn't any. The six guarantees are inherited from the managed-Kubernetes substrate, not provided by ComputePlaneFrom the management plane's point of view, a ComputePlane and a regular managed
None of this is new to ComputePlane. The doc enumerates the managed-Kubernetes trust model and credits it to ComputePlane. The one ComputePlane-specific control provides no platform protectionThe single thing a ComputePlane has that a tenant-run cluster doesn't is tamper-proof hardening: the tenant isn't cluster-admin, so platform-applied PSA / network policy / admission can't be stripped. The doc presents this as protecting the platform from a malicious tenant. It can't — and not just because the VM boundary already contains the escape. It can't in principle, because the hardened venue is optional for the attacker: A tenant who actually holds a management-hijacking payload doesn't run it in the hardened ComputePlane. They provision a regular managed The motivating threat ("container escape → host root → management API → every tenant's secrets") is a property of the substrate, which both venues share, so the fork is:
Either way the hardening adds nothing to platform safety. It would only protect the platform if tenants were also forbidden from provisioning their own unhardened managed clusters — which isn't the case and isn't proposed (that would be a far bigger change, and is the actual lever if the threat is believed). So the hardening's only coherent security scope is intra-cluster: protecting the tenant from their own app's users (JupyterHub students, LLM-generated code), who are confined to whichever venue the tenant deployed. As a platform / multi-tenant boundary it isn't merely redundant — it's circular: it only "works" against an attacker who has agreed to attack from inside the box you hardened. Suggested edits
This connects to the placement-target open question, which already floats To be explicit: none of this blocks building it. I just want the design doc to describe the value as UX / managed-service rather than as an isolation guarantee the substrate already provides and the one new control can't enforce. |
Adds a design proposal for compute planes.
What
A compute plane is a Cozystack-managed Kubernetes cluster that a tenant does not see or manage, onto which untrusted-code workloads (notebooks, workflow "code" nodes, plugin systems, custom components) are placed instead of into the tenant namespace on the management cluster. The compute plane has no credentials to and no network path to the management/infra control plane; the management cluster applies workloads into it one-way via Flux, and tenant access is proxied back through the normal ingress entry point.
Why
Cozystack's model treats a managed app as a single-purpose barrier the tenant cannot cross (you can't run an arbitrary binary inside your managed Postgres). A growing class of apps breaks that by design — their feature is arbitrary code execution. Co-locating those with the management plane is unsafe; this proposal extends the barrier property to them instead of weakening it. No known exploit — a latent gap to close before such apps reach shared/production clusters.
How it maps to existing primitives
Built entirely on things that already exist: the managed
kubernetesapp (Kamaji control plane + KubeVirt nodes, GPU node groups, autoscaler), tenant modules, and Flux remote apply viaHelmRelease.spec.kubeConfig.secretRef(already used by thekubernetesapp for its own addons). Delivered as a tenant module (computePlane: "<profile>", a single-string profile reference — one source of truth, no inline override blob). App routing is a newplacement: { ManagementPlane | ComputePlane }enum onApplicationDefinition(defaultManagementPlane= today's behavior).Related proposals
#4 (tenant-module-overrides), #7 (cross-cluster-tenant-mesh), #8 / #9 (kubernetes-nodes).
Rendered:
design-proposals/compute-plane/README.mdSummary by CodeRabbit