Skip to content

docs(operations): tenant Kubernetes cluster OIDC#591

Open
IvanHunters wants to merge 3 commits into
mainfrom
feat/tenant-kube-oidc
Open

docs(operations): tenant Kubernetes cluster OIDC#591
IvanHunters wants to merge 3 commits into
mainfrom
feat/tenant-kube-oidc

Conversation

@IvanHunters

@IvanHunters IvanHunters commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

What this PR does

Adds an operator-facing docs page for the new per-tenant Keycloak realm + OIDC-on-tenant-kube-apiserver feature. Companion to cozystack/cozystack#3044 which lands the chart implementation.

Lives in next/operations/oidc/tenant_clusters.md (under the existing OIDC operations section that already covers the management cluster). Cross-links to the existing enable_oidc.md page so readers can tell the two flows apart.

Sections

  • Overview of the auto-provisioning cascade: apps/tenant lookup → realm + scope → _namespace.oidc-realmapps/kubernetes per-cluster client / group / kube-apiserver flags → in-cluster ClusterRoleBinding via post-install Job.
  • Prerequisites (platform OIDC must be on; public DNS with valid TLS required).
  • Enable-on-a-Kubernetes-CR walkthrough.
  • Creating tenant-realm users and granting access via the cluster's realm group.
  • Wiring kubectl with kubelogin.
  • The four known limitations: orphan realm because helm-controller does not re-render on Helm lookup result changes, no caBundle / self-signed Keycloak support, hardcoded JWT preferred_username / groups claims, CRB orphan on runtime oidc.enabled toggle.
  • Troubleshooting recipes for 401 (issuer/aud mismatch), 403 (missing ClusterRoleBinding, Job log location), and stuck realm after CR deletion.

Release note

```release-note
docs(operations): document per-tenant Keycloak realm and OIDC on tenant Kubernetes clusters. Covers the auto-provisioning flow (apps/tenant lookup → realm → apps/kubernetes per-cluster client / group / kube-apiserver flags → in-cluster ClusterRoleBinding), kubelogin setup, the four known limitations (Helm lookup non-reactivity, no self-signed Keycloak, hardcoded claims, runtime toggle), and troubleshooting.
```

Summary by CodeRabbit

  • Documentation
    • Added a new guide for enabling and operating OIDC authentication on tenant Kubernetes clusters.
    • Explained the separation of platform vs per-tenant realms and how tenant OIDC provisioning works.
    • Documented required prerequisites, kubectl configuration with kubelogin, and tenant RBAC/group-to-cluster-admin behavior.
    • Covered limitations and edge cases (e.g., fixed claim mappings, disabling behavior) plus troubleshooting for common 401/403 issues.

Operator-facing page covering per-tenant Keycloak realms and OIDC on
tenant kube-apiservers — companion to the existing OIDC docs which
cover the management cluster (`cozy` realm). Pairs with cozystack PR
cozystack/cozystack#3044.

Covers:
- the auto-provisioning flow (apps/tenant lookup → realm + scope →
  _namespace.oidc-realm → apps/kubernetes per-cluster client / group /
  kube-apiserver flags → in-cluster ClusterRoleBinding via Job);
- the enable-on-a-Kubernetes-CR workflow;
- creating users and granting access in the tenant realm;
- wiring kubectl with kubelogin;
- the four known limitations (orphan realm because helm-controller
  does not re-render on Helm `lookup` result changes; no caBundle /
  self-signed Keycloak; hardcoded JWT username/groups claims; CRB
  orphan on runtime oidc.enabled=true→false toggle);
- troubleshooting (401 with valid token, 403 for in-group user,
  stuck realm/scope after CR deletion).

Signed-off-by: IvanHunters <xorokhotnikov@gmail.com>
@netlify

netlify Bot commented Jun 25, 2026

Copy link
Copy Markdown

Deploy Preview for cozystack ready!

Name Link
🔨 Latest commit a542777
🔍 Latest deploy log https://app.netlify.com/projects/cozystack/deploys/6a3d9285b9e07f0008300ed7
😎 Deploy Preview https://deploy-preview-591--cozystack.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@coderabbitai

coderabbitai Bot commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 66b70ec1-569a-4b9b-b46d-d52b871fa84d

📥 Commits

Reviewing files that changed from the base of the PR and between be7368b and a542777.

📒 Files selected for processing (1)
  • content/en/docs/next/operations/oidc/tenant_clusters.md
✅ Files skipped from review due to trivial changes (1)
  • content/en/docs/next/operations/oidc/tenant_clusters.md

📝 Walkthrough

Walkthrough

Adds a new documentation page for tenant-cluster OIDC authentication, covering enablement, access setup, kubelogin configuration, limitations, and troubleshooting for Kamaji-backed tenant Kubernetes clusters.

Changes

Tenant OIDC documentation

Layer / File(s) Summary
Overview and prerequisites
content/en/docs/next/operations/oidc/tenant_clusters.md
Front matter, tenant OIDC page context, the provisioning flow, and platform/TLS prerequisites are added.
Enablement and login setup
content/en/docs/next/operations/oidc/tenant_clusters.md
The tenant access model, realm-group grant flow, and kubelogin configuration steps are added.
Limitations
content/en/docs/next/operations/oidc/tenant_clusters.md
Self-signed TLS, fixed claim mappings, runtime-disable cleanup, and direct-access token notes are added.
Troubleshooting
content/en/docs/next/operations/oidc/tenant_clusters.md
401 and 403 troubleshooting guidance is added for OIDC flags, claim alignment, RBAC state, and Job logs.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~3 minutes

Poem

🐰 I hopped through realms and groups today,
With kubeconfig crumbs along the way.
OIDC shines in docs so neat,
Kamaji purrs in steady beat.
A carrot toast to every claim—
Hooray for clusters, by rabbit name!

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and concisely describes the new documentation page about tenant Kubernetes cluster OIDC.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/tenant-kube-oidc

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces documentation for configuring OIDC authentication on tenant Kubernetes clusters managed by Cozystack. The feedback suggests correcting a kubectl patch command to use the proper KeycloakClient field name (directAccessGrantsEnabled instead of directAccess) and improving a troubleshooting command by replacing a fragile jsonpath and tr pipeline with a robust go-template format.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment on lines +211 to +212
kubectl -n <tenant-namespace> patch keycloakclient kubernetes-<cluster> \
--type=merge --patch '{"spec":{"directAccess":true}}'

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

There is a discrepancy between the text and the patch command. The text mentions directAccessGrantsEnabled, but the patch command uses directAccess. In standard Keycloak Operators, the field is typically spec.client.directAccessGrantsEnabled or spec.directAccessGrantsEnabled. Please update the patch command to use the correct field name to ensure it works as expected.

Suggested change
kubectl -n <tenant-namespace> patch keycloakclient kubernetes-<cluster> \
--type=merge --patch '{"spec":{"directAccess":true}}'
kubectl -n <tenant-namespace> patch keycloakclient kubernetes-<cluster> \
--type=merge --patch '{"spec":{"client":{"directAccessGrantsEnabled":true}}}'

Comment on lines +225 to +228
kubectl --context=mgmt -n <tenant-ns> get pod \
-l kamaji.clastix.io/name=<cluster> \
-o jsonpath='{.items[0].spec.containers[?(@.name=="kube-apiserver")].args}' | \
tr ',' '\n' | grep oidc

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using jsonpath with tr ',' '\n' to parse container arguments is fragile because modern versions of kubectl format array fields as space-separated strings rather than comma-separated JSON-like arrays, which would cause tr to have no effect. Using go-template is a much more robust and built-in way to print container arguments one per line across all kubectl versions.

Suggested change
kubectl --context=mgmt -n <tenant-ns> get pod \
-l kamaji.clastix.io/name=<cluster> \
-o jsonpath='{.items[0].spec.containers[?(@.name=="kube-apiserver")].args}' | \
tr ',' '\n' | grep oidc
kubectl --context=mgmt -n <tenant-ns> get pod \
-l kamaji.clastix.io/name=<cluster> \
-o go-template='{{range (index .items 0).spec.containers}}{{if eq .name "kube-apiserver"}}{{range .args}}{{.}}{{\"\n\"}}{{end}}{{end}}{{end}}' | grep oidc

Adversarial pass on the page from the prior commit found ten issues that
would either confuse the reader or break the doc on render. Fixes:

* Drop the `helm get notes` recipe — Cozystack uses Flux helm-controller,
  not a local helm CLI, so the command would not work for most operators.
  Replace with an explicit `kubectl get secret … | base64 -d` recipe that
  extracts the admin kubeconfig and dumps the cluster CA.
* Clarify the cross-link to `self-signed-certificates.md` — the management
  cluster workaround there does NOT apply to tenant apiservers (Kamaji
  owns their machine config, not the operator's Talos / talm flow). The
  prior phrasing implied a tenant workaround existed.
* Replace the optimistic "Within ≤ 5 minutes" with "up to ~10 minutes
  worst case" — the cascade is two sequential reconcile loops, not one,
  so 5 minutes was misleading for cold-start installs. Also document
  the `<release>-awaiting-oidc-realm` ConfigMap beacon between the two
  reconciles.
* Rename the "Runtime toggle of `oidc.enabled` from `true` to `false`"
  heading to "Runtime oidc.enabled toggle does not clean up bindings" —
  Hugo's TOC generator does not always cope with inline `code` in
  headings, and the new wording is more declarative anyway.
* Spell out the admin-kubeconfig extraction in every troubleshooting and
  cleanup recipe instead of leaving an `<admin-kubeconfig>` placeholder.
* Pin one set of placeholders for the whole page (tenant = `acme`,
  cluster = `prod-a`, root host = `acme.example.com`) and use them
  consistently in every example. The earlier draft mixed concrete and
  `<placeholder>` style across sections.
* Spell out the Job name (`kubernetes-prod-a-oidc-rbac`) rather than
  leaving `<release-name>-oidc-rbac` to be derived by the reader.
* Rewrite "no realm group matches against the now-disabled OIDC path"
  to the clearer "no realm group can match it once OIDC is off".
* Add an upfront `Tenant.spec.oidc.enabled` clarification: the field
  stays at its default `false` during normal operation, and the only
  legitimate use is the realm-cleanup workaround in Limitations. The
  prior draft mentioned the flag in two contexts without flagging that
  one of them was a workaround only.
* Add a top-of-page placeholder index so the reader can map the example
  names back to their own deployment.

Signed-off-by: IvanHunters <xorokhotnikov@gmail.com>
@IvanHunters IvanHunters marked this pull request as ready for review June 25, 2026 12:58
@IvanHunters IvanHunters force-pushed the feat/tenant-kube-oidc branch from be7368b to febf81d Compare June 25, 2026 19:30
IvanHunters added a commit to cozystack/cozystack that referenced this pull request Jun 25, 2026
…dant flag, add tests, rewrite docs

Self-review of the recent oidc refactor surfaced six issues. Pre-commit
on the previous push also failed because two regenerated artifacts
weren't refreshed locally before commit. All addressed in one batch.

Pre-commit recovery:

* packages/apps/tenant/README.md — refreshed by cozyvalues-gen to
  match the flattened `oidc: bool` value (was still rendering the old
  struct shape).
* packages/system/tenant-rd/cozyrds/tenant.yaml — keysOrder regenerated
  by hack/update-crd.sh; lists `oidc` after `resourceQuotas` (matches
  the order in api/tenant/types.go after the flatten).

Real bugs the new unittest caught:

* packages/extra/oidc/templates/admin-user.yaml — `index .Values._cluster
  "root-host"` nil-panicked when `_cluster` was absent (helm template
  standalone, helm-unittest fixtures without `_cluster` set). In
  production `_cluster` is always populated by apps/tenant via
  cozystack-values; the panic only surfaces in test environments and
  blocked the new unittest. Guard with `.Values._cluster | default dict`
  before the index.

Cosmetic:

* Drop `keepResource: true` from KeycloakRealmUser — the CRD default
  is already `true`, the explicit field was verbose dead text.

New helm-unittest coverage for packages/extra/oidc/:

* tests/admin_secret_test.yaml — asserts the keycloak-admin Secret
  carries the expected fields, the realm-admin user CR points at the
  right realm + the right passwordSecret + the right realm-management
  client role mapping, and the chart survives missing `_cluster` (the
  regression test that caught the panic above).
* tests/realm_test.yaml — asserts ClusterKeycloakRealm + KeycloakClientScope
  groups render with the right shape, the v1.edp.epam.com capability
  guard suppresses them in bootstrap, session-lifetime overrides
  propagate.
* `make test` was already there; it just had nothing to run.

Documentation rewrite — the previous docs described the old inline
auto-provision architecture and referenced names (`kubernetes-<cluster>`,
`Tenant.spec.oidc.enabled` struct, lookup-based child detection) that
no longer exist:

* docs/oidc-tenant.md — rewritten end-to-end. Explains the
  tenant-module pattern + the realm inheritance behaviour + the
  realm-admin Secret + the new namespace-prefixed Keycloak identifiers.
  Adds a new Limitation section covering "disabling parent OIDC while
  descendant clusters use the inherited realm" (eventually consistent,
  bounded by helm-controller reconcile interval).
* /tmp/website/content/en/docs/next/operations/oidc/tenant_clusters.md
  — same rewrite for the user-facing operator docs on the website
  side (commit lives in cozystack/website#591).

E2E coverage gap intentionally not addressed in this commit — adding
a multi-tenant inheritance test (parent owns realm, child Kubernetes
CR wires against it) would extend the e2e sandbox runtime by ~5-10
min and is better tracked as a follow-up.

Validated:
* apps/tenant helm-unittest: 16/16 PASS
* apps/kubernetes helm-unittest: 140/140 PASS
* extra/oidc helm-unittest: 8/8 PASS (NEW)
* bats parses (12 tests)

Signed-off-by: IvanHunters <xorokhotnikov@gmail.com>
The previous version of this page described the inline-auto-provision
architecture where apps/tenant rendered the realm directly and used
Helm lookup to detect child Kubernetes CRs. That design is gone —
realm provisioning now lives in extra/oidc, gated by a plain
Tenant.spec.oidc bool (same shape as etcd / monitoring / ingress).

Rewrite covers:
* The tenant-module pattern (Tenant.spec.oidc=true → apps/tenant
  renders an `oidc` HR → extra/oidc provisions realm + admin user +
  keycloak-admin Secret).
* The keycloak-admin Secret in the tenant namespace (url + username +
  password + realm), surfaced through the dashboard via
  spec.secrets.include.
* Realm inheritance — descendant tenants inherit the parent's realm
  through _namespace.oidc-realm; realm-wide unique Keycloak identifiers
  (<tenant-namespace>-kubernetes-<cluster>) prevent sibling collisions.
* Identity-admin delegation living with the realm-owning tenant only.
* Limitation: disabling parent OIDC while descendant clusters use the
  inherited realm — eventually consistent within one helm-controller
  reconcile interval.

Pairs with cozystack#3044 commit 2e52384c1.

Signed-off-by: IvanHunters <xorokhotnikov@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant