Design proposal: Coroot eBPF observability option for Cozystack by gecube · Pull Request #22 · cozystack/community

George Gaál (gecube) · 2026-06-24T11:11:06Z

Migrates discussion cozystack/cozystack#3028 into the design-proposal process.

Adds design-proposals/coroot-ebpf-observability/README.md: add Coroot (Apache-2.0, eBPF) as a zero-instrumentation observability option — service maps, tracing, profiling — reusing Cozystack's existing VictoriaMetrics and ClickHouse backends. Includes a low-commitment agents-only entry point.

Source discussion: cozystack/cozystack#3028

Sibling proposal (migrated together): #21

DCO: commit is signed off.

Migrated from discussion cozystack/cozystack#3028 to the design-proposal process for review. Signed-off-by: Gaál György <gb12335@gmail.com>

coderabbitai · 2026-06-24T11:11:15Z

Warning

Review limit reached

@gecube, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 59 minutes and 47 seconds. Learn how PR review limits work.

Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file).

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based credits.

🚦 How do rate limits work?

CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan review availability.

For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, additional reviews become available more gradually as earlier reviews age out of the rolling window.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 88f8c8fc-8e66-4bdb-887c-5e409f631247

📥 Commits

Reviewing files that changed from the base of the PR and between fbfc6ba and 29692a9.

📒 Files selected for processing (1)

design-proposals/coroot-ebpf-observability/README.md

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch proposal/coroot-ebpf-observability

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands.}

gemini-code-assist

Code Review

This pull request introduces a design proposal to integrate Coroot as a zero-instrumentation (eBPF) observability option for Cozystack, utilizing existing VictoriaMetrics and ClickHouse backends. The review feedback suggests improving the proposal by recommending minimum Linux capabilities (such as CAP_BPF and CAP_PERFMON) instead of full privileges for the daemonset to mitigate security risks, and defining how the agent handles backpressure and resource limits when backend storage is unavailable.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

gemini-code-assist · 2026-06-24T11:11:48Z

+
+## Security
+
+eBPF agents run privileged on each node — a new trust boundary that must be evaluated, especially for the tenant-app mode where namespace-scoped tenants would gain access to kernel-level telemetry. The per-tenant eBPF privilege model is an explicit open question; tenant-app mode is deferred to a later phase partly for this reason. SSO and ingress hardening for the UI land in Phase 2.


Running eBPF agents with full privileges (privileged: true) poses a significant security risk. It is highly recommended to explicitly state that the daemonset should be configured with the minimum necessary Linux capabilities (such as CAP_BPF, CAP_PERFMON, CAP_NET_ADMIN, CAP_SYS_RESOURCE) rather than full privileges, depending on the minimum supported kernel version.

Suggested change

eBPF agents run privileged on each node — a new trust boundary that must be evaluated, especially for the tenant-app mode where namespace-scoped tenants would gain access to kernel-level telemetry. The per-tenant eBPF privilege model is an explicit open question; tenant-app mode is deferred to a later phase partly for this reason. SSO and ingress hardening for the UI land in Phase 2.

eBPF agents run privileged on each node — a new trust boundary that must be evaluated, especially for the tenant-app mode where namespace-scoped tenants would gain access to kernel-level telemetry. To mitigate host compromise risks, the daemonset should be configured with the minimum necessary Linux capabilities (e.g., CAP_BPF, CAP_PERFMON, CAP_NET_ADMIN) rather than full container privileges where supported by the kernel. The per-tenant eBPF privilege model is an explicit open question; tenant-app mode is deferred to a later phase partly for this reason. SSO and ingress hardening for the UI land in Phase 2.

gemini-code-assist · 2026-06-24T11:11:48Z

+## Failure and edge cases
+
+- **Coroot UI / control plane down** → agents keep shipping to VictoriaMetrics/ClickHouse; dashboards in Grafana remain usable (agents-only mode is unaffected by UI availability).
+- **eBPF unsupported on a node kernel** → agent should degrade/skip rather than crash-loop; document the minimum kernel.


When the backend storage (VictoriaMetrics or ClickHouse) is unavailable or experiencing high latency, eBPF agents can consume significant memory or CPU buffering events. It is important to define how the agent handles backpressure and resource limits under these conditions.

Suggested change

- **eBPF unsupported on a node kernel** → agent should degrade/skip rather than crash-loop; document the minimum kernel.

- **eBPF unsupported on a node kernel** → agent should degrade/skip rather than crash-loop; document the minimum kernel.

- **Backend backpressure / unavailability** → if VictoriaMetrics or ClickHouse is down or slow, the eBPF agent must safely drop or buffer telemetry without causing node-level memory exhaustion or high CPU overhead.

Add design proposal: Coroot eBPF observability option for Cozystack

29692a9

Migrated from discussion cozystack/cozystack#3028 to the design-proposal process for review. Signed-off-by: Gaál György <gb12335@gmail.com>

gemini-code-assist Bot reviewed Jun 24, 2026

View reviewed changes

George Gaál (gecube) mentioned this pull request Jun 24, 2026

Design proposal: self-hosted in-cluster registry for air-gapped Cozystack #21

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Design proposal: Coroot eBPF observability option for Cozystack#22

Design proposal: Coroot eBPF observability option for Cozystack#22
George Gaál (gecube) wants to merge 1 commit into
mainfrom
proposal/coroot-ebpf-observability

George Gaál (gecube) commented Jun 24, 2026

Uh oh!

coderabbitai Bot commented Jun 24, 2026

Review limit reached

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Jun 24, 2026

Uh oh!

gemini-code-assist Bot Jun 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant


		## Security

		eBPF agents run privileged on each node — a new trust boundary that must be evaluated, especially for the tenant-app mode where namespace-scoped tenants would gain access to kernel-level telemetry. The per-tenant eBPF privilege model is an explicit open question; tenant-app mode is deferred to a later phase partly for this reason. SSO and ingress hardening for the UI land in Phase 2.

	- eBPF unsupported on a node kernel → agent should degrade/skip rather than crash-loop; document the minimum kernel.
	- eBPF unsupported on a node kernel → agent should degrade/skip rather than crash-loop; document the minimum kernel.
	- Backend backpressure / unavailability → if VictoriaMetrics or ClickHouse is down or slow, the eBPF agent must safely drop or buffer telemetry without causing node-level memory exhaustion or high CPU overhead.

Uh oh!

Conversation

George Gaál (gecube) commented Jun 24, 2026

Uh oh!

coderabbitai Bot commented Jun 24, 2026

Review limit reached

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Jun 24, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 24, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant