feat(sandbox): add Platform network mode for restricted K8s platforms#15
Open
Ladas wants to merge 1 commit into
Open
feat(sandbox): add Platform network mode for restricted K8s platforms#15Ladas wants to merge 1 commit into
Ladas wants to merge 1 commit into
Conversation
This was referenced Jun 12, 2026
421a0a7 to
869b3f0
Compare
Ladas
added a commit
that referenced
this pull request
Jun 12, 2026
Add kernel-level network syscall interception using SECCOMP_RET_USER_NOTIF for Platform mode. Provides mandatory, syscall-level enforcement without any capabilities. DnsPinnedAllowlist: resolve domains to IPs at sandbox creation, freeze for session lifetime (DNS rebinding prevention). BPF filter intercepts: connect, sendto, sendmsg, recvfrom, recvmsg, bind. Validates AUDIT_ARCH to prevent x32/compat ABI bypass. Linux syscall wrappers: notification fd ioctls, pidfd_open/pidfd_getfd for on-behalf-of operations (TOCTOU-safe), read_process_memory with read_exact (no short reads), sockaddr parser (correct endianness for sa_family, port, flowinfo), verify_socket_fd (mitigates fd-swap race), deny/allow_connect response helpers. Code review fixes applied across all PRs: - PR #15: gateway propagates network_enforcement to DriverSandboxSpec - PR #15: driver uses typed enum comparison (not magic integer) - PR #16: saturating_sub prevents underflow in Landlock skipped count - PR #16: warn!() on TCP port restriction failure (was debug) - PR #17: BPF arch check, recvfrom/recvmsg/bind interception, verify_socket_fd, read_exact, allow_connect rename, flowinfo endianness, safety comments on all unsafe blocks 8 tests. Compiles, 949 tests pass, clippy clean. Ref: NVIDIA#899
869b3f0 to
ec5655e
Compare
7158b3d to
b44b196
Compare
Author
|
/ok |
Author
|
/ok to test |
2b05c2c to
42158f3
Compare
Add NetworkMode::Platform for running the supervisor without elevated capabilities on Kubernetes platforms enforcing the restricted Pod Security Standard (including OpenShift restricted-v2 SCC). Platform Mode keeps Landlock filesystem isolation, seccomp syscall filtering, OPA policy evaluation, credential injection, and L7 inspection via a loopback CONNECT proxy. It replaces the network namespace (which requires CAP_SYS_ADMIN + CAP_NET_ADMIN) with: - Loopback proxy binding (127.0.0.1 instead of veth interface) - K8s driver: zero capabilities, drop ALL, non-root UID - seccomp: block SOCK_DGRAM (UDP) in Platform mode to match the nftables UDP reject in namespace mode -- the proxy resolves DNS on behalf of the agent, so UDP is not needed - Landlock scope: restrict abstract Unix sockets and signals (ABI v5+, BestEffort degrades on older kernels) Security parity with namespace mode: | Attack | Namespace mode | Platform mode | |------------------------|------------------------|--------------------------| | TCP bypass proxy | nftables REJECT | Landlock port 3128 only | | UDP exfiltration | nftables REJECT | seccomp SOCK_DGRAM block | | DNS tunneling | no UDP accept rule | no SOCK_DGRAM | | Abstract Unix sockets | netns isolation | Landlock scope | | Signals to supervisor | N/A (same netns) | Landlock scope | | Container escape | Risk (CAP_SYS_ADMIN) | Impossible (zero caps) | Remaining gap: Landlock NetPort allows port 3128 on any IP (not just loopback). Mitigate with egress NetworkPolicy denying all sandbox pod egress -- loopback traffic is unaffected by NetworkPolicy. Proto: add NetworkEnforcementMode enum and field to SandboxPolicy and DriverSandboxSpec. Default NAMESPACE (0) preserves existing behavior; PLATFORM (1) activates the new mode. Signed-off-by: Ladislav Smola <lsmola@redhat.com>
42158f3 to
5991fec
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Add
NetworkMode::Platformfor running OpenShell without elevated capabilitieson
restricted-v2SCC (OpenShift) and restricted PSS (Kubernetes).Keeps Landlock, seccomp, OPA, credential injection, and loopback CONNECT proxy.
Replaces network namespace with K8s NetworkPolicy for L3/L4 enforcement.
Capabilities eliminated: CAP_SYS_ADMIN, CAP_NET_ADMIN, CAP_SYS_PTRACE,
CAP_SYSLOG, runAsUser: 0.
9 files, +219/-119 lines. Compiles clean, tests pass, clippy clean.
Ref: NVIDIA#899
Assisted-By: Claude Code