Skip to content

feat(sandbox): add Platform network mode for restricted K8s platforms#15

Open
Ladas wants to merge 1 commit into
mvp-v2from
feat/platform-mode
Open

feat(sandbox): add Platform network mode for restricted K8s platforms#15
Ladas wants to merge 1 commit into
mvp-v2from
feat/platform-mode

Conversation

@Ladas

@Ladas Ladas commented Jun 12, 2026

Copy link
Copy Markdown

Summary

Add NetworkMode::Platform for running OpenShell without elevated capabilities
on restricted-v2 SCC (OpenShift) and restricted PSS (Kubernetes).

Keeps Landlock, seccomp, OPA, credential injection, and loopback CONNECT proxy.
Replaces network namespace with K8s NetworkPolicy for L3/L4 enforcement.

Capabilities eliminated: CAP_SYS_ADMIN, CAP_NET_ADMIN, CAP_SYS_PTRACE,
CAP_SYSLOG, runAsUser: 0.

9 files, +219/-119 lines. Compiles clean, tests pass, clippy clean.

Ref: NVIDIA#899

Assisted-By: Claude Code

@Ladas Ladas force-pushed the feat/platform-mode branch from 421a0a7 to 869b3f0 Compare June 12, 2026 16:25
Ladas added a commit that referenced this pull request Jun 12, 2026
Add kernel-level network syscall interception using SECCOMP_RET_USER_NOTIF
for Platform mode. Provides mandatory, syscall-level enforcement without
any capabilities.

DnsPinnedAllowlist: resolve domains to IPs at sandbox creation, freeze
for session lifetime (DNS rebinding prevention).

BPF filter intercepts: connect, sendto, sendmsg, recvfrom, recvmsg,
bind. Validates AUDIT_ARCH to prevent x32/compat ABI bypass.

Linux syscall wrappers: notification fd ioctls, pidfd_open/pidfd_getfd
for on-behalf-of operations (TOCTOU-safe), read_process_memory with
read_exact (no short reads), sockaddr parser (correct endianness for
sa_family, port, flowinfo), verify_socket_fd (mitigates fd-swap race),
deny/allow_connect response helpers.

Code review fixes applied across all PRs:
- PR #15: gateway propagates network_enforcement to DriverSandboxSpec
- PR #15: driver uses typed enum comparison (not magic integer)
- PR #16: saturating_sub prevents underflow in Landlock skipped count
- PR #16: warn!() on TCP port restriction failure (was debug)
- PR #17: BPF arch check, recvfrom/recvmsg/bind interception,
  verify_socket_fd, read_exact, allow_connect rename, flowinfo
  endianness, safety comments on all unsafe blocks

8 tests. Compiles, 949 tests pass, clippy clean.

Ref: NVIDIA#899
@Ladas Ladas force-pushed the feat/platform-mode branch from 869b3f0 to ec5655e Compare June 16, 2026 12:43
@Ladas Ladas force-pushed the feat/platform-mode branch 2 times, most recently from 7158b3d to b44b196 Compare June 17, 2026 15:05
@Ladas

Ladas commented Jun 19, 2026

Copy link
Copy Markdown
Author

/ok

@Ladas

Ladas commented Jun 19, 2026

Copy link
Copy Markdown
Author

/ok to test

@Ladas Ladas force-pushed the feat/platform-mode branch 2 times, most recently from 2b05c2c to 42158f3 Compare June 24, 2026 15:13
Add NetworkMode::Platform for running the supervisor without elevated
capabilities on Kubernetes platforms enforcing the restricted Pod
Security Standard (including OpenShift restricted-v2 SCC).

Platform Mode keeps Landlock filesystem isolation, seccomp syscall
filtering, OPA policy evaluation, credential injection, and L7
inspection via a loopback CONNECT proxy. It replaces the network
namespace (which requires CAP_SYS_ADMIN + CAP_NET_ADMIN) with:

- Loopback proxy binding (127.0.0.1 instead of veth interface)
- K8s driver: zero capabilities, drop ALL, non-root UID
- seccomp: block SOCK_DGRAM (UDP) in Platform mode to match the
  nftables UDP reject in namespace mode -- the proxy resolves
  DNS on behalf of the agent, so UDP is not needed
- Landlock scope: restrict abstract Unix sockets and signals
  (ABI v5+, BestEffort degrades on older kernels)

Security parity with namespace mode:

| Attack                 | Namespace mode         | Platform mode            |
|------------------------|------------------------|--------------------------|
| TCP bypass proxy       | nftables REJECT        | Landlock port 3128 only  |
| UDP exfiltration       | nftables REJECT        | seccomp SOCK_DGRAM block |
| DNS tunneling          | no UDP accept rule     | no SOCK_DGRAM            |
| Abstract Unix sockets  | netns isolation        | Landlock scope           |
| Signals to supervisor  | N/A (same netns)       | Landlock scope           |
| Container escape       | Risk (CAP_SYS_ADMIN)   | Impossible (zero caps)   |

Remaining gap: Landlock NetPort allows port 3128 on any IP (not just
loopback). Mitigate with egress NetworkPolicy denying all sandbox pod
egress -- loopback traffic is unaffected by NetworkPolicy.

Proto: add NetworkEnforcementMode enum and field to SandboxPolicy
and DriverSandboxSpec. Default NAMESPACE (0) preserves existing
behavior; PLATFORM (1) activates the new mode.

Signed-off-by: Ladislav Smola <lsmola@redhat.com>
@Ladas Ladas force-pushed the feat/platform-mode branch from 42158f3 to 5991fec Compare June 25, 2026 05:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant