Agent Diagnostic
Investigation reviewing PR #1914 and the discussion in #1914 (comment).
Skills/tools used: code search across crates/openshell-policy, crates/openshell-server/src/grpc, and crates/openshell-sandbox; git history on the original composition PR (#1037).
Findings:
crates/openshell-policy/src/compose.rs:15-34 defines provider_rule_name() which prefixes provider-derived rules with _provider_. The doc comment at compose.rs:41-42 declares this key namespace reserved for derived data.
crates/openshell-policy/src/compose.rs:62-75 (unique_provider_rule_key) collision-renames provider rules to _provider_<name>_2, _3, ... when a key already exists in the source policy. This is the mechanism by which a stale baked-in _provider_* rule shadows the live one.
crates/openshell-policy/src/merge.rs:400-416 and merge.rs:642-656 already filter _provider_* keys out of the endpoint-overlap fallback, confirming the prefix is treated as reserved internally.
crates/openshell-server/src/grpc/validation.rs:619-628 (validate_policy_safety) does not reject user-authored policy with _provider_* network policy keys. There is no enforcement at the gateway boundary for either SetSandboxPolicy, CreateSandbox, or the supervisor-originated enrichment write-back path.
Net: the prefix is reserved by convention and by the composition layer, but not by any input validation. That asymmetry produces the bug.
Description
Actual behavior: A user-authored or sandbox-enriched policy can contain network_policies keys with the _provider_* prefix. When provider profile composition runs, the composer detects the collision and re-keys the provider rule as _provider_<name>_2. The original (now stale) _provider_<name> from the user policy remains in the effective policy. Non-additive provider profile updates (rename / remove endpoint) silently no-op because the live _2 rule sits behind the stale baseline rule.
This breaks the core update flow that PR #1914 introduces: openshell provider profile update succeeds, but for any sandbox whose persisted policy already contains the prior _provider_* key, the change does not take effect.
Expected behavior: The _provider_* key namespace is reserved for derived data. Any user-supplied policy or sandbox-originated write that attempts to set keys under that namespace is rejected (or filtered) at the gateway boundary, so provider composition always produces a single authoritative _provider_<name> rule per profile.
Reproduction Steps
Path 1 — policy get --full round-trip (user-driven)
openshell provider create --name test --type github --credential FOO=bar
openshell sandbox create --provider test -- true
openshell policy get <sandbox> --full > current-policy.yaml
openshell policy set <sandbox> -f current-policy.yaml
openshell provider profile update -f updated-profile.yaml # remove or rename an endpoint
openshell policy get <sandbox> --full
# Observed: _provider_test (stale, from user policy) AND _provider_test_2 (current)
Path 2 — supervisor enrichment write-back (no user action)
Reproducer from pimlock — happens automatically as part of normal sandbox startup:
cat > custom-policy.yaml <<END
version: 1
filesystem_policy:
include_workdir: true
read_only: []
read_write: []
landlock:
compatibility: best_effort
process:
run_as_user: sandbox
run_as_group: sandbox
network_policies:
sandbox_only:
endpoints:
- host: sandbox.example.com
port: 443
protocol: rest
access: read-only
enforcement: enforce
binaries: []
END
openshell provider create --name test --type github --credential FOO=bar
openshell sandbox create --policy custom-policy.yaml --provider test -- true
openshell policy get --full
# Observed: both _provider_test and _provider_test_2 present
Full script with pauses for live-state exploration: https://gist.github.com/pimlock/d10de2a4ca5611a9a9ae78fce7e0a489?permalink_comment_id=6212304#gistcomment-6212304
Root cause of path 2: the supervisor enriches the effective policy (including provider-injected rules) and syncs it back to the gateway. The gateway treats the synced policy as authoritative user policy, baking the _provider_* keys into the source. On the next provider poll, composition collision-renames the new provider rule to _provider_test_2.
Proposed Fix
- Validation: extend
validate_policy_safety (or add a sibling validator) to reject network_policies keys matching ^_provider_ on every user-facing write path — at minimum SetSandboxPolicy and CreateSandbox. Error message should explain that the prefix is reserved for provider composition and direct users to policy get (without --full) for round-trippable output.
- Supervisor enrichment write-back: strip
_provider_* keys before the supervisor pushes enriched policy back to the gateway. Provider-derived rules are gateway-authoritative; the sandbox should never be the source of truth for them. Belt-and-braces: the gateway's enrichment ingestion path should also drop any _provider_* keys it receives from a sandbox.
- No change to
policy get --full: --full continues to render the effective policy including provider contributions — that's the purpose of the flag. policy get without --full already excludes _provider_* (merge.rs:416), so the documented round-trip workflow stays intact.
- Docs: brief note in
docs/sandboxes/manage-providers.mdx about the reservation.
Beta-stage, hard reject is fine.
Environment
Related
Agent Diagnostic
Investigation reviewing PR #1914 and the discussion in #1914 (comment).
Skills/tools used: code search across
crates/openshell-policy,crates/openshell-server/src/grpc, andcrates/openshell-sandbox; git history on the original composition PR (#1037).Findings:
crates/openshell-policy/src/compose.rs:15-34definesprovider_rule_name()which prefixes provider-derived rules with_provider_. The doc comment atcompose.rs:41-42declares this key namespace reserved for derived data.crates/openshell-policy/src/compose.rs:62-75(unique_provider_rule_key) collision-renames provider rules to_provider_<name>_2,_3, ... when a key already exists in the source policy. This is the mechanism by which a stale baked-in_provider_*rule shadows the live one.crates/openshell-policy/src/merge.rs:400-416andmerge.rs:642-656already filter_provider_*keys out of the endpoint-overlap fallback, confirming the prefix is treated as reserved internally.crates/openshell-server/src/grpc/validation.rs:619-628(validate_policy_safety) does not reject user-authored policy with_provider_*network policy keys. There is no enforcement at the gateway boundary for eitherSetSandboxPolicy,CreateSandbox, or the supervisor-originated enrichment write-back path.Net: the prefix is reserved by convention and by the composition layer, but not by any input validation. That asymmetry produces the bug.
Description
Actual behavior: A user-authored or sandbox-enriched policy can contain
network_policieskeys with the_provider_*prefix. When provider profile composition runs, the composer detects the collision and re-keys the provider rule as_provider_<name>_2. The original (now stale)_provider_<name>from the user policy remains in the effective policy. Non-additive provider profile updates (rename / remove endpoint) silently no-op because the live_2rule sits behind the stale baseline rule.This breaks the core update flow that PR #1914 introduces:
openshell provider profile updatesucceeds, but for any sandbox whose persisted policy already contains the prior_provider_*key, the change does not take effect.Expected behavior: The
_provider_*key namespace is reserved for derived data. Any user-supplied policy or sandbox-originated write that attempts to set keys under that namespace is rejected (or filtered) at the gateway boundary, so provider composition always produces a single authoritative_provider_<name>rule per profile.Reproduction Steps
Path 1 —
policy get --fullround-trip (user-driven)Path 2 — supervisor enrichment write-back (no user action)
Reproducer from pimlock — happens automatically as part of normal sandbox startup:
Full script with pauses for live-state exploration: https://gist.github.com/pimlock/d10de2a4ca5611a9a9ae78fce7e0a489?permalink_comment_id=6212304#gistcomment-6212304
Root cause of path 2: the supervisor enriches the effective policy (including provider-injected rules) and syncs it back to the gateway. The gateway treats the synced policy as authoritative user policy, baking the
_provider_*keys into the source. On the next provider poll, composition collision-renames the new provider rule to_provider_test_2.Proposed Fix
validate_policy_safety(or add a sibling validator) to rejectnetwork_policieskeys matching^_provider_on every user-facing write path — at minimumSetSandboxPolicyandCreateSandbox. Error message should explain that the prefix is reserved for provider composition and direct users topolicy get(without--full) for round-trippable output._provider_*keys before the supervisor pushes enriched policy back to the gateway. Provider-derived rules are gateway-authoritative; the sandbox should never be the source of truth for them. Belt-and-braces: the gateway's enrichment ingestion path should also drop any_provider_*keys it receives from a sandbox.policy get --full:--fullcontinues to render the effective policy including provider contributions — that's the purpose of the flag.policy getwithout--fullalready excludes_provider_*(merge.rs:416), so the documented round-trip workflow stays intact.docs/sandboxes/manage-providers.mdxabout the reservation.Beta-stage, hard reject is fine.
Environment
main/ PR feat(providers): support profile updates #1914 branchopenshell-server), sandbox supervisor (openshell-sandbox), policy engine (openshell-policy).Related
_provider_*composition design.