feat(secrets): ingest env secrets at container runtime instead of fanning into ECS taskdef#5189
feat(secrets): ingest env secrets at container runtime instead of fanning into ECS taskdef#5189TheodoreSpeaks wants to merge 2 commits into
Conversation
…ning into ECS taskdef The app/socket ECS taskdefs were ~42KB, ~93% of which was the secrets[] array: 268 pointer entries each restating the full ~78-char secret ARN, marching toward the 64KB taskdef limit and growing ~150 bytes per hosted key added. The secret blob itself is only ~18KB/268 keys. Move secret delivery to container boot: new @sim/runtime-secrets loadRuntimeSecrets() reads SIM_ENV_SECRET_ID, fetches the combined secret once, and hydrates process.env (no-clobber, no-op when unset, fail-fast). Bootstrap entrypoints for app + realtime await it before importing the real server (env-flags reads env at module load). The app bootstrap is bun-bundled in the Dockerfile builder stage since it runs outside the Next standalone bundle; realtime keeps full node_modules and runs the TS entry. Backward-compatible: with the current fan-out taskdef the loader no-ops and the app reads the injected env vars unchanged. The matching infra change (empty secrets[] + SIM_ENV_SECRET_ID) ships separately, after this image is live.
|
@greptile review |
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
PR SummaryHigh Risk Overview
Reviewed by Cursor Bugbot for commit 77a4298. Bugbot is set up for automated code reviews on this repo. Configure here. |
Greptile SummaryThis PR adds a new
Confidence Score: 5/5Safe to merge — the change is backward-compatible, the new image no-ops with the current fan-out task definition, and the one-way infra flip is in a separate PR. All four issues from the prior review round have been cleanly resolved: the binary-secret guard is now outside the retry loop, each request is bounded by an AbortSignal timeout, the redundant AWS SDK pin is gone, and the binary-secret test asserts exactly one send call. The retry logic, no-clobber hydration, and Dockerfile bundling strategy are all correct. No files require special attention. Important Files Changed
Sequence Diagram%%{init: {'theme': 'neutral'}}%%
sequenceDiagram
participant D as Docker CMD
participant B as bootstrap.ts/js
participant L as loadRuntimeSecrets()
participant SM as AWS Secrets Manager
participant E as process.env
participant S as Server (server.js / index.ts)
D->>B: bun bootstrap.js
B->>L: await loadRuntimeSecrets()
L->>E: read SIM_ENV_SECRET_ID
alt SIM_ENV_SECRET_ID not set
L-->>B: return (no-op)
else SIM_ENV_SECRET_ID set
loop sendWithRetry (up to 3 attempts, 5s timeout each)
L->>SM: GetSecretValueCommand(secretId) + AbortSignal.timeout(5000)
alt Network/timeout error
SM-->>L: throw Error
L->>L: backoffWithJitter + sleep (200-2000ms)
else Successful response
SM-->>L: "{ SecretString? }"
end
end
alt No SecretString (binary secret)
L-->>B: throw immediately (non-retriable)
else SecretString present
L->>L: JSON.parse(SecretString)
L->>L: validate: object, not array/null
loop for each [key, value] in entries
L->>E: "process.env[key] = value (no-clobber)"
end
L-->>B: return (loaded N, skipped M)
end
end
B->>S: "await import('./server.js' or '@/index')"
S-->>D: server running
%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
sequenceDiagram
participant D as Docker CMD
participant B as bootstrap.ts/js
participant L as loadRuntimeSecrets()
participant SM as AWS Secrets Manager
participant E as process.env
participant S as Server (server.js / index.ts)
D->>B: bun bootstrap.js
B->>L: await loadRuntimeSecrets()
L->>E: read SIM_ENV_SECRET_ID
alt SIM_ENV_SECRET_ID not set
L-->>B: return (no-op)
else SIM_ENV_SECRET_ID set
loop sendWithRetry (up to 3 attempts, 5s timeout each)
L->>SM: GetSecretValueCommand(secretId) + AbortSignal.timeout(5000)
alt Network/timeout error
SM-->>L: throw Error
L->>L: backoffWithJitter + sleep (200-2000ms)
else Successful response
SM-->>L: "{ SecretString? }"
end
end
alt No SecretString (binary secret)
L-->>B: throw immediately (non-retriable)
else SecretString present
L->>L: JSON.parse(SecretString)
L->>L: validate: object, not array/null
loop for each [key, value] in entries
L->>E: "process.env[key] = value (no-clobber)"
end
L-->>B: return (loaded N, skipped M)
end
end
B->>S: "await import('./server.js' or '@/index')"
S-->>D: server running
Reviews (3): Last reviewed commit: "fix(runtime-secrets): address review fee..." | Re-trigger Greptile |
Greptile SummaryThis PR introduces a new
Confidence Score: 4/5The change is backward-compatible with the current taskdef — when SIM_ENV_SECRET_ID is absent the loader is a no-op, so rolling the image before the infra flip is safe. The main caution is that a binary-secret misconfiguration burns unnecessary retry delays before failing, and there is no per-request timeout on the Secrets Manager client. The bootstrap ordering, no-clobber hydration, and bundling strategy are all sound. The findings are limited to the retry loop catching its own internal guard error (causing avoidable backoff on certain misconfigurations) and a missing SDK request timeout that could slow crash-detection in network-degraded environments. Neither causes incorrect behavior in the normal path. packages/runtime-secrets/src/index.ts — the fetchSecretString retry loop and SecretsManagerClient instantiation. Important Files Changed
Sequence Diagram%%{init: {'theme': 'neutral'}}%%
sequenceDiagram
participant Docker as Container Runtime
participant Bootstrap as bootstrap.js / bootstrap.ts
participant RSP as loadRuntimeSecrets()
participant SM as AWS Secrets Manager
participant Server as server.js / index.ts
Docker->>Bootstrap: "CMD bun bootstrap.*"
Bootstrap->>RSP: await loadRuntimeSecrets()
RSP->>RSP: read SIM_ENV_SECRET_ID from process.env
alt SIM_ENV_SECRET_ID not set (local/self-hosted)
RSP-->>Bootstrap: no-op return
else SIM_ENV_SECRET_ID is set (ECS)
RSP->>SM: "GetSecretValue({ SecretId })"
SM-->>RSP: SecretString (JSON blob)
RSP->>RSP: JSON.parse → Object.entries
RSP->>RSP: hydrate process.env (no-clobber)
RSP-->>Bootstrap: done (loaded N, skipped M)
end
Bootstrap->>Server: await import(standaloneServer / index)
Server-->>Docker: listening on PORT
%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
sequenceDiagram
participant Docker as Container Runtime
participant Bootstrap as bootstrap.js / bootstrap.ts
participant RSP as loadRuntimeSecrets()
participant SM as AWS Secrets Manager
participant Server as server.js / index.ts
Docker->>Bootstrap: "CMD bun bootstrap.*"
Bootstrap->>RSP: await loadRuntimeSecrets()
RSP->>RSP: read SIM_ENV_SECRET_ID from process.env
alt SIM_ENV_SECRET_ID not set (local/self-hosted)
RSP-->>Bootstrap: no-op return
else SIM_ENV_SECRET_ID is set (ECS)
RSP->>SM: "GetSecretValue({ SecretId })"
SM-->>RSP: SecretString (JSON blob)
RSP->>RSP: JSON.parse → Object.entries
RSP->>RSP: hydrate process.env (no-clobber)
RSP-->>Bootstrap: done (loaded N, skipped M)
end
Bootstrap->>Server: await import(standaloneServer / index)
Server-->>Docker: listening on PORT
Reviews (2): Last reviewed commit: "feat(secrets): ingest env secrets at con..." | Re-trigger Greptile |
- Move the binary-secret guard outside the retry loop (sendWithRetry) so a missing SecretString throws immediately instead of burning 3 attempts + backoff. - Bound each Secrets Manager request with AbortSignal.timeout(5s) so a stalled response can't hang boot indefinitely. - Drop the redundant @aws-sdk/client-secrets-manager pin from apps/realtime; it resolves transitively via @sim/runtime-secrets. - Add a test for the non-retriable binary-secret path.
|
Addressed all four in 77a4298:
@greptile review |
Summary
secrets[]array — 268 pointer entries, each restating the full ~78-char secret ARN. That was marching toward the 64KB taskdef limit and growing ~150 bytes per hosted key added. (Confirmed live: the secret blob itself is only 18.3KB / 268 keys — the taskdef is bigger than the data it points to, purely from ARN repetition.)@sim/runtime-secretspackage:loadRuntimeSecrets()readsSIM_ENV_SECRET_ID, fetches the combined/{env}/sim/env-varssecret once via the task role,JSON.parses it, and hydratesprocess.env— no-clobber (explicit taskdef env wins), no-op when unset (local/self-hosted unchanged), fail-fast otherwise, with one bounded retry.apps/sim/bootstrap.ts+apps/realtime/src/bootstrap.tsawait loadRuntimeSecrets()then dynamic-import()the real server. Ordering matters becauseenv-flags.tsreads env at module load.bun build-bundled in the Dockerfile builder stage (it runs outside the Next standalone bundle, so its deps can't resolve from the prunednode_modules); realtime keeps fullnode_modulesand runs the TS entry directly.Deploy ordering (important)
secrets[]and adds theSIM_ENV_SECRET_IDplaintext env).Type of Change
Testing
@sim/runtime-secretsunit tests (6) passing — hydrate, no-clobber, no-op-when-unset, invalid-JSON, non-object, retry-then-throw@sim/runtime-secrets,apps/sim,apps/realtime(and the infra repo for the companion change)bun run lint,check:api-validation:strict,check:boundaries,check:realtime-pruneall cleanbun buildsmoke test of the app bootstrap (0.95MB, AWS SDK inlined, dynamic server import preserved) confirmed the boot ordering (hydrate → import server)docker build+ run against the real secret;cdk diffto confirmsecrets: []Checklist