refactor(self-heal): comp does zero classification — every dynamic failure is pending#3306
Conversation
…ure is pending
New model: comp is pure plumbing. A dynamic check that doesn't succeed — a real
finding, a customer/transport error, or a thrown execution error — is held as
'inconclusive' ("pending", hidden from the customer) and handed to the self-heal
agent, which is the ONLY thing that decides our-bug (fix) vs real fail (show).
- decideRunStatus: dynamic + non-success → 'inconclusive'; no error-code logic.
- splitFailuresByDisposition: returns all-held (nothing decided on the comp side).
- The error-code classifier is now unreferenced (to be deleted in cleanup).
WIP: reveal-real-fail endpoint + the agent decision rewrite + spec updates still
to come. Not deployed.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_014PAsijjUQ1bMJuw8NuC1oR
…er-side / finding) The agent calls /reveal when it verdicts a held check as a GENUINE fail (the customer's creds/config are wrong, or a real compliance finding). Unlike /rerun (which applies the dynamic hold rule and may re-hold as 'inconclusive'), /reveal persists the TRUE status — success if it now passes, 'failed' with the real findings shown (failedCount > 0) otherwise — so the customer sees the red instead of a silent "pending". Mirrors rerunAndPersistCheck; never holds, never disables. tsc clean. Not deployed. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_014PAsijjUQ1bMJuw8NuC1oR
…decides everything
Per the design: comp must do ZERO judging of customer-vs-us. Removed every bit of
pre-classification so the only place that decides is the self-heal agent.
- Deleted check-failure-classifier.ts (the error-code customer/our_side/finding
judging) + its spec.
- Removed failureSignalsFromEvidence (HTTP-status/error-text extraction),
splitFailuresByDisposition, ClassifiableFailure, FailureDisposition from
task-check-evaluation.ts. decideRunStatus is now just:
success → success; dynamic non-success → 'inconclusive' (pending); else failed.
- Run paths (scheduled + manual) and rerun/reveal now record findings by identity
only ({connectionId, checkId, resourceId}); a dynamic failure is always held
pending for the agent. No signals, no patterns, no guessing.
- Rewrote the run-status spec for the new rule; deleted the split/signals specs.
my changed files compile; task-check-evaluation tests pass (19). Pre-existing
worktree spec failures (sync-gws / variables / credential-vault) are unrelated.
Not deployed.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_014PAsijjUQ1bMJuw8NuC1oR
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
There was a problem hiding this comment.
2 issues found across 10 files
Confidence score: 2/5
- In
apps/api/src/integration-platform/services/internal-integration-debug.service.ts, persisting a reveal run without verifyingtaskIdbelongs to the same organization (and expected check/task) can let a bad internal call write into another task’s history/status, creating cross-tenant data integrity risk—add strict ownership/association validation before saving any reveal run. - In
apps/api/src/integration-platform/utils/task-check-evaluation.ts, dynamic non-success runs with no findings being treated asinconclusivewithout blocking completion can allow tasks to move todonewhile error-only runs are still effectively unresolved—count these inconclusive error-only runs in the held/pending logic (or include pending run count) before merging.
Reply with feedback, questions, or to request a fix.
Fix all with cubic | Re-trigger cubic
…eld error-runs keep task pending - P1: reveal/rerun now assert the taskId belongs to the SAME org as the connection before persisting a run (shared assertTaskBelongsToOrg helper), so a wrong/forged internal call can't contaminate another tenant's task history. - P2: count HELD runs (not held findings) toward heldCount — an error-only dynamic run (inconclusive, no findings) now keeps the task pending instead of letting it slip to 'done' while unresolved. Both run paths (scheduled + manual). my files: tsc clean; task-check-evaluation tests pass (19). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_014PAsijjUQ1bMJuw8NuC1oR
|
@cubic-dev-ai review it |
@tofikwest I have started the AI code review. It will take a few minutes to complete. |
There was a problem hiding this comment.
1 issue found across 10 files
Confidence score: 3/5
- In
apps/api/src/integration-platform/services/internal-integration-debug.service.ts, the reveal path persists the real run but does not update the linked task status, so failed runs can appear in history while the task remains green/pending and operators get a false success signal. This creates a concrete workflow integrity risk—ensure reveal also resolves/synchronizes the associated task status before merging.
Reply with feedback, questions, or to request a fix.
Fix all with cubic | Re-trigger cubic
A revealed genuine fail persisted a 'failed' run but left the task green/pending — a false success signal. Now a reveal that resolves to 'failed' sets the linked task to 'failed' too (mirrors the run paths). A reveal that PASSES does not force 'done' (the task spans other checks; recomputed on the next scheduled run). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_014PAsijjUQ1bMJuw8NuC1oR
There was a problem hiding this comment.
1 issue found across 1 file (changes from recent commits).
Tip: Review your code locally with the cubic CLI to iterate faster.
Fix all with cubic | Re-trigger cubic
Previously the reveal flipped any non-failed task to failed, which would resurrect a human-set not_relevant (dismissed) or in_review task. Restrict it to active workflow statuses (todo / in_progress / done) so a reveal never overrides a dismissed or under-review task. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_014PAsijjUQ1bMJuw8NuC1oR
|
@cubic-dev-ai review it |
@tofikwest I have started the AI code review. It will take a few minutes to complete. |
|
🎉 This PR is included in version 3.94.0 🎉 The release is available on GitHub release Your semantic-release bot 📦🚀 |
The model
comp becomes pure plumbing: it never judges customer-vs-us. A dynamic check that doesn't succeed — a finding, a customer/transport error, or a thrown error — is held as
inconclusive("pending", hidden from the customer) and handed to the self-heal agent, the only decider (see comp-private PR).Changes
check-failure-classifier.ts) + its spec — the customer/our_side/finding guessing is gone.failureSignalsFromEvidence(HTTP-status/error-text extraction),splitFailuresByDisposition,ClassifiableFailure,FailureDisposition. No signals, no patterns, no guessing anywhere.decideRunStatusis now just:success → success; dynamic non-success →inconclusive; elsefailed.{connectionId, checkId, resourceId})./revealinternal endpoint — persists the real success/failed (never held) so the agent can surface a genuine fail to the customer.Verification
My changed files compile;
task-check-evaluationtests pass (19). Pre-existing worktree typecheck failures (sync-gws/variables/credential-vault) come from an earlier main-merge + an unbuilt generated prisma client — not from this PR (CI builds the client).Pairs with
comp-private PR: the agent that does all the deciding (fix / show / disable+ticket).
🤖 Generated with Claude Code
Summary by cubic
Make dynamic checks classification-free. Any non-success is stored as
inconclusive(pending, hidden) and sent to the self-heal agent. AddedPOST /connections/:connectionId/revealso the agent can persist the real result and update the task on genuine failures; static/AWS/GCP/Azure behavior is unchanged.Refactors
decideRunStatus: success → success; dynamic non-success →inconclusive; elsefailed.decideTaskStatusblocks "done" when any check is held, including error-only runs.taskIdbelongs to the connection’s org.New Features
POST /connections/:connectionId/revealto persist the true outcome (never held). A failing reveal sets the linked task tofailed; a passing reveal does not forcedone.todo,in_progress,done); never overridesnot_relevantorin_review.Written for commit cf89c5c. Summary will update on new commits.