refactor(self-heal): comp does zero classification — every dynamic failure is pending by tofikwest · Pull Request #3306 · trycompai/comp

tofikwest · 2026-06-30T18:57:43Z

The model

comp becomes pure plumbing: it never judges customer-vs-us. A dynamic check that doesn't succeed — a finding, a customer/transport error, or a thrown error — is held as inconclusive ("pending", hidden from the customer) and handed to the self-heal agent, the only decider (see comp-private PR).

Changes

Deleted the error-code classifier (check-failure-classifier.ts) + its spec — the customer/our_side/finding guessing is gone.
Removed failureSignalsFromEvidence (HTTP-status/error-text extraction), splitFailuresByDisposition, ClassifiableFailure, FailureDisposition. No signals, no patterns, no guessing anywhere.
decideRunStatus is now just: success → success; dynamic non-success → inconclusive; else failed.
Both run paths (scheduled + manual) record findings by identity only ({connectionId, checkId, resourceId}).
New /reveal internal endpoint — persists the real success/failed (never held) so the agent can surface a genuine fail to the customer.
Rewrote the run-status spec for the new rule; deleted the split/signals specs.

Verification

My changed files compile; task-check-evaluation tests pass (19). Pre-existing worktree typecheck failures (sync-gws / variables / credential-vault) come from an earlier main-merge + an unbuilt generated prisma client — not from this PR (CI builds the client).

Pairs with

comp-private PR: the agent that does all the deciding (fix / show / disable+ticket).

🤖 Generated with Claude Code

Summary by cubic

Make dynamic checks classification-free. Any non-success is stored as inconclusive (pending, hidden) and sent to the self-heal agent. Added POST /connections/:connectionId/reveal so the agent can persist the real result and update the task on genuine failures; static/AWS/GCP/Azure behavior is unchanged.

Refactors
- Removed comp-side failure classification, signal extraction, and related specs.
- Simplified decideRunStatus: success → success; dynamic non-success → inconclusive; else failed. decideTaskStatus blocks "done" when any check is held, including error-only runs.
- Scheduled/manual runs store failing findings by identity only.
- Secured rerun/reveal by asserting taskId belongs to the connection’s org.
New Features
- Added internal POST /connections/:connectionId/reveal to persist the true outcome (never held). A failing reveal sets the linked task to failed; a passing reveal does not force done.
- Reveal task sync only updates active statuses (todo, in_progress, done); never overrides not_relevant or in_review.

^{Written for commit cf89c5c. Summary will update on new commits.}

…ure is pending New model: comp is pure plumbing. A dynamic check that doesn't succeed — a real finding, a customer/transport error, or a thrown execution error — is held as 'inconclusive' ("pending", hidden from the customer) and handed to the self-heal agent, which is the ONLY thing that decides our-bug (fix) vs real fail (show). - decideRunStatus: dynamic + non-success → 'inconclusive'; no error-code logic. - splitFailuresByDisposition: returns all-held (nothing decided on the comp side). - The error-code classifier is now unreferenced (to be deleted in cleanup). WIP: reveal-real-fail endpoint + the agent decision rewrite + spec updates still to come. Not deployed. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_014PAsijjUQ1bMJuw8NuC1oR

…er-side / finding) The agent calls /reveal when it verdicts a held check as a GENUINE fail (the customer's creds/config are wrong, or a real compliance finding). Unlike /rerun (which applies the dynamic hold rule and may re-hold as 'inconclusive'), /reveal persists the TRUE status — success if it now passes, 'failed' with the real findings shown (failedCount > 0) otherwise — so the customer sees the red instead of a silent "pending". Mirrors rerunAndPersistCheck; never holds, never disables. tsc clean. Not deployed. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_014PAsijjUQ1bMJuw8NuC1oR

…decides everything Per the design: comp must do ZERO judging of customer-vs-us. Removed every bit of pre-classification so the only place that decides is the self-heal agent. - Deleted check-failure-classifier.ts (the error-code customer/our_side/finding judging) + its spec. - Removed failureSignalsFromEvidence (HTTP-status/error-text extraction), splitFailuresByDisposition, ClassifiableFailure, FailureDisposition from task-check-evaluation.ts. decideRunStatus is now just: success → success; dynamic non-success → 'inconclusive' (pending); else failed. - Run paths (scheduled + manual) and rerun/reveal now record findings by identity only ({connectionId, checkId, resourceId}); a dynamic failure is always held pending for the agent. No signals, no patterns, no guessing. - Rewrote the run-status spec for the new rule; deleted the split/signals specs. my changed files compile; task-check-evaluation tests pass (19). Pre-existing worktree spec failures (sync-gws / variables / credential-vault) are unrelated. Not deployed. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_014PAsijjUQ1bMJuw8NuC1oR

vercel · 2026-06-30T18:57:47Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
app	Ready	Preview, Comment	Jun 30, 2026 8:00pm
comp-framework-editor	Ready	Preview, Comment	Jun 30, 2026 8:00pm

1 Skipped Deployment

Project	Deployment	Actions	Updated (UTC)
portal	Skipped		Jun 30, 2026 8:00pm

cubic-dev-ai

2 issues found across 10 files

Confidence score: 2/5

In apps/api/src/integration-platform/services/internal-integration-debug.service.ts, persisting a reveal run without verifying taskId belongs to the same organization (and expected check/task) can let a bad internal call write into another task’s history/status, creating cross-tenant data integrity risk—add strict ownership/association validation before saving any reveal run.
In apps/api/src/integration-platform/utils/task-check-evaluation.ts, dynamic non-success runs with no findings being treated as inconclusive without blocking completion can allow tasks to move to done while error-only runs are still effectively unresolved—count these inconclusive error-only runs in the held/pending logic (or include pending run count) before merging.

_{Reply with feedback, questions, or to request a fix.

Fix all with cubic | Re-trigger cubic}

…eld error-runs keep task pending - P1: reveal/rerun now assert the taskId belongs to the SAME org as the connection before persisting a run (shared assertTaskBelongsToOrg helper), so a wrong/forged internal call can't contaminate another tenant's task history. - P2: count HELD runs (not held findings) toward heldCount — an error-only dynamic run (inconclusive, no findings) now keeps the task pending instead of letting it slip to 'done' while unresolved. Both run paths (scheduled + manual). my files: tsc clean; task-check-evaluation tests pass (19). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_014PAsijjUQ1bMJuw8NuC1oR

tofikwest · 2026-06-30T19:16:10Z

@cubic-dev-ai review it

cubic-dev-ai · 2026-06-30T19:16:23Z

@cubic-dev-ai review it

@tofikwest I have started the AI code review. It will take a few minutes to complete.

cubic-dev-ai

1 issue found across 10 files

Confidence score: 3/5

In apps/api/src/integration-platform/services/internal-integration-debug.service.ts, the reveal path persists the real run but does not update the linked task status, so failed runs can appear in history while the task remains green/pending and operators get a false success signal. This creates a concrete workflow integrity risk—ensure reveal also resolves/synchronizes the associated task status before merging.

_{Reply with feedback, questions, or to request a fix.

Fix all with cubic | Re-trigger cubic}

A revealed genuine fail persisted a 'failed' run but left the task green/pending — a false success signal. Now a reveal that resolves to 'failed' sets the linked task to 'failed' too (mirrors the run paths). A reveal that PASSES does not force 'done' (the task spans other checks; recomputed on the next scheduled run). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_014PAsijjUQ1bMJuw8NuC1oR

cubic-dev-ai

1 issue found across 1 file (changes from recent commits).

_{Tip: Review your code locally with the cubic CLI to iterate faster.

Fix all with cubic | Re-trigger cubic}

Previously the reveal flipped any non-failed task to failed, which would resurrect a human-set not_relevant (dismissed) or in_review task. Restrict it to active workflow statuses (todo / in_progress / done) so a reveal never overrides a dismissed or under-review task. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_014PAsijjUQ1bMJuw8NuC1oR

tofikwest · 2026-06-30T19:56:58Z

@cubic-dev-ai review it

cubic-dev-ai · 2026-06-30T19:57:06Z

@cubic-dev-ai review it

@tofikwest I have started the AI code review. It will take a few minutes to complete.

cubic-dev-ai

No issues found across 10 files

Confidence score: 5/5

Automated review surfaced no issues in the provided summaries.
No files require special attention.

_{Re-trigger cubic}

claudfuen · 2026-06-30T20:56:21Z

🎉 This PR is included in version 3.94.0 🎉

The release is available on GitHub release

Your semantic-release bot 📦🚀

tofikwest and others added 3 commits June 30, 2026 14:18

vercel Bot temporarily deployed to Preview – app June 30, 2026 18:57 Inactive

vercel Bot temporarily deployed to Preview – portal June 30, 2026 18:57 Inactive

vercel Bot deployed to Preview – comp-framework-editor June 30, 2026 18:59 View deployment

cubic-dev-ai Bot reviewed Jun 30, 2026

View reviewed changes

Comment thread apps/api/src/integration-platform/services/internal-integration-debug.service.ts

Comment thread apps/api/src/integration-platform/utils/task-check-evaluation.ts

Merge branch 'main' into tofik/dynamic-check-versioning

b43f545

vercel Bot deployed to Preview – comp-framework-editor June 30, 2026 19:06 View deployment

vercel Bot deployed to Preview – portal June 30, 2026 19:08 View deployment

vercel Bot deployed to Preview – app June 30, 2026 19:09 View deployment

vercel Bot temporarily deployed to Preview – portal June 30, 2026 19:12 Inactive

vercel Bot temporarily deployed to Preview – app June 30, 2026 19:12 Inactive

vercel Bot deployed to Preview – comp-framework-editor June 30, 2026 19:18 View deployment

cubic-dev-ai Bot reviewed Jun 30, 2026

View reviewed changes

Comment thread apps/api/src/integration-platform/services/internal-integration-debug.service.ts

vercel Bot temporarily deployed to Preview – portal June 30, 2026 19:24 Inactive

vercel Bot temporarily deployed to Preview – app June 30, 2026 19:24 Inactive

vercel Bot deployed to Preview – comp-framework-editor June 30, 2026 19:25 View deployment

cubic-dev-ai Bot reviewed Jun 30, 2026

View reviewed changes

Comment thread apps/api/src/integration-platform/services/internal-integration-debug.service.ts Outdated

vercel Bot temporarily deployed to Preview – portal June 30, 2026 19:38 Inactive

vercel Bot temporarily deployed to Preview – app June 30, 2026 19:38 Inactive

vercel Bot deployed to Preview – comp-framework-editor June 30, 2026 19:43 View deployment

Merge branch 'main' into tofik/dynamic-check-versioning

cf89c5c

vercel Bot temporarily deployed to Preview – portal June 30, 2026 19:56 Inactive

vercel Bot deployed to Preview – comp-framework-editor June 30, 2026 19:58 View deployment

vercel Bot deployed to Preview – app June 30, 2026 20:00 View deployment

cubic-dev-ai Bot reviewed Jun 30, 2026

View reviewed changes

tofikwest merged commit 33d6ff0 into main Jun 30, 2026
11 checks passed

tofikwest deleted the tofik/dynamic-check-versioning branch June 30, 2026 20:04

claudfuen added the released label Jun 30, 2026

Uh oh!

Conversation

tofikwest commented Jun 30, 2026 • edited by cubic-dev-ai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

The model

Changes

Verification

Pairs with

Summary by cubic

Uh oh!

vercel Bot commented Jun 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cubic-dev-ai Bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

tofikwest commented Jun 30, 2026

Uh oh!

cubic-dev-ai Bot commented Jun 30, 2026

Uh oh!

cubic-dev-ai Bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

cubic-dev-ai Bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

tofikwest commented Jun 30, 2026

Uh oh!

cubic-dev-ai Bot commented Jun 30, 2026

Uh oh!

cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

claudfuen commented Jun 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

tofikwest commented Jun 30, 2026 •

edited by cubic-dev-ai Bot

Loading

vercel Bot commented Jun 30, 2026 •

edited

Loading

cubic-dev-ai Bot left a comment •

edited

Loading

cubic-dev-ai Bot left a comment •

edited

Loading

cubic-dev-ai Bot left a comment •

edited

Loading