Skip to content

v0.11.0: hook hardening, commit guard, consistency guards, comment lens, routing eval#5

Merged
KodingDev merged 15 commits into
masterfrom
feat/v0.11.0
Jun 25, 2026
Merged

v0.11.0: hook hardening, commit guard, consistency guards, comment lens, routing eval#5
KodingDev merged 15 commits into
masterfrom
feat/v0.11.0

Conversation

@KodingDev

Copy link
Copy Markdown
Owner

v0.11.0

Hooks

  • The hooks.json resolver exits cleanly when neither CLAUDE_PLUGIN_ROOT nor PLUGIN_ROOT is set, instead of spawning undefined/hooks/*.mjs and crashing the session. touch() and clear() wrap their filesystem calls so an I/O error degrades to a no-op. The failure-signal match collapses internal whitespace, so an accidental double-space ("still broken") still reroutes to debug.
  • A new Claude-only PreToolUse guard (hooks/pre-tool-use.mjs, matcher Bash) denies git commits carrying AI attribution (Co-Authored-By: Claude, "Generated with Claude", claude.ai/code, or a Claude-Session trailer) and denies staging the gitignored .meridian/ artifacts. Everything else defers to normal permissions.
  • A regression test pins post-compaction orientation re-injection: SessionStart fires with source compact, and the empty matcher must keep catching it.

Review & consistency

  • The Craft & Simplicity review lens judges comments by value — flagging chain-of-thought narrated as comments, self-evident restatement, and oversized blocks — rather than by count.
  • A consistency test suite: the three per-host manifest versions must agree (now aligned at 0.11.0), every meridian:<skill>/meridian:<agent> reference must resolve to something that exists, and each skill's frontmatter name must match its directory. The test runner discovers test/*.test.mjs by glob.

Docs

  • The README lists the sketch workflow and the composing meridian/triangulate/auto skills. A CHANGELOG starts at 0.11.0.

Skill-routing eval (dev tooling)

  • eval/ adds a promptfoo-based routing eval (the anthropic:claude-agent-sdk provider with skill-used assertions) that checks prompts route to the correct skill against the real plugin on Sonnet. Run on demand with pnpm eval; an optional workflow_dispatch job runs it in CI. promptfoo and the agent SDK are devDependencies — the plugin still ships no runtime dependencies. Not part of the offline CI gates.

KodingDev added 15 commits June 25, 2026 02:09
The meridian routing skill ended with leftover </content></invoke> tags that
were injected into context whenever the skill loaded.
- Resolver exits cleanly when neither CLAUDE_PLUGIN_ROOT nor PLUGIN_ROOT is
  set, instead of spawning 'undefined/hooks/*.mjs' and crashing the session.
- touch() and clear() wrap their fs calls so a permission/disk error degrades
  to a no-op rather than crashing the hook.
- Failure-signal matching collapses internal whitespace, so an accidental
  double-space ('still  broken') still reroutes to debug.
- Tests cover the unset-env exit, the WSL /mnt fallback at runtime, and the
  double-spaced signal.
SessionStart fires with source "compact" after auto/manual compaction and is
the only compaction-time event that can inject context (PostCompact output is
ignored). The empty matcher already routes every source through session-start,
so orientation is restored when compaction drops it. Pin that behavior: a test
that the hook re-emits on a compact-source payload, and a guard that the
SessionStart matcher stays empty.
Adds a Claude-only PreToolUse hook (matcher Bash) that denies, with feedback to
the model, two things the output style only asked for in prose:
- git commits carrying AI attribution (Co-Authored-By: Claude, 'Generated with
  Claude', claude.ai/code, or a Claude-Session trailer)
- staging .meridian/ working artifacts (git add/stage of a .meridian path)

Everything else defers to normal permissions. Registered in hooks.json only;
Cursor and Copilot have no PreToolUse event.
Strengthens the review enforcement layer that durably counters comment slop.
Judges comments added in the diff by whether they capture non-obvious why,
and names the dominant first-draft failure mode — chain-of-thought narrated as
comments — alongside self-evident-signature restatement and oversized blocks.
The Claude and Cursor manifests had drifted (0.10.9 vs 0.10.8) because
'claude plugin validate' only inspects the Claude manifest. Set all three
per-host manifests to 0.11.0 and add a test asserting they agree. The test
runner now globs test/*.test.mjs, so new suites are picked up without editing
package.json or CI.
Two checks against the drift the routing tables, dispatches, and HARD-GATE
duplication invite: every meridian:<x> reference across skills, agents, hooks
context, output style, and README must resolve to a real skill or agent, and
each skill's frontmatter name must match its directory.
The skills table omitted sketch (a core workflow) and gave no pointer to the
meta/lens/modifier skills (meridian, triangulate, auto), leaving four of the
shipped skills undiscoverable from the README.
Records releases in Keep a Changelog format instead of relying on commit
subjects as the de-facto changelog.
promptfoo-based eval that checks prompts route to the correct skill against the
real plugin, across Opus/Sonnet/Haiku. Uses the anthropic:claude-agent-sdk
provider with skill-used assertions and a local plugin load; corpus under
eval/scenarios covers one positive per routable skill, trivial negatives, and
the failure-signal reroute. Run on demand via 'pnpm eval' (needs ANTHROPIC_API_KEY);
an optional workflow_dispatch job runs it in CI. Not part of the offline gates.
max_turns:1 made the agent-sdk provider error before returning a result with
skillCalls; 6 lets each scenario conclude (0 errors). Run on Sonnet only — the
routing baseline. The reroute hook fires through the provider (a probe routed
'still broken' to meridian:debug).
Rewrite the research and execute scenarios so the route is determined by the
prompt alone (verify-before-coding against an external API; locked requirements
ready to implement) — both now route correctly. Remove the cold single-turn
reroute scenarios: a first-message 'still broken' has no prior fix to debug, so
the route is only meaningful mid-flow (deferred); the hook firing is covered by
the unit tests.
The PreToolUse guard matched .meridian anywhere in a git add argument, so a file
named e.g. data.meridian-export.json was wrongly denied. Require a path boundary
on both sides so only the .meridian directory matches.
…artifact

The routing eval workflow writes the pass/fail matrix to the GitHub job summary
and uploads the promptfoo HTML report as an artifact. Scenario failures no longer
fail the run (continue-on-error) — routing is non-deterministic and the job is
informational, not a gate. Still workflow_dispatch only, no schedule.
@KodingDev KodingDev merged commit a4d9cf2 into master Jun 25, 2026
5 checks passed
@KodingDev KodingDev deleted the feat/v0.11.0 branch June 25, 2026 03:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant