v0.11.0: hook hardening, commit guard, consistency guards, comment lens, routing eval by KodingDev · Pull Request #5 · KodingDev/meridian

KodingDev · 2026-06-25T01:51:07Z

v0.11.0

Hooks

The hooks.json resolver exits cleanly when neither CLAUDE_PLUGIN_ROOT nor PLUGIN_ROOT is set, instead of spawning undefined/hooks/*.mjs and crashing the session. touch() and clear() wrap their filesystem calls so an I/O error degrades to a no-op. The failure-signal match collapses internal whitespace, so an accidental double-space ("still broken") still reroutes to debug.
A new Claude-only PreToolUse guard (hooks/pre-tool-use.mjs, matcher Bash) denies git commits carrying AI attribution (Co-Authored-By: Claude, "Generated with Claude", claude.ai/code, or a Claude-Session trailer) and denies staging the gitignored .meridian/ artifacts. Everything else defers to normal permissions.
A regression test pins post-compaction orientation re-injection: SessionStart fires with source compact, and the empty matcher must keep catching it.

Review & consistency

The Craft & Simplicity review lens judges comments by value — flagging chain-of-thought narrated as comments, self-evident restatement, and oversized blocks — rather than by count.
A consistency test suite: the three per-host manifest versions must agree (now aligned at 0.11.0), every meridian:<skill>/meridian:<agent> reference must resolve to something that exists, and each skill's frontmatter name must match its directory. The test runner discovers test/*.test.mjs by glob.

Docs

The README lists the sketch workflow and the composing meridian/triangulate/auto skills. A CHANGELOG starts at 0.11.0.

Skill-routing eval (dev tooling)

eval/ adds a promptfoo-based routing eval (the anthropic:claude-agent-sdk provider with skill-used assertions) that checks prompts route to the correct skill against the real plugin on Sonnet. Run on demand with pnpm eval; an optional workflow_dispatch job runs it in CI. promptfoo and the agent SDK are devDependencies — the plugin still ships no runtime dependencies. Not part of the offline CI gates.

The meridian routing skill ended with leftover </content></invoke> tags that were injected into context whenever the skill loaded.

- Resolver exits cleanly when neither CLAUDE_PLUGIN_ROOT nor PLUGIN_ROOT is set, instead of spawning 'undefined/hooks/*.mjs' and crashing the session. - touch() and clear() wrap their fs calls so a permission/disk error degrades to a no-op rather than crashing the hook. - Failure-signal matching collapses internal whitespace, so an accidental double-space ('still broken') still reroutes to debug. - Tests cover the unset-env exit, the WSL /mnt fallback at runtime, and the double-spaced signal.

SessionStart fires with source "compact" after auto/manual compaction and is the only compaction-time event that can inject context (PostCompact output is ignored). The empty matcher already routes every source through session-start, so orientation is restored when compaction drops it. Pin that behavior: a test that the hook re-emits on a compact-source payload, and a guard that the SessionStart matcher stays empty.

Adds a Claude-only PreToolUse hook (matcher Bash) that denies, with feedback to the model, two things the output style only asked for in prose: - git commits carrying AI attribution (Co-Authored-By: Claude, 'Generated with Claude', claude.ai/code, or a Claude-Session trailer) - staging .meridian/ working artifacts (git add/stage of a .meridian path) Everything else defers to normal permissions. Registered in hooks.json only; Cursor and Copilot have no PreToolUse event.

Strengthens the review enforcement layer that durably counters comment slop. Judges comments added in the diff by whether they capture non-obvious why, and names the dominant first-draft failure mode — chain-of-thought narrated as comments — alongside self-evident-signature restatement and oversized blocks.

The Claude and Cursor manifests had drifted (0.10.9 vs 0.10.8) because 'claude plugin validate' only inspects the Claude manifest. Set all three per-host manifests to 0.11.0 and add a test asserting they agree. The test runner now globs test/*.test.mjs, so new suites are picked up without editing package.json or CI.

Two checks against the drift the routing tables, dispatches, and HARD-GATE duplication invite: every meridian:<x> reference across skills, agents, hooks context, output style, and README must resolve to a real skill or agent, and each skill's frontmatter name must match its directory.

The skills table omitted sketch (a core workflow) and gave no pointer to the meta/lens/modifier skills (meridian, triangulate, auto), leaving four of the shipped skills undiscoverable from the README.

Records releases in Keep a Changelog format instead of relying on commit subjects as the de-facto changelog.

promptfoo-based eval that checks prompts route to the correct skill against the real plugin, across Opus/Sonnet/Haiku. Uses the anthropic:claude-agent-sdk provider with skill-used assertions and a local plugin load; corpus under eval/scenarios covers one positive per routable skill, trivial negatives, and the failure-signal reroute. Run on demand via 'pnpm eval' (needs ANTHROPIC_API_KEY); an optional workflow_dispatch job runs it in CI. Not part of the offline gates.

max_turns:1 made the agent-sdk provider error before returning a result with skillCalls; 6 lets each scenario conclude (0 errors). Run on Sonnet only — the routing baseline. The reroute hook fires through the provider (a probe routed 'still broken' to meridian:debug).

Rewrite the research and execute scenarios so the route is determined by the prompt alone (verify-before-coding against an external API; locked requirements ready to implement) — both now route correctly. Remove the cold single-turn reroute scenarios: a first-message 'still broken' has no prior fix to debug, so the route is only meaningful mid-flow (deferred); the hook firing is covered by the unit tests.

The PreToolUse guard matched .meridian anywhere in a git add argument, so a file named e.g. data.meridian-export.json was wrongly denied. Require a path boundary on both sides so only the .meridian directory matches.

…artifact The routing eval workflow writes the pass/fail matrix to the GitHub job summary and uploads the promptfoo HTML report as an artifact. Scenario failures no longer fail the run (continue-on-error) — routing is non-deterministic and the job is informational, not a gate. Still workflow_dispatch only, no schedule.

KodingDev added 15 commits June 25, 2026 02:09

fix(meridian): remove stray template markup from entry-point skill

53f3b64

The meridian routing skill ended with leftover </content></invoke> tags that were injected into context whenever the skill loaded.

docs(meridian): list sketch and the composing skills in the README

f7e259b

The skills table omitted sketch (a core workflow) and gave no pointer to the meta/lens/modifier skills (meridian, triangulate, auto), leaving four of the shipped skills undiscoverable from the README.

docs(meridian): add CHANGELOG starting at 0.11.0

df629f1

Records releases in Keep a Changelog format instead of relying on commit subjects as the de-facto changelog.

docs(meridian): note the eval harness in the changelog

da0e853

KodingDev merged commit a4d9cf2 into master Jun 25, 2026
5 checks passed

KodingDev deleted the feat/v0.11.0 branch June 25, 2026 03:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.11.0: hook hardening, commit guard, consistency guards, comment lens, routing eval#5

v0.11.0: hook hardening, commit guard, consistency guards, comment lens, routing eval#5
KodingDev merged 15 commits into
masterfrom
feat/v0.11.0

KodingDev commented Jun 25, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

KodingDev commented Jun 25, 2026

v0.11.0

Hooks

Review & consistency

Docs

Skill-routing eval (dev tooling)

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant