Skip to content

feat(schema): non-bare identifiers in lineage + autocomplete (closes #44)#45

Merged
BorisTyshkevich merged 2 commits into
mainfrom
feat/issue-44-dotted-names
Jun 26, 2026
Merged

feat(schema): non-bare identifiers in lineage + autocomplete (closes #44)#45
BorisTyshkevich merged 2 commits into
mainfrom
feat/issue-44-dotted-names

Conversation

@BorisTyshkevich

Copy link
Copy Markdown
Collaborator

Closes #44 — the deeper dotted/backticked-name handling left as follow-up from #43. Verified against fixtures captured from Docker ClickHouse 26.5.1.

Lineage (src/core/schema-graph.js)

parseMvTarget couldn't read a backticked TO target. CH backtick-quotes non-bare names in create_table_query:

CREATE MATERIALIZED VIEW target_all.`mv-1` TO target_all.`agg.out.parquet` (…) AS SELECT …

The old regex captured only target_all (the db) → a phantom target_all.target_all node and a wrong/absent writes edge. Now it parses a backtick-quoted/dotted compound identifier and returns {db?, table} with backticks stripped, joined via joinId.

EXPLAIN-AST source resolution. Ground truth: EXPLAIN AST prints names unquoted, qualified or bare (TableIdentifier target_all.part-0.snappy.parquet, or just part-0.snappy.parquet for a default-db ref). The old qualify dot-heuristic mislabeled a bare dotted name as already-qualified and dropped the edge. Now sources resolve against the known ids both ways (as-is, then db-qualified) — so dotted names link correctly and CTE/alias names still drop. The ambiguous qualify helper is removed entirely.

Autocomplete (src/core/completions.js)

completionContext is now backtick-aware:

  • An open backtick starts the word, and from points at the backtick so an accepted candidate (`part-0.snappy.parquet`) replaces from the backtick — no doubling, no mid-token splice.
  • A backtick-quoted table before a dot is recognized as the qualified parent (unquoted), so `weird.tbl`.<col> column completion works.
  • Backticks before the caret are counted (odd ⇒ inside an open quote) so a closed `my tbl` run isn't mistaken for an open one.

rankCompletions is unchanged — it matches on the now-unquoted word/parent.

Tests

  • parseMvTarget{db,table}, backticked/dotted target, unqualified target.
  • Whole-graph fixtures (from Docker): backticked TO target + qualified and unqualified dotted AST sources, plus a no-phantom-node assertion.
  • completionContext: open-backtick word, closed-run not-open, backticked parent, open-backtick column under backticked parent.

876 tests pass; format.js / completions.js / schema-graph.js per-file gates hold. Built + deployed to otel / antalya / github.demo (sha 209e6f8).

Deferred (still in #44, intentionally not here)

  • A guard test/lint chokepoint (a source-grep test is brittle).
  • Bare dashed typing without a leading backtick (part-00…) — inherently ambiguous with subtraction; the supported path is to start the identifier with a backtick.

🤖 Generated with Claude Code

BorisTyshkevich and others added 2 commits June 25, 2026 23:28
)

Closes the deeper dotted/backticked-name gaps left by #43, verified against
fixtures captured from Docker ClickHouse 26.5.1.

Lineage (src/core/schema-graph.js):
- parseMvTarget now parses a backtick-quoted/dotted TO target (CH backtick-quotes
  non-bare names in create_table_query, e.g. TO target_all.`agg.out.parquet`) and
  returns {db?, table} with backticks stripped. The old regex captured only the db
  part → a phantom `db.db` node and a wrong/absent writes edge.
- EXPLAIN-AST source resolution no longer uses the ambiguous dot heuristic: AST
  prints names unquoted and qualified-or-bare, so resolve against the known ids
  both ways (as-is, then db-qualified). Removes the `qualify` helper entirely.

Autocomplete (src/core/completions.js):
- completionContext is backtick-aware: an open backtick starts the word (so an
  accepted candidate replaces from the backtick, no doubling), and a backtick-
  quoted table before a dot is recognized as the qualified parent (unquoted) so
  `weird.tbl`.<col> column completion works. Counting backticks before the caret
  avoids mistaking a closed run for an open one.

Tests: parseMvTarget {db,table} + backticked/unqualified cases; whole-graph
fixtures for a backticked TO target + qualified/unqualified dotted AST sources
(incl. no-phantom-node assertion); completionContext open-backtick / closed-run /
backticked-parent cases. 876 tests pass; format/completions/schema-graph gates hold.

Deferred (noted in #44): a guard test/lint chokepoint, and bare dashed typing
without a leading backtick (inherently ambiguous with subtraction).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01QGBS74oUsXarGkCRQKEFLu
Correctness:
- completionContext no longer counts raw backticks across the whole buffer to
  decide "inside an open backtick" — a backtick in a string literal, a comment,
  an earlier statement, or an escaped `\`` flipped the parity and put the rest of
  the document into a bogus open-quote mode (word = a giant multi-token slice →
  autocomplete silently died, and accepting could delete text via the stale
  `from`). Now detects an open backtick identifier via the SQL tokenizer
  (openBacktickStart), which keeps string/comment backticks inside their own
  tokens and handles escapes. The bare-identifier path is unchanged.
- parseMvTarget only scans up to the first '(' (and before AS SELECT), so a stray
  " TO " inside a column comment or the SELECT body is no longer mistaken for the
  MV target — which previously produced a phantom target node AND suppressed the
  real `.inner` writes edge on implicit MVs.

Cleanup (also from the review):
- Consolidate the backtick-unescape into format.js `unquoteIdent` (was duplicated
  in schema-graph.js unBacktick and completions.js identBefore); both now call it.
- Hoist the doubly-computed joinId in the AST source resolver.
- Correct the AST-resolution comment: a CTE that shadows a real same-db table
  resolves to that table (can't be told apart from the name alone).

Tests: backtick-in-string / -comment / closed-run-earlier completion cases;
stray-TO-in-column-comment parseMvTarget case. 880 tests pass; gates hold.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01QGBS74oUsXarGkCRQKEFLu
@BorisTyshkevich BorisTyshkevich merged commit 96ec880 into main Jun 26, 2026
4 checks passed
@BorisTyshkevich BorisTyshkevich deleted the feat/issue-44-dotted-names branch June 26, 2026 05:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Non-bare identifiers: deepen autocomplete matching + EXPLAIN-AST/MV-target lineage

1 participant