feat(schema): non-bare identifiers in lineage + autocomplete (closes #44)#45
Merged
Conversation
) Closes the deeper dotted/backticked-name gaps left by #43, verified against fixtures captured from Docker ClickHouse 26.5.1. Lineage (src/core/schema-graph.js): - parseMvTarget now parses a backtick-quoted/dotted TO target (CH backtick-quotes non-bare names in create_table_query, e.g. TO target_all.`agg.out.parquet`) and returns {db?, table} with backticks stripped. The old regex captured only the db part → a phantom `db.db` node and a wrong/absent writes edge. - EXPLAIN-AST source resolution no longer uses the ambiguous dot heuristic: AST prints names unquoted and qualified-or-bare, so resolve against the known ids both ways (as-is, then db-qualified). Removes the `qualify` helper entirely. Autocomplete (src/core/completions.js): - completionContext is backtick-aware: an open backtick starts the word (so an accepted candidate replaces from the backtick, no doubling), and a backtick- quoted table before a dot is recognized as the qualified parent (unquoted) so `weird.tbl`.<col> column completion works. Counting backticks before the caret avoids mistaking a closed run for an open one. Tests: parseMvTarget {db,table} + backticked/unqualified cases; whole-graph fixtures for a backticked TO target + qualified/unqualified dotted AST sources (incl. no-phantom-node assertion); completionContext open-backtick / closed-run / backticked-parent cases. 876 tests pass; format/completions/schema-graph gates hold. Deferred (noted in #44): a guard test/lint chokepoint, and bare dashed typing without a leading backtick (inherently ambiguous with subtraction). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01QGBS74oUsXarGkCRQKEFLu
Correctness:
- completionContext no longer counts raw backticks across the whole buffer to
decide "inside an open backtick" — a backtick in a string literal, a comment,
an earlier statement, or an escaped `\`` flipped the parity and put the rest of
the document into a bogus open-quote mode (word = a giant multi-token slice →
autocomplete silently died, and accepting could delete text via the stale
`from`). Now detects an open backtick identifier via the SQL tokenizer
(openBacktickStart), which keeps string/comment backticks inside their own
tokens and handles escapes. The bare-identifier path is unchanged.
- parseMvTarget only scans up to the first '(' (and before AS SELECT), so a stray
" TO " inside a column comment or the SELECT body is no longer mistaken for the
MV target — which previously produced a phantom target node AND suppressed the
real `.inner` writes edge on implicit MVs.
Cleanup (also from the review):
- Consolidate the backtick-unescape into format.js `unquoteIdent` (was duplicated
in schema-graph.js unBacktick and completions.js identBefore); both now call it.
- Hoist the doubly-computed joinId in the AST source resolver.
- Correct the AST-resolution comment: a CTE that shadows a real same-db table
resolves to that table (can't be told apart from the name alone).
Tests: backtick-in-string / -comment / closed-run-earlier completion cases;
stray-TO-in-column-comment parseMvTarget case. 880 tests pass; gates hold.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01QGBS74oUsXarGkCRQKEFLu
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #44 — the deeper dotted/backticked-name handling left as follow-up from #43. Verified against fixtures captured from Docker ClickHouse 26.5.1.
Lineage (
src/core/schema-graph.js)parseMvTargetcouldn't read a backtickedTOtarget. CH backtick-quotes non-bare names increate_table_query:The old regex captured only
target_all(the db) → a phantomtarget_all.target_allnode and a wrong/absentwritesedge. Now it parses a backtick-quoted/dotted compound identifier and returns{db?, table}with backticks stripped, joined viajoinId.EXPLAIN-AST source resolution. Ground truth:
EXPLAIN ASTprints names unquoted, qualified or bare (TableIdentifier target_all.part-0.snappy.parquet, or justpart-0.snappy.parquetfor a default-db ref). The oldqualifydot-heuristic mislabeled a bare dotted name as already-qualified and dropped the edge. Now sources resolve against the known ids both ways (as-is, then db-qualified) — so dotted names link correctly and CTE/alias names still drop. The ambiguousqualifyhelper is removed entirely.Autocomplete (
src/core/completions.js)completionContextis now backtick-aware:frompoints at the backtick so an accepted candidate (`part-0.snappy.parquet`) replaces from the backtick — no doubling, no mid-token splice.parent(unquoted), so`weird.tbl`.<col>column completion works.`my tbl`run isn't mistaken for an open one.rankCompletionsis unchanged — it matches on the now-unquotedword/parent.Tests
parseMvTarget→{db,table}, backticked/dotted target, unqualified target.TOtarget + qualified and unqualified dotted AST sources, plus a no-phantom-node assertion.completionContext: open-backtick word, closed-run not-open, backticked parent, open-backtick column under backticked parent.876 tests pass;
format.js/completions.js/schema-graph.jsper-file gates hold. Built + deployed to otel / antalya / github.demo (sha209e6f8).Deferred (still in #44, intentionally not here)
part-00…) — inherently ambiguous with subtraction; the supported path is to start the identifier with a backtick.🤖 Generated with Claude Code