Metastore static#1704
Conversation
WalkthroughThe PR adds a shared metastore singleton during startup, extends ChangesShared metastore access
Sequence Diagram(s)sequenceDiagram
participant PARSEABLE
participant METASTORE
participant Cli
participant ObjectStoreMetastore
participant ParseableNew as Parseable::new
PARSEABLE->>METASTORE: get_or_init(shared Arc<dyn Metastore>)
METASTORE->>Cli: try_parse()
METASTORE->>ObjectStoreMetastore: construct shared metastore
PARSEABLE->>ParseableNew: metastore.clone()
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 3 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (3 passed)
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
src/metastore/metastores/object_store_metastore.rs (1)
868-887: 🎯 Functional Correctness | 🟠 Major | ⚡ Quick winWire the migration flag or remove it.
src/migration/mod.rsLine 310 now passestrue, but this implementation names the parameter_is_migrationand fetches the same path regardless. That makes the new migration flag a no-op.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/metastore/metastores/object_store_metastore.rs` around lines 868 - 887, The migration flag is currently ignored in get_stream_json, so the true value passed from the migration flow has no effect. Update get_stream_json to use the is_migration state when choosing the object path, or remove the parameter if the same path is always correct; make sure the behavior matches the call site in migration/mod.rs and that the path logic in get_base/stream_json_path reflects the intended migration-specific lookup.src/parseable/mod.rs (1)
140-183: 🎯 Functional Correctness | 🟠 Major | ⚡ Quick winParse CLI once before initializing the metastore.
prompt_missing_envs,Cli::try_parse, andsave_collected_envsrun insideMETASTORE.get_or_initand then again immediately after. This duplicates startup side effects and risks constructing the shared metastore from a different parse snapshot than theParseableinstance.Proposed shape
pub static PARSEABLE: Lazy<Parseable> = Lazy::new(|| { - let metastore = METASTORE.get_or_init(|| { - let collected_envs = crate::interactive::prompt_missing_envs(); - - let cli = match Cli::try_parse() { - Ok(cli) => { - crate::interactive::save_collected_envs(&collected_envs); - cli - } - Err(e) => { - e.exit(); - } - }; - let storage = match &cli.storage { - StorageOptions::Local(args) => args.storage.construct_client(), - StorageOptions::S3(args) => args.storage.construct_client(), - StorageOptions::Blob(args) => args.storage.construct_client(), - StorageOptions::Gcs(args) => args.storage.construct_client(), - }; - tracing::warn!("creating objectstoremetastore"); - let metastore = ObjectStoreMetastore { storage }; - Arc::new(metastore) - }); - let collected_envs = crate::interactive::prompt_missing_envs(); let cli = match Cli::try_parse() { Ok(cli) => { crate::interactive::save_collected_envs(&collected_envs); @@ e.exit(); } }; + + let metastore = METASTORE.get_or_init(|| { + let storage = match &cli.storage { + StorageOptions::Local(args) => args.storage.construct_client(), + StorageOptions::S3(args) => args.storage.construct_client(), + StorageOptions::Blob(args) => args.storage.construct_client(), + StorageOptions::Gcs(args) => args.storage.construct_client(), + }; + Arc::new(ObjectStoreMetastore { storage }) as Arc<dyn Metastore> + }); + match cli.storage {🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/parseable/mod.rs` around lines 140 - 183, Parse the CLI and prompt/save env vars only once before calling METASTORE.get_or_init, since `prompt_missing_envs`, `Cli::try_parse`, and `save_collected_envs` are currently duplicated and can produce inconsistent startup state. Move that shared setup out of the `get_or_init` closure, then pass the already-parsed `cli` into the metastore construction path that creates `ObjectStoreMetastore` and `storage.construct_client()`.
🧹 Nitpick comments (2)
src/metastore/metastore_traits.rs (1)
222-233: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winDocument the new migration contract.
The trait docs still only define
get_base; callers and implementations have no contract for whatis_migrationchanges. Add a short doc sentence here so the object-store implementation and migration call sites stay aligned.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/metastore/metastore_traits.rs` around lines 222 - 233, The trait docs for get_stream_json only describe get_base and do not explain the new is_migration contract. Update the doc comment on MetastoreTraits::get_stream_json to briefly state what is_migration changes for callers and implementations, so object-store behavior and migration call sites stay aligned. Keep the wording near the existing get_base explanation and make sure the intent is clear without changing the method signature.src/parseable/mod.rs (1)
102-103: 🗄️ Data Integrity & Integration | 🔵 TrivialKeep
METASTOREprivate.parseable::parseable::METASTOREis public, so any caller can seed theOnceCellbeforePARSEABLEinitializes and force reuse of the wrong metastore. Expose a crate-local initializer or read-only accessor instead.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/parseable/mod.rs` around lines 102 - 103, The METASTORE singleton is publicly exposed, which allows external callers to seed it before PARSEABLE initializes and lock in the wrong metastore. Make METASTORE private in parseable::mod and keep the initialization path inside the module, then provide a crate-local initializer or a read-only accessor tied to PARSEABLE so other code can only retrieve the metastore, not set it.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In `@src/metastore/metastores/object_store_metastore.rs`:
- Around line 868-887: The migration flag is currently ignored in
get_stream_json, so the true value passed from the migration flow has no effect.
Update get_stream_json to use the is_migration state when choosing the object
path, or remove the parameter if the same path is always correct; make sure the
behavior matches the call site in migration/mod.rs and that the path logic in
get_base/stream_json_path reflects the intended migration-specific lookup.
In `@src/parseable/mod.rs`:
- Around line 140-183: Parse the CLI and prompt/save env vars only once before
calling METASTORE.get_or_init, since `prompt_missing_envs`, `Cli::try_parse`,
and `save_collected_envs` are currently duplicated and can produce inconsistent
startup state. Move that shared setup out of the `get_or_init` closure, then
pass the already-parsed `cli` into the metastore construction path that creates
`ObjectStoreMetastore` and `storage.construct_client()`.
---
Nitpick comments:
In `@src/metastore/metastore_traits.rs`:
- Around line 222-233: The trait docs for get_stream_json only describe get_base
and do not explain the new is_migration contract. Update the doc comment on
MetastoreTraits::get_stream_json to briefly state what is_migration changes for
callers and implementations, so object-store behavior and migration call sites
stay aligned. Keep the wording near the existing get_base explanation and make
sure the intent is clear without changing the method signature.
In `@src/parseable/mod.rs`:
- Around line 102-103: The METASTORE singleton is publicly exposed, which allows
external callers to seed it before PARSEABLE initializes and lock in the wrong
metastore. Make METASTORE private in parseable::mod and keep the initialization
path inside the module, then provide a crate-local initializer or a read-only
accessor tied to PARSEABLE so other code can only retrieve the metastore, not
set it.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: da92fdf1-d62a-4b50-88dc-7bd72422f577
📒 Files selected for processing (11)
src/catalog/mod.rssrc/enterprise/utils.rssrc/handlers/http/logstream.rssrc/lib.rssrc/metastore/metastore_traits.rssrc/metastore/metastores/object_store_metastore.rssrc/migration/mod.rssrc/parseable/mod.rssrc/query/mod.rssrc/query/stream_schema_provider.rssrc/storage/object_storage.rs
Modifications to introduce a new static variable which holds the Metastore implementation at runtime
Description
This PR has:
Summary by CodeRabbit