AIR CLI Integration: Adding support for air run configuration by riddhibhagwat-db · Pull Request #5657 · databricks/cli

riddhibhagwat-db · 2026-06-18T18:42:42Z

Changes

Ports the air run YAML config schema and its structural validation from the Python CLI (cli/sdk/config.py) to Go, under experimental/air/cmd/.

Schema (runconfig.go): the top-level runConfig plus the nested environment (with docker_image), code_source/snapshot/git, and permission blocks. Reuses the compute model from the parent branch. Includes custom YAML unmarshalers for the three polymorphic fields that don't map to a single Go type: environment.dependencies (string path or inline list), environment.version (string or int), and git.remote (bool or remote-name string).
Loader (runconfig_load.go): loadRunConfig decodes a YAML file with KnownFields(true) — mirroring pydantic's extra="forbid" so unknown keys are rejected — then runs the validation pass.
Validation: every structural rule from the Python schema — required fields, the experiment_name/mlflow_run_name task-key regex and length caps, secret-ref scope/key format, the environment docker-image/dependencies/version exclusivity rules, git branch-xor-commit and remote-requires-branch rules, code_source snapshot requirements, and include_paths relative/no-traversal checks.

Two deliberate divergences from the Python schema, both following from the training-service-only port:

The compute.node_pool_id / compute.pool_name fields were already dropped on the parent branch.
The top-level priority field is dropped here: it's a node-pool queue-ordering knob (it requires a pool in Python) with no meaning for serverless workloads.

Why

"Structural" validation (types, required fields, format/cross-field rules) needs no workspace access, so it's a self-contained, fully unit-testable unit that's worth landing on its own ahead of the launch logic. Splitting it out keeps the upcoming handle_run PR focused on orchestration rather than mixing in ~900 lines of schema.

The extra="forbid" / KnownFields behavior is load-bearing: it's what turns a typo'd or stale config key into an actionable error instead of a silently-ignored field, so it's preserved faithfully. This is stacked on air-integration-m2-1 (the compute model).

Tests

New unit tests in runconfig_test.go (62 subtests, table-driven), covering:

Loading a minimal config and a full-featured config (all blocks populated).
Each polymorphic union decoding both of its forms (dependencies string vs list, git.remote bool vs string, default-unset).
Unknown-field rejection at top level and nested — including explicit cases asserting the dropped priority field and the not-yet-ported bases key surface as errors.
Every validation rule's failure mode, plus file-level errors (missing file, empty file).

go test ./experimental/air/... passes; ./task lint-q reports 0 issues.

github-actions · 2026-06-18T18:44:32Z

Waiting for approval

Could not determine reviewers from git history.
Round-robin suggestion: @apeforest

Eligible reviewers: @apeforest, @ben-hansen-db, @bfontain, @lu-wang-dl, @maggiewang-db, @panchalhp-db, @pardis-beikzadeh-db, @vinchenzo-db

_{Suggestions based on git history. See OWNERS for ownership rules.}

eng-dev-ecosystem-bot · 2026-06-18T19:13:09Z

Integration test report

Commit: 226d41a

Run: 28137603360

	Env	💚RECOVERED	🙈SKIP	✅pass	🙈skip	Time
💚	aws linux	8	13	263	1015	5:41
💚	aws windows	8	13	265	1013	6:29
💚	aws-ucws linux	8	13	359	929	6:10
💚	aws-ucws windows	8	13	361	927	7:39
💚	azure linux	2	15	266	1013	4:58
💚	azure windows	2	15	268	1011	5:54
💚	azure-ucws linux	2	15	364	925	6:20
💚	azure-ucws windows	2	15	366	923	6:28
💚	gcp linux	2	15	262	1016	5:27
💚	gcp windows	2	15	264	1014	5:46

21 interesting tests: 13 SKIP, 8 RECOVERED

	Test Name	aws linux	aws windows	aws-ucws linux	aws-ucws windows	azure linux	azure windows	azure-ucws linux	azure-ucws windows	gcp linux	gcp windows
💚	TestAccept	💚R	💚R	💚R	💚R	💚R	💚R	💚R	💚R	💚R	💚R
🙈	TestAccept/bundle/invariant/no_drift	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S
🙈	TestAccept/bundle/resources/permissions	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S
💚	TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/with_permissions	💚R	💚R	💚R	💚R	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S
💚	TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/with_permissions/DATABRICKS_BUNDLE_ENGINE=direct	💚R	💚R	💚R	💚R
💚	TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/with_permissions/DATABRICKS_BUNDLE_ENGINE=terraform	💚R	💚R	💚R	💚R
💚	TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/without_permissions	💚R	💚R	💚R	💚R	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S
💚	TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/without_permissions/DATABRICKS_BUNDLE_ENGINE=direct	💚R	💚R	💚R	💚R
💚	TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/without_permissions/DATABRICKS_BUNDLE_ENGINE=terraform	💚R	💚R	💚R	💚R
🙈	TestAccept/bundle/resources/postgres_branches/basic	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S
🙈	TestAccept/bundle/resources/postgres_branches/recreate	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S
🙈	TestAccept/bundle/resources/postgres_branches/replace_existing	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S
🙈	TestAccept/bundle/resources/postgres_branches/update_protected	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S
🙈	TestAccept/bundle/resources/postgres_branches/without_branch_id	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S
🙈	TestAccept/bundle/resources/postgres_endpoints/basic	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S
🙈	TestAccept/bundle/resources/postgres_projects/update_display_name	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S
🙈	TestAccept/bundle/resources/synced_database_tables/basic	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S
🙈	TestAccept/bundle/resources/vector_search_endpoints/drift/recreated_same_name	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S
🙈	TestAccept/bundle/resources/vector_search_indexes/recreate/embedding_dimension	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S
🙈	TestAccept/ssh/connection	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S	🙈S
💚	TestFetchRepositoryInfoAPI_FromRepo	💚R	💚R	💚R	💚R	💚R	💚R	💚R	💚R	💚R	💚R

Top 20 slowest tests (at least 2 minutes):

duration	env	testname
4:29	gcp windows	TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
4:14	gcp linux	TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
4:04	gcp linux	TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
4:00	gcp windows	TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
3:51	aws linux	TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
3:29	aws windows	TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
3:27	azure-ucws linux	TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
3:06	aws windows	TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
3:02	aws-ucws linux	TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
3:01	azure linux	TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
2:57	aws-ucws windows	TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
2:47	azure-ucws windows	TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
2:45	aws-ucws windows	TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
2:43	azure windows	TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
2:43	azure windows	TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
2:37	azure-ucws linux	TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
2:35	azure-ucws windows	TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
2:33	aws linux	TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
2:29	azure linux	TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
2:29	aws-ucws linux	TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct

Port the run YAML schema and its structural validation from the Python CLI's sdk/config.py: the top-level runConfig plus the environment, docker_image, code_source/snapshot/git, and permission blocks. loadRunConfig decodes a YAML file with KnownFields (mirroring pydantic extra="forbid") and runs the validation pass. "Structural" covers types, required fields, and format/cross-field rules that need no workspace access. Online checks (compute pool resolution, GPU availability), git/filesystem checks, _bases_ composition, and CLI --override handling are deferred to later milestones. Two deliberate divergences from the Python schema, both following from the training-service-only port: the compute pool fields were already dropped, and the top-level priority field is dropped here since it is a node-pool queue-ordering knob with no meaning for serverless workloads. Co-authored-by: Isaac

riddhibhagwat-db · 2026-06-25T00:21:25Z

+
+// validate runs structural validation over the whole config, returning the first
+// failure. Fields are checked in declaration order to keep error output stable.
+func (c *runConfig) validate() error {


@ben-hansen-db @vinchenzo-db

riddhibhagwat-db · 2026-06-25T00:21:44Z

+	DockerImage  *dockerImageConfig `yaml:"docker_image"`
+}
+
+func (e *environmentConfig) validate() error {


@ben-hansen-db @vinchenzo-db

riddhibhagwat-db · 2026-06-25T00:21:55Z

+	URL string `yaml:"url"`
+}
+
+func (d *dockerImageConfig) validate() error {


@ben-hansen-db @vinchenzo-db

riddhibhagwat-db · 2026-06-25T00:22:05Z

+	Snapshot *snapshotSourceConfig `yaml:"snapshot"`
+}
+
+func (c *codeSourceConfig) validate() error {


@ben-hansen-db @vinchenzo-db

riddhibhagwat-db · 2026-06-25T00:22:16Z

+	IncludePaths []string `yaml:"include_paths"`
+}
+
+func (s *snapshotSourceConfig) validate() error {


@ben-hansen-db @vinchenzo-db

riddhibhagwat-db · 2026-06-25T00:22:25Z

+	Remote gitRemote `yaml:"remote"`
+}
+
+func (g *gitRef) validate() error {


@ben-hansen-db @vinchenzo-db

riddhibhagwat-db · 2026-06-25T00:22:35Z

+	Level string `yaml:"level"`
+}
+
+func (p *permission) validate() error {


@ben-hansen-db @vinchenzo-db

riddhibhagwat-db · 2026-06-25T00:23:33Z

+	"go.yaml.in/yaml/v3"
+)
+
+// loadRunConfig reads a run YAML config file, decodes it into the schema, and


@vinchenzo-db @ben-hansen-db

riddhibhagwat-db · 2026-06-25T00:23:57Z

+	assert.Equal(t, 1, cfg.Compute.NumAccelerators)
+}
+
+func TestLoadRunConfig_FullFeatured(t *testing.T) {


@vinchenzo-db @ben-hansen-db

vinchenzo-db

maybe a question for @maggiewang-db or @ben-hansen-db, is there a way to make sure this run config doesn't diverge from the run config in the python cli?

vinchenzo-db · 2026-06-25T01:29:56Z

+//
+// The `_bases_` composition feature and CLI `--override` handling are not yet
+// ported; a config using `_bases_` is currently rejected as an unknown field.
+func loadRunConfig(path string) (*runConfig, error) {


I would split the decode step and the validate step here. Technically, decodeRunConfig + validateRunConfig makes more sense here (AIP-wise)

vinchenzo-db · 2026-06-25T01:39:35Z

+type gitRef struct {
+	Branch *string   `yaml:"branch"`
+	Commit *string   `yaml:"commit"`
+	Remote gitRemote `yaml:"remote"`


I'm not very good at Go but is the gitRemote flag used anywhere outside of tests?

vinchenzo-db · 2026-06-25T01:41:52Z

+	EnvVariables   map[string]string  `yaml:"env_variables"`
+	Secrets        map[string]string  `yaml:"secrets"`


Make sure you check that these aren't pointing to the same env variable.... and if so I imagine EnvVariable should take precedence, but ask @maggiewang-db

Reasoning is that if user accidentally sets env var and secret, and doesn't know which one, if secret takes precedence and they try to read it they make accidentally leak their secret.

vinchenzo-db · 2026-06-25T01:42:38Z

+	MaxRetries                *int           `yaml:"max_retries"`
+	TimeoutMinutes            *int           `yaml:"timeout_minutes"`
+	IdempotencyToken          *string        `yaml:"idempotency_token"`
+	Parameters                map[string]any `yaml:"parameters"`


Are you sure map[string]any is best here? Maybe some validation to make sure people aren't injecting bad inputs.

vinchenzo-db · 2026-06-25T01:43:52Z

+// validateExperimentName enforces the Databricks Jobs API task_key constraints:
+// the experiment_name becomes a task key, which caps at 100 characters and allows
+// only alphanumerics, hyphens, and underscores.


do we need to do something similar with mlflow_run_name?

riddhibhagwat-db temporarily deployed to test-trigger-is June 18, 2026 18:48 — with GitHub Actions Inactive

riddhibhagwat-db requested review from ben-hansen-db and maggiewang-db June 18, 2026 21:10

riddhibhagwat-db self-assigned this Jun 18, 2026

riddhibhagwat-db force-pushed the air-integration-m2-2 branch from 73088e7 to 373988d Compare June 23, 2026 17:10

riddhibhagwat-db mentioned this pull request Jun 24, 2026

AIR CLI Integration: air run end to end command #5710

Open

riddhibhagwat-db force-pushed the air-integration-m2-2 branch from 373988d to 226d41a Compare June 25, 2026 00:01

riddhibhagwat-db temporarily deployed to test-trigger-is June 25, 2026 00:02 — with GitHub Actions Inactive

riddhibhagwat-db commented Jun 25, 2026

View reviewed changes

vinchenzo-db reviewed Jun 25, 2026

View reviewed changes

		EnvVariables map[string]string `yaml:"env_variables"`
		Secrets map[string]string `yaml:"secrets"`

Uh oh!

Conversation

riddhibhagwat-db commented Jun 18, 2026

Changes

Why

Tests

Uh oh!

github-actions Bot commented Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Waiting for approval

Uh oh!

eng-dev-ecosystem-bot commented Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Integration test report

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

vinchenzo-db left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

github-actions Bot commented Jun 18, 2026 •

edited

Loading

eng-dev-ecosystem-bot commented Jun 18, 2026 •

edited

Loading