[DRAFT] MAINT: Production schema migration guard + add deliberate migration script for release#2028
[DRAFT] MAINT: Production schema migration guard + add deliberate migration script for release#2028jsong468 wants to merge 2 commits into
Conversation
| **Run the migration** (reads `AZURE_SQL_DB_CONNECTION_STRING_PROD` from `~/.pyrit/.env`): | ||
|
|
||
| ```bash | ||
| python build_scripts/migrate_prod_memory_schema.py --target-revision <revision_id> |
There was a problem hiding this comment.
I think we always want to migrate to the latest revision available in the release. so we can probaly merge these two together and not require someone to print the revision Id, just to paste it to another command)
(or just use alembic's UPGRADE command)
There was a problem hiding this comment.
I was thinking this could safeguard against someone accidentally migrating to a version later than what we want for release but happy to merge these together!
| @@ -0,0 +1,317 @@ | |||
| # Copyright (c) Microsoft Corporation. | |||
There was a problem hiding this comment.
I'm wondering if adding a 300 line script is a bit of overhead, with a lot of overlap with other migration code (i.e. the run_migrations functions there which upgrades a db given its engine.
I think this should be a thin wrapper that constructs an AzureSQLMemory with skip=true and the prod connection string, then calls its _run_schema_migration ... no?
Description
PyRIT's
AzureSQLMemoryrunsalembic upgrade headautomatically on construction, meaning any user connecting to a shared Azure SQL database triggers a schema upgrade using whatever migration files are on their machine. This PR prevents accidental schema modifications to production and adds a deliberate release-time migration path.Runtime Guard (
AzureSQLMemory)AZURE_SQL_DB_CONNECTION_STRING_PROD, the constructor now runs check-only mode instead of full migrationskip_schema_migration=Truebypasses both guard and migration entirely_check_schema_migrationonMemoryInterface_run_schema_migrationto keep both code paths at the same abstraction level_run_schema_migration: upgrade + verify (unchanged)_check_schema_migration: verify only (new)_runprevents callers from accidentally skipping verificationRelease Migration Script (
build_scripts/migrate_prod_memory_schema.py)--target-revision(explicit revision ID, nothead) so migration is deterministic and tied to the exact releasereleases/branch, clean working tree inpyrit/memory/, no.devversion suffixAZURE_SQL_DB_CONNECTION_STRING_PRODfrom~/.pyrit/.envautomatically (same files asinitialize_pyrit_async), loaded withoverride=Falseso explicit env vars take precedencecheck_schema_migrationsafter upgrade to confirm models matchengine.begin()(transactional DDL)--skip-environment-checkflag for CI pipelines where Git state may differmemory_migrations.py headSubcommandgenerateandchecksubcommandsDesign Decisions
main(with unreleased schema changes) connects to prod, the guard detects the mismatch but only logs a warning — it does not block startup. This preserves the primary goal (no accidental schema modification) while keeping prod usable for querying data. The previous iteration raisedAutogenerateDiffsDetected, which blocked developers from using prod at all after any schema PR merged tomain. Question for reviewers: should the mismatch behavior be configurable (e.g., warn vs. raise)? should an error always be raised? or is warn-only the right permanent default?RuntimeErrorunconditionally when connecting to prod, but this would force every user to passskip_schema_migration=True, which also skips schema verification. Check-only lets developers connect to prod normally while ensuring the schema is never modified outside of a release.head: Using--target-revisioninstead ofupgrade headensures the migration is deterministic. If unreleased migration files exist locally,headwould apply them; an explicit revision tied to the release prevents this._check_schema_migrationas a method onMemoryInterface: Rather than importingcheck_schema_migrationsdirectly frommigration.pyinAzureSQLMemory, we wrapped it in a method on the base class. This keeps both migration paths (_runand_check) at the same abstraction level, makes it reusable forSQLiteMemoryif needed, and simplifies testing (patch.objectinstead of module-path patching).Tests and Documentation
Unit Tests Added
5 tests in
test_azure_sql_memory.py(prod guard):skip_schema_migration=Truebypasses guard and migration4 tests in
test_migration.py:_check_schema_migrationdelegates tocheck_schema_migrations_check_schema_migrationlogs a warning on mismatch instead of raising_check_schema_migrationraisesRuntimeErrorwhen engine is Nonememory_migrations.py headoutputs a valid hex revision IDManual E2E Verification
.envloading works correctly (override=False)Documentation
doc/contributing/10_release_process.md