Skip to content

Commit 6ec357b

Browse files
authored
feat(cli): add retain command for gen2-migration (#14843)
* chore: remove deadcode (stateful resources validation) * chore: remove stateful resource validation tests * chore: remove unimplemented validations * chore: scaffold retain.ts * chore: wire retain in gen2-migration.ts * chore: add stack walker * chore: add buildRetainOperation() * chore: call the added fucntions * chore: add validation helper * chore: wire validator * chore: update implications * chore: rollback * chore: add tests * chore: update tests * chore: minor UX improvements * chore: minor UX improvements * chore: minor UX improvements * chore: minor UX improvements * chore: minor UC improvements * chore: minor UX improvements * chore: minor UX imporovements * chore: minor UX improvements * chore: minor UX improvements * chore: minor UX improvements * chore: fix retain validator rejecting nested stack re-evals * chore: update tests * chore: major UX changes * chore: walk stack hierarachy root-first * chore: wire retain into gen2-migration e2e * chore: remove flaky formatting test * chore: remove marker, skip nested-stack refs, add unlock step * chore: lazy changeset creation to avoid OBSOLETE transitions * chore: document retain subcommand * chore: skip createChangeSet when retain already applied * chore: port walker and broadened validator into lock * chore: add resource-classification map in lock * chore: add retain-everything ops to lock.forward * chore: implement buildRetainOperation in lock * chore: inline classifier and label-push helpers in lock * chore: retire old per-resource retain loop from lock * chore: update lock tests for retain-everything * chore: remove retain subcommand * chore: add 3-layer hierarchy test for lock retain walk * chore: broaden lock rollback drift whitelist for retain-everything * chore: add retain2 step for gen2-migration * chore: rename retain2 to retain * docs: document retain step for gen2-migration * docs: tighten retain doc and classifyStacks jsdoc * chore: revert lock.ts to origin/dev * docs: resolve readme conflict by taking dev credential refresh section * chore: address pr feedback on retain
1 parent 6f12bec commit 6ec357b

12 files changed

Lines changed: 822 additions & 785 deletions

File tree

.kiro/skills/gen2-migration/SKILL.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -60,6 +60,8 @@ The E2E runs the following phases in order:
6060
17. Sandbox redeploy
6161
18. Gen1 tests + Gen2 tests (final)
6262
19. Shared data tests
63+
20. Retain
64+
21. Gen1 tests + Gen2 tests (post-retain)
6365

6466
#### App directory
6567

docs/packages/amplify-cli/src/commands/gen2-migration.md

Lines changed: 15 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -5,10 +5,10 @@ the migration of Gen1 applications to Gen2. It exposes a step-based CLI workflow
55
through the complete migration process:
66

77
1. Assessing migration readiness,
8-
2. Locking the Gen1 environment,
8+
2. Locking the Gen1 environment (retains every resource in every Gen1 stack as part of this step),
99
3. Generating Gen2 code,
1010
4. Refactoring CloudFormation stacks to move stateful resources,
11-
5. Decommissioning the Gen1 environment.
11+
5. Retaining every resource below root so the user can safely delete the Gen1 root stack.
1212

1313
The `assess` subcommand is handled separately from the step lifecycle — it is read-only and does not follow the `validate → execute → rollback` pattern. All other steps return a `Plan` object that drives a unified `describe → validate → execute` lifecycle. The `Plan` encapsulates operations and renders validation reports, operations summaries, and implications — the top-level dispatcher orchestrates all steps uniformly without knowing their internals.
1414

@@ -54,6 +54,7 @@ Detailed documentation for subcommands is available in:
5454
- [assess.md](./gen2-migration/assess.md) - Migration readiness assessment
5555
- [generate.md](./gen2-migration/generate.md) - Code generation pipeline for transforming Gen1 configs to Gen2 TypeScript
5656
- [refactor.md](./gen2-migration/refactor.md) - CloudFormation stack refactoring for moving stateful resources
57+
- [retain.md](./gen2-migration/retain.md) - Apply retain policies below root so Gen1 can be deleted safely
5758

5859
## Architecture
5960

@@ -133,16 +134,16 @@ amplify gen2-migration <step> [options]
133134

134135
### Subcommands
135136

136-
| Subcommand | Description | Implementation | Status |
137-
| -------------- | --------------------------------------------------------------------- | ------------------------------------------------------- | --------------- |
138-
| `assess` | Assess migration readiness for the Gen1 environment | `assess.ts``AmplifyMigrationAssessor` | Implemented |
139-
| `clone` | Clone environment for migration | `clone.ts``AmplifyMigrationCloneStep` | NOT IMPLEMENTED |
140-
| `lock` | Lock environment and enable deletion protection on stateful resources | `lock.ts``AmplifyMigrationLockStep` | Implemented |
141-
| `generate` | Generate Gen2 backend code from Gen1 configuration | `generate.ts``AmplifyMigrationGenerateStep` | Implemented |
142-
| `refactor` | Move stateful resources from Gen1 to Gen2 stacks | `refactor/refactor.ts``AmplifyMigrationRefactorStep` | Implemented |
143-
| `shift` | Shift traffic to Gen2 | `shift.ts``AmplifyMigrationShiftStep` | NOT IMPLEMENTED |
144-
| `decommission` | Delete Gen1 environment after migration | `decommission.ts``AmplifyMigrationDecommissionStep` | Implemented |
145-
| `cleanup` | Clean up migration artifacts | `cleanup.ts``AmplifyMigrationCleanupStep` | NOT IMPLEMENTED |
137+
| Subcommand | Description | Implementation | Status |
138+
| ---------- | ------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------- | --------------- |
139+
| `assess` | Assess migration readiness for the Gen1 environment | `assess.ts``AmplifyMigrationAssessor` | Implemented |
140+
| `clone` | Clone environment for migration | `clone.ts``AmplifyMigrationCloneStep` | NOT IMPLEMENTED |
141+
| `lock` | Lock environment, apply `DeletionPolicy: Retain` to every resource in every Gen1 stack, and enable DynamoDB deletion protection | `lock.ts``AmplifyMigrationLockStep` | Implemented |
142+
| `generate` | Generate Gen2 backend code from Gen1 configuration | `generate.ts``AmplifyMigrationGenerateStep` | Implemented |
143+
| `refactor` | Move stateful resources from Gen1 to Gen2 stacks | `refactor/refactor.ts``AmplifyMigrationRefactorStep` | Implemented |
144+
| `retain` | Apply retain policies to every resource in every Gen1 stack below root | `retain.ts``AmplifyMigrationRetainStep` | Implemented |
145+
| `shift` | Shift traffic to Gen2 | `shift.ts``AmplifyMigrationShiftStep` | NOT IMPLEMENTED |
146+
| `cleanup` | Clean up migration artifacts | `cleanup.ts``AmplifyMigrationCleanupStep` | NOT IMPLEMENTED |
146147

147148
### Global Options
148149

@@ -157,7 +158,7 @@ amplify gen2-migration <step> [options]
157158

158159
**Important considerations:**
159160

160-
- The step execution order matters: lock → generate → refactor → decommission. Each step validates prerequisites from previous steps.
161+
- The step execution order matters: lock → generate → refactor → retain. Each step validates prerequisites from previous steps.
161162
- The `clone`, `shift`, and `cleanup` steps are NOT IMPLEMENTED—they throw 'Method not implemented' errors.
162163
- The `GEN2_MIGRATION_ENVIRONMENT_NAME` environment variable on the Amplify app tracks which environment is being migrated and prevents concurrent migrations.
163164
- Stateful resources (defined in `STATEFUL_RESOURCES` set) require special handling—the module prevents their deletion and enables deletion protection.
@@ -179,6 +180,5 @@ amplify gen2-migration <step> [options]
179180
- The `--skip-validations` flag bypasses safety checks—use with extreme caution in production.
180181
- Environment mismatch between local and migration target will throw an error—ensure consistency.
181182
- Rollback implementations are incomplete for most steps (throw 'Not Implemented' errors)—manual intervention may be needed on failure.
182-
- The decommission step creates a changeset to analyze resources—this can timeout for large stacks.
183183
- Cannot specify both `--rollback` and `--no-rollback` flags simultaneously.
184-
- The lock step's rollback does not disable deletion protection on DynamoDB tables (preserves safety).
184+
- The lock step's rollback removes the deny stack policy but does not undo retain policies or DynamoDB deletion protection (preserves safety).
Lines changed: 88 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,88 @@
1+
# retain
2+
3+
The retain subcommand applies `DeletionPolicy: Retain` and `UpdateReplacePolicy: Retain` to every resource in every Gen1 CloudFormation stack below the root. Once applied, the user can manually delete the Gen1 root stack and every underlying AWS resource (DynamoDB tables, S3 buckets, Cognito pools, AppSync APIs, Lambdas) survives as an orphan.
4+
5+
## Key Responsibilities
6+
7+
- Walks the Gen1 stack hierarchy pre-order (parent before children) starting from the root's children; root is excluded.
8+
- For each stack, fetches the template lazily at execute time (not at plan time) and skips the CFN round-trip entirely when every resource already has retain.
9+
- Applies retain to every resource **except** `AWS::CloudFormation::Stack` references. Leaving nested-stack references untouched keeps the parent changeset strictly additive on non-stack attributes and avoids forcing CFN to rewrite child `Properties`.
10+
- Creates a CloudFormation changeset per stack and validates it against an allow list before executing.
11+
- Rollback is not supported (`NotImplementedFault`). To undo, edit the CloudFormation templates directly.
12+
13+
## Architecture
14+
15+
```mermaid
16+
flowchart TD
17+
CLI["amplify gen2-migration retain"] --> STEP["AmplifyMigrationRetainStep"]
18+
STEP --> WALK["walkStackHierarchy(rootStackId)"]
19+
WALK -->|"pre-order DFS, excluding root"| IDS["stackIds[]"]
20+
STEP --> CLASSIFY["classifyStacks()"]
21+
CLASSIFY -->|"Map<stackId, DiscoveredResource>"| CTX["context"]
22+
IDS --> BUILDOP["buildRetainOperation(stackId, resource)"]
23+
CTX --> BUILDOP
24+
BUILDOP -->|"per-stack AmplifyMigrationOperation"| PLAN["Plan"]
25+
PLAN -->|"execute()"| EXEC["For each stack: fetchTemplate → mutate → createChangeSet → validate → executeChangeSet"]
26+
```
27+
28+
### `AmplifyMigrationRetainStep`
29+
30+
[`src/commands/gen2-migration/retain.ts`](../../../../packages/amplify-cli/src/commands/gen2-migration/retain.ts)
31+
32+
Implements the standard step lifecycle. `forward()` builds a `Plan` of per-stack operations. `rollback()` throws `NotImplementedFault`.
33+
34+
### `walkStackHierarchy`
35+
36+
Recursive pre-order DFS over `AWS::CloudFormation::Stack` resource entries. Returns every stack in the tree except the root. Pre-order is required so parents are processed before children — any parent update triggers CFN's Automatic/Dynamic re-evaluation of nested stack references, which is benign when no `Properties` actually changed but clobbers children if retain is applied in a different order.
37+
38+
### `classifyStacks`
39+
40+
Builds `Map<stackId, DiscoveredResource>` that associates each nested stack with its Amplify `DiscoveredResource`. Used purely for UX: `Plan.describe` groups operations under `Resource: <category>/<name> (<service>)` headers, and the execute-time spinner carries matching labels.
41+
42+
For AppSync, the api-stack and every one of its nested children (per-model stacks, ConnectionStack, FunctionDirectiveStack, CustomResourcesjson) share the same api resource.
43+
44+
Stacks not classified fall through to the default `Project` group with stack-name-only labels.
45+
46+
### `buildRetainOperation`
47+
48+
Returns one `AmplifyMigrationOperation` per stack. The operation's `execute()` is lazy — it fetches the template, filters to non-`AWS::CloudFormation::Stack` resources, mutates their `DeletionPolicy` / `UpdateReplacePolicy` to `Retain`, creates the changeset, validates it via `isAllowedRetainChangeset`, and executes it.
49+
50+
Idempotent on reruns: if every target resource already has retain, the whole CFN round-trip is skipped. A second short-circuit handles the case where CFN's own "no changes" detection elides an edit (`Custom::*` resources with empty Properties — see [cloudformation-coverage-roadmap#1543](https://github.com/aws-cloudformation/cloudformation-coverage-roadmap/issues/1543)).
51+
52+
### `isAllowedRetainChangeset`
53+
54+
Allow-lists a retain-only changeset. Accepts two kinds of changes:
55+
56+
- Direct `DeletionPolicy` or `UpdateReplacePolicy` edits targeting `Retain`.
57+
- CFN's own no-op Automatic/Dynamic re-evaluations on `AWS::CloudFormation::Stack` references, emitted on every parent update. These are bookkeeping — they don't trigger child reconciliation because no `Properties` value actually changed.
58+
59+
Any other change (Properties edits on non-stack resources, Add, Remove, Replacement=True) throws `MigrationError`.
60+
61+
## Design Notes
62+
63+
### Why skip the root stack
64+
65+
Updating root post-refactor risks cascading TemplateURL reconciliation through the entire tree. The root is left alone; the user manually deletes it after retain completes, and CFN cascades through the already-retained children.
66+
67+
### Why skip `AWS::CloudFormation::Stack` entries
68+
69+
Adding retain to a nested stack reference would be a no-op for child protection — the child's own retain state is what matters when the cascade delete hits it. Leaving the reference entry untouched keeps the parent changeset narrow (only non-stack attributes change) and keeps Plan output readable.
70+
71+
### Why lazy over eager
72+
73+
**Eager:** create all N changesets up front during `forward()` / plan time, then execute them sequentially. The problem — when any parent's changeset is executed, every child stack's pre-created changeset goes `OBSOLETE` because CFN marks pending changesets stale on any stack update. You end up re-creating most of the changesets anyway.
74+
75+
**Lazy:** defer changeset creation until each operation's `execute()` runs. Each stack's round-trip is `fetchTemplate → createChangeSet → executeChangeSet`, back-to-back, with no gap for the changeset to go stale. The template and parameters reflect the current deployed state at the moment we create the changeset.
76+
77+
Lazy wins because it avoids the OBSOLETE churn and keeps each operation self-contained.
78+
79+
### Why pre-order
80+
81+
Parent update must land before child update. If a child is retained first and the parent is updated next, CFN emits an Automatic/Dynamic re-evaluation on that child's reference. The re-evaluation is structurally benign (Target.Attribute=Properties, no actual value diff) but ordering matters for operator confidence — pre-order ensures the changeset inventory at each step is understandable.
82+
83+
## AI Development Notes
84+
85+
- The step runs after `lock`, `generate`, `refactor`, and user-side Gen2 sandbox validation in the e2e flow. At that point Gen1 stacks have drifted from their S3 `TemplateURL` (refactor moved resources without re-uploading child templates). Updating intermediates or root post-refactor is unsafe — the "skip `AWS::CloudFormation::Stack` entries in parent templates" rule is what makes updating intermediates safe here.
86+
- Resources gated by a false `Condition` (for example the AppSync-generated `CustomResourcesjson` stack's `EmptyResource`, which has `Condition: AlwaysFalse`) are never deployed. A retain-only edit on such a resource produces an empty changeset — CFN returns "didn't contain changes" and the `!changeset` branch treats it as a no-op. The resource doesn't actually exist in the running stack, so there's nothing to retain. Purely cosmetic.
87+
- To undo retain, run the retain templates through CloudFormation manually without the `DeletionPolicy`/`UpdateReplacePolicy` attributes. The step itself has no rollback path.
88+
- When adding a new resource type to `KNOWN_RESOURCE_KEYS`, update `classifyStacks` — the exhaustive switch will force the compiler to flag it.

packages/amplify-cli/package.json

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -87,7 +87,6 @@
8787
"amplify-nodejs-function-runtime-provider": "2.5.33",
8888
"amplify-python-function-runtime-provider": "2.4.55",
8989
"aws-cdk-lib": "~2.189.1",
90-
"bottleneck": "2.19.5",
9190
"cdk-from-cfn": "^0.291.0",
9291
"chalk": "^4.1.1",
9392
"ci-info": "^3.8.0",

0 commit comments

Comments
 (0)