Control Tower updates happen several times a year, and each one brings new features and security improvements. But I’ve seen plenty of teams hit a wall when they try to update their Landing Zone. You click “Update landing zone” in the console, the update starts, and then partway through it fails. The console shows “UPDATE_FAILED” and you’re left wondering what went wrong. In this post, I’ll show you exactly how to diagnose and recover from these failures.

The Problem

When Control Tower updates a Landing Zone, it deploys new baseline CloudFormation StackSets to the Log Archive and Audit accounts, and then rolls out updates to all member accounts. If any of these deployments fail, the entire update rolls back, and you’re left with a partially updated state.

Error Type Description
UPDATE_FAILED CloudFormation stack in failed state blocking update
Drift detected Manual changes detected in Log Archive or Audit account
Stack operation timeout S3 bucket conflicts or drift repairs taking too long
Permission denied Service-linked role missing required permissions
Stack creation failed CloudFormation template validation or resource limit issue

Why Does This Happen?

  • Account has a CloudFormation stack in failed state — If any member account or managed account has a stack from a previous failed Control Tower update, the new update StackSet cannot be applied until that failed stack is remediated.
  • Manual drift in Log Archive or Audit account — Control Tower expects these accounts to be in a known state. If you manually modified CloudTrail, Config, or SNS resources, drift is detected and the update is blocked until you repair it.
  • S3 bucket name conflicts — Control Tower bucket names include account IDs and regions. If you’ve created buckets with similar names or Control Tower buckets from previous updates still exist, the new update might fail on bucket creation.
  • Insufficient service-linked role permissions — The AWSControlTowerAdmin role needs permissions to deploy and update CloudFormation stacks. If the role was modified, the update fails.
  • Concurrent manual CloudFormation operations — If you’re deploying custom stacks to a member account at the same time Control Tower is updating, the StackSet operation might fail due to resource locks.

The Fix

Start by checking for drift and failed stacks. Then determine if you need to repair drift or manually fix a failed stack.

Step 1: Check for Drift

In the Control Tower console:

Control Tower → Landing Zone → Check for drift

This scan identifies accounts or OUs that have drifted from Control Tower’s baseline configuration. Wait for the scan to complete (usually 5–15 minutes).

Step 2: Review the Drift Report

The console shows which accounts or resources have drifted. For each drifted account or OU:

Control Tower → Organization → [Select drifted account/OU] → View drift details

Common drift causes: manually deleted CloudTrail trails, modified Config recorder settings, or modified SNS topics.

Step 3: Repair Drift

To repair drift, select the account or OU and click Repair:

Control Tower → Organization → [Select account/OU] → Repair

Control Tower re-applies the baseline configuration. This is safe and reversible — it only restores what should be there. Repair typically takes 10–20 minutes per account.

Step 4: Check for Failed CloudFormation Stacks

If drift repair doesn’t resolve the issue, check for failed stacks in member accounts:

aws cloudformation list-stacks \
  --stack-status-filter CREATE_FAILED UPDATE_FAILED \
  --region us-east-1 \
  --output table

If you find a failed stack, delete it (ensure it’s safe to delete first):

aws cloudformation delete-stack \
  --stack-name StackName \
  --region us-east-1

Step 5: Verify the Service-Linked Role

Check that the AWSControlTowerAdmin role has the right permissions:

aws iam get-role \
  --role-name AWSControlTowerAdmin

If the role is missing or was modified, you may need to contact AWS Support to restore it. Control Tower creates this role during Landing Zone setup, and it should not be modified.

How to Run This

  1. Open the Control Tower console in your management account.
  2. Go to Landing Zone → Check for drift and wait for the scan to complete.
  3. Review the drift details for any drifted accounts or OUs.
  4. Select each drifted resource and click Repair.
  5. Wait for repairs to complete (check CloudFormation console to monitor StackSet operations).
  6. Run the CloudFormation list-stacks command to check for failed stacks in member accounts.
  7. Delete any failed stacks you find (after confirming it’s safe).
  8. Return to Control Tower → Landing Zone → Update landing zone and click “Update” again.
  9. Monitor the update progress — it should complete successfully this time.

Is This Safe?

Yes. Repairing drift is safe — Control Tower only restores the expected baseline configuration. Deleting a failed CloudFormation stack is safe if the stack is from a previous Control Tower update (these are idempotent). However, never delete custom stacks you’ve deployed yourself.

Key Takeaway

Landing Zone update failures are usually caused by drift in managed accounts or failed CloudFormation stacks. Use the drift detection feature, repair drifted resources, and delete failed stacks. Then retry the update.


Have questions or ran into a different Control Tower issue? Connect with me on LinkedIn or X.