Troubleshoot Landing Zone StackSets Failing Across OUs

I was helping a customer troubleshoot why their Landing Zone StackSet was failing for some accounts but not others. They had 40 accounts spread across multiple OUs, and randomly 5-6 of them would show “OUTDATED” status. After checking IAM execution roles and CloudFormation stack status across all accounts, we discovered three things: two accounts had their execution roles deleted, one account was in a suspended state in AWS Organizations, and one had an SCP that was blocking CloudFormation. In this post, I’ll walk through exactly what causes StackSet failures across OUs and how to fix them.

The Problem

Your Landing Zone StackSets deploy successfully to most accounts but fail for some. You see operations with “OUTDATED” or “FAILED” status in the StackSet console. Some accounts have the baseline stack, others don’t. The issue is inconsistent—it’s not all accounts in a specific OU, just random ones.

StackSet Instance Status	Meaning
CURRENT	Stack successfully deployed and up-to-date
OUTDATED	Stack exists but differs from StackSet template (usually harmless)
FAILED	Stack deployment or update failed, explicit error shown
INPROGRESS	Stack deployment currently running

Why Does This Happen?

CloudFormation StackSets deploy stacks to multiple accounts and regions in parallel. If any account has permissions issues, resource conflicts, or is in an invalid state, the deployment fails for that account while others succeed:

Missing execution role: The AWSCloudFormationStackSetExecutionRole doesn’t exist in some accounts. This can happen if an account was provisioned before the role was added to your baseline, or if the role was accidentally deleted.
Execution role doesn’t trust management account: The role exists but the trust policy doesn’t allow the management account to assume it. This breaks cross-account deployment.
Account is suspended or closed: AWS Organizations marks accounts as “SUSPENDED” or “CLOSED” under certain conditions (payment issues, terms violation). StackSets can’t deploy to these accounts.
SCP blocks CloudFormation: If an account has a Service Control Policy that denies cloudformation:* or specific CF actions, the StackSet can’t deploy.
Concurrent StackSet operations limit: If you’re deploying multiple StackSets in parallel to the same account, AWS might throttle you (limit is 2,500 concurrent account operations per Organization).
Mix of self-managed and service-managed permissions: StackSets can use either self-managed (you create execution roles) or service-managed (AWS does it) permissions. If some accounts use one model and others use another, deployments are inconsistent.

The Fix

Step 1: List All StackSet Instances and Find Failures

Start by identifying which accounts are failing:

aws cloudformation list-stack-instances \
  --stack-set-name AWS-Landing-Zone-Baseline \
  --filters Name=STATUS,Values=OUTDATED,FAILED \
  --region us-east-1

This shows all stack instances that are not in “CURRENT” status. Note the account IDs.

Get more details on a specific instance:

aws cloudformation describe-stack-instance \
  --stack-set-name AWS-Landing-Zone-Baseline \
  --stack-instance-account 123456789012 \
  --stack-instance-region us-east-1 \
  --region us-east-1

The output might include StackInstanceStatus.StatusReason with the specific error.

Step 2: Check the Execution Role in Each Failed Account

For each account that’s failing, verify the execution role exists:

# Assume role into the failed account
aws sts assume-role \
  --role-arn arn:aws:iam::123456789012:role/OrganizationAccountAccessRole \
  --role-session-name stackset-check \
  --duration-seconds 3600

# Then check if the execution role exists
aws iam get-role \
  --role-name AWSCloudFormationStackSetExecutionRole

If the role doesn’t exist, you need to create it. Get the trust policy from the AWS Landing Zone documentation and create the role:

# Create the execution role in the target account
aws iam create-role \
  --role-name AWSCloudFormationStackSetExecutionRole \
  --assume-role-policy-document file://trust-policy.json

Step 3: Verify the Trust Relationship

The execution role must have a trust relationship that allows the management account to assume it. Check the trust policy:

# In the target account, get the trust policy
aws iam get-role \
  --role-name AWSCloudFormationStackSetExecutionRole \
  --query 'Role.AssumeRolePolicyDocument'

The policy should allow the management account and the StackSet service:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::MANAGEMENT-ACCOUNT-ID:role/aws-accelerator-role"
      },
      "Action": "sts:AssumeRole"
    }
  ]
}

If the trust policy is wrong, update it:

aws iam update-assume-role-policy \
  --role-name AWSCloudFormationStackSetExecutionRole \
  --policy-document file://trust-policy.json

Step 4: Check Account Status in Organizations

Verify the account isn’t suspended or closed:

aws organizations describe-account \
  --account-id 123456789012

Look for Status: ACTIVE. If it’s SUSPENDED or CLOSED, you need to contact AWS Support to resolve the account status.

Step 5: Check for Blocking SCPs

List all SCPs attached to the account:

aws organizations list-policies-for-target \
  --target-id 123456789012 \
  --filter SERVICE_CONTROL_POLICY

For each SCP, check if it denies CloudFormation actions:

aws organizations describe-policy \
  --policy-id p-xxxxxxxxxx

If an SCP denies cloudformation:* or cloudformation:CreateStack/UpdateStack/DeleteStack, that’s your problem. Either:

Detach the SCP from that account
Modify the SCP to allow CloudFormation
Add an exemption for the StackSet execution role

Step 6: Check for CloudFormation Stack Issues

If the stack instance is “OUTDATED”, describe the actual stack in the target account:

# Assume role into target account
aws sts assume-role \
  --role-arn arn:aws:iam::123456789012:role/OrganizationAccountAccessRole \
  --role-session-name cf-check

# Then check the stack
aws cloudformation describe-stacks \
  --stack-name AWS-Landing-Zone-Baseline \
  --region us-east-1

If the stack is in “CREATE_FAILED” or “UPDATE_FAILED” state, describe the stack events to see what failed:

aws cloudformation describe-stack-events \
  --stack-name AWS-Landing-Zone-Baseline \
  --region us-east-1 \
  --query 'StackEvents[?ResourceStatus==`CREATE_FAILED` || ResourceStatus==`UPDATE_FAILED`]'

Step 7: Retry StackSet Deployment

Once you’ve fixed the underlying issues, retry the StackSet operation:

aws cloudformation update-stack-set \
  --stack-set-name AWS-Landing-Zone-Baseline \
  --template-body file://template.yaml \
  --region us-east-1

Or create new stack instances for accounts that don’t have them:

aws cloudformation create-stack-instances \
  --stack-set-name AWS-Landing-Zone-Baseline \
  --accounts 123456789012 \
  --regions us-east-1 us-west-2 \
  --operation-preferences FailureToleranceCount=0,MaxConcurrentCount=1 \
  --region us-east-1

Monitor the operation:

aws cloudformation describe-stack-instances \
  --stack-set-name AWS-Landing-Zone-Baseline \
  --stack-instance-account 123456789012 \
  --stack-instance-region us-east-1 \
  --region us-east-1 \
  --query 'Summaries[0].Status'

Is This Safe?

Checking role status and SCPs is always safe. Creating a missing execution role is safe. Updating the StackSet or retrying an operation is safe—it just reapplies the template.

Key Takeaway

StackSet failures across OUs usually stem from missing or misconfigured execution roles, account suspension, or blocking SCPs. Always verify the execution role exists and trusts the management account, check account status in Organizations, and ensure no SCPs block CloudFormation before investigating further.

Have questions or ran into a different Landing Zone issue? Connect with me on LinkedIn or X.