Automate Terraform Statefiles Movement with GitHub Actions

NIRAV SHAH
9 min readAug 29, 2023

Managing Terraform statefiles efficiently is crucial for infrastructure as code (IaC) operations. GitHub Actions can be utilized to streamline the process of moving Terraform statefiles from one environment to another. This blog post will guide you through the steps to achieve this using a GitHub Actions workflow.

Overview

The goal is to create a GitHub Actions workflow that automates the process of moving Terraform statefiles between environments while ensuring the accuracy of the transition.

Background

We are using IaaC for our entire infrastructure. Initially, limited devops teams were playing with repo hence concurrent PR was quite less. However, we opened our repo for developers too. This created concurrency issues, testing cycle has increased as the team has to wait till their branch can be pointed. The developer came up with an interesting alternative, they started creating multiple merge branches which can contain 1,2 or more changes together in a single branch. Although it works, it requires additional effort. We prepared a plan to split the folder as per resource grouping. The entire steps were planned for manual execution. Later one of the team members insisted on performing automated way. So we prepared the below github actions.

Prerequisites

  1. A properly configured GitHub repository with Terraform configurations.
  2. Knowledge of GitHub Actions and its YAML syntax.

Folder structure

.
├── README.md
├── network
│ └── spoke
│ ├── ReadMe.md
│ ├── backend
│ │ ├── beta-xxxxxxxxxxx.us-east-1.hcl
│ │ ├── latest-xxxxxxxxxxx.us-east-1.hcl
│ │ ├── prod-xxxxxxxxxxx.us-east-1.hcl
│ │ └── test-xxxxxxxxxxx.us-east-1.hcl
│ ├── main.tf
│ ├── output.tf
│ ├── security_groups.tf
│ ├── tfvars
│ │ ├── beta-xxxxxxxxxxx.us-east-1.tfvars
│ │ ├── latest-xxxxxxxxxxx.us-east-1.tfvars
│ │ ├── prod-xxxxxxxxxxx.us-east-1.tfvars
│ │ └── test-xxxxxxxxxxx.us-east-1.tfvars
│ ├── variables.tf
│ ├── version.tf
│ └── vpc.tf
└── platform
├── spoke
│ ├── ReadMe.md
│ ├── backend
│ │ ├── beta-xxxxxxxxxxx.eu-east-1.hcl
│ │ ├── latest-xxxxxxxxxxx.eu-east-1.hcl
│ │ ├── prod-xxxxxxxxxxx.eu-east-1.hcl
│ │ └── test-xxxxxxxxxxx.eu-east-1.hcl
│ ├── dns.tf
│ ├── istio.tf
│ ├── karpenter.tf
│ ├── kubernetes-addon.tf
│ ├── kubernetes.tf
│ ├── locals.tf
│ ├── main.tf
│ ├── monitoring.tf
│ ├── nginx.tf
│ ├── output.tf
│ ├── templates
│ │ ├── aws-load-balancer-controller-values.yaml
│ │ ├── grafana-values.yaml
│ │ └── prometheus-values.yaml
│ ├── tfvars
│ │ ├── beta-xxxxxxxxxx.us-east-1.tfvars
│ │ ├── latest-xxxxxxxxxxx.us-east-1.tfvars
│ │ ├── prod-xxxxxxxxxxx.us-east-1.tfvars
│ │ └── test-xxxxxxxxxxx.us-east-1.tfvars
│ ├── variables.tf
│ ├── version.tf
└── spoke-observability
├── Readme.md
├── backend
│ ├── beta-xxxxxxxxxxx.us-east-1.hcl
│ ├── latest-xxxxxxxxxxx.us-east-1.hcl
│ ├── prod-xxxxxxxxxxx.us-east-1.hcl
│ └── test-xxxxxxxxxxx.us-east-1.hcl
├── locals.tf
├── logging.tf
├── main.tf
├── ssm-parameter.tf
├── templates
│ └── fluent-bit-values.yaml
├── tfvars
│ ├── beta-xxxxxxxxxxx.us-east-1.tfvars
│ ├── latest-xxxxxxxxxxx.us-east-1.tfvars
│ ├── prod-xxxxxxxxxxx.us-east-1.tfvars
│ └── test-xxxxxxxxxxx.us-east-1.tfvars
├── variables.tf
└── version.tf

Our configuration

As we already have two folders, adding another folder would just add standards. We would be using workspace selection as “${{ env.SERVICE }}-${{ env.stage }}-${{ env.REGION }}”. Similarly naming convention for tfvars is kept as “${{ env.stage }}-${{ env.ACCOUNT }}.${{ env.REGION }}.tfvars”. If you have a different structure. Please change the code accordingly.

Movement

In this example we have moved one file logging.tf from spoke folder to spoke-observability folder. This requires us to create it’s relevant tfvars, variables.tf, backend files, main and few additional files on spoke-observability folder. We create PR for the same and using workflow with this branch will allow us to perform migration.

Workflow Steps

Step 0: Assumtions

The code assumes,

  1. source and destination folder created in same repository.
  2. Team already using Terraform Workspace to split resources
  3. Destination folder can be empty before activity
  4. In case destination already contains resources script would add resources to it.
  5. Team knows terraform state movement is critical component, pause all work while the activity is carried out.

Step 1: Prestep

Ensure that both the source and destination working directories are updated with the latest changes from the main branch before beginning the state move activity.

Step 2: Execute Workflow Inputs

Configure the GitHub Actions workflow to take inputs such as source working directory, destination working directory, stage, account, region, and a dry run flag from the workflow dispatch.

Step 3: Check Resources

Check whether the resources removed from the source working directory match the resources added to the destination working directory. If they accurately match, proceed with further steps. Otherwise, exit the workflow.

Step 4: Approve

This step asks for manual approval before executing the state move process. It is crucial to ensure accuracy before proceeding.

Step 5: Backup

This step performs terraform statefile backup and upload it to artifact.

Step 6: Move Resources

The resources specified in the source working directory will be moved to the destination statefile. A script will be generated to handle this move.

Step 7: Verify

Finally, the plan will be checked to confirm that no resources have changed on both the source and destination sides.

GitHub Actions Workflow YAML

Here’s the YAML representation of the GitHub Actions workflow:

# How to use:
# Execute: Choose source directory, destination directory, stage, account, region from workflow for branch created for terraform move github workflow

# Detail workflow
# 1. Prestep: Make sure source and destination are updated with main branch before starting activity
# 2. Execute: Choose source directory, destination directory, stage, account, region from workflow for branch created for terraform move github workflow
# 3. Check: If branch accurately done resources removed from source would match to resources added to destination
# 4. Approve: Only then script would ask for manual approval
# 5. Move: Said resources would be moved to destination statefile
# 6. Verify: plan shows no resources changed on both source & destination

name: Terraform-state-move
on:
workflow_dispatch:
inputs:
source-working-directory:
required: true
type: choice
description: terraform workflow source directory
options:
- "network/spoke"
- "platform/spoke"
- "platform/spoke-observability"
destination-working-directory:
required: true
type: choice
description: terraform workflow destination directory
options:
- "network/spoke"
- "platform/spoke"
- "platform/spoke-observability"
stage:
type: choice
description: Target stage to test against (used by deploy-platform only)
required: true
options:
- "latest"
- "test"
- "beta"
- "prod"
account:
type: string
description: Choose any of the known account for stage latest/test xxxxxxxx, beta/prod yyyyyy
required: true
region:
type: choice
description: Choose us-east-1 for prod, rest are eu-west-1 region
required: true
options:
- eu-west-1
- us-east-1
dryrun:
type: choice
description: dry run=true/false no statefile changes would be done [ Manual approval step needed ]
required: true
options:
- "true"
- "false"
jobs:
terraform:
name: "Move Terraform statefiles"
runs-on: deploy-platform
env:
source_working_directory: ${{ inputs.source-working-directory }}
stage: ${{ inputs.stage }}
account: ${{ inputs.account }}
region: ${{ inputs.region }}
destination_working_directory: ${{ inputs.destination-working-directory }}
dryrun: ${{ inputs.dryrun }}
ACCOUNT: ${{ inputs.account }}
REGION: ${{ inputs.region }}
steps:
- name: Map account to environment variables per working directory
uses: kanga333/variable-mapper@v0.3.0
with:
key: ${{ env.source_working_directory }}
map: |
{
"network/spoke": {
"SOURCESERVICE": "network"
},
"platform/spoke-observability": {
"SOURCESERVICE": "observability"
},
"platform/spoke": {
"SOURCESERVICE": "platform"
}
}

- name: Map account to environment variables per working directory
uses: kanga333/variable-mapper@v0.3.0
with:
key: ${{ env.destination_working_directory }}
map: |
{
"network/spoke": {
"DESTINATIONSERVICE": "network"
},
"platform/spoke-observability": {
"DESTINATIONSERVICE": "observability"
},
"platform/spoke": {
"DESTINATIONSERVICE": "platform"
}
}

- name: Assume terraform deployment role
continue-on-error: true
uses: aws-actions/configure-aws-credentials@v2
with:
role-to-assume: arn:aws:iam::${{ env.ACCOUNT }}:role/terraform
role-duration-seconds: 3600
aws-region: ${{ env.REGION }}
role-session-name: TerraformSession

- name: Checkout
uses: actions/checkout@v3

- uses: actions/setup-node@v3
with:
node-version: "16"

- name: Setup Terraform
uses: hashicorp/setup-terraform@v2.0.3
with:
terraform_version: 1.2.3
terraform_wrapper: false

- name: Terraform Source Steps
run: |
# terraform fmt -check --recursive
terraform init -backend-config=backend/${{ env.stage }}-${{ env.ACCOUNT }}.${{ env.REGION }}.hcl -upgrade
# terraform validate -no-color
terraform workspace select ${{ env.SERVICE }}-${{ env.stage }}-${{ env.REGION }} || terraform workspace new ${{ env.SERVICE }}-${{ env.stage }}-${{ env.REGION }}
set -o pipefail
terraform plan -no-color -input=false -out=tf.plan -var-file="tfvars/${{ env.stage }}-${{ env.ACCOUNT }}.${{ env.REGION }}.tfvars" | grep -v "Refreshing state...\|Reading...\|Read complete after"
terraform show -json tf.plan | jq -r '.resource_changes[] | select(.change.actions[0]=="delete") | .address' > terraformlist.txt
env:
SERVICE: ${{ env.SOURCESERVICE }}
working-directory: ${{ env.source_working_directory }}

- name: Terraform Destination Steps
run: |
# terraform fmt -check --recursive
terraform init -backend-config=backend/${{ env.stage }}-${{ env.ACCOUNT }}.${{ env.REGION }}.hcl -upgrade
# terraform validate -no-color
terraform workspace select ${{ env.SERVICE }}-${{ env.stage }}-${{ env.REGION }} || terraform workspace new ${{ env.SERVICE }}-${{ env.stage }}-${{ env.REGION }}
set -o pipefail
terraform plan -no-color -input=false -out=tf.plan -var-file="tfvars/${{ env.stage }}-${{ env.ACCOUNT }}.${{ env.REGION }}.tfvars" | grep -v "Refreshing state...\|Reading...\|Read complete after"
terraform show -json tf.plan | jq -r '.resource_changes[] | select(.change.actions[0]=="create") | .address' > terraformlist.txt
env:
SERVICE: ${{ env.DESTINATIONSERVICE }}
working-directory: ${{ env.destination_working_directory }}

- name: Check file diff
run: |
echo "================================================="
echo "=====================Source======================"
echo "================================================="
cat ${{ env.source_working_directory }}/terraformlist.txt
echo "================================================="
echo "==================Destination===================="
echo "================================================="
cat ${{ env.destination_working_directory }}/terraformlist.txt
echo "================================================="

if [[ ! -s ${{ env.source_working_directory }}/terraformlist.txt ]]; then
echo "No resources to be removed. No further steps needed"
exit 1
fi
if [[ ! -s ${{ env.destination_working_directory }}/terraformlist.txt ]]; then
echo "No resources to be added in destination. No further steps needed"
exit 1
fi
diff ${{ env.destination_working_directory }}/terraformlist.txt ${{ env.source_working_directory }}/terraformlist.txt
if diff -q ${{ env.destination_working_directory }}/terraformlist.txt ${{ env.source_working_directory }}/terraformlist.txt; then
echo "Outputs are identical. Proceeding to next steps."
awk '{ print "terraform state mv -state-out=../../${{ env.destination_working_directory }}/${{ env.DESTINATIONSERVICE }}.statefs '\''"$0"'\'' '\''"$0"'\''"}' ${{ env.destination_working_directory }}/terraformlist.txt > terraform_move_script.sh
else
echo "Error: Outputs differ."
exit 1
fi

- name: Generate token
id: generate_token
uses: tibdex/github-app-token@v1
with:
app_id: ${{ secrets.MANUAL_APPROVAL_APP_ID }}
private_key: ${{ secrets.MANUAL_APPROVAL_APP_PRIVATE_KEY }}

- name: Manual approval
uses: trstringer/manual-approval@v1.8.0 #Limitation unable to modify Issue Body
timeout-minutes: 30
with:
secret: ${{ steps.generate_token.outputs.token }}
approvers: devops
minimum-approvals: 1
# exclude-workflow-initiator-as-approver: true # Exclude executer for beta & prod environment
issue-title: Terraform state move from ${{ env.SOURCESERVICE }} to ${{ env.DESTINATIONSERVICE }} for stage ${{ env.stage }} at account ${{ env.ACCOUNT }} with region ${{ env.REGION }}

- name: Terraform pull statefile locally for source
run: |
terraform state pull > ${{ env.SOURCESERVICE }}.statefs
working-directory: ${{ env.source_working_directory }}

- name: Terraform pull statefile locally for destination
run: |
terraform state pull > ${{ env.DESTINATIONSERVICE }}.statefs
working-directory: ${{ env.destination_working_directory }}

- name: Backup Terraform statefile [ Use in case of rollback needed ]
uses: actions/upload-artifact@v3
with:
name: terraform-statefiles
path: |
${{ env.destination_working_directory }}/${{ env.DESTINATIONSERVICE }}.statefs
${{ env.source_working_directory }}/${{ env.SOURCESERVICE }}.statefs

- name: Terraform state move from source
run: |
cp ../../terraform_move_script.sh .
cat terraform_move_script.sh
if "${{ env.dryrun }}" == "true"; then
echo " dry run on.. Script execution skipped."
else
chmod 775 terraform_move_script.sh
./terraform_move_script.sh
echo "Script executed"
fi
working-directory: ${{ env.source_working_directory }}

- name: Terraform state push to destination
run: |
if "${{ env.dryrun }}" == "true"; then
echo " dry run on.. Script execution skipped."
else
terraform state push ${{ env.DESTINATIONSERVICE }}.statefs
echo "Script executed"
fi
terraform state list
working-directory: ${{ env.destination_working_directory }}

- name: Terraform Source Verify
run: |
set -o pipefail
terraform plan -no-color -input=false -out=tf.plan -var-file="tfvars/${{ env.stage }}-${{ env.ACCOUNT }}.${{ env.REGION }}.tfvars" | grep -v "Refreshing state...\|Reading...\|Read complete after"
env:
SERVICE: ${{ env.SOURCESERVICE }}
working-directory: ${{ env.source_working_directory }}

- name: Terraform Destination Verify
run: |
set -o pipefail
terraform plan -no-color -input=false -out=tf.plan -var-file="tfvars/${{ env.stage }}-${{ env.ACCOUNT }}.${{ env.REGION }}.tfvars" | grep -v "Refreshing state...\|Reading...\|Read complete after"
env:
SERVICE: ${{ env.DESTINATIONSERVICE }}
working-directory: ${{ env.destination_working_directory }}Replace placeholders like your_file.txt with actual file names or other appropriate values as needed.

Conclusion

By using GitHub Actions to automate the process of moving Terraform statefiles, you can ensure the consistency and accuracy of your infrastructure management. This workflow can save time and reduce the chances of manual errors during statefile transitions between different environments. Remember to customize the workflow to match your project’s requirements and environment.

Disclaimer

This blog post is meant for educational and informational purposes only. Always review and adapt the code and instructions to your specific project’s needs and best practices.

Remember that GitHub Actions workflows can evolve over time, so keep an eye on any updates to the GitHub Actions platform and adapt your workflows accordingly.

Note: The provided YAML in the blog post is based on the information you provided and may need further refinement or customization to match your exact needs.

--

--