terraform-state-manager

安装量: 43
排名: #17088

安装

npx skills add https://github.com/dengineproblem/agents-monorepo --skill terraform-state-manager

Terraform State Manager Эксперт по управлению Terraform state файлами, remote backends, state operations и troubleshooting. Core Principles State File Security state_security_principles : - principle : "Never commit state to version control" reason : "State files contain sensitive information including secrets" - principle : "Use remote backends for team environments" reason : "Enables collaboration and prevents state corruption" - principle : "Enable encryption at rest and in transit" reason : "Protects sensitive data in state files" - principle : "Implement state locking" reason : "Prevents concurrent modifications and corruption" - principle : "Regular backups with retention policy" reason : "Enables recovery from accidental deletions or corruption" Backend Configuration AWS S3 Backend (Recommended)

backend.tf

terraform { backend "s3" { bucket = "mycompany-terraform-state" key = "environments/prod/infrastructure/terraform.tfstate" region = "us-east-1" encrypt = true kms_key_id = "alias/terraform-state-key" dynamodb_table = "terraform-state-lock"

Optional: Assume role for cross-account access

role_arn

"arn:aws:iam::123456789012:role/TerraformStateAccess"

Optional: Workspace-based key prefix

workspace_key_prefix

"workspaces" } } S3 Backend Infrastructure Setup

state-backend/main.tf - Run this first with local backend

terraform { required_version = ">= 1.5.0" required_providers { aws = { source = "hashicorp/aws" version = "~> 5.0" } } } provider "aws" { region = var.region }

S3 Bucket for State

resource "aws_s3_bucket" "terraform_state" { bucket = " $ { var . company } -terraform-state- $ { var . region } " tags = { Name = "Terraform State" Environment = "shared" ManagedBy = "terraform" } lifecycle { prevent_destroy = true } } resource "aws_s3_bucket_versioning" "terraform_state" { bucket = aws_s3_bucket.terraform_state.id versioning_configuration { status = "Enabled" } } resource "aws_s3_bucket_server_side_encryption_configuration" "terraform_state" { bucket = aws_s3_bucket.terraform_state.id rule { apply_server_side_encryption_by_default { sse_algorithm = "aws:kms" kms_master_key_id = aws_kms_key.terraform_state.arn } bucket_key_enabled = true } } resource "aws_s3_bucket_public_access_block" "terraform_state" { bucket = aws_s3_bucket.terraform_state.id block_public_acls = true block_public_policy = true ignore_public_acls = true restrict_public_buckets = true } resource "aws_s3_bucket_lifecycle_configuration" "terraform_state" { bucket = aws_s3_bucket.terraform_state.id rule { id = "state-versions" status = "Enabled" noncurrent_version_expiration { noncurrent_days = 90 } noncurrent_version_transition { noncurrent_days = 30 storage_class = "STANDARD_IA" } } }

DynamoDB Table for State Locking

resource "aws_dynamodb_table" "terraform_lock" { name = "terraform-state-lock" billing_mode = "PAY_PER_REQUEST" hash_key = "LockID" attribute { name = "LockID" type = "S" } server_side_encryption { enabled = true } point_in_time_recovery { enabled = true } tags = { Name = "Terraform State Lock" Environment = "shared" ManagedBy = "terraform" } }

KMS Key for State Encryption

resource "aws_kms_key" "terraform_state" { description = "KMS key for Terraform state encryption" deletion_window_in_days = 30 enable_key_rotation = true policy = jsonencode( { Version = "2012-10-17" Statement = [ { Sid = "Enable IAM User Permissions" Effect = "Allow" Principal = { AWS = "arn:aws:iam:: $ { data . aws_caller_identity . current . account_id } :root" } Action = "kms:" Resource = "" } , { Sid = "Allow Terraform Role" Effect = "Allow" Principal = { AWS = var.terraform_role_arn } Action = [ "kms:Encrypt" , "kms:Decrypt" , "kms:GenerateDataKey" ] Resource = "*" } ] } ) tags = { Name = "Terraform State Key" } } resource "aws_kms_alias" "terraform_state" { name = "alias/terraform-state-key" target_key_id = aws_kms_key.terraform_state.key_id } data "aws_caller_identity" "current" { }

Outputs

output "state_bucket_name" { value = aws_s3_bucket.terraform_state.id } output "lock_table_name" { value = aws_dynamodb_table.terraform_lock.name } output "kms_key_arn" { value = aws_kms_key.terraform_state.arn } Azure Backend terraform { backend "azurerm" { resource_group_name = "terraform-state-rg" storage_account_name = "mycompanytfstate" container_name = "tfstate" key = "prod/infrastructure.tfstate"

Enable encryption

use_azuread_auth

true } } Google Cloud Backend terraform { backend "gcs" { bucket = "mycompany-terraform-state" prefix = "terraform/state"

Enable encryption

encryption_key

"projects/myproject/locations/global/keyRings/terraform/cryptoKeys/state-key" } } State Operations Essential State Commands

View all resources in state

terraform state list

View specific resource details

terraform state show aws_instance.web

View state as JSON

terraform show -json | jq '.values.root_module.resources'

Pull remote state to local file

terraform state pull

terraform.tfstate.backup

Push local state to remote (use with caution!)

terraform state push terraform.tfstate

Get outputs from state

terraform output -json Resource Refactoring (terraform state mv)

Rename a resource

terraform state mv aws_instance.web aws_instance.app_server

Move resource to a module

terraform state mv aws_instance.web module.compute.aws_instance.web

Move resource from module to root

terraform state mv module.compute.aws_instance.web aws_instance.web

Move resource between modules

terraform state mv module.old.aws_instance.web module.new.aws_instance.web

Move entire module

terraform state mv module.old module.new

Move resource to different state file

terraform state mv -state-out = other.tfstate aws_instance.web aws_instance.web Import Existing Resources

Basic import

terraform import aws_instance.web i-1234567890abcdef0

Import with provider alias

terraform import -provider = aws.west aws_instance.web i-1234567890abcdef0

Import into module

terraform import module.vpc.aws_vpc.main vpc-12345678

Import with for_each

terraform import 'aws_instance.servers["web"]' i-1234567890abcdef0

Generate import blocks (Terraform 1.5+)

terraform plan -generate-config-out

generated.tf Import Block (Terraform 1.5+)

imports.tf

import { to = aws_instance.web id = "i-1234567890abcdef0" } import { to = aws_vpc.main id = "vpc-12345678" } import { to = module.rds.aws_db_instance.main id = "mydb-instance" }

With for_each

import { for_each = var.existing_buckets to = aws_s3_bucket.imported [ each.key ] id = each.value } Remove Resources from State

Remove single resource (doesn't destroy actual resource)

terraform state rm aws_instance.web

Remove resource in module

terraform state rm module.compute.aws_instance.web

Remove entire module

terraform state rm module.old_module

Remove with for_each

terraform state rm 'aws_instance.servers["web"]'

Dry run - show what would be removed

terraform state rm -dry-run aws_instance.web Replace/Recreate Resources

Force replacement of resource (Terraform 0.15.2+)

terraform apply -replace = "aws_instance.web"

Taint resource (legacy, use -replace instead)

terraform taint aws_instance.web

Untaint resource

terraform untaint aws_instance.web Workspace Management Workspace Commands

List all workspaces

terraform workspace list

Create new workspace

terraform workspace new staging

Select workspace

terraform workspace select prod

Show current workspace

terraform workspace show

Delete workspace (must not be current)

terraform workspace delete staging Workspace-Aware Configuration

locals.tf

locals { environment = terraform.workspace

Environment-specific configurations

config

{ dev = { instance_type = "t3.small" min_size = 1 max_size = 2 multi_az = false } staging = { instance_type = "t3.medium" min_size = 2 max_size = 4 multi_az = true } prod = { instance_type = "t3.large" min_size = 3 max_size = 10 multi_az = true } } env_config = local.config [ local.environment ] }

Use in resources

resource "aws_instance" "app" { instance_type = local.env_config.instance_type tags = { Environment = local.environment } } Backend with Workspace Prefix terraform { backend "s3" { bucket = "terraform-state" key = "app/terraform.tfstate" region = "us-east-1" workspace_key_prefix = "workspaces" dynamodb_table = "terraform-lock" encrypt = true } }

Results in state paths:

- workspaces/dev/app/terraform.tfstate

- workspaces/staging/app/terraform.tfstate

- workspaces/prod/app/terraform.tfstate

State Locking DynamoDB Lock Table Schema resource "aws_dynamodb_table" "terraform_lock" { name = "terraform-state-lock" billing_mode = "PAY_PER_REQUEST" hash_key = "LockID" attribute { name = "LockID" type = "S" }

Enable encryption

server_side_encryption { enabled = true }

Enable point-in-time recovery

point_in_time_recovery { enabled = true } tags = { Name = "Terraform State Lock Table" } } Force Unlock

Force unlock (use only when you're sure no operation is running)

terraform force-unlock LOCK_ID

The lock ID is shown in the error message when locked:

"Error: Error locking state: Error acquiring the state lock: ConditionalCheckFailedException..."

Lock ID: 12345678-1234-1234-1234-123456789012

Lock Troubleshooting

Check for existing locks in DynamoDB

aws dynamodb scan \ --table-name terraform-state-lock \ --projection-expression "LockID, Info" \ --output table

Delete stale lock manually (last resort)

aws dynamodb delete-item \ --table-name terraform-state-lock \ --key '{"LockID": {"S": "terraform-state/path/to/terraform.tfstate"}}' State Migration Migrate from Local to Remote Backend

Step 1: Add backend configuration to your Terraform files

backend.tf (see S3 backend example above)

Step 2: Initialize with migration

terraform init -migrate-state

Terraform will prompt:

Do you want to copy existing state to the new backend?

Enter "yes"

Step 3: Verify migration

terraform state list terraform plan

Should show no changes

Migrate Between Remote Backends

Step 1: Pull current state

terraform state pull

terraform.tfstate.backup

Step 2: Update backend configuration

Change backend.tf to new backend

Step 3: Reinitialize with migration

terraform init -migrate-state -force-copy

Step 4: Verify

terraform state list terraform plan Split State into Multiple States

Scenario: Split monolithic state into separate states for each environment

Step 1: Create backup

terraform state pull

full-state.backup.json

Step 2: Create new state file for specific resources

terraform state mv -state-out = env/prod/terraform.tfstate \ module.prod_vpc module.prod_vpc terraform state mv -state-out = env/prod/terraform.tfstate \ module.prod_app module.prod_app

Step 3: Initialize new state directories with appropriate backends

cd env/prod terraform init terraform state push terraform.tfstate Recovery Procedures Recover from Corrupted State

Step 1: Check S3 bucket versioning for previous versions

aws s3api list-object-versions \ --bucket terraform-state \ --prefix "path/to/terraform.tfstate" \ --max-keys 10

Step 2: Download previous version

aws s3api get-object \ --bucket terraform-state \ --key "path/to/terraform.tfstate" \ --version-id "versionId123" \ recovered-state.json

Step 3: Validate the recovered state

terraform show -json recovered-state.json | jq '.values'

Step 4: Push recovered state

terraform state push recovered-state.json Recover from Deleted State

Option 1: Recover from S3 versioning

aws s3api list-object-versions \ --bucket terraform-state \ --prefix "path/to/terraform.tfstate"

Look for DeleteMarker and recover the version before it

Option 2: Recover from DynamoDB backup (if PITR enabled)

aws dynamodb restore-table-to-point-in-time \ --source-table-name terraform-state-lock \ --target-table-name terraform-state-lock-recovered \ --restore-date-time 2024 -01-15T10:00:00Z

Option 3: Reimport all resources

Create import blocks for each resource and run terraform apply

Rebuild State from Scratch

imports.tf - Generate these by examining your infrastructure

Use AWS CLI to discover resources

aws ec2 describe-instances --query 'Reservations[].Instances[].[InstanceId, Tags]'

import { to = aws_vpc.main id = "vpc-12345678" } import { to = aws_subnet.public [ "a" ] id = "subnet-aaaaaaaa" } import { to = aws_subnet.public [ "b" ] id = "subnet-bbbbbbbb" } import { to = aws_instance.web id = "i-1234567890abcdef0" }

Then run:

terraform plan -generate-config-out=generated_resources.tf

Review and merge generated_resources.tf into your configuration

terraform apply

Drift Detection Detect Configuration Drift

Standard drift detection

terraform plan -detailed-exitcode

Exit codes:

0 = No changes

1 = Error

2 = Changes detected (drift)

Machine-readable drift detection

terraform plan -json | jq '.resource_changes[] | select(.change.actions != ["no-op"])'

Generate drift report

terraform plan -json

plan.json jq '[.resource_changes[] | select(.change.actions != ["no-op"]) | { address: .address, actions: .change.actions, before: .change.before, after: .change.after }]' plan.json

drift-report.json Refresh State (Sync with Reality)

Refresh state without applying changes

terraform refresh

Or use plan with refresh-only (Terraform 0.15.4+)

terraform apply -refresh-only

This updates state to match actual infrastructure

without making any changes to infrastructure

Automated Drift Detection Script

!/bin/bash

drift-check.sh

set -e WORKSPACES = ( "dev" "staging" "prod" ) DRIFT_FOUND = false for ws in " ${WORKSPACES [ @ ] } " ; do echo "Checking drift in workspace: $ws " terraform workspace select " $ws "

Run plan with detailed exit code

set +e terraform plan -detailed-exitcode -out = plan- $ws .tfplan

/dev/null 2

&1 EXIT_CODE = $? set -e if [ $EXIT_CODE -eq 2 ] ; then echo "⚠️ DRIFT DETECTED in $ws " terraform show -json plan- $ws .tfplan | jq '.resource_changes[] | select(.change.actions != ["no-op"]) | .address' DRIFT_FOUND = true elif [ $EXIT_CODE -eq 0 ] ; then echo "✅ No drift in $ws " else echo "❌ Error checking $ws " fi rm -f plan- $ws .tfplan done if [ " $DRIFT_FOUND " = true ] ; then exit 2 fi Security Best Practices IAM Policy for State Access { "Version" : "2012-10-17" , "Statement" : [ { "Sid" : "TerraformStateAccess" , "Effect" : "Allow" , "Action" : [ "s3:GetObject" , "s3:PutObject" , "s3:DeleteObject" , "s3:ListBucket" ] , "Resource" : [ "arn:aws:s3:::terraform-state" , "arn:aws:s3:::terraform-state/" ] , "Condition" : { "StringEquals" : { "aws:PrincipalTag/Team" : "${aws:ResourceTag/Team}" } } } , { "Sid" : "TerraformStateLock" , "Effect" : "Allow" , "Action" : [ "dynamodb:GetItem" , "dynamodb:PutItem" , "dynamodb:DeleteItem" ] , "Resource" : "arn:aws:dynamodb:::table/terraform-state-lock" } , { "Sid" : "TerraformStateEncryption" , "Effect" : "Allow" , "Action" : [ "kms:Encrypt" , "kms:Decrypt" , "kms:GenerateDataKey" ] , "Resource" : "arn:aws:kms::*:key/terraform-state-key-id" } ] } State File Inspection

Check for sensitive data in state

terraform state pull | jq '.resources[].instances[].attributes | to_entries[] | select(.key | test("password|secret|key|token"; "i")) | {resource: input.address, sensitive_field: .key}'

List all sensitive values

terraform state pull | jq '[ .resources[] | .instances[] | .attributes | to_entries[] | select(.value != null and (.key | test("password|secret|key|token|credential"; "i"))) ] | length' CI/CD Integration GitHub Actions State Management

.github/workflows/terraform.yml

name : Terraform on : push : branches : [ main ] pull_request : branches : [ main ] env : TF_VAR_environment : $ { { github.ref == 'refs/heads/main' && 'prod' | | 'staging' } } jobs : terraform : runs-on : ubuntu - latest permissions : id-token : write contents : read pull-requests : write steps : - uses : actions/checkout@v4 - name : Configure AWS Credentials uses : aws - actions/configure - aws - credentials@v4 with : role-to-assume : arn : aws : iam : : 123456789012 : role/GitHubActionsRole aws-region : us - east - 1 - name : Setup Terraform uses : hashicorp/setup - terraform@v3 with : terraform_version : 1.6.0 - name : Terraform Init run : terraform init - backend - config="key=envs/$ { { env.TF_VAR_environment } } /terraform.tfstate" - name : Terraform Plan id : plan run : terraform plan - no - color - out=tfplan continue-on-error : true - name : Comment PR with Plan if : github.event_name == 'pull_request' uses : actions/github - script@v7 with : script : | const output = #### Terraform Plan 📖 \`` ${{ steps.plan.outputs.stdout }} ``` `; github.rest.issues.createComment({ issue_number: context.issue.number, owner: context.repo.owner, repo: context.repo.repo, body: output }); - name : Terraform Apply if : github.ref == 'refs/heads/main' && github.event_name == 'push' run : terraform apply - auto - approve tfplan Лучшие практики Всегда используй remote backend — local state только для экспериментов Включай versioning для state bucket — возможность recovery Используй state locking — предотвращает коррупцию state Шифруй state at rest и in transit — содержит sensitive data Регулярно тестируй recovery procedures — до того как понадобится Автоматизируй drift detection — в CI/CD pipeline Разделяй state по окружениям — workspaces или отдельные backends Ограничивай доступ к state — IAM policies с минимальными правами

返回排行榜