Skip to the content.
Lesson 5 55 min Terraform IaC Azure AWS

Infrastructure as Code

Manage cloud infrastructure declaratively with Terraform — modules, remote state, workspaces, drift detection, and testing patterns for production IaC workflows.

This lesson is private to enrolled students. Please keep the link to yourself — thanks.

What You Will Learn

  • Understand the Terraform core workflow and state model
  • Structure reusable modules for team and organisation-wide use
  • Manage remote state with locking in team environments
  • Use workspaces and variable files for multi-environment deployments
  • Detect and remediate infrastructure drift
  • Test Terraform modules before production

1. The Terraform Core Workflow

Write                 Plan                    Apply
──────                ──────                  ──────
.tf files   →   terraform plan     →   terraform apply
              (shows what will change)  (makes changes)

Every Terraform operation reads state — the source of truth about what currently exists:

# Initialise — download providers, configure backend
terraform init

# Preview changes without making them
terraform plan -out=tfplan

# Apply the saved plan (no interactive prompt)
terraform apply tfplan

# See what's currently in state
terraform state list
terraform state show aws_s3_bucket.artifacts

2. Module Structure

A well-structured Terraform module is reusable, self-documented, and testable.

modules/
└── kubernetes-cluster/
    ├── main.tf          # Resources
    ├── variables.tf     # Input variables
    ├── outputs.tf       # Output values
    ├── versions.tf      # Required providers + Terraform version
    └── README.md        # Auto-generated by terraform-docs

Example: EKS cluster module

# modules/kubernetes-cluster/variables.tf
variable "cluster_name" {
  type        = string
  description = "Name of the EKS cluster"
}

variable "node_count" {
  type        = number
  description = "Number of worker nodes"
  default     = 3

  validation {
    condition     = var.node_count >= 2
    error_message = "Production clusters need at least 2 nodes for HA."
  }
}

variable "instance_type" {
  type    = string
  default = "t3.medium"
}
# modules/kubernetes-cluster/outputs.tf
output "cluster_endpoint" {
  description = "API server endpoint URL"
  value       = aws_eks_cluster.main.endpoint
}

output "cluster_ca_certificate" {
  description = "Base64-encoded cluster CA certificate"
  value       = aws_eks_cluster.main.certificate_authority[0].data
  sensitive   = true
}

3. Remote State & Locking

Storing state locally is only safe for solo projects. Teams need remote state with locking.

# versions.tf — configure the S3 backend
terraform {
  required_version = ">= 1.6"

  backend "s3" {
    bucket         = "mycompany-terraform-state"
    key            = "platform/eks/terraform.tfstate"
    region         = "eu-west-1"
    encrypt        = true
    dynamodb_table = "terraform-state-lock"  # prevents concurrent applies
  }
}
💡
State locking is critical Two engineers running terraform apply simultaneously without a lock will corrupt state. Always use DynamoDB (AWS) or GCS with versioning (GCP) as your lock backend.

Workspaces for environments

# Create and switch between environments
terraform workspace new staging
terraform workspace new production
terraform workspace select staging

# Reference workspace in code
resource "aws_instance" "app" {
  instance_type = terraform.workspace == "production" ? "m5.xlarge" : "t3.medium"
}

4. Drift Detection

Drift = reality diverged from your Terraform state (someone applied a hotfix manually, or a cloud event changed a resource).

# Detect drift — shows changes made outside Terraform
terraform plan -detailed-exitcode
# exit code 0 = no changes, 1 = error, 2 = changes detected (drift)

# In CI — alert when drift detected
terraform plan -detailed-exitcode
if [ $? -eq 2 ]; then
  echo "DRIFT DETECTED — infrastructure diverged from state"
  # Send alert to Slack / PagerDuty
fi

Schedule drift detection to run every 4 hours in CI to catch manual changes early.


5. Testing Terraform

Tool What it tests
terraform validate Syntax and type checking
terraform fmt -check Code formatting
tflint Best practice linting (AWS/Azure/GCP rules)
checkov Security policy scanning (CIS benchmarks)
terratest Integration tests — real infra, real assertions
# .github/workflows/terraform.yml
- name: Validate
  run: terraform validate

- name: Lint
  run: tflint --recursive

- name: Security scan
  run: checkov -d . --framework terraform --soft-fail

- name: Plan
  run: terraform plan -out=tfplan

# On merge to main only:
- name: Apply
  if: github.ref == 'refs/heads/main'
  run: terraform apply tfplan

6. Hands-on Exercise

  1. Write a Terraform module for an S3 bucket (or Azure Storage Account) with versioning and encryption enabled
  2. Add validation blocks on the bucket name and region variables
  3. Configure an S3/GCS remote backend with state locking
  4. Create staging and production workspaces with different instance sizes
  5. Add a GitHub Actions workflow that runs terraform plan on PRs and terraform apply on merge

Summary

Concept Key takeaway
Core workflow Init → Plan → Apply — always preview before applying
Modules Input variables + outputs + validation = reusable, safe abstractions
Remote state Always use S3/GCS backend + DynamoDB/GCS lock in team environments
Workspaces One set of configs, multiple environments via workspace interpolation
Drift detection Schedule terraform plan in CI to catch manual changes early

Discussion & Questions

Ask questions, share what you built, or leave feedback about this lesson. GitHub account required.