Module 5: Terraform CI/CD Environments and Production Workflows on Azure
Overview Module 5 completes the Azure Terraform: From Zero to Production series by shifting focus from writing Terraform to operating Terraform safely in real-world Azure environments.
While HashiCorp Configuration Language (HCL) enables infrastructure definition, production readiness requires far more:
- Strong separation between environments
- Automated CI/CD delivery pipelines
- Secure authentication
- Change validation and policy enforcement
- Resilience against failure and human error
By the end of this module, you move from:
“I can write Terraform”
to
“I can run Terraform in production with confidence.”
Learning Objectives
After completing this module, you will be able to:
- Design automated Terraform CI/CD pipelines for Azure
- Implement environment isolation (dev, staging, production)
- Use Azure remote state backends securely and correctly
- Promote infrastructure changes safely between environments
- Enforce production guardrails using policy-as-code
- Detect and manage configuration drift
- Apply Terraform using zero-downtime and resilient workflows
1. Designing Automated Terraform CI/CD Pipelines
Automation is the foundation of modern DevOps. In production Azure environments, manual Terraform runs are a liability, not a convenience.
Terraform CI/CD pipelines replace error-prone “ClickOps” with repeatable, auditable workflows using tools like GitHub Actions or Azure Pipelines.
Continuous Integration (CI)
Every push or pull request should trigger automated validation:
terraform fmt– Enforces consistent formattingterraform validate– Verifies syntax and internal consistencytflint– Enforces provider and Azure best practices- Security scanning:
- Checkov
- tfsec
These checks prevent misconfigurations and vulnerabilities before infrastructure is modified.
The Planning Phase
In production workflows, terraform plan is not optional.
Best practice is to generate a saved plan artifact:
terraform plan -out=tfplan.out
Benefits:
- Produces an immutable record of intended changes
- Allows teams to review exactly what will be applied
- Prevents drift between plan and apply stages
In GitHub Actions, tools like tf-summarize can post a readable plan summary directly into the pull request.
Continuous Deployment (CD)
Once a plan is approved and merged:
terraform apply tfplan.out
Using the previously generated plan ensures:
- No unreviewed changes
- No surprise diffs
- Full traceability between code, plan, and apply
Secure Authentication
Never store Azure credentials in source control.
Recommended options:
- OpenID Connect (OIDC) (best practice)
- Azure Service Principals (fallback)
OIDC allows GitHub Actions to authenticate dynamically with Azure, eliminating long-lived secrets entirely.
2. Environment Separation and Isolation
Sharing a Terraform state file between environments is a high-risk anti-pattern.
A staging mistake should never be able to destroy production resources.
Directory-Based Environment Isolation
The most robust approach in Azure is file-level separation:
/environments
/dev
/staging
/prod
Each environment should have:
- Its own backend configuration
- Its own state file
- Ideally, its own Azure subscription or resource group boundary
This creates hard blast-radius containment.
Terraform Workspaces (and Their Limits)
Terraform workspaces provide lightweight isolation but:
- Share backend authentication
- Are easy to misuse
- Do not enforce access boundaries
Workspaces are suitable for experiments—not for production Azure estates.
Promoting Changes Between Environments
Production workflows should promote versioned modules, not raw code changes.
Typical flow:
- Develop and test module in Dev
- Promote tagged module version to Staging
- Validate manually or via automated tests
- Update module version reference in Production
This mirrors application release pipelines and dramatically reduces risk.
3. Production-Grade Terraform State Management on Azure
The Terraform state file is a sensitive operational artifact, not just metadata.
In production, state must be:
- Remote
- Secure
- Locked
- Auditable
Azure Blob Storage Backend
Azure Blob Storage is the recommended backend for Terraform on Azure.
Key features:
- Centralised storage
- Native state locking
- Integration with Azure RBAC
- Encryption at rest
State Locking
State locking prevents concurrent terraform apply operations.
Think of it as a digital handshake:
- Only one pipeline or operator can modify state at a time
- Prevents corruption and race conditions
Encryption and Security
Terraform state often contains secrets in plaintext.
Ensure:
- Server-Side Encryption (SSE) is enabled
- Prefer Customer-Managed Keys (CMK/CMEK) for sensitive environments
- Access is restricted using Azure RBAC
Understanding Lineage and Serial
- Lineage: Unique identifier for the lifetime of a state file
- Serial: Incrementing version number for each state change
Knowing these fields helps with recovery and forensic troubleshooting.
4. Advanced Production Workflows and Resilience
Production Terraform is not just about creating resources—it’s about changing them safely.
Zero-Downtime Deployments with lifecycle
Use lifecycle rules to avoid outages:
lifecycle {
create_before_destroy = true
}
Terraform will:
- Create the new resource
- Verify it exists
- Destroy the old one only after success
This is essential for:
- App Services
- Load balancers
- Critical network components
Policy as Code (Guardrails)
Policy enforcement prevents dangerous changes before they reach Azure.
Common approaches:
- Open Policy Agent (OPA)
- Terraform Cloud / Sentinel
- Azure Policy integration
Example policy rules:
- Deny storage accounts without HTTPS
- Enforce mandatory cost tags
- Restrict regions or VM SKUs
Violations should fail the pipeline automatically.
Refactoring Safely with moved Blocks
When restructuring Terraform code:
moved {
from = azurerm_resource_group.old
to = azurerm_resource_group.new
}
This allows you to rename or reorganise resources without destroying live infrastructure.
Drift Detection
Drift occurs when changes are made outside Terraform (e.g. Azure Portal).
Detect drift with scheduled jobs:
terraform plan --refresh-only
This identifies differences without applying changes and keeps infrastructure honest.
Analogy: The Production Flight Plan
Running Terraform in production is like flying an aircraft:
- Terraform code is the flight route
- CI/CD pipelines are the pre-flight checks
- Terraform plan is the flight plan preview
- Apply is flying the aircraft while passengers are onboard
Environment isolation acts as bulkheads—ensuring a failure in one system doesn’t bring down the entire fleet.
Final Outcome
After Module 5, you will have:
- A secure, automated Terraform delivery pipeline
- Clear environment boundaries
- Confidence operating Terraform in production Azure environments
- The skills required for enterprise and regulated cloud platforms
This module completes the journey from learning Terraform to running Terraform professionally.
Azure Terraform Tutorial Series From Zero to Production series index
