Cloud Engineering

Provisioning an EKS Cluster with Terraform from Scratch

Intermediate60 min to complete14 min read

Build a production-ready EKS cluster with Terraform: VPC with private subnets, managed node groups, IRSA for pod IAM, and OIDC provider — all in reproducible, reviewable infrastructure code.

Before you begin

  • Terraform >= 1.6 installed
  • AWS CLI configured with sufficient IAM permissions
  • kubectl installed
  • Basic Terraform knowledge (init
  • plan
  • apply)
AWS
EKS
Terraform
Kubernetes
Infrastructure as Code

Creating an EKS cluster through the AWS console is fine for learning. Doing it with Terraform means you can reproduce it, review changes before applying, and destroy it cleanly. This tutorial builds a cluster you'd actually run in production.

What You'll Build

VPC
├── 3 public subnets (one per AZ) — load balancers
├── 3 private subnets (one per AZ) — EKS nodes
└── NAT Gateway — outbound internet from private subnets

EKS Cluster (Kubernetes 1.30)
├── Managed node group — 2–10 nodes, t3.medium
├── OIDC provider — enables IRSA for pods
├── aws-vpc-cni add-on — pod networking
├── coredns add-on — cluster DNS
└── kube-proxy add-on — service networking

Step 1: Project Structure

bash
mkdir eks-cluster && cd eks-cluster

# Create files
touch main.tf vpc.tf eks.tf outputs.tf variables.tf versions.tf

Step 2: versions.tf — Provider Pins

hcl
1# versions.tf
2terraform {
3  required_version = ">= 1.6"
4
5  required_providers {
6    aws = {
7      source  = "hashicorp/aws"
8      version = "~> 5.0"
9    }
10    kubernetes = {
11      source  = "hashicorp/kubernetes"
12      version = "~> 2.25"
13    }
14  }
15
16  # Remote state — replace with your bucket
17  backend "s3" {
18    bucket = "my-terraform-state-bucket"
19    key    = "eks/terraform.tfstate"
20    region = "ap-south-1"
21  }
22}

Step 3: variables.tf

hcl
1# variables.tf
2variable "cluster_name" {
3  description = "EKS cluster name"
4  type        = string
5  default     = "my-cluster"
6}
7
8variable "aws_region" {
9  description = "AWS region"
10  type        = string
11  default     = "ap-south-1"
12}
13
14variable "kubernetes_version" {
15  description = "Kubernetes version"
16  type        = string
17  default     = "1.30"
18}
19
20variable "node_instance_type" {
21  description = "EC2 instance type for worker nodes"
22  type        = string
23  default     = "t3.medium"
24}
25
26variable "node_min_size" {
27  type    = number
28  default = 2
29}
30
31variable "node_max_size" {
32  type    = number
33  default = 10
34}
35
36variable "node_desired_size" {
37  type    = number
38  default = 3
39}

Step 4: main.tf — AWS Provider

hcl
1# main.tf
2provider "aws" {
3  region = var.aws_region
4
5  default_tags {
6    tags = {
7      ManagedBy   = "terraform"
8      Cluster     = var.cluster_name
9      Environment = "production"
10    }
11  }
12}
13
14# Data sources
15data "aws_availability_zones" "available" {
16  state = "available"
17}
18
19data "aws_caller_identity" "current" {}

Step 5: vpc.tf — Network Foundation

hcl
1# vpc.tf
2locals {
3  azs            = slice(data.aws_availability_zones.available.names, 0, 3)
4  vpc_cidr       = "10.0.0.0/16"
5  public_cidrs   = ["10.0.1.0/24", "10.0.2.0/24", "10.0.3.0/24"]
6  private_cidrs  = ["10.0.11.0/24", "10.0.12.0/24", "10.0.13.0/24"]
7}
8
9# VPC
10resource "aws_vpc" "main" {
11  cidr_block           = local.vpc_cidr
12  enable_dns_hostnames = true
13  enable_dns_support   = true
14
15  tags = {
16    Name = "${var.cluster_name}-vpc"
17    # Required for EKS to discover the VPC
18    "kubernetes.io/cluster/${var.cluster_name}" = "shared"
19  }
20}
21
22# Internet Gateway
23resource "aws_internet_gateway" "main" {
24  vpc_id = aws_vpc.main.id
25  tags   = { Name = "${var.cluster_name}-igw" }
26}
27
28# Public subnets — for load balancers
29resource "aws_subnet" "public" {
30  count             = 3
31  vpc_id            = aws_vpc.main.id
32  cidr_block        = local.public_cidrs[count.index]
33  availability_zone = local.azs[count.index]
34
35  map_public_ip_on_launch = true
36
37  tags = {
38    Name                     = "${var.cluster_name}-public-${count.index + 1}"
39    "kubernetes.io/role/elb" = "1"   # Required for AWS Load Balancer Controller
40    "kubernetes.io/cluster/${var.cluster_name}" = "shared"
41  }
42}
43
44# Private subnets — for EKS nodes
45resource "aws_subnet" "private" {
46  count             = 3
47  vpc_id            = aws_vpc.main.id
48  cidr_block        = local.private_cidrs[count.index]
49  availability_zone = local.azs[count.index]
50
51  tags = {
52    Name                              = "${var.cluster_name}-private-${count.index + 1}"
53    "kubernetes.io/role/internal-elb" = "1"   # For internal load balancers
54    "kubernetes.io/cluster/${var.cluster_name}" = "shared"
55  }
56}
57
58# Elastic IPs for NAT Gateways
59resource "aws_eip" "nat" {
60  count  = 3
61  domain = "vpc"
62  tags   = { Name = "${var.cluster_name}-nat-eip-${count.index + 1}" }
63}
64
65# NAT Gateways — one per AZ for HA
66resource "aws_nat_gateway" "main" {
67  count         = 3
68  allocation_id = aws_eip.nat[count.index].id
69  subnet_id     = aws_subnet.public[count.index].id
70  tags          = { Name = "${var.cluster_name}-nat-${count.index + 1}" }
71  depends_on    = [aws_internet_gateway.main]
72}
73
74# Route tables
75resource "aws_route_table" "public" {
76  vpc_id = aws_vpc.main.id
77
78  route {
79    cidr_block = "0.0.0.0/0"
80    gateway_id = aws_internet_gateway.main.id
81  }
82
83  tags = { Name = "${var.cluster_name}-public-rt" }
84}
85
86resource "aws_route_table" "private" {
87  count  = 3
88  vpc_id = aws_vpc.main.id
89
90  route {
91    cidr_block     = "0.0.0.0/0"
92    nat_gateway_id = aws_nat_gateway.main[count.index].id
93  }
94
95  tags = { Name = "${var.cluster_name}-private-rt-${count.index + 1}" }
96}
97
98# Route table associations
99resource "aws_route_table_association" "public" {
100  count          = 3
101  subnet_id      = aws_subnet.public[count.index].id
102  route_table_id = aws_route_table.public.id
103}
104
105resource "aws_route_table_association" "private" {
106  count          = 3
107  subnet_id      = aws_subnet.private[count.index].id
108  route_table_id = aws_route_table.private[count.index].id
109}

Step 6: eks.tf — Cluster and Node Groups

hcl
1# eks.tf
2
3# IAM role for the EKS control plane
4resource "aws_iam_role" "eks_cluster" {
5  name = "${var.cluster_name}-cluster-role"
6
7  assume_role_policy = jsonencode({
8    Version = "2012-10-17"
9    Statement = [{
10      Action    = "sts:AssumeRole"
11      Effect    = "Allow"
12      Principal = { Service = "eks.amazonaws.com" }
13    }]
14  })
15}
16
17resource "aws_iam_role_policy_attachment" "eks_cluster_policy" {
18  policy_arn = "arn:aws:iam::aws:policy/AmazonEKSClusterPolicy"
19  role       = aws_iam_role.eks_cluster.name
20}
21
22# Security group for the cluster API endpoint
23resource "aws_security_group" "cluster" {
24  name        = "${var.cluster_name}-cluster-sg"
25  description = "EKS cluster security group"
26  vpc_id      = aws_vpc.main.id
27
28  egress {
29    from_port   = 0
30    to_port     = 0
31    protocol    = "-1"
32    cidr_blocks = ["0.0.0.0/0"]
33  }
34
35  tags = { Name = "${var.cluster_name}-cluster-sg" }
36}
37
38# EKS Cluster
39resource "aws_eks_cluster" "main" {
40  name     = var.cluster_name
41  role_arn = aws_iam_role.eks_cluster.arn
42  version  = var.kubernetes_version
43
44  vpc_config {
45    subnet_ids              = concat(aws_subnet.private[*].id, aws_subnet.public[*].id)
46    security_group_ids      = [aws_security_group.cluster.id]
47    endpoint_private_access = true
48    endpoint_public_access  = true   # Set to false and use VPN for production
49    public_access_cidrs     = ["0.0.0.0/0"]   # Restrict to your office IP for production
50  }
51
52  enabled_cluster_log_types = ["api", "audit", "authenticator", "controllerManager", "scheduler"]
53
54  depends_on = [aws_iam_role_policy_attachment.eks_cluster_policy]
55}
56
57# OIDC Provider — enables IRSA (IAM Roles for Service Accounts)
58data "tls_certificate" "eks" {
59  url = aws_eks_cluster.main.identity[0].oidc[0].issuer
60}
61
62resource "aws_iam_openid_connect_provider" "eks" {
63  client_id_list  = ["sts.amazonaws.com"]
64  thumbprint_list = [data.tls_certificate.eks.certificates[0].sha1_fingerprint]
65  url             = aws_eks_cluster.main.identity[0].oidc[0].issuer
66}
67
68# IAM role for worker nodes
69resource "aws_iam_role" "node_group" {
70  name = "${var.cluster_name}-node-group-role"
71
72  assume_role_policy = jsonencode({
73    Version = "2012-10-17"
74    Statement = [{
75      Action    = "sts:AssumeRole"
76      Effect    = "Allow"
77      Principal = { Service = "ec2.amazonaws.com" }
78    }]
79  })
80}
81
82resource "aws_iam_role_policy_attachment" "node_group_policies" {
83  for_each = toset([
84    "arn:aws:iam::aws:policy/AmazonEKSWorkerNodePolicy",
85    "arn:aws:iam::aws:policy/AmazonEKS_CNI_Policy",
86    "arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryReadOnly",
87  ])
88
89  policy_arn = each.value
90  role       = aws_iam_role.node_group.name
91}
92
93# Managed Node Group
94resource "aws_eks_node_group" "main" {
95  cluster_name    = aws_eks_cluster.main.name
96  node_group_name = "${var.cluster_name}-main"
97  node_role_arn   = aws_iam_role.node_group.arn
98  subnet_ids      = aws_subnet.private[*].id
99
100  instance_types = [var.node_instance_type]
101  ami_type       = "AL2_x86_64"
102  capacity_type  = "ON_DEMAND"
103
104  scaling_config {
105    desired_size = var.node_desired_size
106    min_size     = var.node_min_size
107    max_size     = var.node_max_size
108  }
109
110  update_config {
111    max_unavailable = 1
112  }
113
114  labels = {
115    role = "general"
116  }
117
118  depends_on = [aws_iam_role_policy_attachment.node_group_policies]
119
120  lifecycle {
121    ignore_changes = [scaling_config[0].desired_size]  # Let Cluster Autoscaler manage this
122  }
123}
124
125# Core EKS Add-ons
126resource "aws_eks_addon" "vpc_cni" {
127  cluster_name             = aws_eks_cluster.main.name
128  addon_name               = "vpc-cni"
129  resolve_conflicts_on_update = "PRESERVE"
130  depends_on               = [aws_eks_node_group.main]
131}
132
133resource "aws_eks_addon" "coredns" {
134  cluster_name             = aws_eks_cluster.main.name
135  addon_name               = "coredns"
136  resolve_conflicts_on_update = "PRESERVE"
137  depends_on               = [aws_eks_node_group.main]
138}
139
140resource "aws_eks_addon" "kube_proxy" {
141  cluster_name             = aws_eks_cluster.main.name
142  addon_name               = "kube-proxy"
143  resolve_conflicts_on_update = "PRESERVE"
144  depends_on               = [aws_eks_node_group.main]
145}

Step 7: outputs.tf

hcl
1# outputs.tf
2output "cluster_name" {
3  value = aws_eks_cluster.main.name
4}
5
6output "cluster_endpoint" {
7  value = aws_eks_cluster.main.endpoint
8}
9
10output "cluster_certificate_authority" {
11  value     = aws_eks_cluster.main.certificate_authority[0].data
12  sensitive = true
13}
14
15output "oidc_provider_arn" {
16  value = aws_iam_openid_connect_provider.eks.arn
17}
18
19output "oidc_provider_url" {
20  value = replace(aws_eks_cluster.main.identity[0].oidc[0].issuer, "https://", "")
21}
22
23output "configure_kubectl" {
24  value = "aws eks update-kubeconfig --region ${var.aws_region} --name ${var.cluster_name}"
25}

Step 8: Apply

bash
1terraform init
2
3# Check the plan before applying
4terraform plan -out=tfplan
5
6# Review: should see ~50 resources to create
7# Apply
8terraform apply tfplan

Application takes 12–20 minutes — most of the time is the EKS control plane coming up.

Step 9: Connect kubectl

bash
$(terraform output -raw configure_kubectl)

kubectl get nodes
# NAME                          STATUS   ROLES    AGE   VERSION
# ip-10-0-11-xxx.ap-south-1.compute.internal   Ready    <none>   5m    v1.30.x

Step 10: Use the OIDC Provider for IRSA

The OIDC provider ARN is in terraform output oidc_provider_arn. Use it to create IAM roles for pods (see the AWS IRSA tutorial for the full workflow).

bash
OIDC_PROVIDER_ARN=$(terraform output -raw oidc_provider_arn)
OIDC_PROVIDER_URL=$(terraform output -raw oidc_provider_url)

Tear Down

bash
# Scale down node group first (faster)
terraform destroy -target=aws_eks_node_group.main

# Then destroy everything else
terraform destroy

NAT Gateways are expensive — make sure you destroy the cluster when you're done with it.

Official References

We built Podscape to simplify Kubernetes workflows like this — logs, events, and cluster state in one interface, without switching tools.

Struggling with this in production?

We help teams fix these exact issues. Our engineers have deployed these patterns across production environments at scale.