Kubernetes with Kubeadm: Fully Automated Installation with Terraform
Kubeadm is a Kubernetes distribution that provides all customization options that you can think of: container runtime, container network interface, cluster storage and ingress. You can configure all these aspects of your cluster but have to understand the individual options and their setup as well. For a complete overview about this remarkable distribution, see my previous article.
This article is a tutorial about creating a 3 node Kubernetes cluster. One node will be the control plane node, and 2 others will be the worker nodes. The particular Kubernetes components are etcd for the cluster state, container-d as the CRI, and calico as the CNI. Let’s get started.
The complete project source code is also available at my Github repository kubernetes-kubeadm-terraform.
Prerequisites
To follow along, you will need a Hetzner Cloud account. Then, get an access token following the official documentation.
Also, choose appropriate server types for your setup. For this demonstration, I will use the following:
- 1x CX11 node (1 Intel CPU, 2GB RAM, 40GB SSD)
- 2x CX21 nodes (2 Intel CPU, 4GB RAM, 40GB SSD)
You also need to install Terraform, head over to terraform.io and grab the binary suitable for your computer.
Project Structure
With the following command, you will create a file structure that conforms with Terraform best practices.
TF_PROJECT_NAME=kops_github
mkdir $TF_PROJECT_NAME
tffiles=('main' 'variables' 'providers' 'versions' 'outputs'); for file in "${tffiles[@]}" ; do touch "$TF_PROJECT_NAME/$file".tf; done
Provisioner Configuration
We start with configuring the used provides inside versions.tf
. Enter the following content with the most up-to-date version that’s available:
//versions.tf
terraform {
required_providers {
hcloud = {
source = "hetznercloud/hcloud"
version = ">=1.35.2"
}
tls = {
source = "hashicorp/tls"
version = ">=4.0.3"
}
}
}
The provider configuration is slim: We just need to define a variable that will keep the Hetzner cloud access token...
//providers.tf
provider "hcloud" {
token = var.hcloud_token
sensitive = true
}
... and the variable itself is defined in the same-named file.
//variables.tf
variable "hcloud_token" {
sensitive = true
}
Secrets handling in Terraform means to not store them as concrete text because it will end up as plaintext in the state file. Instead, you pass them as environment variables to the terraform commands.
Resources
We will create two different kinds of resources: An SSH key for accessing the servers, and servers with the role of either controller or worker.
SSH Key
To access the machines, we will use a dynamically created SSH key. The resource configuration is this:
//resources.tf
resource "tls_private_key" "generic-ssh-key" {
algorithm = "RSA"
rsa_bits = 4096
}
To use this key for connection to the server via SSH, we will use a local_provisioner
, a Terraform abstraction that manifests resource information in files. We will grab the private and public key attributes from the resource and store them in files.
//resources.tf
resource "tls_private_key" "generic-ssh-key" {
//...
provisioner "local-exec" {
command = <<EOF
cat <<< "${tls_private_key.generic-ssh-key.private_key_openssh}" > .ssh/id_rsa.key
cat <<< "${tls_private_key.generic-ssh-key.public_key_openssh}" > .ssh/id_rsa.key
chmod 400 .ssh/id_rsa.key
chmod 400 .ssh/id_rsa.key
EOF
}
}
Server Variables
In Terraform, you can write resource definitions in plain text, or invoke functions that iterate through a list of items and use their values. My project uses the latter to be more flexible and handling any number of controller or worker nodes.
We need variables that define the list of controller and workers, and we need an object that holds the server’s configuration like server size and OS image.
// variables.tf
variable "server_config" {
default = ({
controller_server_type = "cx21"
worker_server_type = "cpx21"
image = "debian-11"
k8s_controller_instances = ["controller"]
k8s_worker_instances = ["worker1", "worker2"]
})
}
Server Resources
I like to think of the following resource definitions as generators: They consume variables iteratively.
The controller is this:
resource "hcloud_server" "controller" {
for_each = toset(var.server_config.k8s_controller_instances)
name = each.key
server_type = var.cloud_server_meta_config.server_type.controller
image = var.cloud_server_meta_config.image
location = "nbg1"
ssh_keys = [hcloud_ssh_key.primary-ssh-key.name]
}
And for the workers:
resource "hcloud_server" "worker" {
for_each = toset(var.server_config.worker_instances)
name = each.key
server_type = var.server_config.worker_server_type
image = var.server_config.image
location = "fsn1"
ssh_keys = [hcloud_ssh_key.primary-ssh-key.name]
}
Kubeadm Automation: Controller
Up to this point, the above declarations will merely provision the infrastructure. In order to install kubeadm on them, we need to run shell scripts.
In Terraform, there is the concept of a provisioner, an additional logic that is involved during the creation of resources. We will use a remote_provisoner
to copy installation scripts to the servers. This provisioner will use the SSH key that we created earlier.
The configuration inside the controller is this:
resource "hcloud_server" "controller" {
//...
connection {
type = "ssh"
user = "root"
private_key = tls_private_key.generic-ssh-key.private_key_openssh
host = self.ipv4_address
}
provisioner "remote-exec" {
scripts = [
"./bin/01_install.sh",
"./bin/02_kubeadm_init.sh"
]
}
}
The two scripts will install all necessary packages and then start the kubeadm cluster initialization. The content of these scripts is derived from the manual installation tutorial, which will be published later.
When the controller is provisioned, we need to get the kubeadm join
command from it. This includes dynamically created join tokens. With the local-exec
provider, we create a shell script inside this project, then connect to the controller via SSH, grab the join command, and feed it into the shell script. Here is how:
resource "hcloud_server" "controller" {
//...
provisioner "local-exec" {
command = <<EOF
rm -rvf ./bin/03_kubeadm_join.sh
echo "echo 1 > /proc/sys/net/ipv4/ip_forward" > ./bin/03_kubeadm_join.sh
ssh root@${self.ipv4_address} -o StrictHostKeyChecking=no -i .ssh/id_rsa.key "kubeadm token create --print-join-command" >> ./bin/03_kubeadm_join.sh
EOF
}
Kubeadm Automation: Worker
The workers are provisioned in a very similar way: We copy and execute the installation script and the dynamically created kubeadm join
shell script. Furthermore, since the workers need to wait for the controller to be fully provisioned, we add a depends_on
relationship.
The complete configuration is this:
resource "hcloud_server" "worker" {
//...
depends_on = [
hcloud_server.controller
]
connection {
type = "ssh"
user = "root"
private_key = tls_private_key.generic-ssh-key.private_key_openssh
host = self.ipv4_address
}
provisioner "remote-exec" {
scripts = [
"./bin/01_install.sh",
"./bin/03_kubeadm_join.sh"
]
}
}
Cluster Provisioning
All Terraform resources are ready. Now we can start the provisioning.
export TF_VAR_hcloud_token=SECRET
terraform apply
#...
Terraform will perform the following actions:
# hcloud_server.controller["controller"] will be created
+ resource "hcloud_server" "controller" {
# hcloud_server.worker["worker1"] will be created
+ resource "hcloud_server" "worker" {
#...
Do you want to perform these actions in workspace "staging"?
Terraform will perform the actions described above.
Only 'yes' will be accepted to approve.
Enter a value: yes
tls_private_key.generic-ssh-key: Creating...
tls_private_key.generic-ssh-key: Provisioning with 'local-exec'...
tls_private_key.generic-ssh-key (local-exec): (output suppressed due to sensitive value in config)
tls_private_key.generic-ssh-key: Creation complete after 2s [id=763061a58dc3b7c5b9c09b76e4ddbf4cdb52c3ab]
hcloud_ssh_key.primary-ssh-key: Creating...
hcloud_ssh_key.primary-ssh-key: Creation complete after 2s [id=8482290]
hcloud_server.controller["controller"]: Creating...
hcloud_server.controller["controller"]: Still creating... [10s elapsed]
hcloud_server.controller["controller"]: Provisioning with 'remote-exec'...
hcloud_server.controller["controller"] (remote-exec): Connecting to remote host via SSH...
#...
Run terraform show
to see the controllers IP address, then grab the kubeconfig
file via SSH:
ssh root@${CONTROLLER_IP} -o StrictHostKeyChecking=no -i .ssh/id_rsa.key "cat /root/.kube/config"
And with this, you can operate the cluster with kubectl:
kubectl get nodes
NAME STATUS ROLES AGE VERSION
controller-staging Ready control-plane,master 22m v1.23.11
worker1-staging Ready <none> 49s v1.23.11
worker2-staging Ready <none> 5m51s v1.23.11
Conclusion
This article showed how to use Terraform for provisioning a kubeadm cluster. We started with the configuration of the Terraform provisioners, then created a variable that holds the cluster configuration, and continued with the resource definition for the controller and worker. When this Terraform project runs, it creates the controller first, grabs its kubeadm join command
, provisions the worker and joins them into a cluster. After about 2 minutes you have your full available cluster. Copy the kubeconfig to your machine and start working with your cluster.