Creating Lokomotive cluster in bare metal environment using Tinkerbell
Continuing our journey with Tinkerbell
In our previous blog post we described an example workflow of how Flatcar Container Linux can be provisioned using Tinkerbell in bare metal environments.
Today, we will take this idea further and explain how you can create Kubernetes cluster in bare metal environment using Lokomotive and Tinkerbell.
This guide will use libvirt to create virtual machines which will run a Tinkerbell server and Kubernetes cluster nodes, which will be provisioned by Tinkerbell. Using libvirt allows easy and isolated local testing without any cloud accounts or costs involved.
Why Lokomotive?
To support growing popularity of Tinkerbell, we decided to make it one of supported platforms in Lokomotive to provide Tinkerbell users well integrated option for production-grade Kubernetes cluster management.
Lokomotive is a self-hosted, upstream Kubernetes distribution with strong security defaults and frictionless updates.
Steps towards Tinkerbell support in Lokomotive
In order to bring Tinkerbell integration to Lokomotive, we had to do some preparation steps, which as a result will also benefit the Tinkerbell community.
Terraform provider for Tinkerbell
Lokomotive use Terraform underneath to manage cluster infrastructure, so one of the requirements was to have a Terraform provider for Tinkerbell, which would allow us to declaratively manage hardware entries, provisioning templates and workflows.
As currently there is no official Terraform provider for Tinkerbell and we couldn't find any existing implementation to contribute to, we decided to implement one ourselves.
The provider code currently lives in kinvolk/terraform-provider-tinkerbell repository on GitHub and we are working closely with Tinkerbell community to align the project to their best practices and to transfer this project to Tinkerbell organization, so it can become an official project.
Support for automated testing in sandbox environment
To ensure that Tinkerbell support in Lokomotive remains fully functional, we had to implement automated tests for it.
Currently, the recommended way of installing Tinkerbell is via tinkerbell/sandbox project. However, at the moment it
still requires some manual steps during setup, so we had to modify it a bit to support full automation, so after running
terraform apply
, you will get a fully functional Tinkerbell stack.
We plan to contribute our modifications upstream.
Contributions to Tinkerbell
While working on Tinkerbell, we found some issues and opportunities for improvements across Tinkerbell codebase and documentation, which we reported upstream, so all users can benefit from it.
Creating local Lokomotive cluster using experimental Tinkerbell sandbox and libvirt
To make trying out Lokomotive with Tinkerbell easier, we will be using Tinkerbell sandbox which will run on libvirt virtual machines.
We created this setup for Lokomotive continuous integration process, so we can ensure that our integration with Tinkerbell remains functional.
Cluster creation will be done using lokoctl
, which is Lokomotive CLI tool for managing clusters and components.
Prerequisites
Before we get started, we need to get some dependencies described in the next sections.
Local development tools
As Tinkerbell support in Lokomotive has not yet been released, we need to build the lokoctl
binary ourselves with support for it.
In order to do that, some essential development tools are required to be installed on your machine, here is the list:
git
make
go
wget
bunzip2
Please refer to your OS documentation to learn how to install and configure those.
Lokoctl binary
To build lokoctl
binary, let's clone kinvolk/lokomotive repository using the following command:
git clone --branch invidian/tinkerbell-platform https://github.com/kinvolk/lokomotive.git && cd lokomotive
Then, build lokoctl
using your local development tool with the following command:
make
Terraform binary
lokoctl
depends on the terraform
binary, which must be installed beforehand.
Please see Terraform documentation to learn how to install it in your system.
libvirt
As we will be using libvirt to create virtual machines, it must be installed and configured on your system.
Please refer to your OS documentation to learn how to install it.
Terraform provider for libvirt
As terraform-provider-libvirt which we use for managing libvirt resource has not yet been published to Terraform registry, we must download it manually as well. This can be done using the following commands:
export LIBVIRT_VER="0.6.2" LIBVIRT_VER_FULL="0.6.2+git.1585292411.8cbe9ad0" && \
wget https://github.com/dmacvicar/terraform-provider-libvirt/releases/download/v"$LIBVIRT_VER"/terraform-provider-libvirt-"$LIBVIRT_VER_FULL".Ubuntu_18.04.amd64.tar.gz && \
tar xzf terraform-provider-libvirt-"$LIBVIRT_VER_FULL".Ubuntu_18.04.amd64.tar.gz && \
mkdir -p ~/.local/share/terraform/plugins/registry.terraform.io/dmacvicar/libvirt/"$LIBVIRT_VER"/linux_amd64/ && \
mv terraform-provider-libvirt ~/.local/share/terraform/plugins/registry.terraform.io/dmacvicar/libvirt/"$LIBVIRT_VER"/linux_amd64/terraform-provider-libvirt && \
rm terraform-provider-libvirt-"$LIBVIRT_VER_FULL".Ubuntu_18.04.amd64.tar.gz
It will download the required version of the provider, unpack it and place it in a directory discoverable for Terraform.
Terraform provider for Tinkerbell
As Tinkerbell Terraform provider has not been officially released yet either, we need to compile it from source as well.
So, first we clone it's source using the following command:
git clone https://github.com/kinvolk/terraform-provider-tinkerbell.git && cd terraform-provider-tinkerbell
Then we build it:
make build
And we make it available for Terraform:
mkdir -p ~/.local/share/terraform/plugins/registry.terraform.io/tinkerbell/tinkerbell/0.1.0/linux_amd64/ && \
mv terraform-provider-tinkerbell ~/.local/share/terraform/plugins/registry.terraform.io/tinkerbell/tinkerbell/0.1.0/linux_amd64/
Flatcar Container Linux operating system image
To use Tinkerbell sandbox, we need to download and unpack Flatcar Container Linux image. This way, we can skip installation of the OS on the provisioner machine.
This can be done using the following commands:
wget https://stable.release.flatcar-linux.net/amd64-usr/current/flatcar_production_qemu_image.img.bz2
bunzip2 flatcar_production_qemu_image.img.bz2
After you download the image, get absolute path of it using the following command:
realpath flatcar_production_qemu_image.img
It will be needed in the configuration process.
Cluster configuration
Once we have all dependencies in place, we can configure our cluster.
Lokomotive use HashiCorp Configuration Language (HCL) for defining cluster configuration in a declarative way, which can be version controlled and is easy to customize.
Basic configuration
So, to create a basic configuration for Tinkerbell cluster, create a file named cluster.lokocfg
with the following contents:
cluster "tinkerbell" {
asset_dir = "./assets"
name = "demo"
dns_zone = "example.com"
ssh_public_keys = ["ssh-rsa AAAA..."]
controller_ip_addresses = ["10.17.3.4"]
experimental_sandbox {
pool_path = "/opt/pool"
flatcar_image_path = "/opt/flatcar_production_qemu_image.img"
hosts_cidr = "10.17.3.0/24"
}
worker_pool "pool1" {
ip_addresses = ["10.17.3.5"]
}
}
Now, replace the parameters above using the following information:
ssh_public_keys
- A list of strings representing the contents of the public SSH keys which should be authorized on cluster controller nodes. If you don't have your SSH key yet, see this guide to learn how to generate one.pool_path
- Absolute path where VM disk images will be stored. Can be set to the output ofecho $(pwd)/pool
command. There will be around 25 GB of free disk space required in total for the cluster.flatcar_image_path
- This should be set to the value obtained from the previous step.
Closer look at configuration options
To briefly explain what each line of the configuration is doing:
cluster "tinkerbell" {
This declares a cluster using the tinkerbell
platform. You can find out more about platforms supported by Lokomotive in
the documentation.
asset_dir = "./assets"
Asset directory defines where cluster artifacts like certificates and Terraform code will be stored. This directory is disposable if you configure remote backend for Terraform state, otherwise Terraform state is stored in assets directory as well and it must be preserved to allow cluster updates and cleanup.
name = "demo"
Name defines a common identifier for the cluster, which will be used in the DNS entries used by the cluster and in cluster node names.
dns_zone = "example.com"
DNS zone is a domain which will be used by cluster communication. With Tinkerbell sandbox, DNS entries are created locally.
If you create a cluster with existing instance of Tinkerbell, following DNS entries must be created:
-
<cluster name>.<dns zone>
pointing to IP addresses defined incontroller_ip_addresses
configuration option.For example, with the following configuration:
name = "demo"
dns_zone = "example.com"
controller_ip_addresses = ["10.17.3.4", "10.17.3.5"]Following DNS records should be created:
demo.example.com. IN A 10.17.3.4
demo.example.com. IN A 10.17.3.5 -
<cluster name>-etcd0.<dns zone>
,<cluster name>-etcd1.<dns zone>
etc. pointing to each controller IP address.For example, with the following configuration:
name = "demo"
dns_zone = "example.com"
controller_ip_addresses = ["10.17.3.4", "10.17.3.5"]Following DNS records should be created:
demo-etcd0.example.com IN A 10.17.3.4
demo-etcd1.example.com IN A 10.17.3.5
The DNS entries must be created before cluster creation, otherwise creation will fail.
In the future, we plan to automate DNS entries creation, as it is done for our other supported platforms.
ssh_public_keys = ["ssh-rsa AAAA..."]
This list of SSH public keys will be added as authorized keys to the core
user on all controller nodes.
It is required to define here at least one key, which must be loaded into SSH agent during cluster creation time,
so lokoctl
can perform cluster bootstrapping.
controller_ip_addresses = ["10.17.3.4"]
This list of IP addresses will be used to select the right hardware from Tinkerbell. If you use experimental_sandbox
feature, for each IP address virtual machine will be created for you and hardware entry will be automatically added
into Tinkerbell.
experimental_sandbox {
Experimental sandbox block enables creation of an extra virtual machine, which will run Tinkerbell server. If this block is defined, also DNS entries for the cluster will be created automatically in libvirt DNS server and cluster nodes will be automatically configured to use this DNS server.
This setting is recommended only for testing and not for production usage.
pool_path = "/opt/pool"
Pool path defines where libvirt will store created virtual machine disk images.
flatcar_image_path = "/opt/flatcar_production_qemu_image.img"
Flatcar image path should be an absolute path in the local file system, pointing to previously downloaded Flatcar image.
hosts_cidr = "10.17.3.0/24"
Hosts CIDR is required by libvirt to configure isolated network environment for virtual machines. This range
must cover all IP addresses listed in controller_ip_addresses
parameter and all worker pool IP addresses.
worker_pool "pool1" {
Defines a worker pool with name pool1
. Each worker pool can have for example different Node labels and taints configured, use
different hardware or different OS version, depending on users' need.
ip_addresses = ["10.17.3.5"]
This option for worker pool defines which hardware should be used from Tinkerbell for creating worker nodes of the cluster.
With the experimental_sandbox
option enabled, those machines will be created automatically.
To see all available configuration options for Tinkerbell platform, see Tinkerbell configuration reference.
Additionally, you can also extra configuration for backend or components.
Creating cluster
With configuration in place, we can finally trigger creation of our cluster. To do that, execute the following command:
lokoctl cluster apply --verbose
This step will take about 15 minutes, depending on your machine performance and internet connection speed.
When it is finished, you should see output similar to this:
Your configurations are stored in ./assets
Now checking health and readiness of the cluster nodes ...
Node Ready Reason Message
demo-controller-0 True KubeletReady kubelet is posting ready status
demo-worker-pool1-0 True KubeletReady kubelet is posting ready status
Success - cluster is healthy and nodes are ready!
This means, our cluster is ready!
Inspecting Tinkerbell activity
With cluster created, we can log in into the provisioner machine to get access to Tinkerbell CLI tool tink
, so we can see what kind of
Tinkerbell resource has been created for us.
To log in into the provisioner machine, run the following command:
Then, execute this command to open shell with tink
binary available:
source tink/.env && docker-compose -f tink/deploy/docker-compose.yml exec tink-cli sh
Hardware
Now, run tink hardware list
command to see available Hardware registered in Tinkerbell:
# tink hardware list
+--------------------------------------+-------------------+------------+----------+
| ID | MAC ADDRESS | IP ADDRESS | HOSTNAME |
+--------------------------------------+-------------------+------------+----------+
| 75ecb8b9-e24b-cdda-75cd-6a0d486d81e7 | 52:9b:b9:95:3c:bb | 10.17.3.5 | |
| 576a52db-9f26-f3c1-5457-0ca305c6ccba | 52:b2:e9:0e:16:56 | 10.17.3.4 | |
+--------------------------------------+-------------------+------------+----------+
Here, you should see the IP address of nodes we configured previously.
Templates
With the tink template list
command, you can inspect what workflow templates has been created to configure the Lokomotive cluster.
# tink template list
+--------------------------------------+---------------------+-------------------------------+-------------------------------+
| TEMPLATE ID | TEMPLATE NAME | CREATED AT | UPDATED AT |
+--------------------------------------+---------------------+-------------------------------+-------------------------------+
| 33cbf5d0-09d2-4ea5-92a2-e23a6a0a1e0f | demo-worker-pool1-0 | 2020-10-15 17:08:06 +0000 UTC | 2020-10-15 17:08:06 +0000 UTC |
| 7d5d9b5a-7455-496f-a553-669d78b48e7d | demo-controller-0 | 2020-10-15 17:08:06 +0000 UTC | 2020-10-15 17:08:06 +0000 UTC |
+--------------------------------------+---------------------+-------------------------------+-------------------------------+
Workflows
Finally, with the tink workflow list
we can see workflows which has been executed to create the cluster.
# tink workflow list
+--------------------------------------+--------------------------------------+---------------------------+-------------------------------+-------------------------------+
| WORKFLOW ID | TEMPLATE ID | HARDWARE DEVICE | CREATED AT | UPDATED AT |
+--------------------------------------+--------------------------------------+---------------------------+-------------------------------+-------------------------------+
| ff622ae2-280c-4767-bb19-cf08851fa673 | 7d5d9b5a-7455-496f-a553-669d78b48e7d | {"device_1": "10.17.3.4"} | 2020-10-15 17:08:06 +0000 UTC | 2020-10-15 17:08:06 +0000 UTC |
| 732eb53e-f55a-4f69-bd02-7788cc4f348e | 33cbf5d0-09d2-4ea5-92a2-e23a6a0a1e0f | {"device_1": "10.17.3.5"} | 2020-10-15 17:08:06 +0000 UTC | 2020-10-15 17:08:06 +0000 UTC |
+--------------------------------------+--------------------------------------+---------------------------+-------------------------------+-------------------------------+
Using tink workflow events
command, we can see what actions has been performed and how much time they took:
# tink workflow events ff622ae2-280c-4767-bb19-cf08851fa673
+--------------------------------------+-----------------+-----------------+----------------+---------------------------------+--------------------+
| WORKER ID | TASK NAME | ACTION NAME | EXECUTION TIME | MESSAGE | ACTION STATUS |
+--------------------------------------+-----------------+-----------------+----------------+---------------------------------+--------------------+
| 576a52db-9f26-f3c1-5457-0ca305c6ccba | flatcar-install | dump-ignition | 0 | Started execution | ACTION_IN_PROGRESS |
| 576a52db-9f26-f3c1-5457-0ca305c6ccba | flatcar-install | dump-ignition | 3 | Finished Execution Successfully | ACTION_SUCCESS |
| 576a52db-9f26-f3c1-5457-0ca305c6ccba | flatcar-install | flatcar-install | 0 | Started execution | ACTION_IN_PROGRESS |
| 576a52db-9f26-f3c1-5457-0ca305c6ccba | flatcar-install | flatcar-install | 80 | Finished Execution Successfully | ACTION_SUCCESS |
| 576a52db-9f26-f3c1-5457-0ca305c6ccba | flatcar-install | reboot | 0 | Started execution | ACTION_IN_PROGRESS |
+--------------------------------------+-----------------+-----------------+----------------+---------------------------------+--------------------+
Now, disconnect from tink
shell before proceeding to next steps using exit
command twice.
Verifying cluster functionality
To verify that the cluster is functional, use the generated kubeconfig
file to access it. This can be done using the following commands:
export KUBECONFIG=$(pwd)/assets/cluster-assets/auth/kubeconfig
kubectl get nodes
If everything went well, you should see list of 2 nodes, similar to this:
NAME STATUS ROLES AGE VERSION
demo-controller-0 Ready <none> 15m v1.18.8
demo-worker-pool1-0 Ready <none> 15m v1.18.8
To test creating some workloads on the cluster, add the following block at the bottom of previously created cluster.lokocfg
file:
# A demo application.
component "httpbin" {
ingress_host = "httpbin.example.com"
}
It will configure the httpbin
component on the cluster, which is our example workload.
Now, run the following command to apply new configuration:
lokoctl component apply
Once finished, run the following command to see if pods has been created:
kubectl -n httpbin get pods
You should see output similar to this:
NAME READY STATUS RESTARTS AGE
httpbin-64ff5b4d5-27pbz 1/1 Running 0 15m
Now you can either continue using the cluster or go to the next step explaining how to shut things down.
Cleaning up
To destroy the cluster, run the following command:
lokoctl cluster destroy --verbose
You will be asked to confirm the destroying.
If you use the experimental_sandbox
option, virtual machines for both cluster and provisioner will be destroyed.
If you run a standalone Tinkerbell instance, you must wipe disks on the nodes and reboot them manually.
Summary and next steps
This blog post showed how to get a Lokomotive set up locally with Tinkerbell.
Join the discussion in the #tinkerbell channel on the Equinix Metal Community Slack.
Reach out to the Flatcar Container Linux community by checking out its community channels. For discussions around Lokomotive, join the community Slack channel.