Creating Lokomotive cluster in bare metal environment using Tinkerbell

Continuing our journey with Tinkerbell

In our previous blog post we described an example workflow of how Flatcar Container Linux can be provisioned using Tinkerbell in bare metal environments.

Today, we will take this idea further and explain how you can create Kubernetes cluster in bare metal environment using Lokomotive and Tinkerbell.

This guide will use libvirt to create virtual machines which will run a Tinkerbell server and Kubernetes cluster nodes, which will be provisioned by Tinkerbell. Using libvirt allows easy and isolated local testing without any cloud accounts or costs involved.

Why Lokomotive?

To support growing popularity of Tinkerbell, we decided to make it one of supported platforms in Lokomotive to provide Tinkerbell users well integrated option for production-grade Kubernetes cluster management.

Lokomotive is a self-hosted, upstream Kubernetes distribution with strong security defaults and frictionless updates.

Steps towards Tinkerbell support in Lokomotive

In order to bring Tinkerbell integration to Lokomotive, we had to do some preparation steps, which as a result will also benefit the Tinkerbell community.

Terraform provider for Tinkerbell

Lokomotive use Terraform underneath to manage cluster infrastructure, so one of the requirements was to have a Terraform provider for Tinkerbell, which would allow us to declaratively manage hardware entries, provisioning templates and workflows.

As currently there is no official Terraform provider for Tinkerbell and we couldn't find any existing implementation to contribute to, we decided to implement one ourselves.

The provider code currently lives in kinvolk/terraform-provider-tinkerbell repository on GitHub and we are working closely with Tinkerbell community to align the project to their best practices and to transfer this project to Tinkerbell organization, so it can become an official project.

Support for automated testing in sandbox environment

To ensure that Tinkerbell support in Lokomotive remains fully functional, we had to implement automated tests for it.

Currently, the recommended way of installing Tinkerbell is via tinkerbell/sandbox project. However, at the moment it still requires some manual steps during setup, so we had to modify it a bit to support full automation, so after running terraform apply, you will get a fully functional Tinkerbell stack.

We plan to contribute our modifications upstream.

Contributions to Tinkerbell

While working on Tinkerbell, we found some issues and opportunities for improvements across Tinkerbell codebase and documentation, which we reported upstream, so all users can benefit from it.

Creating local Lokomotive cluster using experimental Tinkerbell sandbox and libvirt

To make trying out Lokomotive with Tinkerbell easier, we will be using Tinkerbell sandbox which will run on libvirt virtual machines.

We created this setup for Lokomotive continuous integration process, so we can ensure that our integration with Tinkerbell remains functional.

Cluster creation will be done using lokoctl, which is Lokomotive CLI tool for managing clusters and components.

Prerequisites

Before we get started, we need to get some dependencies described in the next sections.

Local development tools

As Tinkerbell support in Lokomotive has not yet been released, we need to build the lokoctl binary ourselves with support for it.

In order to do that, some essential development tools are required to be installed on your machine, here is the list:

git
make
go
wget
bunzip2

Please refer to your OS documentation to learn how to install and configure those.

Lokoctl binary

To build lokoctl binary, let's clone kinvolk/lokomotive repository using the following command:

git clone --branch invidian/tinkerbell-platform https://github.com/kinvolk/lokomotive.git && cd lokomotive

Then, build lokoctl using your local development tool with the following command:

make

Terraform binary

lokoctl depends on the terraform binary, which must be installed beforehand.

Please see Terraform documentation to learn how to install it in your system.

libvirt

As we will be using libvirt to create virtual machines, it must be installed and configured on your system.

Please refer to your OS documentation to learn how to install it.

Terraform provider for libvirt

As terraform-provider-libvirt which we use for managing libvirt resource has not yet been published to Terraform registry, we must download it manually as well. This can be done using the following commands:

export LIBVIRT_VER="0.6.2" LIBVIRT_VER_FULL="0.6.2+git.1585292411.8cbe9ad0" && \
wget https://github.com/dmacvicar/terraform-provider-libvirt/releases/download/v"$LIBVIRT_VER"/terraform-provider-libvirt-"$LIBVIRT_VER_FULL".Ubuntu_18.04.amd64.tar.gz && \
tar xzf terraform-provider-libvirt-"$LIBVIRT_VER_FULL".Ubuntu_18.04.amd64.tar.gz && \
mkdir -p ~/.local/share/terraform/plugins/registry.terraform.io/dmacvicar/libvirt/"$LIBVIRT_VER"/linux_amd64/ && \
mv terraform-provider-libvirt ~/.local/share/terraform/plugins/registry.terraform.io/dmacvicar/libvirt/"$LIBVIRT_VER"/linux_amd64/terraform-provider-libvirt && \
rm terraform-provider-libvirt-"$LIBVIRT_VER_FULL".Ubuntu_18.04.amd64.tar.gz

It will download the required version of the provider, unpack it and place it in a directory discoverable for Terraform.

Terraform provider for Tinkerbell

As Tinkerbell Terraform provider has not been officially released yet either, we need to compile it from source as well.

So, first we clone it's source using the following command:

git clone https://github.com/kinvolk/terraform-provider-tinkerbell.git && cd terraform-provider-tinkerbell

Then we build it:

make build

And we make it available for Terraform:

mkdir -p ~/.local/share/terraform/plugins/registry.terraform.io/tinkerbell/tinkerbell/0.1.0/linux_amd64/ && \
mv terraform-provider-tinkerbell ~/.local/share/terraform/plugins/registry.terraform.io/tinkerbell/tinkerbell/0.1.0/linux_amd64/

Flatcar Container Linux operating system image

To use Tinkerbell sandbox, we need to download and unpack Flatcar Container Linux image. This way, we can skip installation of the OS on the provisioner machine.

This can be done using the following commands:

wget https://stable.release.flatcar-linux.net/amd64-usr/current/flatcar_production_qemu_image.img.bz2
bunzip2 flatcar_production_qemu_image.img.bz2

After you download the image, get absolute path of it using the following command:

realpath flatcar_production_qemu_image.img

It will be needed in the configuration process.

Cluster configuration

Once we have all dependencies in place, we can configure our cluster.

Lokomotive use HashiCorp Configuration Language (HCL) for defining cluster configuration in a declarative way, which can be version controlled and is easy to customize.

Basic configuration

So, to create a basic configuration for Tinkerbell cluster, create a file named cluster.lokocfg with the following contents:

cluster "tinkerbell" {
  asset_dir               = "./assets"
  name                    = "demo"
  dns_zone                = "example.com"
  ssh_public_keys         = ["ssh-rsa AAAA..."]
  controller_ip_addresses = ["10.17.3.4"]

  experimental_sandbox {
    pool_path          = "/opt/pool"
    flatcar_image_path = "/opt/flatcar_production_qemu_image.img"
    hosts_cidr         = "10.17.3.0/24"
  }

  worker_pool "pool1" {
    ip_addresses    = ["10.17.3.5"]
  }
}

Now, replace the parameters above using the following information:

ssh_public_keys - A list of strings representing the contents of the public SSH keys which should be authorized on cluster controller nodes. If you don't have your SSH key yet, see this guide to learn how to generate one.
pool_path - Absolute path where VM disk images will be stored. Can be set to the output of echo $(pwd)/pool command. There will be around 25 GB of free disk space required in total for the cluster.
flatcar_image_path - This should be set to the value obtained from the previous step.

Closer look at configuration options

To briefly explain what each line of the configuration is doing:

cluster "tinkerbell" {

This declares a cluster using the tinkerbell platform. You can find out more about platforms supported by Lokomotive in the documentation.

  asset_dir               = "./assets"

Asset directory defines where cluster artifacts like certificates and Terraform code will be stored. This directory is disposable if you configure remote backend for Terraform state, otherwise Terraform state is stored in assets directory as well and it must be preserved to allow cluster updates and cleanup.

  name                    = "demo"

Name defines a common identifier for the cluster, which will be used in the DNS entries used by the cluster and in cluster node names.

  dns_zone                = "example.com"

DNS zone is a domain which will be used by cluster communication. With Tinkerbell sandbox, DNS entries are created locally.

If you create a cluster with existing instance of Tinkerbell, following DNS entries must be created:

<cluster name>.<dns zone> pointing to IP addresses defined in controller_ip_addresses configuration option.

For example, with the following configuration:
```
name                    = "demo"
dns_zone                = "example.com"
controller_ip_addresses = ["10.17.3.4", "10.17.3.5"]
```
Following DNS records should be created:
```
demo.example.com. IN A 10.17.3.4
demo.example.com. IN A 10.17.3.5
```

<cluster name>-etcd0.<dns zone>, <cluster name>-etcd1.<dns zone> etc. pointing to each controller IP address.

For example, with the following configuration:

name                    = "demo"
dns_zone                = "example.com"
controller_ip_addresses = ["10.17.3.4", "10.17.3.5"]

Following DNS records should be created:

demo-etcd0.example.com IN A 10.17.3.4
demo-etcd1.example.com IN A 10.17.3.5

The DNS entries must be created before cluster creation, otherwise creation will fail.

In the future, we plan to automate DNS entries creation, as it is done for our other supported platforms.

  ssh_public_keys         = ["ssh-rsa AAAA..."]

This list of SSH public keys will be added as authorized keys to the core user on all controller nodes. It is required to define here at least one key, which must be loaded into SSH agent during cluster creation time, so lokoctl can perform cluster bootstrapping.

  controller_ip_addresses = ["10.17.3.4"]

This list of IP addresses will be used to select the right hardware from Tinkerbell. If you use experimental_sandbox feature, for each IP address virtual machine will be created for you and hardware entry will be automatically added into Tinkerbell.

  experimental_sandbox {

Experimental sandbox block enables creation of an extra virtual machine, which will run Tinkerbell server. If this block is defined, also DNS entries for the cluster will be created automatically in libvirt DNS server and cluster nodes will be automatically configured to use this DNS server.

This setting is recommended only for testing and not for production usage.

    pool_path          = "/opt/pool"

Pool path defines where libvirt will store created virtual machine disk images.

    flatcar_image_path = "/opt/flatcar_production_qemu_image.img"

Flatcar image path should be an absolute path in the local file system, pointing to previously downloaded Flatcar image.

    hosts_cidr         = "10.17.3.0/24"

Hosts CIDR is required by libvirt to configure isolated network environment for virtual machines. This range must cover all IP addresses listed in controller_ip_addresses parameter and all worker pool IP addresses.

  worker_pool "pool1" {

Defines a worker pool with name pool1. Each worker pool can have for example different Node labels and taints configured, use different hardware or different OS version, depending on users' need.

  ip_addresses    = ["10.17.3.5"]

This option for worker pool defines which hardware should be used from Tinkerbell for creating worker nodes of the cluster. With the experimental_sandbox option enabled, those machines will be created automatically.

To see all available configuration options for Tinkerbell platform, see Tinkerbell configuration reference.

Additionally, you can also extra configuration for backend or components.

Creating cluster

With configuration in place, we can finally trigger creation of our cluster. To do that, execute the following command:

lokoctl cluster apply --verbose

This step will take about 15 minutes, depending on your machine performance and internet connection speed.

When it is finished, you should see output similar to this:

Your configurations are stored in ./assets

Now checking health and readiness of the cluster nodes ...

Node                   Ready    Reason          Message

demo-controller-0      True     KubeletReady    kubelet is posting ready status
demo-worker-pool1-0    True     KubeletReady    kubelet is posting ready status

Success - cluster is healthy and nodes are ready!

This means, our cluster is ready!

Inspecting Tinkerbell activity

With cluster created, we can log in into the provisioner machine to get access to Tinkerbell CLI tool tink, so we can see what kind of Tinkerbell resource has been created for us.

To log in into the provisioner machine, run the following command:

ssh [email protected]

Then, execute this command to open shell with tink binary available:

source tink/.env && docker-compose -f tink/deploy/docker-compose.yml exec tink-cli sh

Hardware

Now, run tink hardware list command to see available Hardware registered in Tinkerbell:

# tink hardware list
+--------------------------------------+-------------------+------------+----------+
| ID                                   | MAC ADDRESS       | IP ADDRESS | HOSTNAME |
+--------------------------------------+-------------------+------------+----------+
| 75ecb8b9-e24b-cdda-75cd-6a0d486d81e7 | 52:9b:b9:95:3c:bb | 10.17.3.5  |          |
| 576a52db-9f26-f3c1-5457-0ca305c6ccba | 52:b2:e9:0e:16:56 | 10.17.3.4  |          |
+--------------------------------------+-------------------+------------+----------+

Here, you should see the IP address of nodes we configured previously.

Templates

With the tink template list command, you can inspect what workflow templates has been created to configure the Lokomotive cluster.

# tink template list
+--------------------------------------+---------------------+-------------------------------+-------------------------------+
| TEMPLATE ID                          | TEMPLATE NAME       | CREATED AT                    | UPDATED AT                    |
+--------------------------------------+---------------------+-------------------------------+-------------------------------+
| 33cbf5d0-09d2-4ea5-92a2-e23a6a0a1e0f | demo-worker-pool1-0 | 2020-10-15 17:08:06 +0000 UTC | 2020-10-15 17:08:06 +0000 UTC |
| 7d5d9b5a-7455-496f-a553-669d78b48e7d | demo-controller-0   | 2020-10-15 17:08:06 +0000 UTC | 2020-10-15 17:08:06 +0000 UTC |
+--------------------------------------+---------------------+-------------------------------+-------------------------------+

Workflows

Finally, with the tink workflow list we can see workflows which has been executed to create the cluster.

# tink workflow list
+--------------------------------------+--------------------------------------+---------------------------+-------------------------------+-------------------------------+
| WORKFLOW ID                          | TEMPLATE ID                          | HARDWARE DEVICE           | CREATED AT                    | UPDATED AT                    |
+--------------------------------------+--------------------------------------+---------------------------+-------------------------------+-------------------------------+
| ff622ae2-280c-4767-bb19-cf08851fa673 | 7d5d9b5a-7455-496f-a553-669d78b48e7d | {"device_1": "10.17.3.4"} | 2020-10-15 17:08:06 +0000 UTC | 2020-10-15 17:08:06 +0000 UTC |
| 732eb53e-f55a-4f69-bd02-7788cc4f348e | 33cbf5d0-09d2-4ea5-92a2-e23a6a0a1e0f | {"device_1": "10.17.3.5"} | 2020-10-15 17:08:06 +0000 UTC | 2020-10-15 17:08:06 +0000 UTC |
+--------------------------------------+--------------------------------------+---------------------------+-------------------------------+-------------------------------+

Using tink workflow events command, we can see what actions has been performed and how much time they took:

# tink workflow events ff622ae2-280c-4767-bb19-cf08851fa673
+--------------------------------------+-----------------+-----------------+----------------+---------------------------------+--------------------+
| WORKER ID                            | TASK NAME       | ACTION NAME     | EXECUTION TIME | MESSAGE                         |      ACTION STATUS |
+--------------------------------------+-----------------+-----------------+----------------+---------------------------------+--------------------+
| 576a52db-9f26-f3c1-5457-0ca305c6ccba | flatcar-install | dump-ignition   |              0 | Started execution               | ACTION_IN_PROGRESS |
| 576a52db-9f26-f3c1-5457-0ca305c6ccba | flatcar-install | dump-ignition   |              3 | Finished Execution Successfully |     ACTION_SUCCESS |
| 576a52db-9f26-f3c1-5457-0ca305c6ccba | flatcar-install | flatcar-install |              0 | Started execution               | ACTION_IN_PROGRESS |
| 576a52db-9f26-f3c1-5457-0ca305c6ccba | flatcar-install | flatcar-install |             80 | Finished Execution Successfully |     ACTION_SUCCESS |
| 576a52db-9f26-f3c1-5457-0ca305c6ccba | flatcar-install | reboot          |              0 | Started execution               | ACTION_IN_PROGRESS |
+--------------------------------------+-----------------+-----------------+----------------+---------------------------------+--------------------+

Now, disconnect from tink shell before proceeding to next steps using exit command twice.

Verifying cluster functionality

To verify that the cluster is functional, use the generated kubeconfig file to access it. This can be done using the following commands:

export KUBECONFIG=$(pwd)/assets/cluster-assets/auth/kubeconfig
kubectl get nodes

If everything went well, you should see list of 2 nodes, similar to this:

NAME                  STATUS   ROLES    AGE   VERSION
demo-controller-0     Ready    <none>   15m   v1.18.8
demo-worker-pool1-0   Ready    <none>   15m   v1.18.8

To test creating some workloads on the cluster, add the following block at the bottom of previously created cluster.lokocfg file:

# A demo application.
component "httpbin" {
  ingress_host = "httpbin.example.com"
}

It will configure the httpbin component on the cluster, which is our example workload.

Now, run the following command to apply new configuration:

lokoctl component apply

Once finished, run the following command to see if pods has been created:

kubectl -n httpbin get pods

You should see output similar to this:

NAME                      READY   STATUS    RESTARTS   AGE
httpbin-64ff5b4d5-27pbz   1/1     Running   0          15m

Now you can either continue using the cluster or go to the next step explaining how to shut things down.

Cleaning up

To destroy the cluster, run the following command:

lokoctl cluster destroy --verbose

You will be asked to confirm the destroying.

If you use the experimental_sandbox option, virtual machines for both cluster and provisioner will be destroyed.

If you run a standalone Tinkerbell instance, you must wipe disks on the nodes and reboot them manually.

Summary and next steps

This blog post showed how to get a Lokomotive set up locally with Tinkerbell.

Join the discussion in the #tinkerbell channel on the Equinix Metal Community Slack.

Reach out to the Flatcar Container Linux community by checking out its community channels. For discussions around Lokomotive, join the community Slack channel.

Continuing our journey with Tinkerbell​

Why Lokomotive?​

Steps towards Tinkerbell support in Lokomotive​

Terraform provider for Tinkerbell​

Support for automated testing in sandbox environment​

Contributions to Tinkerbell​

Creating local Lokomotive cluster using experimental Tinkerbell sandbox and libvirt​

Prerequisites​

Local development tools​

Lokoctl binary​

Terraform binary​

libvirt​

Terraform provider for libvirt​

Terraform provider for Tinkerbell​

Flatcar Container Linux operating system image​

Cluster configuration​

Basic configuration​

Closer look at configuration options​

Creating cluster​

Inspecting Tinkerbell activity​

Hardware​

Templates​

Workflows​

Verifying cluster functionality​

Cleaning up​

Summary and next steps​

Continuing our journey with Tinkerbell

Why Lokomotive?

Steps towards Tinkerbell support in Lokomotive

Terraform provider for Tinkerbell

Support for automated testing in sandbox environment

Contributions to Tinkerbell

Creating local Lokomotive cluster using experimental Tinkerbell sandbox and libvirt

Prerequisites

Local development tools

Lokoctl binary

Terraform binary

libvirt

Terraform provider for libvirt

Terraform provider for Tinkerbell

Flatcar Container Linux operating system image

Cluster configuration

Basic configuration

Closer look at configuration options

Creating cluster

Inspecting Tinkerbell activity

Hardware

Templates

Workflows

Verifying cluster functionality

Cleaning up

Summary and next steps