OpenShift brain dump

This is a collection of notes on the installation of Openshift, which at some point I will turn into a proper design for an OpenShift host on KVM.

URLs

Setup

Openshift uses virtual machines, named nodes. There are several types of node here.

OpenShift setup

  • bootstrap - First node in the network, to be disposed of later.
  • control plane - Configuration and control (formerly master?)
  • compute - Worker nodes that run the actual workload.

These are some of the variables required We will want to support more clusters in the future so I am designing the variables to accommodate this. The first OKD cluster will be named "okd" and use the "okd.nerdhole.me.uk" DNS domain. This DNS domain will be served by the main server. The okd part in okd.clusters.okd is the name of the first cluster.

Variable Meaning Value
okd_type For each node, the node type bootstrap, master, worker
okd Parent dictionary for global OKD variables See below
- - -
okd.os_image_location Location for fast OS disks /var/lib/libvirt/images/
okd.clusters Collection of OKD clusters Indexed by cluster name
okd.clusters.okd.netname Name of the Openshift network shiftnet
okd.clusters.okd.basedomain OpenShift 4 Cluster base domain okd.nerdhole.me.uk
okd.clusters.okd.clustername OpenShift 4 Cluster name okd
okd.clusters.okd.bridge OpenShift KVM network bridge shiftnet0
okd.clusters.okd.networkblock OpenShift Network Block 10.12.2.0/24
okd.clusters.okd.gateway OpenShift Network gateway address 10.12.2.1
okd.clusters.okd.helper_fqdn Bastion/Helper node FQDN okdlb.okd.nerdhole.me.uk
okd.clusters.okd.helper_ip IP address of helper node 10.12.2.100
okd.clusters.okd.node_rootvg_size Size in GB of an openshift node's rootvg 120G
okd.clusters.okd.node_os_variant OS variant to pass to virt-install fedora-coreos-stable

OKD also uses secrets - the root passwords to the nodes are important. We will store these in an Ansible vault group_vars/all/02-openshift-vault.yml, which will contain the following variables:

Variable Meaning
okd_vault.root_password Root password for OpenShift nodes
okd_vault.pull_secret Pull secret downloaded from RedHat
okd_vault.ssh_pubkey Administrator's public SSH key

Boot/Install server

In our case here, we are installing our OpenShift cluster from the main server, which is also a boot/install server. We will download the files from RedHat. Two files exist: openshift-client-linux.tar.gz, which contains the kubectl or oc commands, and openshift-install-linux.tar.gz, which contains the installation utility. When we extract both of these to an empty directory, we get the following files:

  • kubectl - The OpenShift client used to operate the cluster from the command line.
  • oc - The same, a hardlink to kubectl.
  • openshift-install - The installation command that generates ignition files and does other things.
  • README.md - A short README. If you extract both files, the last one will overwrite the first but we don't care.

Downloading and extracting these files will probably remain a manual action. They get updated fairly frequently. OpenShift suggests putting these files in /usr/local/bin, where normal users can get them, but root can't get to them there, so we also symlink them to /usr/local/sbin. Both on the load balancer. We will extract these files to a temporary directory, then tar them both up into a single file named openshift_client_install_4.17.XXX.tar.gz and then symlinked to openshift_client_install_latest.tar.gz.

Load balancer

The load balancer will be a multi-homed virtual host with one connection to Frontnet and one to Shiftnet. We define it as a normal NSCHOOL host running CentOS 9, a member of the main server's authentication domain. It will serve as the main entry point to the OpenShift cluster, and it will be running:

  • The Openshift Client (oc) and its friends
  • A load balancer that distributes jobs to Openshift nodes

The load balancer will be the main entry point into the OpenShift cluster, and as such will be the machine running the installations and the openshift client commands. The customer entrance to the cluster wil be on its 10.12.2.100/24 Shiftnet interface. Opening that up to the Net at large would involve routing the 10.12.2 network to every other machine, which I may do at a later time. For now, we access the Cluster from Algernon, who has direct access to Shiftnet.

KVM host

We will be installing our Openshift cluster on KVM host Algernon, which was in fact purchased for the purpose. Packages to install:

  • guestfs-tools - Tools for automatic installations.
  • butane - Turns YAML ignition files into JSON ignition files.

On previous attempts, I have found the Openshift nodes to be very I/O intensive - too demanding even for my striped hard drives. So I may have to put them on the internal M.2 SSD. Since I allocate 100% of that drive to the root partition, I will have to put the Openshift nodes' QCOW files in the /var/lib/libvirt directory.

Shiftnet

We will create a separate virtual network named shiftnet. Originally I wanted it completely separate, but since this thing is so Internet dependent I will make it a NAT network as suggested. The instructions referenced above specify using a PXE installation on the load balancer, but I already have a PXE setup so I will try to add my installation to that instead. I may even do a copy of the qcow file that is created at some point. Shiftnet can be created using the virt-manager GUI, but we will use the virsh net-define command into which we will feed an XML file similar to this:

<network>
  <name>shiftnet</name>
  <forward mode='nat'/>
  <bridge name='virbr1' stp='on' delay='0'/>
  <mac address='52:54:51:f7:01:01'/>
  <ip address='10.12.2.1' netmask='255.255.255.0'>
    <dhcp>
      <range start='10.12.2.200' end='10.12.2.254'/>
      <!-- MAC and IP addresses of the OKD nodes -->
      <host mac='52:54:51:f7:01:64' ip='10.12.2.100'/>
      <host mac='52:54:51:f7:01:65' ip='10.12.2.101'/>
      <host mac='52:54:51:f7:01:66' ip='10.12.2.102'/>
      <host mac='52:54:51:f7:01:67' ip='10.12.2.103'/>
      <host mac='52:54:51:f7:01:68' ip='10.12.2.104'/>
      <host mac='52:54:51:f7:01:69' ip='10.12.2.105'/>
      <host mac='52:54:51:f7:01:6a' ip='10.12.2.106'/>
    </dhcp>
  </ip>
</network>

The 52:54:51:f7 part of the MAC address identifies Shiftnet, the 01 is kept in reserve just in case I want more than one shiftnet, and the final hex number 64 corresponds with the last byte of the machine's IP address. We will generate this file using a templatre that loops through all the OKD nodes' information. The DHCP information is needed because Shiftnet is not in broadcast range of Frontnet. Since there is no way of specifying a boot server in the KVM net, this precludes using PXE to install the machines.

IP address management

The OKD Load Balancer is the first machine in the Nerdhole to be multi-homed: One leg in Frontnet, the other in Shiftnet. Two MAC addresses, two IP addresses. In addition, the load balancer also needs a special wildcard entry in DNS so all of the application okd.nerdhole.me.uk entries will point at the load balancer. We will add okdlb.okd.nerdhole.me.uk to the inventory, giving okdlb two entries. This will configure the machine into DNS correctly. We will add the load balancer to the Shiftnet XML speecially with its own entry outside of the nodes.

We add a special alias to the mkdnsserver utility: OPENSHIFT_WILDCARD. When this is added as an alias to a machine, it will add the *.apps.okd.example.com CNAME to the appropriate Bind database.

Network interface

The load balancer needs an extra virtual network interface into the Shiftnet. We will do this with a shell stanza running virsh attach-device hostname --file /local/kvm/xml/hostname-shiftnet-nic.xml on the KVM host. To avoid this happening more than once, we will search for the mac address in the output of virsh dumpxml hostname. If the interface is there, we remove it using virsh datach-device.

Software installation

The load balancer needs the following software:

  • haproxy - The load balancer that lets Frontnet machines into the OKD cluster.
  • butane - A utility to turn YAML files into machine readable ignition files.
  • oc installer and client - The oc command lets you manage your cluster, and the openshift-install command sets it up.

The haproxy takes one configuration file: /etc/haproxy/haproxy.conf. This file is generated from a template with all the nodes in their proper place.

Butane is not a part of OpenShift, but it can create standards-compliant ignition file and so is useful at least to build test ignition files for Fedora CoreOS.

The oc installer and client consist of two executables and a symlink: oc, openshift-install, and kubectl. Since normal users have /usr/local/bin in their path, I will put the executables there. Root has /usr/local/sbin in its path, so I'll put symlinks to ../bin there. At the time of writing, the version is 4.17.9

We can download this from Redhat. The oc client is: openshift-client-linux.tar.gz and the installer is: openshift-install-linux.tar.gz.

You will also need to download your pull secret from there and add it to the configuration file.

Installation of OpenShift 4.1

Installing OpenShift is a matter of provisioning the nodes type by type, specifying their role in the cluster at boot/install time. The order is:

  1. Bootstrap node - There is only one of these, and it can be disposed of once the other nodes are up.
  2. Control plane nodes - Also known as "masters". These are cluster management nodes.
  3. Worker nodes - The nodes runing the actual OpenShift workload.

These nodes' configuration files have to be accessed from a web server - we will use the BIS structure and add an "ignition" directory to the bis part: https://bis.nerdhole.me.uk/bis/ignition/ where the nodes can find their configuration similar to Kickstart.

The cluster is defined in a file named install-config.yaml. This is an example of a working one from a RedHat blog post:

apiVersion: v1
baseDomain: nerdhole.me.uk
compute:
- hyperthreading: Enabled
  name: worker
  platform: {}
  replicas: 0
controlPlane:
  hyperthreading: Enabled
  name: master
  platform: {}
  replicas: 3
metadata:
  creationTimestamp: null
  name: okd
networking:
  machineNetwork:
  - cidr: 10.253.0.0/16
  clusterNetwork:
  - cidr: 10.254.0.0/16
    hostPrefix: 16
  networkType: OpenShiftSDN
  serviceNetwork:
  - 10.255.0.0/16
platform:
  none: {}
pullSecret: '<INSERT_YOUR_PULL_SECRET_HERE>'
sshKey: '<INSERT_YOUR_PUBLIC_SSH_KEY_HERE>'

Comments:

  • The apiversion tells the installer if this is an old config file that should be shunned. The base domain is our home domain, to which the installer will prefix the cluster name (okd).
  • We have only one type of worker node, so we need one entry. The number of replicas is 0, because we do not want the installer to try and install new nodes - we have already done that ourselves.
  • Any cluster only has one type of control plane. The number of replicas is the same as the number of master nodes, which in our case is three. I will calculate that value at some point.
  • The metadata's important value is "name", which puts our cluster in okd.nerdhole.me.uk.
  • Under "Networking", we assign a bunch of IP addresses. As long as we don't use the range anywhere else, we can use the same one for every cluster we build:
    • Clusternetwork: whenever we create a pod, it gets an IP address from this range. A pod is more or less the same as a process, so we want to have plenty of space.
    • ServiceNetwork: The IP address pool for service IP addresses.
  • Platform: None. This is meant to be used for external cloud providers. We make our own cloud.

Our playbooks will generate this file onto the load balancer in /root/install-config.yaml

Nodes

These are the nodes used:

Nodename Type Memory Disk
okdlb.okd.nerdhole.me.uk Load balancer 4G 50G
okd101.okd.nerdhole.me.uk bootstrap 16G 120G
okd102.okd.nerdhole.me.uk master 16G 120G
okd103.okd.nerdhole.me.uk master 16G 120G
okd104.okd.nerdhole.me.uk master 16G 120G
okd105.okd.nerdhole.me.uk worker 16G 120G
okd106.okd.nerdhole.me.uk worker 16G 120G

Libguestfs documentation is here

RedHat CoreOS image

Since the instructions I have now found specify Redhat CoreOS, I will download it from the Openshift Mirror and use that instead. The Openshift host play will copy out the image and resize it to the needed size.

Node installation

We will after all not be building the OpenShift nodes with virt-builder. What we will do is download the Fedora CoreOS image from the Fedora website and put it on the boot/install server in /local/bis/qcow2. From there, we will copy it to the KVM host into /local/kvm/images. The installation will then consist of copying the image to the virtual machine's rootvg file.

Rather than the usual settings XML file, we will use the virt-install command to build our VMs like so:

virt-install \
--name hostname
--ram 4096 \
--vcpus 2 \
--disk path=/local/kvm/storage/hostname-os-00.qcow \
--network network=shiftnet,model=virtio,mac=00:11:22:33:44:55 \
--os-type linux \
--os-variant rhel8.0 \
--graphics none \
--serial pty \
--console pty \
--boot hd \
--import \
--noautoconsole \
--qemu-commandline='-fw_cfg name=opt/com.coreos/config,file=/local/kvm/xml/okd101.ign'

The import parameter specifies not to try and install the host, but use the OS already on the virtual harddisk. The image gets an IP address from DHCP, and will not start its console thanks to the --noautoconsole parameter, making it suitable for unattended installs. The --qemu-commandline parameter will copy the named ignition file to the machine on its first boot using a mechanism meant to configure a firewall. This will feed information to the virtual machines about the OpenShift configuration.

Procedural organisation

This diagram shows the flow of information between the machines involved in building the Openshift cluster:

OpenShift information flow

The steps in building the OpenShift cluster are:

KVM host activities for OpenShift

  1. Install needed RPMs: butane and guestfs-tools
  2. Create the OpenShift network
    • Generate the OpenShift network XML
    • Define the OpenShift network using the "define" command
    • Autostart it
  3. Copy the latest Redhat CoreOS image into /local/kvm/images
    • Copy out the qcow2 file
    • Use qemu-img resize to increase the rootvg size as needed

Load balancer activities

  1. Build the load balancer
    • Provision the virtual machine (CentOS 9 - standard)
    • Install needed RPMs: HAProxy and Butane
    • Add a network interface connected to the OpenShift network
  2. Configure the HA proxy
    • Generate haproxy.cfg
    • Start and enable haproxy
  3. Install the Openshift software
    • Install the OpenShift installer and client into /usr/local/bin
  4. Generate the cluster configuration
    • Generate the install-config.yaml onto the load balancer
    • Generate the ignition files using openshift-install create ignition-configs
    • Fetch the ignition files onto the Ansible run host

Bootstrap node activities

  1. Provision the bootstrap node
    • Copy all ignition files to the KVM host
    • Copy the rhcos.qcow2 file to the node's image on high speed SSD.
    • Use the virt-install command to build the node
    • Use the bootstrap.ign