Author: danberk

Step-By-Step Setting Up Ceph Storage for Virtualization on OpenShift 4.19+ for a Homelab

We have set up our OpenShift system, more than likely using HA if you are interested in Ceph, it’s time to get the storage online. You can do a SNO setup, then add additional workers. Or even you can do one system, with Ceph running storage… for some reason…

A very quick brief on Ceph. Ceph is an open-source shared storage system. It is comparable to VMware vSAN. Ceph has many knobs you can tune for performance. Once set up, you have several options for storage. You can do file-layer, POSIX-compatible storage (CephFS) that lets pods access files in a shared folder; S3 storage (RGW) that pods connect to over an HTTP endpoint; or block storage (RBD). For VMs we will be focusing on block storage. These will be our virtual disks. Since this is networked storage, you will want a fast interconnect between nodes, usually a minimum of 10gb/s. Depending on your disks and traffic, you can easily saturate your network here. Having a dedicated network for your nodes to talk over, and/or overprovisioning can help. After a node reboots, Ceph will resync whatever has changed while it was offline and compare itself to other cluster members. This will go as fast as your storage will let it.

When using OpenShift and wanting Ceph, you have two options. You can run the officially supported Red Hat OpenShift Data Foundation. This is a nicer version of Ceph that has a clean interface you can add as a plugin within OpenShift. The downside is it wants a ton of resources and is a licensed product. The other large downside is the requirements; it WILL NOT ALLOW YOU TO INSTALL without 30 CPUs (a Kubernetes measurement) and 72GB of RAM available across your nodes. For a home lab, that’s probably too much.

The alternative is Rook. Rook is a Kubernetes wrapper around Ceph and gives you controls for running Ceph within Kubernetes. This is what I will be discussing deploying. Rook is a project of the Cloud Native Computing Foundation and has lots of support. There is a getting started guide specifically for OpenShift, OpenShift – Rook Ceph Documentation, but I think it’s written more for people with experience.

Installing Rook

Since this is a generic open-source project and not an OpenShift-specific one, this installation will be oc-command heavy.

Note: OpenShift marks your installation disk as not available, Ceph will attempt to use any other disks in your hosts. If you have data on disks you do not want to lose, remove those disks. There is a chance Ceph will see the data and refuse to use the disk until you erase it. In that case, if Ceph does not grab your disk, I dd zeros to the first few MB of the drive, and then it will see it as blank.

git clone https://github.com/rook/rook.git to a workstation which has the oc commands present.
oc create namespace rook-ceph
oc apply -f ./rook/deploy/examples/crds.yaml -f ./rook/deploy/examples/common.yaml
- These are low level prereqs for Rook
oc apply -f ./rook/deploy/examples/csi-operator.yaml
oc apply -f ./rook/deploy/examples/operator-openshift.yaml
oc apply -f ./rook/deploy/examples/cluster.yaml

# Wait for pods to come up
oc -n rook-ceph get service

The system should start grabbing disks and you will see a bunch of OSD pods come online. Each hard drive gets an OSD pod that will be in charge of it. This is replicated, distributed storage; if it says you have 10TB, and you have replication of 2 on, then you will only be able to store 5TB of actual data.

With the backend initializing, we need to do some configuration. We will start with the block storage that is our main use case.

1`_configure_pool.yaml`

			
apiVersion: ceph.rook.io/v1
kind: CephBlockPool
metadata:
  name: cephblockpool
  namespace: rook-ceph
spec:
  failureDomain: host
  replicated:
    size: 2

		

oc apply -f 1_configure_pool.yaml

Note here I am saying keep 2 copies of each block of data in our ceph block pool and make sure they are stored on different hosts. You can set this higher to keep 3 copies or live on the edge with 1. If you are running SNO: set failureDomain: osd (or size: 1), or the system will not be able to start. I am running 3 nodes, so I purposefully set my replica size for block data to 2, and metadata above to 3. Block storage tends to be large; keeping only 2 copies allows for a backup and for use when a cluster node is rebooting for updates.

`2_configure_ceph_block.yaml`

			
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
   name: rook-ceph-block
provisioner: rook-ceph.rbd.csi.ceph.com
parameters:
    clusterID: rook-ceph
    pool: cephblockpool
    imageFormat: "2"
    imageFeatures: layering
    csi.storage.k8s.io/provisioner-secret-name: rook-csi-rbd-provisioner
    csi.storage.k8s.io/provisioner-secret-namespace: rook-ceph
    csi.storage.k8s.io/controller-expand-secret-name: rook-csi-rbd-provisioner
    csi.storage.k8s.io/controller-expand-secret-namespace: rook-ceph
    csi.storage.k8s.io/node-stage-secret-name: rook-csi-rbd-node
    csi.storage.k8s.io/node-stage-secret-namespace: rook-ceph
reclaimPolicy: Delete
allowVolumeExpansion: true

		

oc apply -f 2_configure_ceph_block.yaml

If you want to make this your default storage run:

oc patch storageclass rook-ceph-block -p '{"metadata":{"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'

Now that we have our block storage configured, you may want CephFS also. It’s a shared folder where applications within OpenShift can mount and share files. This is file-based like a NAS. Whereas the block storage is block-based like a SAN.

3_configure_cephfs.yaml

			
apiVersion: ceph.rook.io/v1
kind: CephFilesystem
metadata:
  name: cephfspool
  namespace: rook-ceph
spec:
  metadataPool:
    replicated:
      size: 3
      requireSafeReplicaSize: false
  dataPools:
    - name: replicated
      replicated:
        size: 3
        requireSafeReplicaSize: false
      compressionMode: none
  preserveFilesystemOnDelete: false
  metadataServer:
    activeCount: 3
    activeStandby: false

		

oc apply -f 3_configure_cephfs.yaml

`4_configure_ceph_storage_class.yaml`

			
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: rook-cephfs
provisioner: rook-ceph.cephfs.csi.ceph.com
allowVolumeExpansion: true
parameters:
  clusterID: rook-ceph
  fsName: cephfspool
  pool: cephfspool-replicated
  csi.storage.k8s.io/provisioner-secret-name: rook-csi-cephfs-provisioner
  csi.storage.k8s.io/provisioner-secret-namespace: rook-ceph
  csi.storage.k8s.io/controller-expand-secret-name: rook-csi-cephfs-provisioner
  csi.storage.k8s.io/controller-expand-secret-namespace: rook-ceph
  csi.storage.k8s.io/node-stage-secret-name: rook-csi-cephfs-node
  csi.storage.k8s.io/node-stage-secret-namespace: rook-ceph
reclaimPolicy: Delete

		

oc apply -f 4_configure_ceph_storage_class.yaml

For VMs and wanting to snapshot I added:

oc apply -f rook/deploy/examples/csi/rbd/snapshotclass.yaml

oc create -f rook/deploy/examples/csi/cephfs/snapshotclass.yaml

The last thing I would add, this is optional but makes life easier, is the web dashboard for Ceph.

oc -n rook-ceph edit cephcluster will pop up a text editor, then add

			
spec:
  dashboard:
    enabled: true
    ssl: true

That will start the dashboard on a node, but you may want to access it from outside the cluster itself:

oc apply -f ./rook/deploy/examples/dashboard-external-https.yaml

That exposes it on all nodes; you can then run the following to get the port number:

oc get services -n rook-ceph
NAME                                     TYPE           CLUSTER-IP       EXTERNAL-IP   PORT(S)             AGE
rook-ceph-exporter                       ClusterIP      172.30.120.33    <none>        9926/TCP            229d
rook-ceph-mgr                            ClusterIP      172.30.52.78     <none>        9283/TCP            229d
rook-ceph-mgr-dashboard                  ClusterIP      172.30.238.190   <none>        8443/TCP            229d
rook-ceph-mgr-dashboard-external-https   NodePort       172.30.254.103   <none>        8443:32088/TCP      229d
rook-ceph-mon-a                          ClusterIP      172.30.58.172    <none>        6789/TCP,3300/TCP   229d
rook-ceph-mon-b                          ClusterIP      172.30.10.72     <none>        6789/TCP,3300/TCP   229d
rook-ceph-mon-c                          ClusterIP      172.30.164.174   <none>        6789/TCP,3300/TCP   229d

In this example, that “32088” is the port we connect on. You will need to get the admin password as a secret within the cluster, and you will need to base64 decode it, on Linux / Mac:

oc -n rook-ceph get secret rook-ceph-dashboard-password -o jsonpath="{['data']['password']}" | base64 --decode

On Windows:

oc -n rook-ceph get secret rook-ceph-dashboard-password -o jsonpath="{['data']['password']}"
[System.Text.Encoding]::UTF8.GetString([System.Convert]::FromBase64String("mylittlestringfromfirststep="))

The dashboard comes with a bunch of metrics and configuration options out of the box. Optionally, you can enable additional performance monitoring or connect Ceph to Grafana for metrics.

Some helpful info

One helpful thing to enable is the toolbox for Ceph. This is an additional pod that will run and have all the Ceph tools ready for you.

oc apply -f ./rook/deploy/examples/toolbox.yaml

After the pod starts run:
oc exec -it -n rook-ceph deployment/rook-ceph-tools -- bash

Welcome to the world of Ceph. Commands like ceph status become your friend. The dashboard has most of this information in GUI format, but the commands have all the power to change settings. Ceph has anonymous metrics you can enable, at https://telemetry-public.ceph.com/ they have public Grafana dashboards of all the clusters checking in worldwide!

After struggling through getting it to work the first time, I found a blog which lays a similar workflow out nicely. I’ll leave it here in case their notes help anyone!

Restoring a Dolch Pac 65: The Last of the Luggables

I love old portable computers. It goes back to the Compaq Portable II I have. I saw online there was a much later system from a company I hadn’t heard of called Dolch. The Dolch Pac 65, considered one of the last “luggable” computers. An industrial motherboard, with an ISA slot, four PCI slots, and a Pentium II 400MHz with MMX. These machines were built to be “portable workstations” for industrial use: factory floors, military, network diagnostics. A far cry from the Intel 286 in the Compaq Portable II. It also has an LCD screen instead of the small CRT of the Compaq. Gone are the days of tiny green and black displays, even if I love them. In comes a full 1024×768 LCD. I have been expanding my retro computing rack and wanted something more in the Windows 95 or 98 era. This system fit the bill.

Whenever I tried to find one online, they would come up for around $500. I saw one come up for around $150, with the small catch that it was in an “unknown” state and missing some of the panels and mounting brackets. I could see from one side photo that it had the power supply, and I saw what looked like the motherboard inside. A yolo later (and a few days), I had the system.

I was right! The motherboard was there! It had no disk drives, no RAM, no hard drive; it was perfect. A blank slate. The system definitely has some oddities. The motherboard is more of an industrial system, all on one board with CPU, memory, disk controller, ethernet, video card all together. Then a connector to another board that holds the ISA and PCI slots. My guess is it’s a PICMG or similar system. Being from that era, and a more industrial system, the fan is a 120-volt fan that always runs at a relatively high speed. I haven’t replaced it yet, but that may be on the shorter list of parts to swap out for a good quality of life improvement.

I need to thank Curious Marc! He also has this system, documented it well, and collected/uploaded all the drivers! This made getting the system online a good amount easier. He also has tables about the different RAM types it supports and information about the motherboard that got me moving a lot quicker on this project.

The first step was to get some memory to prove the system could POST. I found the system could use PC100 SDRAM, up to 256MB a stick for buffered RAM (128MB for un-buffered)! The KTD-WS610R/256 Kingston RAM ($7.50) I got was very interesting, it’s a multi-PCB board, joined together. I ordered just one for now, 256MB is plenty for this machine. I am currently using one of the two DIMM slots the machine has.

With that, we got the system to POST! It needed a new CMOS battery, but what system this age doesn’t? It was happy! Now I needed a hard drive, some disk drives, and expansion cards. For a hard drive I got a modern SATA SSD, and a SATA to IDE adapter. I put the drive in a spring loaded enclosure that goes into one of the PCI slots; this lets me remove the drive easily if I want to. I would pull the drive to add files before I got networking loaded from time to time. I later added USB, but the system couldn’t boot from that, and burning CDs wasn’t always my go-to.

Then I got a slim fit IDE DVD-ROM, and a laptop floppy drive. That let me burn Windows 98 SE to a blank CD and start installation. I also ran memory tests and stress tests on it just to be sure the system in “unknown” condition was doing OK. Now that I had those external drives, I had to start making some brackets. The original hard drive bracket and the optical/floppy bracket were missing when I got the system.

I pondered which operating system to put on the machine. With a 400MHz Pentium II, Windows 95 and 98 were the more obvious choices. The spring-loaded enclosure and optical drive make installing a new operating system not too difficult. To begin its new life (with me) I decided to go with Windows 98 SE. I get USB support right away, FAT32 for more storage, and generally better driver support. There are a lot of community patch kits, Windows Service Packs – PHILSCOMPUTERLAB.COM, that help fix rough edges. Eventually I may install OS/2 for fun, or an older flavor of Linux.

The brackets turned out to be the hardest part of the whole restoration project. The actual system came together so easily that this is where a majority of the time went: dialing in the models for all the brackets.

I then went to find some additional expansion cards to make my life easier, starting with a Belkin F5U220 USB 2.0. With USB mouse and storage support added, working within Windows 98 would be much easier. The little nub mouse works on the keyboard, but it’s not the most comfortable thing to use.

I wanted to get a Creative Sound Blaster, and I did off eBay. It was an authentic, old, ISA one that would work perfectly with all my DOS games. The issue I ran into was the ISA slot did not want to work. I am not sure if this is a later model, or maybe there is a jumper I couldn’t find on the motherboard that needs to be set for the ISA slot to work. There are many BIOS screens (screenshots below). I tried manually setting IRQ, automatic, no matter what Windows didn’t even see the card existed. After battling that for a little while, I gave in and got a PCI Sound Blaster. Using Windows 98 made this easier than older operating systems would have. That one worked immediately when plugged in.

Last, but not least, I got a Thomas Conrad PCI Token Ring card for the system. It has 10/100Mb ethernet on the motherboard and in a breakout slot. But for the true retro rack road warrior, I needed the Token Ring to go with the rest of my Token Ring gear. There was one downside to this card, the same issue others have mentioned with Token Ring cards… The driver attempts to join the ring when the system boots; if it can’t, you get a 30-second or so delay, then an error message. Every. Boot. But that’s the price to be awesome and have Token Ring.

I had been using the system for a little while, and the keyboard had some sticking keys, so I took it apart and gave it a good cleaning. I think it still needs some work, but overall, it does what it needs to do. I can always use a USB keyboard now.

It’s a fun machine, I’m glad I got it. The nice part about portable systems is I can fold it up and put it in the corner. It’s my new go-to retro system. When I want it, I just pick it up, plug in a power cable, and I am back in business. There are some outstanding items: the power supply hasn’t been serviced and could explode at some point, the motherboard capacitors are 30 years old, and the ISA slot won’t recognize cards. We are going to ignore those for now and play RollerCoaster Tycoon instead!

ONB-Classic

Repo: https://github.com/palantir/onb-classic Docs: https://palantir.github.io/onb-classic/

Background

I have been on a slow mission to open source some of the projects I have worked on over the years. The next project after the IsoFileReader actually predates that other project. ONB-Classic has the ability to serve files directly from an ISO without extracting the ISO, to do this we utilize IsoFileReader. That feature required me to open source IsoFileReader first. ONB-Classic is a fork of the OpenNetBoot project I started around 12 years ago. The project originated with having to PXE (Preboot Execution Environment) boot systems via the network, and I wanted to be able to use Proxy-DHCP (more on that in a second) to boot anywhere no matter the infrastructure. This allowed any sysadmin to pop up a PXE server, quickly image systems, and then tear it down.

When I started this project we were imaging Windows 7 machines with Symantec Ghost (what a time to be alive). It wasn’t new at that time, but it had a built in utility from 3Com which allowed us to do Proxy-DHCP, and PXE boot systems. Proxy-DHCP is when you have an authoritative DHCP server giving out addresses on a layer 2 segment, but then have a secondary DHCP server which jumps in after the Server -> Client offer broadcast and offers up boot information (PXE data) which the client then combines to boot. This has its own RFC as part of DHCP, and works on all (with some difficulty) PXE roms.

Concept

The main concept of OpenNetBoot was to bring PXE, TFTP, and HTTP services together into a simple app, allowing someone to stand up an imaging system or Linux installation system anywhere. I also always hoped to open source it, hence the name. The program started as what you see in the repo. It became classic when I had the idea to make it into a web app, and that web app became more of a platform with plugins that could allow different types of system installation. That greater platform was more work specific and I may write about it later, but was very mission focused for building servers at work. It hooked into different systems like server ordering and customer delivery.

This project was where I really started loving lower level programming, and the things you could do if you owned the whole stack. This allows you to have greater insight into the boot process, in addition to doing some fun tricks with the protocols because we control them. I can send the client to different images depending on data I get at different stages in the pipeline, watching the client progress through the boot stages.

Development / Boot Flow

There were bumps along the way. The system would be able to boot one system, and then not work on the next. Some of this came down to vendors **cough** Realtek **cough**, not following the RFC and requiring extra bytes where there are not supposed to be any. We later moved from shipping all BIOS systems to UEFI, which proved to be a new generation of PXE roms that were more picky. The project was also written in Java, this allowed me to run it on any operating system, but this also led to issues where different systems would treat sending to a broadcast address (255.255.255.255) differently.

At this point it may be worth going through the boot flow, and how I always used ONB. A server boots, asks for DHCP, your local network gives you an IP and ONB comes in and gives boot information based on the headers in your original DHCP request. Are you x86? ARM? Are you a BIOS system, or UEFI? Then we return the address to a boot server (ourselves) and a file to load, usually iPXE.

I have been using iPXE for the whole life of this project, it’s a great boot rom except it never was SecureBoot signed, forcing us to disable SecureBoot for PXE operations. That is until recently! iPXE project after a decade got their rom signed by Microsoft! I have been very excited.

Now that iPXE’s address is given, the client reaches out over TFTP to the server to pull the rom. TFTP is very slow; the client requests bytes of a file over UDP, we send bytes, the client acknowledges and requests next bytes. There is no windowing, and if the link is full because let’s say 100 servers are booting, some UDP packets are dropped, forcing a restart. That makes our goal to leave TFTP as quickly as we can. Once iPXE is loaded, it does a new DHCP request, and gets Proxy DHCP information again, but this time with an iPXE system id. Now we serve them a menu file instead of a boot rom. From now on, we can send them data in HTTP format, which is much, much faster and unlocks things like loading large kernel roms before the heat death of the universe. Loading 100MB Linux kernels at 500kb/s is not feasible for a production environment.

As mentioned, the application is written in Java, this allows it to load anywhere, and the JavaFX UI to work on any of them. Over the years things have changed; JavaFX used to be included in Oracle Java, and as we all moved to open source Java it became its own package. The application back end became heavily multi-threaded with threads dispatched when clients connect. Multi-threading dispatch allows the DHCP and TFTP servers to handle 100+ clients at a time. When a client reaches out, we get their request, and pass it to a new worker thread to respond. Then we can immediately free up the original server process to handle the next client. There are several core threads and the goal is always to get work off them as quickly as possible and hand it to a sub-worker.

I have gone on to update this application, and write other ones using JavaFX. It’s another one of those – devil you know – situations, where I do not love programming in it, but I know how it works. The SceneBuilder allows you to create the XML GUI templates fairly easily. One of the more complicated parts of the application is actually the logging system. It has to be able to pass log messages to the GUI, or the CLI; and then pass some of them to a text log file. This system also has to take logs from different threads as they fire, and try not to block. It naturally grew over time, and has shown to work well.

While I was deep in the protocols, I went off on a weird tangent: I added a ‘virtual NIC’ option to the command line of the app. It allows you to simulate a client on the network. It generates a new MAC address and reaches out to see how the authoritative DHCP server responds. That was fun because it was the first time I acted as the PXE client instead of server, simulating a full network card.

That is all ONB-Classic does. Brings those different parts together to help iPXE get through the process. You load your own menu and images to boot whatever systems you have. The application supports running as a console app, a daemon, or a full GUI app with a tray icon. It works on Windows, Mac, and Linux; over the years has been in production on all three.

UI Design

The app went through several designs and mock-ups. Not altering too much until it became a web platform.

I also tried making different logos. This is before generative AI, I had to sit there in Gimp or Inkscape myself and draw ideas. Here are a few for fun.

ONB-Classic Settings Page — ONB-Classic Final App Settings page and look

Wrapping Up

I have maintained this app for over a decade now. It ran the heart of our server shipments for years, shipping thousands of servers. It helped launch my career. And gave me a love of lower level programming down to RFC. I am excited to share it with the world, and I hope it helps a sysadmin out there to boot systems. The system is Apache 2.0 licensed, and I am always happy to get pull requests or feedback!

Step-By-Step Setting Up Local LVM Storage for Virtualization on OpenShift 4.19 for a Homelab

We now have our SNO node stood up, and we have our networking how we want it; it’s time to get storage configured to host VMs. OpenShift is Kubernetes with some opinions. Like picking a Linux distribution, there are many paths and OpenShift helps put you on one, making it easier to start with Kubernetes. OpenShift also comes with multiple Operators, Operators are like plugins, to help you manage different aspects of the system. Storage can be done multiple ways, and OpenShift supports multiple ways. In this post we will be discussing using the LVM Operator to have local storage.

Once you have installed a SNO node, and then deployed LVM here, congrats you will have everything you need to start deploying workloads. Before setting up storage, we have no place to persistently store files. The OpenShift node has a primary OS disk, which it uses for its own settings and storing container images, but this storage will be used for workload/pod data itself.

A few notes: If you are running HA OpenShift, you do not want this because storage cannot be shared amongst the different nodes. Also worth noting, during this process the system may try to format any drives that the OpenShift system itself is not using, make sure you are ready for these drives to be formatted.

Configuration Steps

Go to the OpenShift web console, if you need to setup OpenShift I suggest checking out either my SNO guide or HA Guide.
Click Operators -> OperatorHub.
Install “LVM Storage” Operator.

You will be prompted to create an “LVMCluster”, click the button.
Name it something like “local-lvm-storage”.
Open the “storage” menu.
- Open “deviceClasses”.
  - Check default, if you want this to be the default storage.
  - It will say 90% of the space will be used for LVM, you can increase if you want more space, by default it leaves a little in case you need more for metadata storage. LVM has data and metadata storage as two different sub pools within your LVM.
  - Open “deviceSelector”.
    - forceWipeDevicesAndDestroyAllData if you dare / want it to format the drive first.
    - Open “paths”, here you must add the drives you want to use, this will go to each node, you can use serial numbers or for my single node I will just use “/dev/sda” since I know I have an empty SATA drive I want to use.
    - Note: If you used the partition script from the SNO article, then you need to put “/dev/sda5”, or “/dev/nvme0n1p5”, or whichever drive you used, if you run $ lsblk from Compute -> Nodes -> <your node> you will see the partition layout.
Click “Create”.

If the drive sets up correctly you will see Status: Ready.

If you install OpenShift Virtualization, it will automatically populate some of the templates to the default storage, also it will allow you to “Create PersistentVolumeClaim” “With Data upload form”.

Going to Compute -> Nodes -> <your node> -> Terminal, you can run: chroot /host, lsblk and see all the goings on with lvm.
Going to the same terminal, you can enter sudo lvs -a vg1 to verify lvm is configured and space available. I have started testing on OpenShift 4.20 nodes, and have noticed the host lvm.conf filters may hide the volumes, and you can use lvs --config 'devices { filter=["a|.*|"] }' to override that filter.

Types of Storage in Kubernetes

Now that LVM is setup it is worth quickly giving an overview on more layers of Storage within OpenShift and Kubernetes as a whole. There are two types of storage for Kubernetes / OpenShift, and within those types of access for files and data.

First, there is file system storage, this type of storage is a folder or files that are needed by a pod, think of storage such as NFS. Second there is block, this is large binary blobs; this is used a lot with VMs. Block will be both your ISO images for templates, and disk images you boot off of. This storage is just “here is 30GB”, x VM can request to edit any of the bytes in that block.

For access types there is RWO and RWX, biggest difference there is how many systems can access that storage at once. All our LVM will be RWO, because it’s local to this box. When you do Ceph storage, that can be RWX. There is also Read Only (ROX) and Read write once pod (RWOP) but we will not worry about that here.

These types of storage come into play later when you will want to add ISOs or VM hard drives within KubeVirt. The next post will be about setting up Ceph for a cluster. LVM storage is great for a single node you want to develop with; it’s simple, and there is very little overhead. When we get into multinode, we enter the world of hyper-converged infrastructure with all its pluses and minuses; but we need that to migrate VMs and have redundancy. After Ceph, we will finally discuss installing and managing KubeVirt / OpenShift Virtualization.

Rock-Ona 507

I have always loved audio as a medium, I don’t know why. Starting with playing around with an old stack stereo growing up; learning I can record a record onto a tape blew my young mind. I have done several projects around audio. Continuing on that, I love the classic Jukeboxes. They come from an era before microprocessors. Before you could just put a chip in something. When real mechanical and electrical engineering made things work. To pay homage to that I have been working on a similar looking device to an old Rock-Ola 507, but… I just put a microprocessor in it…

All 3D Models and ESP32 code is on Github.

Design

The Rock-Ola 507 was a table side music player for diners and restaurants. Rock-Ola had many models of “wallboxes” that connected to jukeboxes. I always liked the older 1548 models, but those have a lot of moving parts and many buttons. Instead, I opted for the simpler 507. I called mine the Rock-Ona 507.

The idea is you could walk up to it, see a list of songs, and type in the numbers to queue it up. I could have my favorite songs available, and then also have a Bluetooth option for when you just want to stream from your phone.

(Photo from Rock-Ola 507 wallbox jukebox flyer | #102562020)

With this project I wanted to expand working with the ESP32 Audio board I did in the History Phone project. But also wanted a bit more of an old school vibe. To make the system feel more retro, one idea was to have the song list be an E-Ink screen. I thought that gave the system a softer feeling than an LCD screen. I have not used E-Ink in a project before; it turns out E-Ink panels need a special driver board. E-Ink is also not cheap, a 9.7″ panel will be $70, then you need the driver board. There are not standard boards out there to do what I wanted to, then I found an awesome open-source project!

Circuit Board Ordering with PCBWay

EPDiy (e-paper diy) is an open-source project on its 7th version of a universal E-Ink driver board! This introduced a new challenge for me; how do I get the boards made. I have not made circuit boards before, and while I have thought about making boards before, it seemed like an uphill battle. I had heard about PCBWay from Youtube videos and podcasts. I knew they had a shared projects area and hoped someone had done the heavy lifting here. Turns out someone had already uploaded the design and had it ready to go! I was not ready to push forward with the project yet and shelved the project for a bit. A few weeks later, PCBWay reached out after seeing the blog and asked if they could help pay for some of the boards to be made if I spoke about the process. It was a great match!

I went back to the shared project and added it to my cart. With the board needing a bunch of small capacitors and the ESP32 soldered to the board, I opted to also have them do full assembly. I submitted the order. A few hours later I got a message on the website and email. The shared project hadn’t set the PCB layer count, and I had accepted whatever was default (incorrect). I flagged https://vroland.github.io/epdiy-hardware/ as the source of the project and said I wasn’t sure how many layers it was. They came back quickly saying the board was a 4-layer board with 1oz for each copper layer, and it was resubmitted for approval. I was very glad someone who knew what they were doing was reviewing the board before sending it off. Later, I got another message from someone double checking the polarity of a LED on the board, and I was able to look up from the design and verify it with them.

Boards have to be ordered in sets of at least 5, I got 5. I planned to pay to have 4 assembled; I would have extras and would be able to use them in future projects. 5 boards were $25.97 + shipping and took 4 days or so to build. Then 4 boards, assembly and parts all came to $110.59, that part took about 20 days. Most of that cost was the parts (which they handle). Then shipping was around $50 at the time. All together this put each PCB at around $40. I got updates along the way, and this let me work on other aspects of the project while I followed the boards’ progress.

I plan to use PCBWay again in the future for circuit boards I may need created since 5 for ~$26 seemed like a great deal on these complicated ones. I am also curious about using them for CNC on some future projects where parts may need to be made out of metal. 3D printing works for a lot of my needs but a few parts may need to be metal.

Controller Design

The EPDiy was great for handling the E-Ink screen and uses an ESP32 that I knew how to program for; the issue was it doesn’t have enough ports for everything I need, and I need something to drive the audio as well. I returned to the ESP32 Audio board I used in the History Phone project. As I said before, its documentation is bad, but I now know how it works… so the devil you know…

The idea was to connect these two controllers, EPDiy in charge of user input and output, I called this one the IO Board; then the audio board in charge of SD card full of music, organizing the music, and playback, I called this the jukebox board. They share a serial link. The audio board comes online, checks an index of audio on an SD card. Then after reading the table of contents, sends over what to display to the IO Board. The IO Board sends back button presses as they occur. I haven’t done a project with two ESP32s working together yet, but Claude helped me get the code set up.

Quick Rust Interlude: I was interested in trying to use Rust as the ESP32 controller language instead of C, except when I looked all the libraries that make the audio board work are C, and then the EPDiy libraries are C. I either would need to do a ton of inter-language linking, or I could just do the project in C; which I did instead. Also to use Rust, I couldn’t easily use Platformio which I have been using for ESP32 projects. Platformio allows you to use Visual Studio Code to build out your Arduino / ESP32 projects, and I like it much more than the Arduino IDE or other command line tools.

This was a project I was playing with Claude Code to throw together some of the C. The code is straight forward, and Claude put together a python script which let me manage the songs I was loading on the SD card, setting up the metadata correctly for the audio board to parse. I also was able to add Bluetooth audio support. I set one of the song selections to a special setting, and that puts the device into Bluetooth pairing mode! Once a song is selected, that mode disables itself.

3D Model

I have been trying several different 3D modeling systems recently. For years now I have used Solidworks, and it works well other than it’s a giant Windows app that periodically gets hours of patching done to it. I tried OnShape, a web-based 3D modeling tool, after I ran into someone from the company and they recommended it. You get to use all their standard tools for free, as long as your designs are public. I gave it a spin. It worked decently well. My normal keyboard shortcuts didn’t all work, but that is from years of using Solidworks, not their fault. The main issue I kept having was the mating system. Theirs is MUCH different from what I am used to (again from years of Solidworks) and I kept stumbling.

This holiday season Solidworks for hobbyists was on sale, including the standard Desktop version AND the web version. I had to pay twice but got a year of each for the normal $49 price. I want to love the web version. My models aren’t that complex, and honestly Solidworks is one of the last things that keeps my desktop computer on Windows. I will keep experimenting with it, but I did find it slower and more cumbersome than the desktop version of Solidworks. In the end I made an early render in each but did the majority of the work in Solidworks Professional, the desktop version.

The original version of the Rock-Ola 507 was large, larger than a photo may let on. It was 18 inches wide, by 15 inches tall, and 6 inches deep. It was also 30 pounds… It was BIG. Here is a screenshot of OnShape with a 15.6″ screen put into a version where I had already shrunk down 2 inches.

Quick aside: One neat feature of OnShape is git-like versioning of your models. You can see on the left I tried different sizes and different screen ideas.

Here was another attempt at fitting different screens, this time I had 3 x 7.5″ (kindle like) screens in the device. Still way too big and would require 3 EPDiy boards working together (I did get 4 in case I went that route) greatly increasing the cost. This would be good for hanging in a garage or basement, but I wanted something a bit more compact.

I ended up shrinking the whole thing to ~9 inch across, about 50% the original one. This also allowed me to easily fit in on my 3D printer in one piece. I got a Sovol SV08 a bit ago, its 350mm by 350mm print bed made easy work of this.

This is a closer to finished Solidworks model. I am not showing the screen. I had an idea of using cloth for the speakers to give it a more warm feeling. I did try this on the first unit I made of two, sewing is not my thing.

I got a 9.7″ E-Ink screen off amazon, that turned out to be around the same price I saw at places like eBay and could come quickly. At $68.99, this was the largest expense of the project for one item.

Much later I realized I had one place for the USB C power to go into the side of the case, but I didn’t have anywhere for the SD card. This would mean I had to open the unit every time I wanted to change the songs on it. I got a SD card extension cable, and already had a Dremel… so I made another hole in the side…

Part of this build was figuring out a way to make the buttons feel nice. I tried several iterations and different designs. I wanted a longer throw, but something electrically simple, and could fit in the case.

I ended up with this 3-layer design. Big button with a stem, that goes through a guide that has a spring, that clicks a little push button. You get the nice throw of the switch, a good click, then it bounces back.

Part Selection

I have mentioned selecting the audio board, selecting the screen, selecting the buttons. One important thing I haven’t mentioned for an audio system is selecting the speakers. The documentation for this board mentions that it can drive 3W, 4Ohm speakers. I tried 3 different ones in different sizes thinking how different could 3W speakers’ sound. Apparently very different. The small ones had a very hollow sound. I went with the bigger speakers. The audio board needed JST-XH 2.54mm connectors in male, which I couldn’t find documented anywhere.

For filament, I did most of the prototypes parts in black. I knew I wanted a silver front from the start, and that led me to trying silver spray paint, more on that in a second. The second big personality piece is the glowing red front buttons. I got transparent PLA filament and put a strip of lights behind the buttons. This worked well. I may have instead put just a few LEDs behind the buttons instead though, because right now the LED strip is tied into the same power source as the front LEDs, and I want to increase the brightness to the buttons but not the front lights. Something to fix in future versions, or a small upgrade. Splitting the button lighting control would also allow interesting effects for if the system is left on for a while with nothing playing, dimming the screen and buttons separately.

Assembly

With boards in hand, Claude helping to put the code for them together, and the simple buttons, the hardest part of this whole project was printing it out and putting the pieces together. I will readily admit I am not the best spray painter. It also didn’t help that the chrome-looking spray paint wants 65F-90F degrees and 60%+ humidity to properly paint. Being winter around NY left me limited days where I could attempt different spray-painting techniques. I tried the whole proper 3D printing, sanding, filling, sanding, priming, sanding, spray painting for the first unit. It came out, meh. I think I over sprayed. I ended up making two units, one where I tried a few front plates, with different sanding and spray-painting techniques, then one that I just got silver filament and 3D printed it, calling it a day.

Button Assembly

I assembled the front. Soldering the buttons to a PCF8575 IO Expander, this helped give me more buttons over I2C to the EPDiy board. I created a version 1.1 of the button board. The first version had wires going everywhere which made it hard to put into the case. The updated version I put the chip on the same 3D printed part, greatly shortening the runs and making it much cleaner inside. I did get one of those PCF8575 that was bad, and that took me a bit to track down, it would spray out random buttons being hit. I thought it was my circuit. One great upside to the PCF8575, it uses 2 wires for power, and 2 for data, making the whole button assembly very self-contained.

An important note about EPDiy and PCF8575. The E-Ink part of the panel uses 0x20 ID on I2C to communicate. I had to solder on the back of the PCF8575 to change its ID to 0x21.

I got the final front panel and installed the full button assembly. I put the EPDIY onto the bracket for the screen. Then installed the E-Ink screen and its backing bracket. I had put LEDs from an LED strip I got on the front below the screen, and above, then another short strip behind the transparent buttons. Soldering the wires onto the “easy to cut” LED strip was not fun. I soldered all the LED strip positive wires together, and then the negative wires; connecting them into one of the quick connect connectors I had. The front panel was done now. I wrote the IO Board code to the ESP32 (which I ended up doing multiple times), and it was good to go. I did originally want to have text with lights behind it in the top cut out area of the front, and above the buttons, but this was added complexity for a small nice visual thing. I may add that in another iteration.

I made wide use of one of my more recent favorite 3D printing tools, heat set inserts. They are tiny metal screw points you melt into your print with a soldering iron. They give great M3, M4, or M5 mount holes that you can use in a project and not worry about wearing out the plastic. They are also $10 for 120 of them.

Electronics

Time to wire the body. I put a USB C port in the side of the case which allowed one power plug for the device. I didn’t want to have one of those devices that had USB C but not actually USB C Power Delivery negotiations. I got a little chip that allows you to declare which voltage you want over USB C. I set that to 5V.

Power goes from that board, into a USB mini cable for the jukebox board, a USB C cable for the IO Board, and a pair of wires into a MOSFET. I haven’t worked with MOSFETs to control LEDs before, but I needed it to drive the LED strips I put around the screen, and behind the buttons. This actually gave me issues because (I learned later) MOSFETs are good to regulate power but also are noisy on power lines turning on and off. They caused chaos on the power line, and I ended up putting a 1000uF capacitor on the input side of the MOSFET to help smooth out the power line. Without that, the IO and jukebox boards power was so dirty they kept blinking on and off.

*Prepare for Rookie lvl 101 Circuit Design!*

I had a bunch of quick connectors that usually are for LED lights, but I used them anywhere I may need to often connect and disconnect sub-components. If one component has more than one connector going to it, I will reverse the male/female side of the connectors to make it clear which one plugs into where. Things like PCF8575 -> IO Board, IO Board -> Jukebox board, LEDs on front panel -> MOSFET. This allowed me to quickly remove things when needed. I soldered an extra pair for the serial connection that would go between the two ESP32s.

The speakers had 4 metal anchor points around them, and I was having a hard time fitting them, so I just cut off one and it fit perfectly. The heavier speakers also helped the system stand upright, bringing a good lower point of gravity to the unit.

With the speakers in, and wired up, I used the heat sets and put the jukebox board in and programmed it. Then I wired that into the power sub system and connected the two ESP32s. And crazy enough, it worked! I did some tweaking from there around the buttons, and screen output, but overall, we were good!

I did have a moment where I had the IO Board ESP32 connected to my laptop, and jukebox on the normal power system, and I couldn’t get the serial line to come up. That’s when I remembered serial communication really needs 3 wires, RX, TX, and ground. I had originally planned for both ESP32s to be in the case and they would be sharing a common ground. Once I put both of them on the power from inside the case, the serial line came up!

With the project working, I went into one of the hardest parts: selecting music to add. While selecting music, I would let it play in the background. Working hands on with it I realized I wanted to add features like a queue to the screen and some feedback about which buttons you have pressed.

Wrapping Up

This project had me using a bunch of engineering things I hadn’t done before. Two ESP32s working together, working with a PCF8575 port expander, 3D printing then priming, and painting to try to get good chrome, MOSFETs, E-Ink; it was a lot of fun.

I also had ideas that could go into another version down the line: a knob on the side like the original to have more than one page of songs, a web interface for uploading songs and showing the queue, Spotify connect support (somehow). Possibly a cheaper version where you print out the song list and don’t need an E-Ink screen. I want to use these units for a while in my office and in the workshop, and then see what makes sense to add over time.

If anyone has any questions or is interested in making one and needs more information than I have here and in the repo let me know!

Homelab Token Ring

For the LAN Before Time, my retro rack, I wanted to mix the most diverse set of CPU/OS/Networking I could find. There are not a ton of networking standards out there, as Ethernet took over so quickly. One that has always interested me is Token Ring, IEEE 802.5 standard, mostly from IBM as a competitor to Ethernet. Token Ring went through many transitions in its time on the scene, from speed changes to connector changes, lasting from the mid 1980s through the 1990s.

Connectors

The protocol started at 4mb/s (megabits a second), with the computer having a DB9 connector going to a giant 4 pin plug.

Later 16mb/s was added. Most of the cards you will find are 4/16 cards.

The physical connector, and connection speed are independent, you can use either the DB9 or RJ45 connectors to run 4mb/s or 16mb/s.

The cards started in the ISA era and later continued into the PCI era. The connector also evolved to a standard RJ45. There were adapters to go between the older connectors and newer ones. Later cards would include both DB9 and RJ45 connectors. With RJ45, only the middle 4 pins were used, but in a straight through way, allowing normal Ethernet straight through cables to be used.

In the last updates to the protocol, 100mb/s Token Ring was added, but by the time that came out Ethernet had taken much of the market share. And finally in 2001 a 1000mb/s standard was created, but Wikipedia says no devices ever came out for it.

MAUs

Unlike Ethernet, Token Ring cannot connect two computers directly. You need to go through a Media Access Unit, or MAU. These units control ports going in and out of the ring. They can be thought of like an Ethernet hub or switch. The Token Ring itself also needs a terminator on it. Later models contained internal terminators if put into a specific mode. There are MAUs with the old large IBM connector, and there are newer ones with RJ45. There were adapters between any of these connection types for networks in transition.

My MAU Journey

I picked up 2 of the same model MAU. ODS/Motorola 877. These are great units after some hardware tweaks and I would recommend them. While they are the same model, and same firmware revision, Motorola bought the company ODS (Optical Data Systems) which made them. The first one I got has ODS branding and a spot for two switches to control the mode and speed of the MAU. The second one is Motorola branded on the case, but not the board, and is missing the cut out in the case for switches.

From what I can learn with working on it, looking at documentation for other MAUs, and Claude; the device can work in three modes:

RING: Normal Token Ring operation, requires external RI/RO loopback cable to close the ring, use this when daisy-chaining multiple MAUs together, all active lobe ports are part of the ring.
STAR: Each port operates independently (not a true ring), used for certain troubleshooting or special configurations.
LOOP: Internally connects Ring In to Ring Out, self-terminates the ring without external cables, perfect for a single standalone MAU.

The MAUs were designed to have a switch to go between modes. Neither of mine did, both had a physical soldered in jumper setting their mode. The Motorola one didn’t have a hole in the case for a switch to exist, but the PCB is the same. I removed the soldered jumper and replaced it with a standard PC jumper pin, that way I could easily change it when I wanted to. In the end I will leave them both in LOOP mode most of the time, that has internal termination and is used for simple 4 port usage. Bridging the top and middle pin put it into LOOP mode, which is what I needed. Before that it was in RING without termination; each device would join the ring for 10 or so seconds, not hear anything else on the ring, and then disconnect. This MAU appears to be able to automatically go between 4mb/s and 16mb/s mode and I never moved the speed jumper.

The two modifications I made to these devices were the mentioned jumper change; and they come with a FGG 2P power connector onto a RJ45 plug. It says it needs 12V on it, and I wanted to just be able to use a wall plug, I first tried to get that connector, but after finding it tiny and hard to work with, I replaced the port in the device with a standard barrel plug.

Token Ring Drivers

One difficult part of finding Token Ring cards on eBay, you never know if you can find all the drivers. The card I have is a later model PCI card. It’s a Thomas Conrad TC4048. Thomas Conrad seems to have been an interesting company putting out different network cards over the 80s and 90s before ethernet took off. It is easy to find their Token Ring and Arcnet cards online. Finding their drivers on the other hand, proved to be difficult.

Driver Hunting

I found https://archive.org/details/pwork-297 this archive.org ISO, it contains a TON of drivers for devices in the 90s. It lists TC4048 as one of them. I download the image, install the driver AND… Windows 98 says it has the tc4048 files it needs except a “tc4048.dos”. I then found https://www.minuszerodegrees.net/software/Compaq/allfiles.txt this site which has every HP/Compaq driver that used to be on their site. Those are much easier to search. There were several TC4048 items.

I found an archive at https://ftp.zx.net.nz/pub/archive/ftp.compaq.com/pub/softpaq/sp19501-20000/, and downloaded sp19859.exe, which expanded and had “DOSNDIS” and “OS2NDIS”. I knew Compaq rebranded this card, so I yoloed and renamed “DOSNDIS/CPQTRND.DOS” to “tc4048.dos” and put it with the drivers I got from the archive.org image. The Thomas Conrad drivers from different vendors had similar files with different names, but they were the exact same size, and appeared to be the same… I hoped it would just work if I renamed a file from a different vendor to the one I needed. I made progress with error messages now seeing “svrapi.dll” missing in C:\Windows\, and found that file in C:\Windows\System32… and just copied it up one directory…

And magically that worked! I had a 16mb/s connection working between the Cisco 3825 (core) and the Windows 98 PC (edge)! The core of my retro network is a Cisco router. I purchased this Cisco 3825 system a while back because it’s the last one that supports Token Ring, but new enough to have 1gb/s uplink port to my core network. This allows me to host some retro VLANs internally, and firewall them off for security (since none of these systems have gotten patches for decades). I can play with Novell Netware and host a file share of games for the retro systems on this network as well. Using even legacy networks to move files is still a lot easier than a ton of floppy disks. I leave this router off most of the time because it’s a bit power hungry and loud. I have written about it before, and it also hosts my dial up connections.

I now had the Cisco 3825 with a Token Ring card and Windows 98 PC joining a Ring and communicating! I have watched a bunch of clabretro’s videos on Token Ring, and I saw the same issue with the Thomas Conrad drivers that he saw with his cards, Windows joining a Token Ring network and the drivers have an odd interaction. When the computer boots, at that point it tries to join the ring, the system will stay at the Windows startup screen an extra-long amount of time as it tries to enter the ring. The system will also wait at shutdown as it attempts to leave the ring. If the Token Ring card is not plugged in, you get a message about failing to connect after a prolonged startup.

Future Token Ring Plans

I plan to play with Token Ring a bit more both as a standard networking technology alongside the Ethernet network I have. Now that I have two working MAUs I want to experiment with linking them over the ST fiber connectors they have and getting a Token Ring connection over fiber. I am pondering learning FPGAs by building a Token Ring to Ethernet bridge using an FPGA connected to an ISA Token Ring card. I just find it interesting and it would push my FPGA skills; the project would need to translate the headers of Token Ring at layer 2 to Ethernet headers.

Token Ring is the layer 1 and layer 2 technology, after that we use standard TCP/IP on top of it; this has made it easy to get started with Token Ring over another protocol like AppleTalk or IPX. Once the physical connection was up, and devices could enter the ring; I was able to use standard Cisco commands and create a routable DHCP pool for Token Ring.

10″ Homelab Rack

I am working on a project that involves Intel vPro and AMD Dash. I am hoping to automate vPro and DASH systems with ansible the same way you can do iDrac and iLO for OS level development. I needed a few of those Tiny/Mini/Micro PCs to have the actual system API available. After stacking a bunch of them on the shelf next to my desk, I wanted a nicer way to organize them. I tried 3D printing a stand for one, and that worked for a time, then I decided to stack them. First I tried 3D modeling and printing little shelves that could fit them, this didn’t work great, I made the tolerances too tight making them hard to snap together. Then I decided to pivot to a 10″ Homelab rack.

There are many of these out there that come as full kits, but those can go for $100+, and I have a 3D printer… so where is the fun in that…

I found this https://www.printables.com/model/329801-10-ah-rail-based-servernetwork-rack/files model. You buy the metal posts offline, then this is the frame that holds the posts together. That brings the cost to 2 sets of posts (about $10 each), then some filament. I got these posts. That is more my style! There are many 10″ rack 3D models available. Many of them are BIG and engineered with handles so you can carry the systems around. Some are engineered to be 100% 3D printed with no metal posts. I wanted a rack with the strength of the metal posts, but simple.

I printed this, and while it was printing saw someone made a beefed up version with thicker corner pieces: https://www.printables.com/model/666403-10-in-mini-rack. I wish now that I printed that one instead because of the flimsiness, but it’s working as is. The frame is a bit flimsy, instead of the frame holding in the gear, the gear is more holding together the frame. I printed all these parts with 3 walls, and 50% infill. More than likely overkill, but I was just using PLA and wanted extra strength.

I went with 4U rack posts, I don’t need that much more than that now, didn’t want it to get too big, and for $20 investment I can always change the posts out later. Each of the mini pcs is 1U, this would give me 4 “slots”.

I printed this to hold the HP models I have: https://www.printables.com/model/585091-10-inch-rackmount-for-mini-hp-prodesk-elitedesk-g1. They were strong but I wanted to support that back of the mini pcs also, and found someone made a remix for that, short Remix: https://www.printables.com/model/841903-10-forward-or-reverse-rackmount-for-hp-elitedeskpr. The systems airflow is mostly front to back, and there are gaps between the systems for airflow.

I have an Intel NUC I also want to have in the rack. The NUCs are a little taller than 1U it seems, so all the 3D models to mount them are 2U. I didn’t want to use 2U up for it, I would just put it up top and it could stick out. I found this nice shelf: https://www.printables.com/model/1002978-geeekpi-10-in-rack-shelf, and printed that. I used this one specifically because it had adjustable rear supports.

The little rack is working well for me, sitting on a shelf and holding what it needs. The power supplies for these systems go down the shelf the rack is on to a power strip. Since these are development systems, I skipped having a UPS. Once I got the rack posts, the whole rack came together in a day and was easy to put together. This was a better solution than any other home grown one, because I get access to the entire ecosystem of 10″ standard rack parts!

History Phone

Periodically I enjoy doing more artistic electronics projects. I wanted to capture more of the family stories I hear but aren’t recorded or written down. I had the idea to get an old rotary phone, remove the parts inside, and replace them with an ESP32. When this phone is dialed, it looks for a year that is close to the number dialed, and then plays the audio file from a family member telling a story of that year. Along with hearing the story, a companion tablet website would allow you to see photos and get more information about the story. As all my “quick projects” go, this has been going on for a while, partially because I grew the scope some, partially because getting recordings can take a while.

Along with everything below, here is the github repo that contains controller code, 3D models, scripts, and more information. I have had three hardware revisions; the first was a prototype, the second was the first phone that was overly complex, the third was a simple phone with an all in one ESP32 audio board. This project was a great way for me to play with the ESP32 using SD cards, audio, and even Wifi.

General Design

All the different versions of hardware have an ESP32, SD card reader, and an audio chip. The main inputs are on/off hook, and the rotary piece itself. Then for output we have the speaker in the handset. When the handset is picked up, we play a dial tone.

Historically I have used the Arduino IDE to develop embedded code, for this project I moved to PlatformIO. As a more advanced user its MUCH nicer. You can easily add libraries, and do all your development within VSCode.

Outside of the dialing a year, I wanted some fun features that also helped with discovery. Dialing 0 runs an Operator audio file. This file is generated by a python script that scans the SD card contents, then creates an audio file stating which years can be dialed. Dialing 9 plays a random file from the SD card, this helps with random story discovery. Dialing 8 gives the option to change the volume, this prompt is created via another python script.

The rest of the numbers will try to find a year with a fuzzy matching a decade up and down. If no audio is found, it plays a busy signal. If the user doesn’t enter a number within 10 seconds or so, it plays a busy signal.

A quick tangent: the way the rotary dials work on these old phones is interesting. There are two pairs of wires that come off the rotary. Code for it is here. The first opens once the dial moves at all, this pair goes from a closed circuit to open and remains there till it rests at the start again. That’s when you start tracking the rotary. Next the second pair will pulse every time it passes a digit. The flow for reading this is once the movement starts, clear a counter and start counting pulses. It’s shockingly and happily simple.

Pinout

Other than V1 which had all the buttons and front LEDs hooked up, the wiring is fairly simple. There is the SD Card, which is internal on the all the boards; I have done external SD cards before but I prefer to have my ESP32 boards have a SD card reader on them. Audio – headphone amp, internal on V2 board. Then the two wires mentioned for rotary, and 1 for hook on/off. That’s it!

Iterations

Prototype V0

This was my first test, I 3D Printed a holder for a rotary dial I got off eBay for about $10, and wired up all the parts I would need. For the hook I used a simple switch you could toggle on and off. I did not touch things like Wifi in this iteration, this was just to see if I could get the basics working. The rotary was around $10, then a $17 ESP32 with built in SD card reader, then a few extra dollars for a switch, audio amp, and speaker. Total cost was around $30. This was built in a weekend or so.

Phone V1

After I had the prototype working, I moved onto to getting a real phone, and trying to get everything working within that. I got this phone that had several lines, and I thought I would use those buttons for different functions. Over time this seemed like overkill and kept me from making more progress. I also needed more IO ports, and had to start using MUXes to do this, which further discouraged me. I wired up all the front buttons to have a LED behind them, and could read if they were open or closed. All I ended up doing was lighting up line 1 when the handset was picked up. The buttons do have a great tactile feel though.

I did do most of the development with this phone. I got the web interface working, and push many versions; which involved doing a static build, and then coping it to the SD card. I cut a hole in the side of this phone to allow me to externally mount the SD card. When the web pages need updated or I needed to load more content I need the SD card, and that is buried very deep in this version of the phone.

To get the hook signal, I connected to the old phones terminals that close when the phone is on the hook. This worked decently, except I added some code to take several readings and average them. I think the old contacts weren’t the cleanest leading to periodic bad readings.

I tried to do clean wiring on this one. With the ESP32 I used for this version, Freenove CAM Board, it came with headers. I used those below through this mounting plate that swiveled to connect to a MUX and audio amp, a MAX98357.

This version of the phone, was an actual phone… That cost $59 on eBay (ones with all the lines were extra), then same $17 ESP32. I installed a few LEDs, MUXes, and an audio amp. This version probably cost closer to $75, and I used this as my test bench for… Almost a year as I on and off worked on this.

Fancy Phone V2

After how complex (see wiring photos, scream in horror) V1 became, I wanted to go back to the start and make a simpler one with the core phone functions. I also found this board that included an SD card reader, and audio amp all for $20, meaning I could use a phone, and this card for everything.

At this point I am doing my normal eBay scouring, and come across this nice executive desk phone. It’s simple, and the seller accepted my low offer! (I think eBay sellers just want to get rid of these phones) I wired everything up and I had a difficult time recreating V1. The main reason is what several reviews on Amazon said about this board (EC Buying ESP32-Audio-Kit Audio Development Board, ESP32-A1S); it has basically no documentation. Luckily, one person mentioned a repo that had a ton of information on the pinout, and that worked for me.

I probably would use this board again, because now I know how to make it work; but it was not forgiving to get working. There appear to be a bunch of companies that have this same board with different names on Amazon. There are different revisions and you don’t really know which one with which pinout you will get till you get it. The company has a github repo that has SOME information, but not all.

For this phones hook signal, I had to remove the piece that would usually be triggered to fit in the ESP32. I modeled and then 3D printed a bracket that holds the ESP32 in the bottom, and at the same time it has spots for two switches that get contact when the handset is put down.

This version of the phone I got (all these prices are with shipping) for $35, its a nice solid unit. The new ESP32 board was $22. Then I added 2 switches because I needed the hook signal, and some terminals. This version of the phone was closer to $60, and I spent about a month here and there on it.

File Structure

The SD Card has 2 main folders, content and web. Within content, there are some of the stock tones like a dial tone, then there are folders for each year. Within each year: photos, txt files, and mp3 files can live. The system will select at random an MP3 file if there is more than one. The files are supposed to be named starting at 1.mp3 and increment. The idea was you would have 1.mp3 and 1.txt, which would match up and have content matching with the audio. This wouldn’t be hard to make it not that way, but that’s how I was doing it. This is all documented more in the README, on the github page.

The web folder contains a Next.js React app that is built into a static site (a great Next.js feature!).

Web Interface

One feature that was an added on stretch goal was to have a companion website for the phone, allowing for photos and other media while stories were being told. The ESP32 can host a Wifi access point and web app, and I have been learning React, so this was a great place to keep learning! I used Next.js because I have used it before, and I know it can create a static HTML version of your site. I had to overcome some CORS and api issues, then it was smooth sailing. I used the next.config.mjs file to add redirects while doing development.

Adding features like volume control on the web page, and currently playing was helpful to learn how to properly do POST requests with an ESP32. I also added the ability to click a audio file in the web app, and then that will queue up on the phone.

Conclusion

Having my family play with the end product has been fun. One of the hardest parts of the project has been getting audio, cutting it down, and finding accompanying photos.

I had a bunch more ideas for the project. Adding a Bluetooth audio output option, photos overlaid on AI-generated period artwork, family tree generator integration, on-device recording capability, support for phone buttons and indicator lights, and standardized metadata schema. I already have spent far longer than I thought on this project, so those ideas will wait for another day.

In the end this was a fun project that I learned a lot of about web servers and audio processing on an ESP32. These chips continue to amaze me in what they can do for under $20. I will probably use that same audio board again, the documentation is bad but now I know how to get it to work. Here is the repo with even more information, the supporting 3D models, and files. If anyone has questions or recreates this, please leave a comment!

Step-By-Step Setting Up Networking for Virtualization on OpenShift 4.19 for a Homelab

As we continue our Openshift journey to get virtualization working, we have a vanilla node already setup and now we need to get the networking configured. The examples here are from Openshift 4.19.17.

Networking in OpenShift is conceptually two parts that connect. The first part is the host level networking; this is your CoreOS OpenShift host itself. Then there is how do the pods connect into that networking. Usually, the network connects through your network interface card (NIC), to the Container Networking Interface (CNI), then to your pod. Here we will be using a meta plugin that connects between the NIC and the CNI called Multus. Redhat has a good post about it.

Host Level Networking

This part of the networking stack is straight forward if you are used to Linux system networking, and it is setup the same way. Treat the CoreOS node like any other Linux system. The big decision to make in the beginning is how many interfaces you will have.

Networking diagram without sub interface

If you have 1 interface and plan on using virtualization, are you going to use VLANs? If so, then you may want to move the IP of the interface off of the primary interface and onto a VLAN sub interface. This moves the traffic from untagged to tagged traffic for your network infrastructure.

Another reason is there are bugs in the Mellanox firmware, mlx5e, where Mellanox 4 and 5 cards can think you are double VLAN encapsulating, and will start automatically stripping VLAN tags. The solution is to move all traffic to sub interfaces. You will get an error in your dmesg/journalctl of: mlx5e_fs_set_rx_mode_work:843:(pid 146): S-tagged traffic will be dropped while C-tag vlan stripping is enabled

With the interface moved, that frees us up to use it for other VLANs as well. If you deployed network settings via a MachineConfig, you would have to override them there.

The rest of the configuration will be done via the NMState Operator and native Openshift.

NMState VLAN and Linux Bridge Setup

NMState is a Network Manager policy system. It allows you to set policies like you would in Windows Group Policy, or Puppet to tell each host how the network should be configured. You can filter down to specific hosts (I do that for testing, to only apply to 1 host) or deploy rules for your whole fleet assuming nodes are all configured the same way. It’s possible to use tags on your hosts to specify which rules go to which hosts.

NMState can also be used to configure port bonding and other network configurations you may need. After configuration, you get a screen that tells you the state of that policy on all the servers it applies to. Each policy sets one or more Network Manager configurations, if you have multiple NICs and want to configure all of them, you can do them in one policy, but it may be worth breaking the policies apart and having more granularity.

Another way to go about this section, is to SSH into each node, and use a tool such as nmtui to manually set the networking. I like NMState because I get a screen that shows all my networking is set correctly on each node, and updates to make sure it stays that way. I put an example below of setting up port bonding.

Go to the OpenShift web console, if you need to setup OpenShift I suggest checking out either my SNO guide or HA Guide.
Click Operators -> OperatorHub.

Install NMState.
- Worth mentioning you can do all this with OKD, except NMState is very old, and hasn’t been updated in 5 years. Either NMState would need manually installed, or the interfaces would need manually created.

Once installed, you will need to create an “instance” of NMState for it to activate.

Then there will be new options under the Networking section on the left. We want NodeNetworkConfigurationPolicy. Here we create policies of how networking should be configured per host. This is like Group Policy or Puppet configurations.
At the NodeNetworkConfigurationPolicy screen, click “Create” -> “With YAML”.
- I put some additional YAML files below under Additional NodeNetworkConfigurationPolicy YAMLs
We need to create a new sub-interface off of our eno1 main interface for our new vlan, then we need to create a Linux Bridge off that interface for our VMs to attach to.

apiVersion: nmstate.io/v1
kind: NodeNetworkConfigurationPolicy
metadata:
  name: vlan19-with-bridge           <-- Change This
spec:
  desiredState:
    interfaces:
      - name: eno1.19             <-- Change This
        type: vlan
        state: up
        ipv4:
          enabled: false
        vlan:
          base-iface: eno1
          id: 19                     <-- Change This
      - name: br19                   <-- Change This
        type: linux-bridge
        state: up
        ipv4:
          enabled: false
        bridge:
          options:
            stp:
              enabled: false
          port:
            - name: eno1.19       <-- Change This
              vlan: {}

Important things here:
- Change the 19s to whichever VLAN ID you want to use.
- “ipv4: enabled: false” says we want an interface here, but we are not giving it host level IP networking on our OpenShift node.
- Remove the <– Change This comments
- You MUST leave the “vlan: {}” at the end or it will not work, adding this tells it to leave vlan data how it is because we are processing via the kernel via sub interfaces.

Now we have this configuration, with a secondary interface off of our NIC, and an internal Linux Bridge for the VMs.

The great thing about doing this configuration via NMState, it applies to all your nodes unless you put a filter in, and you get a centralized status about if each node could deploy the config.

Here is an example from my Homelab, with slightly different VLAN IDs than we have been discussing. You can see all three nodes have successfully taken the configuration.

OpenShift VM Network Configuration

Kubernetes and OpenShift use Network Attachment Definitions (NADs) to configure rules of how pods can connect to host level networking or to the CNI. We have created the VLANs and Bridges we need on our host system, now we need to create Network Attachment Definitions to allow our VMs or other pods to attach to the Bridges.

Go to “Networking” -> “NetworkAttachmentDefinitions”.
Click “Create NetworkAttachmentDefinition”
This is easily done, and can be done via the interface or via YAML, first we will do via the UI then YAML.
Before entering the name, make sure you are in the Project / Namespace you want to be in, NADs are Project / Namespace locked. This is nice because you can have different projects for different groups to have VMs and limit which networks they can go to.
Name: This is what the VM Operator will select, make it easy to understand, I do “vlan#-purpose“, example: “vlan2-workstations”.
Network Type: Linux Bridge.
Bridge Name: what was set above, in that example “br19“, no quotes.
VLAN tag number: Leave this blank, we are processing VLAN data at the kernel level not overlay.
MAC spoof check: Do you want the MAC addresses checked on the line. This is a feature which allows the network admin to pin certain MAC addresses and only send traffic out to those allowed. I usually turn this off.
Click “Create“

The alternative way to do a NAD is via YAML, here is an example block:

apiVersion: k8s.cni.cncf.io/v1
kind: NetworkAttachmentDefinition
metadata:
  name: vlan19-data-integration
  namespace: default
spec:
  config: |-
    {
        "cniVersion": "0.3.1",
        "name": "vlan19-data-integration",
        "type": "bridge",
        "bridge": "br19",
        "ipam": {},
        "macspoofchk": false,
        "preserveDefaultVlan": false
    }

You can verify the NAD was created successfully by checking the NetworkAttachmentDefinitions list. Your networking is ready now. Next post, we will discuss getting storage setup.

Additional NodeNetworkConfigurationPolicy YAMLs

NIC Bonding / Teaming

Use mode 4 (802.3ad/LACP) if your switch supports link aggregation; otherwise mode 1 (active-backup) is the safest fallback.

apiVersion: nmstate.io/v1
kind: NodeNetworkConfigurationPolicy
metadata:
  name: bond0-config
spec:
  desiredState:
    interfaces:
      - name: bond0
        type: bond
        state: up
        ipv4:
          enabled: false
        link-aggregation:
          # mode=1 active-backup
          # mode=2 balance-xor
          # mode=4 802.3ad
          # mode=5 balance-tlb
          # mode=6 balance-alb
          mode: 802.3ad
          options:
            miimon: '140'
          port:
            - eno1
            - eno2

Useful Links

https://github.com/k8snetworkplumbingwg/multus-cni/blob/master/docs/how-to-use.md

https://medium.com/@tcij1013/how-to-configure-bonded-vlan-interfaces-in-openshift-4-18-0bcc22f71200

Step-By-Step Getting Started with High Availability OpenShift 4.19 for a Homelab

Last post, looked at getting started with a SNO (Single Node OpenShift) system. Next we will look at a build with multi-node, or multi-master, OpenShift. This runs the core service of etcd on more than one node, allowing for a single node failure. Some services like the virtual machine services need to run on a master as well, having more than one relieves pressure on that system. With SNO, if your master does not start, the entire cluster cannot start. In addition, SNO upgrades will always introduce downtime with the single master rebooting.

Master nodes do have more services than a simple worker, if you are running a small cluster with 3 nodes, you may want to decide if the extra overhead on the second and third nodes are worth it, or if you want to run leaner and run SNO with extra workers. In my experience of vanilla OpenShift, masters run about 20GB of ram more than worker nodes with no additional services on them.

I have a 3 node cluster that I was migrating from VMware and wanted to run HA. This allows me to do no downtime upgrades, with the three nodes sharing the control role.

My Setup

I am installing onto 3 HP Elitedesk 800 G5s, each with an Intel 9700, and 96GB of RAM (they can go to 128GB when RAM prices aren’t insane). I have a dual 10gb/s NIC in each for networking since I will be running ceph. This is the same Homelab cluster I have had for a bit. These machines aren’t too expensive, they have 8 cores each, can go to 128GB of RAM, and have several PCI slots, and NVMe slots. I have used this guide to install OpenShift 4.17-4.20.

Installation Steps for HA OpenShift

Any line starting with $ is a terminal command to use. The whole process will take about an hour; 30 minutes or so to collect binaries and prep your config files, a minute or two to create the ISO, then 30 minutes of the cluster sitting there and installing.

One important thing to say up front to those who have not used Openshift or Kubernetes before: there is 1 IP that all the applications use, the web server looks at the request coming in and WHICH DNS NAME YOU CONNECTED TO, and then routes your traffic that way. You can have 100% of the things setup right, and when you browser to the IP you get “Application is not available” when trying to access the console. This means the system is working! You just need to connect via the correct DNS name.

Prerequisites: Start by going to the same place as the original post to get a pull secret and binaries you will need for the install. These include openshift-install, and oc.
I am on Fedora 42 and needed to run sudo dnf install nmstate to install nmstate. This is required to transform the configs in the agent-config.yaml into the configs that will be injected into the installation ISO.
Make a folder, called something like “ha-openshift”, and put all the binaries in there.
Config Files: Before we had install-config.yaml, now we will have that AND agent-config.yaml.
Below is an install-config.yaml, I will call out things you will want to change for your setup:
- ```
apiVersion: v1
baseDomain: example.com
compute:
- architecture: amd64
 hyperthreading: Enabled
 name: worker
 platform: {}
 replicas: 0
controlPlane:
 architecture: amd64
 hyperthreading: Enabled
 name: master
 platform: {}
 replicas: 3
metadata:
 name: cluster1
networking:
 clusterNetwork:
 - cidr: 10.131.0.0/16
 hostPrefix: 23
 machineNetwork:
 - cidr: 192.168.4.0/24
 networkType: OVNKubernetes
 serviceNetwork:
 - 172.30.0.0/16
platform:
 baremetal:
   apiVIPs:
   - 192.168.4.5
   ingressVIPs:
   - 192.168.4.7
pullSecret: '{"auths":{"cloud.openshift.com":{"auth":"b3Blbn==","email":"not-my-real-email@gmail.com"}}}'
sshKey: ssh-rsa AAAAB
```
- The “baseDomain” is the main domain to use, your hosts will be master0.<baseDomain>, the cluster name will be <metadata.name>.<baseDomain>. Make sure you put in what you want here because you can’t change it later. This is how users will reference the cluster.
- Under workers and controlPlane, you put how many worker nodes and master nodes you want. This is a big difference between SNO and HA, we are saying 3 instead of 1 master.
- metadata.name is the sub name of this exact cluster. You can have multiple clusters at lets say “example.com”, then setting this will make the cluster apps.cluster1.example.com. (Yes the DNS names get long with OpenShift)
- clusterNetwork and serviceNetwork will be used internally for backend services, only change these if you are worried about the preset ones conflicting with your IP space.
- machineNetwork.cidr is the IP space your nodes will live on, this needs to be set for your DHCP network. This is the range the network will use. Some of the IPs below will need static reservations in your DHCP network, the worker and master nodes can have general pool DHCP addresses. We are assuming DHCP here, you can statically assign IPs but its more work and not something I am going to talk about right here.
- platform.baremetal.apiVIPs is where the API for your cluster will live, this is an additional IP the HA masters will hand back and forth to give the appearance of a single control plane.
- platform.baremetal.ingressVIPs is another IP that will be handed back and forth but will be the HTTPs front door for applications.

agent-config.yaml, I will call out things you will want to change:

apiVersion: v1alpha1
kind: AgentConfig
rendezvousIP: 192.168.4.10
hosts:
  - hostname: hv1
    role: master
    rootDeviceHints:
      serialNumber: "AA22122369"
    interfaces:
      - name: enp1s0f0
        macAddress: 0c:c4:7b:1e:42:14
      - name: enp1s0f1
        macAddress: 0c:c4:7b:1e:42:15
    networkConfig:
      interfaces:
        - name: bond0.4
          type: vlan
          state: up
          vlan:
            base-iface: bond0
            id: 4
          ipv4:
            enabled: true
            address:
              - ip: 192.168.4.10
                prefix-length: 24
            dhcp: false
        - name: bond0
          type: bond
          state: up
          mac-address: 0c:c4:7b:1e:42:14
          ipv4:
            enabled: false
          ipv6:
            enabled: false
          link-aggregation:
            mode: 802.3ad
            options:
              miimon: "150"
            port:
              - enp1s0f0
              - enp1s0f1
      dns-resolver:
        config:
          server:
            - 192.168.3.5
      routes:
        config:
          - destination: 0.0.0.0/0
            next-hop-address: 192.168.4.1
            next-hop-interface: bond0.4
            table-id: 254
  - hostname: hv2
    role: master
    rootDeviceHints:
      serialNumber: "AA22628"
    interfaces:
      - name: enp1s0f0
        macAddress: 0c:c4:7b:1f:06:e2
      - name: enp1s0f1
        macAddress: 0c:c4:7b:1f:06:e3
    networkConfig:
      interfaces:
        - name: bond0.4
          type: vlan
          state: up
          vlan:
            base-iface: bond0
            id: 4
          ipv4:
            enabled: true
            address:
              - ip: 192.168.4.20
                prefix-length: 24
            dhcp: false
        - name: bond0
          type: bond
          state: up
          mac-address: 0c:c4:7b:1f:06:e2
          ipv4:
            enabled: false
          ipv6:
            enabled: false
          link-aggregation:
            mode: 802.3ad
            options:
              miimon: "150"
            port:
              - enp1s0f0
              - enp1s0f1
      dns-resolver:
        config:
          server:
            - 192.168.3.5
      routes:
        config:
          - destination: 0.0.0.0/0
            next-hop-address: 192.168.4.1
            next-hop-interface: bond0.4
            table-id: 254
  - hostname: hv3
    role: master
    rootDeviceHints:
      serialNumber: "203129F9D7"
    interfaces:
      - name: enp1s0f0
        macAddress: 0c:c4:7b:1f:03:c2
      - name: enp1s0f1
        macAddress: 0c:c4:7b:1f:03:c3
    networkConfig:
      interfaces:
        - name: bond0.4
          type: vlan
          state: up
          vlan:
            base-iface: bond0
            id: 4
          ipv4:
            enabled: true
            address:
              - ip: 192.168.4.30
                prefix-length: 24
            dhcp: false
        - name: bond0
          type: bond
          state: up
          mac-address: 0c:c4:7b:1f:03:c2
          ipv4:
            enabled: false
          ipv6:
            enabled: false
          link-aggregation:
            mode: 802.3ad
            options:
              miimon: "150"
            port:
              - enp1s0f0
              - enp1s0f1
      dns-resolver:
        config:
          server:
            - 192.168.3.5
      routes:
        config:
          - destination: 0.0.0.0/0
            next-hop-address: 192.168.4.1
            next-hop-interface: bond0.4
            table-id: 254

rendezvousIP is an IP of a node in charge of the setup. You pick one of them to wait for all other masters/workers to be ready before starting the installation. It will wait for all nodes to be online, check they are ready, install them, then install itself.
The rest of this config is a three times repeated (one per host) setup of each host, things you will want to change:
- hostname, whatever each hostname should be.
- rootDeviceHints.serialNumber, this is the serial number of the disk you want to install the OS onto. There are many different rootHints you can use depending on your setup. https://docs.redhat.com/en/documentation/openshift_container_platform/4.19/html/installing_an_on-premise_cluster_with_the_agent-based_installer/preparing-to-install-with-agent-based-installer#root-device-hints_preparing-to-install-with-agent-based-installer. To find the drive serial numbers, you can boot either a Redhat CoreOS image, or any Linux LiveCD, then run something like $ lsblk -o name,model,size,serial .
- List interfaces with MAC addresses. I did here use port bonding and then subinterfacing the port bond for a vlan. That is a bit advanced and is not required. I do it this way so I can run a trunk port to each node, and then with VMs, run different VMs over different sub interfaces. The website above also has examples of networking without this port bond if you want a simpler config.
- DNS and core routes are the last thing to enter.

DNS Entries: Having created those two files, you know what you want your DNS to be. It’s time to go into your location’s DNS servers and enter addresses just like in the original post. These entries can be made at any time before you start the installation. In the end you should have 1 IP for ingress, 1 for api, then one per node.
- api.cluster1.example.com -> apiVIPs, in my config 192.168.4.5
- api-int.cluster1.example.com -> apiVIPs, in my config 192.168.4.5
- *.apps.cluster1.example.com -> ingressVIPs, in my config 192.168.4.7
- master0.cluster1.example.com -> node1 IP, in my config hv1 so I put 192.168.4.10
- master1.cluster1.example.com -> node2 IP, in my config hv2 so I put 192.168.4.10
- …
Image Creation:
$ mkdir ocp
$ cp *.yaml ocp
$ ./openshift-install –dir ./ocp/ agent create image
This will create a ocp/agent.x86_64.iso
Installation: Boot that iso on all servers. The image will use the hardware you specified in agent-config.yaml and DNS lookups to identify each node. Make sure the systems NTP is working, and their time looks correct, then that each node can curl:
- registry.redhat.io
  quay.io
  cdn01.quay.io
  api.openshift.com
  access.redhat.com
The stack should now install, the main server will show a screen saying the state of the other masters, and when they are all ready, it will proceed with install. This can easily take 30 minutes, and the screen on the rendezvous server can be slow to update.

With any luck you will have all the nodes reboot, and a running stack you can access at your console server location; here that would be console-openshift-console.apps.cluster1.example.com. Each node should show a normal Linux boot up sequence, then will show a login prompt, with that nodes name, and IP address(es). In this learning experience, feel free to restart the installation and the system will wipe the machines again.

In the ha-openshift folder, then the ocp subfolder there will be an auth folder. That will have the kubeadmin and kubeconfig files to authenticate to the cluster. The kubeadmin password can be used to login to oauth at console-openshift-console.apps.cluster1.example.com. The kubeconfig file can be used with the oc command downloaded from Redhat. using $ ./oc --kubeconfig ./ocp/auth/kubeconfig get nodes will show the nodes and their status from your installation machine.

Properly installed cluster example: 
~/homelab_openshift $ ./oc --kubeconfig ./ocp/auth/kubeconfig get nodes
NAME   STATUS   ROLES                         AGE   VERSION
hv1    Ready    control-plane,master,worker   44d   v1.32.9
hv2    Ready    control-plane,master,worker   44d   v1.32.9
hv3    Ready    control-plane,master,worker   44d   v1.32.9

This is an example of a successfully upgraded cluster running, and I am running the standard OpenShift oc get nodes command. Note: the version is the version of Kubernetes being run, not OpenShift.

I will continue this series with posts about Networking, Storage, and VM setup for OpenShift.

Troubleshooting

The install process for OpenShift has a big learning curve. You can make it a bit easier by using Redhats web installer, but that also puts some requirements on the system that a Homelab usually can’t hit, doing the agent based installer bypasses those checks. Once you get your configs dialed in, I have found it easy to reinstall a stack, but getting configs for a stack setup correctly the first few times is tough. The installer also does not do a ton to make it easier on you, if something goes wrong, the biggest indicators I have found are: when SSHed into the installer, the memory usage, the journalctl logs in the installer, and about 8-10 minutes into a good install, you will see the DVD image start to read a lot of data, constant activity on the indicator for a few minutes (that is the CoreOS being written to the disk).

Random things to check in a failing install:

SSH into a node using the SSH key in the install-config.yaml, run $ sudo journalctl and scroll to the bottom to see what’s going on, or just run $ sudo journalctl -f.
- You may see something like:
  - “failing to pull image”: It can’t hit Redhat, or your pull secret expired
  - “ip-10-123-123-132.cluster.local node not recognized”: DNS entries need updated
If the system successfully reboots after an install, but you are not seeing the console start, SSH into a node using the SSH key in the install-config.yaml, run $ top. If your RAM usage is about:
- 1GB, Kubernetes is failing to start, this could be a DNS or image download issue.
- around 8GB, the core systems are attempting to come online, but something is stopping them such as an issue with the api or apps DNS names.
- 12-16+GB of ram used, the system should be online.
Worth repeating for those who haven’t used Openshift before, internal routing is done via DNS names in your request, if you attempt to go to the ingress VIP via the IP you will get “Application is not available”. This is good! Everything is up, you just need to navigate to the correct URL.

Footnotes

Helpful examples: https://gist.github.com/thikade/9210874f322e72fb9d7096851d509e35