Windows Server DNSSEC Error 9110

TL;DR; Check that your Domain Controllers are in the correct OU and that Microsoft Key Distribution Service is running

I ran into an issue recently when DNSSEC signing a dns zone where Windows Server 2019 gave a very vague error, and would only display that error after 10 minutes of timeout. This made iterating on it very slow since every change I made was a 10 minute wait. Every guide to setup DNSSEC mentioned right clicking the zone, then clicking sign and as long as you select the default it should just work. On another domain, that happened for me and it just worked; except the one original one that kept timing out.

In setting a custom DNSSEC signing policy I noticed that there were different keystores each of which gave a different error. This made me think it was something to do with the specific one I was using. It was time to troubleshoot the service itself not DNSSEC.

I got a list of the services from a known good, and signing, domain controller; then compared that to the bad one to see what was different. Part way down the list I noticed that Microsoft Key Distribution Service was failing to start, and if I tried to start it, there was an error.

Group Key Distribution Service cannot connect to the domain controller on local host Status 0x80070020.

Checking the Event Log showed an issue in finding the Domain Controllers on the network (error above), which was weird because it is a Domain Controller… In looking at where this system was placed in the domain tree, I saw it had been moved from the original OU for domain controllers to another place. I dragged it back, after applying all the GPOs that were on that other folder to the original Domain Controller folder. Then held my breath, hit start on the Key Distribution Service and it started right away.

After that DNSSEC signed with no issues. Long story short, dont move your DCs it’ll only end in pain. And to the one other person on the internet who has seen this problem and never solved it, 5+ years ago there is your answer!

US Patent US10530642B1

One of the projects I currently work on at work, and have for the last few years is how to go from a blank stack of servers to a fully configured cluster with my companies software running on it. While some projects were starting and getting going in the open source field when I started this project 5+ years ago, a lot of them kept rewriting their API every minor version rev. That started my down a path that has now become a decently large internal network booting infrastructure, and managing interconnects to our inventory system as well as other systems such as Tenable Nessus. I recently was awarded my first patent! This one is specifically about how my system interacts with the inventory to dynamically assign systems as they come online to clusters.

My part of the code was all written in Java and continues to evolve as a platform, I hope to open source a good amount of it down the road. I started the project by reading the RFCs for DHCP/PXE and then writing code. I have grown to enjoy writing libraries and some project this way of adhering to the standard (more on that some other time). The general platform can handle ProxyDHCP PXE booting, and then uses iPXE to create menus and boot systems. I spent many hours debugging different vendors PXE code and BIOS vs UEFI to get all the systems to work. The platform now supports plugins for many different aspects of server configuration.

I could write page about small details I have learned a long the way; one issue that has been driving me crazy recently, if you want to ProxyDHCP instead of using your main DHCP stacks these days is Secure Boot. iPXE does not have a Secure Boot signed image, I have tried to get Microsoft to sign it but they will not unless you are selling a product using that the sign iPXE. I am not I just wanted it for internal use. That means you may want to use grub2 as your loader, but there is a bug that has been outstanding for over 6 years and makes ProxyDHCP with grub basically impossible, which is sad.