Quick post: I had a HA pair of ISE boxes in a lab the other day have the certificates that I made with a Windows Certificate Authority expire the other day and I ran into some odd behavior. To be clear, in this scenario, the certificates had a valid chain of trust, but it was past its expiration date.
I logged in after realizing this and had odd behavior, node-A could not read node-Bs certificates. Both nodes said they were no longer on domain, even though the domain disagreed and I logged in with domain credentials that were recently changed. Then when I went to make a Certificate Signing Request (CSR), I was able to make it, but when I went to download it I got a generic message of “Cannot connect to node-a”. At the same time all these issues were going on, under “Node Status” on the dashboard, both nodes were sharing health data.
In the end, ISE gets weird when the cert date has expired. I generated a new self signed cert for node-A. Then deleted the expired certs because the system didnt want me to make a CSR for the same thing it thought it had a cert for already. This allowed me to then properly make a CSR and export it. That gave me “ciscoisenodea.pem”, I brought that over to my setup Windows CA, and with a admin command prompt ran certreq -submit -attrib "CertificateTemplate:WebServer" ciscoisenodea.pem . Saved that to my local desktop, and went into ISE to Bind it to the CSR. Node-A then rebooted. All of a sudden things like the domain pairing, started showing they were working again. Then the second node, I did the same process, and all of a sudden everything was happy again. Note: make sure you have a your admin backup password, one of the nodes DID refuse to talk to AD and I had to use that, while the other one said it wasn’t on the domain, but did work…
A while ago I purchased a Ruckus ICX 7150-c12p off eBay to use at home. It gives 14x1gb/s ports, and 2 SFP+ ports. The SFP+ ports are limited to 1gb/s by default, and there is a honor system license for upgrading them to 10gb/s. These switches go for $600 – $1200 depending on where you get them and which license you get with it (1gb/s vs 10gb/s). The switch is also POE, and can do 4 POE+ (30 watt) ports. I had one of these switches and it worked great. I wanted to get a second one to replace the WiFi link I was using across my apartment with a fiber link.
Instead of paying ~$250, which was their going rate on eBay; I saw a forum post about replacing this models power supply, and thought I would give that a shot. I got a broken switch for $45, and then a PSU for $50. The PSU I used was a SL Power LB130S56K 56V 2.32 130W. Armed with someone’s photos of doing this repair it ended up going fine. The hardest part of the whole operation is that the pins going onto the main board are reversed from what the power supply comes with, so you need to flip them. I have been running the unit for almost 2 years now without issue.
This model of switch is great because of its features and is fanless. The fanless-ness part of it is nice for homelabs near your desk, because the switches are silent. Because they are fanless, they cant have anything put on top of them, and need some room to breath. I think a lot of the ones you see online dead are because someone didn’t give it enough air, and the PSU died. Note when looking for a similar dead switch on eBay, you really want the seller saying “when plugged in nothing happens”, not “it periodically blinks” because that could be bad ram and its in a boot loop.
Having run two of these switches for over a year, I can give some feedback. I really like them. I have the two I have in a stack, I login once and manage both. When it comes time for firmware updates you SCP the file to the management IP, and it downloads the file to both, and then flashes and reloads. I came from using Cisco gear usually, or sometimes Arista; the CLI is a bit different, and Ruckus handles VLAN setup a bit weird, but once you get used to it, it makes sense. They are solid switches, with POE, that you can set and forget for a while.
Broken guyEasy to open screwsde-lid-ed switchOld PSUThere is a little plastic cover that folds over itThermal pad on bottom of PSUThis little caddy holds the PSUHad to switch the PSU Pins, look at the forum post I am not 100% sure this is before or afterOld homelab about 2 years ago, I went a bit crazy getting broken ones to repair, gave one to my brother, other had bad memory
One technology I have played around with a little at work but wanted to get a better handle on is 802.1x. I have taken and passed the Cisco ISE cert a few years back and have used that with other services at work, but for the home setup I mostly wanted to be able to put different wireless devices onto different vlans based on device and user. Windows Server natively makes this possible with Network Policy Server (NPS).
An example of me playing with Network Policy Server
NPS is a Radius server in Windows at the end of the day. It gives you the conditions and a rules system to respond to different Radius calls, as well as a way to setup Accounting. It is fairly simple compared to something like ISE that can also do Posture and Profiling for devices; but for a quick free solution works well for home. You can say of a client is attempting to authenticate over a system like wireless, then accept x methods, vs if its wired or a switch login then accept other forms. Instead of going point by point of how to set it up, which you can find elsewhere online, I want to give some high level edge cases you may run into. First NPS need Windows Server with the Desktop experience, if you are running member servers or domain controllers as Server Core to simplify the environment then it will not work. NPS also does not easily HA. You can run multiple servers with it running, and export the config from one, then import it in another, but there is not a good system for dynamically syncing these (less you call random peoples PowerShell scripts a good system).
Some good reasons to use NPS is the simple AD integration, you can have users use their domain auth and easily get access. Or do as I do, really too much for home, or possibly anywhere, setup a domain CA, have a GPO that creates certs for each machine, then use cert based auth via 802.1x deployed via GPO. If anyone has questions about this I am happy to answer, but there are many places online that will talk about each of those configs and how to do them. Another place to integrate Radius other than 802.1x for wired and wireless is network device login. I use Radius for the stack of Ruckus switches (2 is considered a stack (like when you run k3s as a “cluster” of 1)) I have at home.
This is one of those Windows Services that works well, but also has not been touched in YEARS by Microsoft; like WSUS, or any other service that is useful. To backup this point, I installed several old versions of Windows Server in ESXi that I had laying around. Lesson 1 that I learned, the web console doesn’t work well with some of the legacy mouse support systems. Second you may need legacy VMware tools iso VMware Tools support for Windows 2000, Windows XP, and Windows Server 2003 (81466). The internet seems to say it came out in Server 2008.
I had to password recover a Cisco ISR 4451, and kept having issues getting into the ROMMON prompt. Every guide mentioned sending a BREAK character during startup, but I could not get that to work. I was using the mini-USB port in the front, and as far as I knew did not have password recovery disabled. It turns out there is a problem with the mini-USB port and the Mac driver, I switched to using a traditional serial cable with a DB-9 connector/RJ45 serial port and suddenly I could get into ROMMON. I wanted to post incase anyone else runs into this.
Below is the startup process, at the end there you should be able to send a BREAK character.
Initializing Hardware ...
System integrity status: 00000610
Rom image verified correctly
System Bootstrap, Version 15.3(3r)S1, RELEASE SOFTWARE
Copyright (c) 1994-2013 by cisco Systems, Inc.
Current image running: Boot ROM0
Last reset cause: PowerOn
Cisco ISR4451-X/K9 platform with 4194304 Kbytes of main memory
Warning: filesystem is not clean
File size is 0x1d482044
Located isr4400-universalk9.03.16.04b.S.155-3.S4b-ext.SPA.bin
<SEND BREAK HERE>
As a younger person in my career I got a few Cisco certs, the study material was available to me, and I thought it would be an interesting thing to learn. At this point, I have had a CCNP for almost 10 years and I still enjoy messing with networking even if it is not my day to day job. While I historically have used Cisco a lot, there are many other brands out there these days that have good gear, some even low power enough that I can run at home and not worry about the power bill. Below is my current home setup, it has changed a lot over time, and this is more of a snapshot than a proper design document. That is what homelabs are for right? Messing around with things.
Firewall
The firewall I am running is one I have mentioned on here before. The system itself is an OLD Dell Optiplex 990, released in early 2011 and soon to have its 10th birthday! Idling at ~30 watts, it works well for what I need it to with a gen 2 i7, and 8gb of ram. I added a 4 port Intel gigabit ethernet card to it, which allows for more ports and hardware offloading of a lot of IP tasks.
I looked around at different firewall OS options. Pfsense is the obvious one, but I found its interface lacking. (I use Palo Alto Networks firewalls at work and that interface/flow is more what I am used to) Opnsense is a bit better, but still leaves something to be desired on the UI side. Then I tried a Home License of Sophos XG. It is free as long as you stay at 4 cores or less, with 6 or less GB of ram used; you are given an “evaluation license” until 2099-12-31, if it runs out I will ask for an extension. For more than a year I have been enjoying it, the interface is slick, and you get the enterprise auto patching built in. In the time I have run it, I have had 1 zero-day attack on the product and it was immediately patched without me having to login. I use it as my home firewall between vlans, a DHCP server, and I also have IPSec and SSL VPNs for when I am away from home. The system does DNS for the house (on the less secure vlans, AD does those) and allows for block lists to be used. This is like a pihole but built into the product.
There are a few things it does a little odd, but I enjoy not having to go and write weird config files on the backend of some Linux/BSD to have my firewall work. I have it hooked into AD for auth, and that way I can login with a domain admin, and allow users who have domain accounts to VPN into home. It has been VERY stable, and usually only reboots when I tell it to do an update, or that one time the ~10 year old PC blew a power supply.
Cross Room Link
At the start of the year, I was running a Ubiquiti Wi-Fi mesh at home, it got decent speeds, and allowed me to not run wires over the apartment. The access points used were these models, link. They were only 2×2 802.11AC Wave 1; got decent speeds (around 400mbps), but being in a New York City apartment, I would get interference sometimes, even on 5ghz. The interferences would cause issues when playing games or transferring files. The bigger issue was my desk with a bunch of computers, and the firewall were on different sides of this link, meaning any data that was on a different vlan had to go over and back on this Wi-Fi link. On top of that, I will mention I basically HAVE to use 5ghz, I did a site survey with one of the APs and the LOWEST used 2.4ghz band near me was 79% utilized…
Anyway I started looking around for what I would replace it with, I always thought fiber could be a way to go since its small and if I could get white jackets on it, then it would blend in with the wall. I spend a few weeks emailing and calling different vendors trying to find someone who would do a single cable run of white jacketed fiber. Keep in mind this is early 2020 with Covid starting up. Lots of places could not do orders of 1, or their website would say they could and later they would say they couldn’t and refund me. Finally I found blackbox.com, I have no affiliation with them they just did the job quickly and I appreciate that. I got a 50 meter or so run, and was able to install that with the switches below.
Switches
Now that I had the fiber I needed some small switches I could run at home. After looking at what others have on reddit and www.servethehome.com I found the Ruckus ICX 7150-C12P. A 14 1gb/s ethernet, switch with 2 1/10gb SFP+ ports. The switch is compact, fan-less, and has 150 watts of POE! I can run access points, and cameras off of it without other power supplies. I have learned to look for before buying this sort of gear off eBay to try to get the newest firmware. With Cisco and HPe they love to put it behind a wall that requires an active support contract. Not only does Ruckus NOT do that, they have firmware available for their APs that allows it to run without a controller, more on that later.
I ended up buying 1 of the Ruckus switches “used” but it came sealed in box. Then getting another one broken, after seeing some people online mention they sometimes over heat if it was somewhere without proper ventilation and that can kill their power supply. The unit is fan-less, but the tradeoff there is nothing can sit on top of it, because it needs to vent. I was able to get one for around $40, then a new power supply for $30, all in I spend $70 for a layer 3 switch with 10gb ports! Now I have these 2 units on opposite sides of the room, in a switch stack. This way they act as one and I only need to manage “one switch”.
With the Ubiquiti gear no longer acting as a Wi-Fi link, which I have written about before, I only had one of the APs running. As mentioned before the access point was only 2×2 antennas and 802.11AC Wave 1. I was pondering getting a new Wi-Fi 6 access point, while looking around someone on reddit, again, suggested looking at Ruckus access points. Their antenna design is very good, and with their “Unleashed” firmware you get similar features to running a Ubiquiti controller. After looking at the prices I had to decide if I wanted to go Ubiquiti with Wi-Fi 6, and wait for their access points to come out, or get something equally priced but more enterprise level like a 802.11AC Wave 2 access point (like a Ruckus R510 or R610 off eBay).
I recently had a bad experience with some Ubiquiti firmware, then all of a sudden they killed Ubiquiti Video with very little warning, and some the more advanced functions I would want to do are either minimally or not documented with Ubiquiti. One could argue that I am used to enterprise gear, and Ubiquiti is more “pro-sumer” than enterprise; thus, I should not be upset at the lack of enterprise features. That made me decide to try something new. I ended up getting a Ruckus R610 off eBay and loading the “Unleashed” firmware on it. I can say the speeds and coverage is much better than the older access point. It is 3×3 802.11AC Wave 2, and with most of my devices still being 802.11AC I figured that was a good call.
One feature of the Unleashed firmware is it can manage all your Ruckus hardware. The web management portal has a place to attach your switches as well, and do some management of them there. I have been scared to do this, and coming from a traditional CLI switch management background have yet to do so.
Unleashed Home Screen
I was able to POE boot the AP just like I did with Ubiquiti, converting the firmware was easy, and there are many guides on Youtube for it. The UI does not have the same polish that Ubiquiti does, but the controller is in the AP itself which is very nice. There is a mobile app, but it is fairly simplistic. The web interface allows for auto updating, and can natively connect to Active Directory making it very easy to manage authentication.
There are 3 wireless networks in the home, 1 is the main one for guests, with their 6 year old unpatched android phones, that has a legacy name and meh password, that way I don’t have to reset some smart light switches Wi-Fi settings. This is where all the IoT junk lives. There is one with a better password that connects to the same vlan I am slowly moving things in the house over to, at least the key is more secure. Then there is the X wireless network, this one is not broadcast and has 802.1X on it. When a user authenticates with their domain creds, depending on the user and device I send them to a different vlan. This is mostly used for trusted devices like our laptops, and iPad when I want to do management things. This network for my domain account allows me on the management network.
10gb/s Upgrade!
The latest upgrade I have embarked on was 10gb/s. I moved my active VM storage off of the NAS to Storage Spaces Direct for perf. While the NAS has worked well for years, the 7 – 3 TB disks do not give fantastic IOPS when different VMs are doing a lot of transactions. After lots of thought and trials I went with Storage Spaces Direct and will write about it later. The main concern was that it allows all the hypervisors to have shared storage and keep it in sync, and to do that they need good interconnects. This setup is the definition of, lab-do-not-do-in-prod, with 3 nodes each with 3 SSDs over USB 3. I knew with USB 3 my theoretical bottle neck was 5gb/s, which is much better than the 1gb/s I had, that also had to be shared with all server and other traffic.
First I had to decide how I would layout the 10gb/s network, while the ICX 7150 has 2 – 10gb/s ports, 1 is in use to go between the switches. After looking around and comparing my needs/wants/power/loudness-the-significant-other-would-put-up-with I got a MikroTik CRS309-1G-8S+IN. I wasn’t super excited to use them, since their security history is not fantastic, but I didn’t want to pay a ton or have a loud switch. I run the switch with the layer 2 firmware, and then put its management interface on a cut off vlan, that way it is very limited on what it can do.
After that I got a HP 10gb/s server cards, and tried a Solarflare S7120. Each had their ups and downs, the HPs are long and would not fit into some of the slim desktops I had. But when the would work, like in the Dells, they would work right away without issues. The Solarflare are shorter cards which is nice, but most of them ship with a firmware that will not work on some motherboards or newer operating systems. For these you need to find a system they work in, boot to Windows (perhaps an older version) then flash them with a tool off their website. After that they work great. I upgraded the 3 main hypervisors, and the NAS. I have seen the hypervisors hit 6.1gb/s when syncing Storage Spaces. With memory caching I can get over the rated disk speed.
That is the general layout of the network at this point. I am using direct attached cables for most of the systems. I did order some “genuine” Cisco 10gb/s SFP+ off Amazon for ~$20, I didn’t believe they would be real, but I had someone I know who works at Cisco look them up and they are real. Old stock shipped to Microsoft in 2012 or so, but genuine parts. The Ruckus switches and these NICs do not care which brand the SFPs are, so I figured I would get one I knew. The newer Intel NICs will not work with non Intel SFPs so look out.
To summarize, the everything comes in from my ISP to the Sophos XG box, then that connects to a port on one of the Ruckus switches. Those two Ruckus switches have a fiber link between them. Then one of the SFP+ ports on the Ruckus switch goes to a SFP+ port in the MikroTik switch. All the hypervisors hang off that MikroTik switch with SFP+ DACs. Desktops, video game consoles, and APs all attach to 1gb/s ethernet ports on the Ruckus switches. I have tried my best to label all the ports as best I can to make managing everything easier. I’m sure this will evolve more with time, but for my apartment now 10gb/s networking with a Ruckus R610 AP has been working very well.
I am starting a series about my homelab and how it is all laid out. I have written this article a few times, with months in between. Each time the setup changes, but we seem to be at a stable-ish point where I will start this series. Since I wrote this whole article and now a while later am editing it, I will mark with italics and underline when present me is filling in. I think it will give a neat split of growth in the last year or so I have been working on this. Or it will make it illegible, we will see. My home setup gives me a good chance to test out different operating systems and configs in a domain environment before using that tech elsewhere like at work.
Hypervisor
Starting off with virtualization technology, I settled a while ago on Microsoft Hyper-V instead of ESXi, the main reason behind it is I already had Windows Server, and Hyper-V allows for Dynamic memory, and allocating a range of memory for a VM. When something like an AD controller is idling, it doesn’t need much memory; when it starts it may, Dynamic memory allows me to take that into account. I will say one place that has bit me later is file storage, but that will be a later post.
Original post layout
Updated Diagram
The setup is technically “router on a stick”, where the Sophos XG firewall functions as the router, and the rest of the devices hang off of that. The Sophos XG machine is a old Dell Optiplex 990 (almost 10 years old!) with an Intel quad NIC in it. That way it can do hardware offloading for most of the traffic. I intend to do posts for networking, hypervisors, file storage, domain, and more; thus I will not get too in the weeds right now on the particulars.
The file storage is a FreeNAS box recently updated to 7, 3TB HDDS. I have had this box for about over 6 years (I just looked it up in November 2020, one of the drives has 55257 hours or 6.3 years of run time on it); it is older but has worked well for me so far.
The network backbone is a new switch I really like that I was able to get 2 of off eBay; they were broken but I was able to repair them, more on that later as well. They are Brocade, now Ruckus, ICX7150-12P; 12 1GB/s POE ports, 2 additional 1GB/s uplink port, and 2, 1/10GB/s SFP/SFP+ ports. These switches can run at layer 3, but I have the layer 2 firmware on them currently. They have a fiber connection between them, before that I was using 2 Unifi APs in a bridge, that didn’t work fantastic however because A. I am in NY, B. they were only 2×2 802.11AC Wave 1, and C. I am in NY. I custom ordered (so the significant other would not get mad) a white 50m fiber cable to go around the wall of the apartment.
With SSDs in the hypervisor boxes (I call them HV# for short) and iSCSI storage for VMs as well, which VMs are on which host doesn’t particularly matter. Flash forward 6 months or so, since that first sentence was written, I now still use the NAS for backups, but the hypervisors are running Storage Spaces Directed and doing shared storage now. This allows the hypervisors to move move VMs around during patching or pause during a system update if they are less critical. The Intel NUC and small Dell Inspiron are much under powered compared to the mid tower hypervisors, so they run usually only 1 or 2 things. The NUC runs the primary older domain controller, and that is it. It is an older NUC that I got about 7 years ago, so its not that fast. The “servers” in the hypervisor failover cluster are a Lenovo and 2 Dell Optiplex 5050s. I like these Dells because they go for about $200 on eBay, while having a Intel 7600 i5, can support 64GB of ram, and have expansion slots for things like 10gb SFP+ cards. These machines also idle at about 30 watts, which makes the power bill more reasonable.
Some of the services I run include:
2 Domain Controllers (Server 2016, and 2019)
Including Routing and Access service for RADIUS and 802.1x on wifi on wired
Windows Admin Center Server (Windows Server 2019)
Windows Bastion (This box does Windows Management) (Server 2019)
Veeam Server (Server 2019)
Unifi Controller/Unifi Video for security camera (Ubuntu)
3 Elastic Search boxes for ELK (CentOS 8)
Linux Bastion (CentOS 8)
Foreman Server (CentOS 8)
LibreNMS (This I grew to really like) (CentOS 8)
Nessus Server (CentOS 8)
Jira Server (CentOS 8)
That is the general overview, I will spend the next while diving into each bit and discussing how it is configured and what I learned in doing that.
Having a small home lab I wanted to be able to setup internal services, and then on the go be able to access them. While I could setup a L2TP or SSL VPN and connect whenever I wanted to use these services, I thought I would give On-Demand VPN via a iOS/macOS configuration a try. Little did I know the world of hurt I was entering. I will start with the settings you need to get it working, since a lot of people just want that. Then I will talk about the crazy and painful road I went down before finding 1, just 1, set of settings that seem to work. If you have any questions, thoughts, or success stories please comment below!
Fun fact: I will be calling the protocol IPsec here. That is what the original RFC called it, what the original working group was called, and the capitalization they used. Sophos agrees and uses that capitalization, while Cisco and depending on which web page you are on for Microsoft may call it IPSEC or IPSec or IPsec.
On-Demand VPN gives you the ability to set certain websites or IPs, and when your phone or laptop attempts to connect, the machine silently brings a IPsec tunnel online and uses it for that traffic. This allows you to run services at home, and to users (your mom or cat or whomever) it looks like just another website. Apple has 1 big requirement for them, you have to use certificate based auth. You can not use a pre-shared key/password. Also up front, to save you a few days of trying things. iOS and macOS will NOT check your certificate store for your VPN endpoint (Sophos XG) certificate, it HAS to ship with the firmware or you will get the fantastic and descriptive “Could not validate the server certificate.” Also believe it or not, that is one of the most descriptive errors you will get here. There are some posts on the Apple support forums from Apple engineers saying the root CA has to be in already on the device. If anyone gets it to work with your own let me know.
Sophos XG Setup
I am using Sophos XG v18 with a Home license, backed by AD running on a Dell Optiplex for this guide (dont worry it as a cool Intel Nic in it). To setup the IPsec server in Sophos XG first we need to make 2 certificates. Login to the admin portal, then on the bottom left select “Certificates”. You need 2 certificates; 1 is our “local certificate” (we will call it Cert-A) this is a cert that is used for the server (Sophos) end. As previously mentioned, this has to be a real signed cert. I ended up forwarding a subdomain on my site to the firewall, and then using Let’s Encrypt to create a cert for that URL. I used this site, https://hometechhacker.com/letsencrypt-certificate-dns-verification-noip/ to guide me in creating the cert on my laptop, then I uploaded that to the Sophos firewall. This will require you to have access to your domains DNS settings or be able to host a web file.
The second cert (Cert-B) is for the client, Sophos will call it “The Remote Cert”; this is to auth to the firewall, that can just be a locally generated cert. All devices will share this cert. The devices will use their username and password combination to identify the user. I used email as the cert ID, note this email does not have to exist, I just made one up on my domain so I will know what this cert is. Once created, go back to the main Certificates page and download the client/remote certificate, I suggest putting an encryption password on it since the Apple tools seem to freak out if that is missing. But ALSO the password for this cert will be in clear text in your config, so don’t make it a password you care about. These certs all need to be rotated at least once a year, with the newer requirements; Let’s Encrypt is every 90 days and I intend on automating that on one of the Linux machines I have.
Now that we have our 2 certificates, lets go over to “VPN” on the left hand navigation. I have tried many settings in the main “IPsec Connections,” and none of them have worked for me. I get fun and generic errors from the Mac of “received IKE message with invalid SPI (759004) from other side” or “PeerInvalidSyntax: Failed to process IKE SA Init packet (connect)”.
Click the “Sophos Connect Client” tab, the back end of this client is just a well setup IPsec connection. Fill in the form, from the external interface you want to use, to selecting “Digital certificate” as your auth method, followed by the “Local certificate” which is the Let’s Encrypt one (Cert-A). “Remote certificate” is the one we will load on your device (Cert-B).
Now you select which users you want to have access to use this. I have Active Directory backing my system, so I can select the AD users who have logged in before to the User Portal. This is a trick to Sophos XG you may need, if you use AD and a user doesn’t show up, that means they need to login to the User Portal first.
Select an IP range to give these clients, I suggest something outside any of your normal ranges, then you can set the firewall rules and know no other systems are getting caught in them. Once you are happy, or fill in other settings you want like DNS servers, click “Apply”. After a second it will activate, you can download the Windows and Mac client here, or follow along to make a profile.
Apple Configuration
To create a configuration file you need to download Apple Configurator 2, https://apps.apple.com/us/app/apple-configurator-2/id1037126344 onto a Mac. I know what you are thinking, 2.1 Stars, Apple must love enterprises. Download that from the store and open it up. If you do not have a Mac I attached a templatethat you can edit as a text document down below. This profile needs a Name, as well as an identifier. The identifier is used to track this config uniquely, if you update the profile, then your device will override old configs instead of merging. You will see on the left there are LOTS of options you can set, the only 2 week need are “Certificates” and “VPN”.
Starting with Certificates, click into that section, then hit the Plus in the top right. Upload the cert we exported from Sophos (Cert-B) earlier for the end device, and enter the password for it. Again note, this password is in plain text in the config file.
Now for the VPN Section. Click the Plus in the top right again to make a new profile, name the connection anything. Set the Connection Type to “IPsec”. IKEv2 is IPsec but a newer version, I will get into some of this later after our config is done and I can rant. Server is your Sophos XG URL. Account and password can be entered here to ease setup, or you can leave one or both blank to make the user enter it when they import the config. You can leave the user/password fields blank (it will give you a yellow triangle but that is fine) and then give it out widely and not have your creds in it… For “Machine Authentication” you want “Certificate”; you will see in selecting “Certificate” all of a sudden the On-Demand area appears. For “Identity Certificate” select the one we uploaded before. Finally we can enable “Enable VPN On Demand” and select the IPs or URLs you want to trigger the VPN.
Once that is done, save the profile and open it on a Mac or you can use this configuration tool to upload it to an iOS device. That should be it! Your devices should be able to start the connection if you ask it, and if you go to the website should auto vpn. Make sure you have firewall rules in Sophos XG for this new IP range, or that can block you from being able to access things.
A small note, from my tinkerings with the On Demand profile if you go to Safari on a iOS device, it will connect when you visit a website that is in the configuration. If you use a random app, such as an SSH application, I didn’t find it always bringing the tunnel up, and at times it had to manually be started. Something to lookout for, a nice part of the the IPsec tunnel is that it starts quickly.
Now that the config is done, I want to mention some of the other things I have learned in tinkering with this for several days. The only way I got it to work is using that Sophos Connect area, and the other big not documented thing is you have to use a publicly trusted cert for the Sophos end. I found 1 Apple engineer mention this on their forum, and a TON of people talking about how they couldn’t get the tunnel to work with their private CA. I have tried uploading a CA, and injecting it different places with different privileges for the Mac and never could get it to work. The Let’s Encrypt cert imminently worked.
For IPsec v1, aka IKEv1, Apple uses the BSD program racoon on the backend to manage the connection. Using the “Console” app you can find the logs of this. For IKEv2 it seems Apple wrote their own client around 2016-2018, there are a lot of reports online that it just doesnt work at all with cert based auth. All the guides about it working stop around 2016. You can find earlier ones, or people using pre-shared keys, but selecting pre-shared keys doesnt allow us to do a On Demand VPN. The bug has been reported for a while, https://github.com/lionheart/openradar-mirror/issues/6082. If you try to do this, you can expect A LOT of “An unexpected error has occurred” from the VPN client. Even looking at the Wireshark traffic didn’t lend any help on tuning Sophos to give the IKEv2 client something it would accept. If someone figures out how to get that to work in this setup please let me now.
Now that everything is setup you can host things yourself. I give the auto connecting VPN less rights than when I do a full tunnel on my laptop, but it allows for things like Jira to be hosted, then mobile clients to easily connect.
Template
For your cert to work in the template it needs to be converted. Sophos will give you a .p12 file for your cert, use the following command to get the version that needs to be in the .mobileconfig file. You’ll at minimum want to edit the cert area and put yours in there, set the password for the cert, and any URLs you need.
Different network gear I have has had many problems trying to get email alerts working. I thought I would document them. All of these systems use a service gmail address I made on free/public gmail to send alerts to me.
Sophos, and LibreNMS gave me no problems; if you have issues with them drop a comment below and I can post my settings.
Ruckus AP
The trick to getting Ruckus Unleashed, I used “smtp.gmail.com” and port 587. The issue I ran into is the service email I use to send emails had a long password. Ruckus Unleashed v200.8 supports a maximum of 32 character passwords. I would also mention it dumps the password raw into the logs, so make an account you dont care much about.
Unifi Controller
After digging through logs and getting lots of “There was an error sending the test email to x@gmail.com. Failed to send email for unknown reasons.”, I found one post that mentioned a fix for the console log of “fail to send email: api.err.SmtpSendFailed”. You need to once again use smtp.gmail.com, and port 587, but since its TLS, you need to counter intuitively UNCHECK “Enable SSL”.
Setting up AD auth in the product is straight forward, set your domain search as wide as you are comfortable with, because next you import groups that are under that search. Next, make sure to hit the little icon that imports all the AD groups you want, it is easy to overlook.
Import groups button
Now go to the Services tab, and include your new AD servers in your group for Admin Authentication methods. The guides say to make AD first, and in testing I just put one of the servers above local; but this shouldn’t matter too much, local auth still works.
Admin Authentication Methods
Now here is the trick that got me. TO HAVE THE USER SHOW UP IN THE USER AREA OF AUTHENTICATION, YOU MUST HAVE THEM LOGIN TO THE USER PORTAL FIRST. Thus the User Portal needs to also be setup to allow AD auth. After that, the user will appear like below, and you can click in to edit them.
User admin panel
Clicking into the user you can make them an Admin, and set their group. You have to provide a email at this point for the user. BEWARE, MAKING THE USER AN ADMIN IS NOT REVERSIBLE! IF YOU WANT TO MAKE THEM A NORMAL ACCOUNT AGAIN YOU NEED TO DELETE THE USER, AND IF THIS USER IS USED IN ANY FIREWALL RULE OR SETTINGS THIS WILL BE BLOCKED UNTIL THEY ARE REMOVED FROM ALL OF THEM. One fix for this is to make them part of a Admin group that has no rights to anything, but that doesn’t feel like the proper way.
User panel making a user an adminError if you try to delete a user tied to policies
Then you should be good to go!
Troubleshooting
Some troubleshooting techniques I used while fixing this: if you don’t have the user imported into Sophos XG, and attempt to login to the Admin panel, you will get “Wrong username/password” and looking at the logs in Sophos you will see “Wrong credentials entered for x@domain”. This is not exactly true and can throw you off. If you login to AD and look at your servers Security logs, it says “User login successful”. That is a good indicator that at least your login is working correctly, don’t get fooled by AD saying success, while Sophos says wrong; the user just needs to login to the User panel first to link the accounts.
In my apartment I needed to get wired networking with VLANs across the apartment. I didn’t want to run a wire since I thought my roommate would not appreciate that. I wanted to have a switch near my desk, that allowed different devices I have like file server, desktop, and a few other things to have a wired link; then, connect to the modem/firewall and rest of the networking gear across the apartment.
Long story short, I ended up using a trick I didn’t know would work till I tried it. I have 2 x UAP-AC-M, they work decently well, topping out at 867Mbps and 2×2 MIMO; as well as being able to get them on sale in a 2 pack for a decent price made them a great deal. I have run 1 of them for 4 years as my main access point. Then when I wanted to get this wire connection in a new room configuration I tried to do a wireless uplink to the second one. This makes it mesh with the first access point. Now the important item I don’t seem written anywhere but works well (caveats below):
Ubiquiti access points in wireless uplink/mesh will bridge that network to the wired port on the device
This means if you have a trunk port going into your original/base mesh AP, you will have the same trunk port coming out the other end. This also means anyone who is running mesh points, and hasn’t secured the wired port may want to think about doing so. I am will skip over HOW to set this up, Ubiquiti has a good guide https://help.ui.com/hc/en-us/articles/115002262328 to walk you through it, and most APs can do wireless uplink at this point; this is more about saying it can be done, and works well from my experience to anyone thinking about implementing this or wants a solution for their home/apartment that is not powerline networking. The APs I have are 2×2 802.11AC, I’m sure with a 4×4 AP like the AC-Pro as your base you may see better performance on higher trafficked lines.
This setup has worked well for me for over 6 months now, I can easily hit the 300Mbps I get from my internet connection on a desktop plugged into this meshed AP’s port; I also get 6ms pings to servers while playing games. You get the benefit of real commercial grade antennas and radios in the APs you are using instead of a tiny wifi chip in a laptop, desktop, or device. This also lowers the number of wireless devices (since all the wired devices would have been wireless instead). I also disabled the secondary AP from hosting any of the SSIDs I have in the apartment, so it just works as a wireless uplink. My apartment is not big enough for 2 AP’s for devices.
Caveats
I am looking to move away from this setup for a few reasons. It has worked well and if you are in a pinch I would recommend this setup much more than powerline networking which I have also tried and used several times. I am hoping to move to 10gb/s networking at home with my growing homelab setup; thus, no more wireless link. The other limitation that 99% of people probably would not care about is that you can not do jumbo packets over wireless, so that means it can not be done from all I have read over a wireless link of this type.
Network Topology
The first caveat is that this configuration slightly confuses the access point when it first starts up. The first 60 seconds or so when the access point is online it will think the wired connection is its uplink and attempt to ping out over it. After that it realizes it cant hit anything and will go to wireless uplinking. Sometimes everything just works then, sometimes I have had my switch be confused about where traffic should go and had to power cycle it; in this case it was just a Netgear Prosafe switch with VLANs, not especially smart, but not the dumbest switch. This is similar to a enterprise networks re-converge time when a link is downed. Overall it is rarely a problem and these APs are solid and can go months between restarts, but this is something to lookout for.
Remember that if a Ubiquiti AP cant get an IP, then it doesn’t broadcast SSIDs; this is important since if the base AP boots (like after a power outage) and doesn’t get a DHCP address quick enough, it wont broadcast, then the mesh side will never find an uplink to connect to.
Management
With the earlier mentioned topology issues you can run into, that can make management difficult. You need to make sure the base side of the network is stable. You can get into a position where you did a bad config push or a setting is wrong on the secondary/mesh side and the only way to fix the config is bringing that AP back to the original wired network and pushing a config to it, before the secondary AP can go back into wireless uplink mode.