Episodic Genius


occurring occasionally and at irregular intervals


Work quickly, keep the tools unlocked, work whenever.

A few years back, I made a lateral move within the company where I worked, but to a different industry than where I had been for a while. In my new position, we had a small data center filled with hardware prototypes and other hardware that we used as part of development. On my first day, a colleague that I had worked with previously showed me around the network, virtually.

Working with the machines in the datacenter was like working in a very large maze. Depending on where you wanted to be at any given time, you needed to know a different set of directions to get there. There were a number of machines on the border between the company network and the datacenter. Some of them were Linux machines to which they’d open an SSH session. Others were windows machines that allowed RDP. I found that I spent a great deal of my time navigating the maze and had less left to give to the real effort of developing the products. No one seemed to know any better.

I had long had a knack for networking and set out quickly to detangle the mess so that we could start getting some work done. I demonstrated a number of methods for getting nearly seamless access to the datacenter. I set up OpenVPN on one of the border machines with strong authentication based on an SSL certificate infrastructure that the company already had in place. For developers close to the datacenter, who were able to physically connect to the same subnet as some of the border machines, I wrote documentation showing how to set static routes to the data center. I did what I could to make sure that everyone in the organization, who was authorized, could easily access the datacenter as transparently as possible.

About a year later, even with all of the transparency in the network, we had accumulated racks and racks of prototypes, each with its own isolated private network. The private network was the only way to contact some of the products’ internal components. The worst of it was that each prototype used the same class C (24 bit netmask) network address space as all of the rest. Most would have given here up but I had a strong gut feel that there was a way to easily give developers access to these private networks.

I started out by imagining a server with many many network cards installed and ethernet wires running out the back to each of the prototype’s internal switch. With that in mind I devised a way to connect this server to any number of networks, all with the same address space. Each of the internal networks got mapped (NETMAP) to the outside developer’s network to a unique network address space. Network requests from the developers’ network would be marked so that we could later route it to the correct interface and then mapped to the common private address space. I set up proxy arp entries on the server to respond to those requests and use NAT to make the request appear as if it was coming from the server’s IP address on the internal network.

I never actually needed all the network cards. I found that I could set up VLan interfaces on the cards and then configure the switches to connect them up. The real trick was connecting to many homogeneous networks and keeping them all straight. To make it just a little trickier, each rack had up to five prototypes and five VLans. Two or more prototypes could be moved to the same VLan to make a larger version of the product. So, the server could never reliably know where to find any particular component. To solve this, I create five virtual interfaces and bridged them together for each rack. Then, I added shorewall rules and an arptables rule to ensure that the server could communicate freely with any component on any VLan but traffic originating on one of the networks could not bleed over to another.

The last piece of the puzzle was to script up the creation of all of the necessary network configuration files so that others could maintain it. Then, I set up some DNS entries for all of the components of all of the prototypes so that developers didn’t have to remember how to map IP address ranges to prototypes. It all worked great and most people forgot all about the complexity behind the system.

Every once in a while, when I had to work on a test system outside of all of this, I use a alternate method. I built a custom rpm of tinyproxy and installed it on one of the servers. Then, I’d change my proxies for my web browser and ssh for the specific prototype that I needed to work with. This worked really well for me but few others wanted deal with installing the tinyproxy rpm and frequently changing proxies.