Top gotchas: Creating virtual machines under KVM with virt-install


January 20, 2016

After a stint building Stacy’s new blog whilst contributing to Drupal 8, I’ve been directing my free development time to more ops-like concerns for the past month or so. I moved to a cheaper parking facility at work and plan to direct the savings to bankroll the expansion of my own little hosting platform. And, I’m going virtualized.

This is a big shift for me, as I’ve gotten many years of trusty service without much trouble from the “single dedicated server with OS on bare metal” model, only experiencing a handful of minutes of downtime a month for major security updates, while watching several different employers flail around keeping all the pieces of their more complex virtualized environments operational.

On paper, the platform I’ll be moving to should be more resilient than a single server – the cost savings of virtualization plus infusion of extra cash will enable me to run two redundant webserver instances behind a third load balancing VM, with another (fourth) load balancing VM that can take over the public IP addresses from the first load balancer at a moment’s notice (thanks to the “floating IP” or “reserved IP” features that the better VPS providers have been rolling out recently). The provider I’m going with, Vultr, has great customer service and assures me that my instances will not share any single points of failure. Thus, I should have full redundancy even for load balancing, and I’ll be able to take instances down one at a time to perform updates and maintenance with zero downtime.

Alas, even on VMs, full redundancy is expensive. I could have just bought sufficiently small instances to host everything at Vultr and stay within my budget, but I’m betting I’ll get better performance overall by directing all traffic to a beefed-up instance at Vultr (mostly superior to the old hardware I’m currently on) under nominal conditions, and have the load balancer fail over to a backend server on hardware in my home office for the short periods where my main Vultr instance is unavailable. My home office isn’t the Tier 4 DuPont Fabros facility in Piscataway, NJ that the rest of the instances will reside in, but it is server grade components with a battery backup that hasn’t seen a power failure yet, and the probability of a simultaneous outage of both webservers seems very low.

The two backend webservers need to be essentially identical for sanity’s sake. However, as hinted in other posts such as the one where I installed an Intel Atom SoC board, I’m a bit obsessive about power efficiency, and there was no way in hell I was running a separate machine at home to serve up a few minutes of hosting a month. So, I needed to figure out KVM.

As near as I can ascertain having completed this experience, there are one or two defaults you always need to override if you want a VM that can write to its block device and communicate on a network.

  1. You probably need to create an ethernet bridge on the host system, onto which you attach your guests’ virtualized NICs. Examples abound for configuring the bridge itself in various linux distros, but none seem to mention with clarity that the host kernel needs to be told not to pump all the ethernet packets traversing the (layer 2!) bridge through the (layer 3!) IPTables. I’m puzzled whose idea this was, but someone made this thing called bridge-nf and apparently talked their way into it being default ‘on’ in the Linux kernel. Maybe that would be nice if you wanted run suricata on a machine separate from your main router, or something, but yeah otherwise I’d recommend turning this off:

    cd /proc/sys/net/bridge
    for f in bridge-nf-call-*; do echo 0 > $f; done

    …and if the bridge should pass tagged vlan traffic, snag bridge-nf-filter-vlan-tagged too. See for more detail. If you have strict iptables rules on your host, your guests won’t get any network connectivity – and on the flipside, while troubleshooting that problem I inadvertently made packets on the theoretically isolated bridge start appearing on a totally different network segment when I tested a blanket-accept rule on the host iptables’ FORWARD chain. I’m trying to isolate my hosting-related vlan and subnet from my plain-old home Internet connection sharing vlan and subnet, and to get this result was just wrong. Bridge-nf is scary stuff; turn it off.

  2. If you opt to use lvm volumes as the backing for your VM’s disks, virt-install provides a handy syntax to cause it to make and manage the logical volume along with the VM:

    virt-install --disk pool=vm-guest-vg,size=80,bus=virtio ...

    where vm-guest-vg is a volume group you’ve set aside for VM disks. This says, make me a new 80G logical volume in the vm-guest-vg volume group, associate it to this instance, and attach it as a disk on the guest through the optimized virtio bus. If you do this, your VM will experience many I/O errors. You must also add “sparse=false”:

    virt-install --disk pool=vm-guest-vg,size=80,bus=virtio,sparse=false ...

    and then, finally, your VM hosting environment will be reasonably usable to a guest operating system.

For completeness/my own reference later on, an entire virt-install command:

virt-install -n 'the-instances-name' --virt-type kvm --hvm --autostart -r 4096 --vcpus=4 --os-type=linux --os-variant=ubuntutrusty --cpu host --disk pool=vm-guest-vg,size=80,bus=virtio,sparse=false --network bridge=hostingbr,model=virtio --graphics vnc -c "/data/public/Program Installers/ubuntu-14.04.3-server-amd64.iso"

There’s still much work ahead of me to build this entire infrastructure out – some custom scripting will likely be needed to pull off automated IP address transfers between the load balancers, and I’m shooting to use Apache Traffic Server as the HTTPS terminator/caching proxy, because it seems to have been doing the HTTP/2 thing for longer than alternatives like nginx, and HTTP/2 is cool. I’ll do a follow-up post after it’s all been up and running for awhile and we’ll see if going virtualized was even a good idea…

And, stay tuned, once I’m done with that, I have plans to further reduce my home infrastructure’s power consumption by losing the bulky full ATX PSU in my main server, which burns about 10 watts minimum, and go to a little laptop-sized DC transformer. If I put the hot data on an SSD and move the big PSU and spinning disks (which are the only things that need that much power, and only when spinning up) to a second machine that just acts as an iscsi target with wake-on-mac, I think I can keep the big disks and their PSU in standby the vast majority of the time. Fun times, fun times.