eworldproblems
  • Home
  • About
  • Awesome Ideas That Somebody Else Already Thought Of
  • Perl defects
  • Books & Resources
Follow

Reset connection rate limit in pfSense



Note to self next time I get bit by it:

When using the “Max. src. conn. Rate” advanced option in a pfSense firewall rule, if desirable traffic ends up exceeding that rate, it’s really really hard to let the traffic through again. PfSense adds a rule to a firewall table, “virusprot”, that’s not listed in the web UI. The rule blocks all traffic from the offending source address, and it hangs around for a really long time. So adjusting the rate limit, clearing the state table, etc. still won’t let the traffic through.

After ~30 minutes of reading, the following command at the shell is what does the trick:

pfctl -t virusprot -F rules

…which translates to something like “packet filter control, operate on table virusprot and flush the rules in it.”

Posted in Uncategorized

Connecting to University of Minnesota VPN with Ubuntu / NetworkManager native client



I had a devilish time this morning connecting to UofM VPN on my Ubuntu 16.04 system. The official Cisco AnyConnect client available for download from OIT’s webpage is out of date and insists on updating itself upon first connection to the server, but the update process failed, so the client couldn’t be used.

With some poking around and experimentation, I was able to get the native VPN connection client available through Ubuntu 16.04’s NetworkManager stuff to connect up and work just fine. Here is the configuration.

  1. Create a new connection of type Cisco Compatible VPNScreenshot of creating a new Cisco compatible connection
  2. Enter the following settings in the window that appears and click Save (the defaults under Advanced seem to all just work):Screenshot of UofM VPN Settings
  3. Connect to this new VPN connection by selecting it from the network connections menu in the system tray:Screenshot of network connections menu showing new VPN connection
  4. A dialog box will appear asking for your password and the “Group Password.” Provide your Internet ID password in the Password box, and enter S3cur1ty! in the Group Password box. (from https://it.umn.edu/downloads-guides-install-ipsec-native)
  5. Click OK, and your VPN connection should be established!
Posted in Uncategorized

Running nodes against multiple puppetmasters as an upgrade strategy



At work, we’re way out of date in our devops, having not upgraded Puppet since version 3.8. As of this writing, version 5 is available.

This has finally created sufficiently many problems that I’ve been helping prep for an upgrade to puppet 5 — but with some 3,200 .pp puppet manifest files in our existing puppet deployment, and a puppet language that doesn’t retain backwards compatibility, the prospect of upgrading is incredibly onerous.

Instead of trying to convert everything in one massive action, a strategy that people will hate me for but that I’m finding really helps to get the job done is to write new puppetization against a puppet 5 server, slowly removing equivalent declarations / resources / etc from the old puppet server, and running your puppetized nodes against both masters during this period. As long as you ensure the puppet masters don’t both try to set up the same service / file / resource in different ways, there’s no real reason you can’t do this.

This turns out to be fairly easy, because Puppet’s developers threw us a bone and made sure the latest 5.x Puppet server can drive very old (3.8) puppet agents, so you don’t need more than one puppet binary installed on the puppetized nodes. All the shiniest puppet 5 features are available for use in your new puppet code if it is compiled by a puppet 5 server, and the resulting state can be set by agents all the way back to 3.8 (maybe even slightly older.) Also, it’s really helpful that the puppet agent can be told at invocation to use a nonstandard config file.

There’s some potential gotchas with getting the agent to trust both masters’ self-signed certs, pluginsync, and the case of puppet masters that enforce particular puppet agent configuration. Here’s a setup that avoids all that.

  1. Leave your legacy puppet configuration alone.
    We’ll do puppet runs against the new server via a foreground puppet run in a cronjob.
  2. Make a copy of puppet.conf.
    I’ll call the copy puppet5.conf, but you’ll just be referencing this new file in a command-line argument, so may name it as you like.
  3. Edit puppet5.conf:
    • Change the server line to your new puppet 5 server, of course.
    • Change vardir, logdir, and rundir to new locations. This is key, as it makes puppet agent runs against your new server completely isolated from puppet agent runs against your old serer with respect to ssl trust and pluginsync.
    • Unrelated to a multi-master setup, but I also found that most modern puppet modules on the forge assume you’ve set stringify_facts = false.

    Here’s my complete puppet5.conf, for reference:

    [main]
        server = puppet5-experimental.msi.umn.edu
        vardir = /var/lib/puppet5
        logdir = /var/log/puppet5
        rundir = /var/run/puppet5
        ssldir = $vardir/ssl
        pluginsync = true
        factpath = $vardir/lib/facter
        always_retry_plugins = false
        stringify_facts = false
    
    [agent]
        # run once an hour, stagger runs                                                             
        runinterval = 3600
        splay = true
        configtimeout = 360
        report = true
    
  4. Do a test run manually:
    # puppet agent --config /etc/puppet/puppet5.conf -t

    This should perform like a first-time puppet run. A new client certificate will be generated, the agent will retrieve and in future trust the server’s certificate and CRL, and depending on your server’s configuration you’ll likely need to puppet cert sign mynode.mydomain on the master.

  5. Do a standard test run against your legacy server manually.
    # puppet agent -t

    Watch it proceed happily, as confirmation that your existing puppet infrastructure is unaffected.

  6. If desired, create a cronjob to cause periodic convergence runs against your new puppet server.

Now you’re free to start using puppet 5 features, and porting legacy puppet code, to your heart’s content.

Posted in dev, devops, Linux

The easiest way to (re)start MySQL replication



I’ve run mirrored MySQL instances with asynchronous replication from time to time for almost 15 years. The fundamentals haven’t changed much over that time, but one thing that has drastically improved is mysqldump‘s support for properly initializing slaves.

In order for your replication slave to have a complete and uncorrupted database, you need to arrange for it to use exactly the same data as was on the master at some particular instant in time as a starting point. In practice, this means taking out some kind of lock or transaction on the master while a complete copy of the database is made. Then, you need to tell the slave from what point in the master’s binary log to start applying incremental updates.

It used to be that doing all this required a lot of refreshing one’s memory by reading the MySQL manual in order to issue a variety of queries manually. But of course, there’s no reason the steps can’t be scripted, and I was pleased to discover this automation is now nicely packaged as part of mysqldump.

By including –master-data in the following command (to be run on the master), mysqldump will take out locks as necessary to ensure a consistent dump is generated, and automatically append a CHANGE MASTER TO query to the dump file with the associated master binary log coordinates:

$ mysqldump --add-drop-database --master-data -u root -p --databases list your replicated database_names here > /tmp/master-resync.sql

That way, you can simply apply this dump file to a slave server whose replication has broken (for example, due to an extended loss of connectivity) and restart the slave process to be back in business. On the slave:

$ cat /tmp/master-resync.sql | mysql -u root -p
$ mysql -u root -p
mysql> START SLAVE;
Query OK, 0 rows affected (0.01 sec)

Tadaa! Clean restart of MySQL replication without any FLUSH TABLES WITH READ LOCKs or manual copying of binlog coordinates in sight!

Posted in Linux

Keeping up on one’s OpenSSL cipher configurations without being a fulltime sysadmin



As you probably already know if you’re the type to be reading my blog, https is able to stay secure over time because it is not reliant on a single encryption scheme. A negotiation process takes place between the two parties at the start of any TLS-encrypted TCP session in which the parties figure out which cipher suites each are willing and able to use. So, as cipher suites fall out of favor, alternative ones can be seamlessly put to use instead.

Of course, this requires that as a server operator, you keep your systems in the know about the latest and greatest trends in that arena. And unfortunately, in order to do that the reality is that it requires you to keep you in the know, as well. It pretty much comes down to plugging “the right” value into a parameter or two used by the OpenSSL library, but those parameters are long and obtuse, and there’s a balance to be struck between optimal security and support for visitors with older web browsers.

It’s a nuisance I’d been aware of for years, but had been letting sit on the back burner because frankly I didn’t have any solutions that were sufficiently easy for me to actually bother keeping up with it over time. This post by Hynek Schlawack, for example, professes to be among the more concise explanations for a quality OpenSSL configuration, but it still weights in at 11 printed pages. More than I am a systems operator, I’m a developer with many active interests to pursue. The reality is I’m not going to be rereading something like that periodically as the post suggests.

Recently, with the help of a link Jeff Geerling dropped on his excellent blog, I found out that CloudFlare, one of the major CDN providers, makes their current SSL configuration available publicly on github -> cloudflare/sslconfig. As a commercial entity that serves a huge volume of content to a diverse client base, they have the resources and motivation to figure all this stuff out, and they’re providing a valuable public service by keeping their findings updated and public.

Checking their github repo periodically is probably an improvement over diff’ing an 11-page blog post, but I still would need to remember to do it. I wanted proactive automated notifications when I needed to update my SSL configuration. Maybe I missed something obvious, but I didn’t find any options on github that would notify me of new commits in a repository I’m not a member of, at least that didn’t also spam me with every comment on every issue.

So, project! The github API is easy to poll for new commits on a repository, so I coded up this little script to do that, and email me when it sees a change. I have it cronned to watch only cloudflare/sslconfig for now, but you can configure it to watch any repository(ies) you desire. You can also configure the email recipients/subject/message easily.

Grab my script and give it a try if this is a problem you can relate to!

Posted in devops, Linux

Introducing Prophusion: Test complex applications in any version of PHP



Putting together testing infrastructure for Curator has been an interesting project onto itself. On my wishlist was:

  • Support for a wide range of PHP interpreter versions, at least 5.4 – current
  • In addition to the unit test suite, be able to run full integration testing including reads/writes to actual FTP servers.
  • Keep the test environment easy enough to replicate that it is feasible for developers to run all tests locally, before submitting a pull request.

By building on Docker and phpenv, I was able to meet these requirements, and create something with more general applicability. I call it Prophusion, because it provides ready access to over 140 PHP releases.

For a quick introduction to Prophusion including a YouTube of it in action, check out this slide deck.

I’ve since fully integrated Prophusion into the testing pipeline for Curator where it happily performs my unit and in-depth system tests in the cloud, but I also make a habit of running it on my development laptop and workstation at home as I develop. You can even run xdebug from within the Prophusion container to debug surprise test failures from an xdebug client on your docker host system…I’m currently doing that by setting the right environment variables when docker starts up in my curator-specific test environment. I’ll port those back to the prophusion-base entrypoint in the next release of the prophusion-base docker image.

Prophusion includes a base image for testing in the CLI (or FPM), and one with Apache integrated for your in-browser testing.

Posted in devops, PHP

Automatic security updates and a support lifecycle: the missing links to Backdrop affordability



I saw Nate Haug and Jen Lampton give their Backdrop CMS intro talk last weekend at the Twin Cities Drupal Camp. They have already done much to identify and remedy some big functionality and UX issues that cause organizations to leave Drupal for WordPress or other platforms, but you can read more about that elsewhere or experience it yourself on Pantheon.

Their stated target audience is organizations who need more from their website than is WordPress’s primary use case, but still don’t need all the capability that Drupal 8’s software engineering may theoretically enable — everyone doesn’t need a CMS that supports BigPipe or compatibility with multiple heterogeneous data backends.

That positions them squarely in the same space in the same target market as Squarespace — but whereas Squarespace is a for-profit business, Backdrop is open-source software, and affordability to the site owner is so important to the project that it gets a mention in Backdrop’s single-sentence mission statement.

Simplified site building for less-than-enterprise organizations is a crowded space already. The Backdrop philosophy identifies a small area of this market where a better solution is conceivable, but I think a big reason so much emphasis is given to goals and philosophy on the Backdrop website and in their conference session is that Jen and Nate recognize their late entry in this crowded market leaves little room for Backdrop to miss in achieving its goals. Backdrop’s decision makers definitely need to keep a clear view of the principles the project was founded for in order for Backdrop to positively differentiate itself.

Affordability is one potential differentiator, and I am personally happy to see it’s one already embodied by Backdrop’s promises of backwards compatibility and low server requirements. But, frankly, WordPress has got those things already. An objective evaluation of Backdrop’s affordability in its current form would put it on par with, but not appreciably better than a more established competitor. But there is hope, because:

Current CMSs don’t give site owners what’s best for them

To make converts, Backdrop could truly differentiate itself with a pledge to offer two new things:

  1. Security backports for previous releases, with a clearly published end-of-support date on each release, so site owners can address security issues with confidence that nothing else about their site will change or stop working.
  2. A reference implementation for fully automated updates, so that installed sites stay secure with zero effort.

Here’s why.

Let’s assume there’s a certain total effort required to create and maintain that highly customized website described by Backdrop’s mission statement. Who exerts that effort is more or less transferable between the site builder and the CMS contributors. On one extreme, the site owner could eschew a CMS entirely and take on all the effort themselves. Nobody does this, because of the colossal duplication of exertions that would result from all the sites re-solving many of the same problems. Whenever a CMS takes over even a small task from site builders, a mind-boggling savings in total human hours results.

Consider this fact in the context of CMS security. From a site owner’s perspective, here’s the simplest-case current model in Drupal, Backdrop, WordPress, and other popular open-source CMS’s for keeping the sites they run secured over time. This model assumes a rather optimistic site owner who isn’t worried enough about the risk of code changes breaking their site to maintain a parallel development environment; organizations choosing to do this incur even more steps and costs.

  1. A responsible individual keeps an eye out for security updates to be released. There are myriad ways to be notified of them, but somebody has to be there to receive the notifications.
  2. When the individual and/or organization is ready, an action is performed to apply the latest version of the code, containing the security update as well as non-security changes the developers have made since this site was last updated. Sometimes this action is as simple as clicking a button, but it cannot be responsibly automated beyond this point due to the need to promptly perform step 3.
  3. The site owner browses around on their site, checking key pages and functions, to find any unintended consequences of the update before their customers do. (For non-enterprise, smaller sites, let’s assume an automated testing suite providing coverage of the highly customized features does not exist. Alternatively, if it did exist, there would be a cost to creating it that needs to be duplicated for each instance of the CMS in use.)
  4. In practice, the developers usually did their job correctly, and the update does not have unintended consequences. But sometimes, something unexpected happens, and a cost is incurred to return the site to its past level of operation, assuming funds are available.

Once a site has been developed, unsolicited non-security changes to the code rarely add business value to the organization operating the site. In the current model, however, changes are forced on organizations anyway as a necessary component of maintaining a secure site, merely because they are packaged along with the security update. In my opinion, the boldface observation above ought to be recognized as one of the principles guiding Backdrop’s philosophy. In the classic model, the CMS avoids a small task of backporting the security fix to past releases and the work is transferred to site owners in the form of the above steps. That’s expense for the site owner, and in total it is multiplied by each site the CMS runs — a much larger figure than offering a backport would have amounted to.

This is a clear shortcoming of current offerings, and Backdrop’s focus on affordability makes it a ripe candidate for breaking this mold. Not to mention the value proposition it would create for organizations evaluating their CMS options. Heck, make a badge and stick it on backdropcms.org/releases:

Supported

3 years

Stable

Backdrop could guarantee security updates will not disrupt the sites they run; the competition could only say “A security update is available. You should update immediately. It is not advisable to update your production site without testing the update first, because we introduced lots of other moving parts and can’t make any guarantees. Good luck.”

That’s something I think developers and non-technical decision makers alike can appreciate, and that would make Backdrop look more attractive. Don’t want to pay monthly dues to Squarespace? Don’t want to pay periodic, possibly unpredictable support fees to a developer? Now you can, on Backdrop.

The above case that a software support lifecycle would make site maintenance more affordable to site owners does not even begin to take into consideration the reality that many sites simply are not updated in a timely fashion because the updates aren’t automated. If you are not an enterprise with in-house IT staff, and you are not paying monthly premiums to an outfit like Squarespace or a high-end web host with custom application-layer firewalls, history shows a bot is pretty guaranteed to own that site well before you get around to fixing it. Exploited sites are in turn used to spread malware to their visitors, so adding automated updates to the CMS that can be safely applied, rapidly, without intervention would have a far-reaching overall impact on Internet security.

But how achievable is this?

Isn’t extended support boring to developers volunteering their time?

Yes, probably. Top contributors might not be too psyched to do backports. But just as in a for-profit development firm, developers with a range of abilities are all involved in creating open-source software. Have top contributors or members of the security team write the fix and corresponding tests against the latest release, and let others merge them back. The number of patches written for Drupal core which have never been merged has I think demonstrated that developer hours are eminently available, even when the chance of those hours having any impact is low. Propose to someone that their efforts will be reflected on hundreds or thousands of sites across the Internet in a matter of days, and you’ll get some volunteers. Novice contributors show up weekly in #drupal-contribute happy to learn how to reroll patches as it is. Security issues might be slightly more tricky in that care needs to be taken to limit their exposure to individuals whose trust has been earned, but this is totally doable. Given the frequency of core security releases in Drupal 7, a smaller pool of individuals known personally by more established community members could be maintained on a simple invite-only basis.

Update, April 2017
I discussed how achievable backports within major versions of Drupal could be in a core conversation at DrupalCon Baltimore. The focus related especially to the possibility of extending the model where official vendors have access to confidential security information to support Drupal 6 LTS. Participants included many members of the security team and a few release managers; the youtube is here.

Some interesting possibilities exist around automating attempts to auto-merge fixes through the past releases and invoke the test suite, but rigging up this infrastructure wouldn’t even be an immediate necessity.

Also, other projects in the wider, and even PHP FOSS world show us it can be done. Ubuntu made software lifecycles with overlapping supported versions famous for an entire Linux distribution (though most all of the other distros also pull it off with less fanfare, chalking it up as an implicit necessity), and it’s even been embraced by Symfony, the PHP framework deeply integrated into Drupal 8. While Drupal adopted  Symfony’s software design patterns and frequently cites this as one of Drupal 8’s strengths, they didn’t adopt Symfony’s software lifecycle practices. In this regard, Drupal is increasingly finding itself “on the island.” Hey, if Backdrop started doing it, maybe Drupal would give in eventually too.

What about contributed modules?

I would argue that a primary strategy to handle the issue of contrib code should be to reduce the amount of contrib code.  This fits well with a Backdrop initiative to put the things most people want in the core distribution of the product.  A significant number of sites — importantly, the simplest ones that are probably least inclined to think about ongoing security — would receive complete update coverage were backports provided only for core. The fact that contrib is there in CMS ecosystems is sometimes cited as a reason security support lifecycles would not be possible, but it’s no excuse not to tackle it in the CMS’s core.

Contrib will always be a major part of any CMS’s ecosystem though, and shouldn’t be left out of the opportunity to participate in support lifecycles.  I would propose that infrastructure be provided for contrib developers to clearly publish their own support intentions on their project pages. Then, when a security issue is disclosed in a contrib module, the developer would identify, by means of simple checkboxes, the versions that are affected. There would be no obligation to actually produce a patched version of old releases identified as vulnerable, however, regardless of previously published intentions. This would have two effects: A) the developer would be reminded that they committed to do something, and therefore might be more likely to do it, and B) sufficient data would be available to inform site owners of a security issue requiring their attention if the contrib module chose not to provide a backported fix. Eventually, the data might also be used as a statistic on project pages to aid site builders in selecting modules with a good support track record.

Aren’t automated updates inherently insecure?

No, although some web developers may conflate performing an automatic update with the risks of allowing code to modify other code when it can be invoked by untrusted users. A reference implementation of an automatic updater would be a separate component from the CMS, capable of running with a different set of permissions from the CMS itself.

Brief case study: “Drupalgeddon”

Here’s the patch, including its test coverage, that fixed what probably proved to be the most impactful security vulnerability in Drupal 7‘s history to date:

drupalgeddon

The fix itself is a one-line change in database.inc. Security patches are, as in this case, often very small and only have any impact on a site’s behavior in the face of malicious inputs. That’s why there’s value in packaging them separately.

Drupal 7.32, the version that fixed this vulnerability, was released in October 2014. A

git apply -3 drupalgeddon.patch

is able to automatically apply both the database.inc and database_test.test changes all the way back to Drupal 7.0, which was released almost four years earlier in January 2011. Had the infrastructure been in place, this fix could have been automatically generated for all earlier Drupal versions, automatically verified with the test suite, and automatically distributed to every Drupal 7 website with no real added effort on the part of the CMS, and no effort on the part of site owners. Instead, in the aftermath, tremendous time energy and money was expended by the site owners that were affected or compromised by it, with those that didn’t patch in a matter of hours facing the highest expenses to forensically determine if they were compromised and rectify all resulting damages.

You better believe botnet operators maintain databases of sites by CMS, and are poised to use them to effectively launch automated exploits against the bulk of the sites running any given CMS within hours of the next major disclosure. So, unless the CMSs catch up, it is not a matter of if this will happen again, but when.

The only way to beat the automation employed by hackers is for the good guys to employ some automation of their own to get all those sites patched. And look how easy it is. Why are we not doing this.

Final thoughts

A CMS that offered an extended support lifecycle on their releases would make site ownership more affordable and simpler, and would improve overall Internet security. Besides being the right thing to do, if it made these promises to its users, Backdrop would be able to boast of a measurable affordability and simplicity advantage. And advantages are critical for the new CMS in town vying for market share in a crowded and established space.

Posted in dev, Drupal, PHP

HAProxy “backup” and “option redispatch”



I’m testing out my new infrastructure-in-progress for redundant web hosting, described in my last post. The quick outline of the components again, is a total of four VMs:

  • Two small & inexpensive instances in a tier 4 datacenter running Apache Traffic Server + HAProxy.
  • One bigger, rather spendy backend VM in a tier 4 datacenter running Apache, webapp code, databases, etc. Call it hosting1.
  • One VM roughly equivalent in specs to the backend VM above, but running on a host machine in my home office so its marginal cost to me is $0. Call it hosting2.

The thinking is that hosting1 is going to be available 99%+ of the time, so “free” is a nice price for the backup VM relative to the bandwidth / latency hit I’ll very occasionally take by serving requests over my home Internet connection. Apache Traffic Server will return cached copies of lots of the big static media anyway. But this plan requires getting HAProxy to get with the program – don’t EVER proxy to hosting2 unless hosting1 is down.

HAProxy does have a configuration for that – you simply mark the server(s) you want to use as backups with, sensibly enough, “backup,” and they’ll only see traffic if all other servers are failed. However, when testing this, at least under HAProxy 1.4.24, there’s a little problem: option redispatch doesn’t work. Option redispatch is supposed to smooth things over for requests that happen to come in during the interval after the backend has failed but before the health checks have declared it down, by reissuing the proxied request to another server in the pool. Instead, when you lose the last (and for me, only) backend in the proper load balance pool, requests received during this interval wait until the health checks establish the non-backup backend as down, and then return a 503, “No servers are available to handle your request.”

I did a quick test and reconfigured hosting1 and hosting2 to both be regular load balance backends. With this more typical configuration, option redispatch worked as advertised, but I now ran the risk of traffic being directed at hosting2 when it didn’t need to be.

Through some experimentation, I’ve come up with a configuration that my testing indicates gives the effect of “backup” but puts both servers in the regular load balance pool, so option redispatch works. The secret, I think, is a “stick match” based on destination port — so matching all incoming requests — combined with a very imbalanced weight favoring hosting1.

Here’s a complete config that is working out for me:

global
  chroot  /var/lib/haproxy
  daemon
  group  haproxy
  log  10.2.3.4 local0
  maxconn  4000
  pidfile  /var/run/haproxy.pid
  stats  socket /var/lib/haproxy/stats
  user  haproxy

defaults
  log  global
  maxconn  8000
  option  redispatch
  retries  3
  stats  enable
  timeout  http-request 10s
  timeout  queue 1m
  timeout  connect 10s
  timeout  client 1m
  timeout  server 1m
  timeout  check 10s

listen puppet00
  bind 127.0.0.1:8100
  mode http
  balance static-rr
  option redispatch
  retries 2
  stick match dst_port
  stick-table type integer size 100 expire 96h
  server hosting1 10.0.1.100:80 check weight 100
  server hosting2 10.0.2.2:80 check weight 1

I tested this by putting different index.html’s on my backend servers and hitting HAProxy with cURL every 0.5 seconds for about an hour, using the `watch` command to highlight any difference in the data returned:

watch -x --d=permanent -n 0.5 curl -H 'Host: www.mysite.com' http://104.156.201.58  --stderr /dev/null

It didn’t return the index.html from hosting2 a single time, until I shutdown apache on hosting1 after about an hour.

There’s a small amount of magic here that I unfortunately don’t understand, but will resign myself to being happy that it works as desired. Once failed over to hosting2, I would kind of expect the stick table to update and cause everything to stay with hosting2 even after hosting1 comes back. In actuality, it returns to hosting1. So cool, I guess.

Posted in devops

Top gotchas: Creating virtual machines under KVM with virt-install



After a stint building Stacy’s new blog whilst contributing to Drupal 8, I’ve been directing my free development time to more ops-like concerns for the past month or so. I moved to a cheaper parking facility at work and plan to direct the savings to bankroll the expansion of my own little hosting platform. And, I’m going virtualized.

This is a big shift for me, as I’ve gotten many years of trusty service without much trouble from the “single dedicated server with OS on bare metal” model, only experiencing a handful of minutes of downtime a month for major security updates, while watching several different employers flail around keeping all the pieces of their more complex virtualized environments operational.

On paper, the platform I’ll be moving to should be more resilient than a single server — the cost savings of virtualization plus infusion of extra cash will enable me to run two redundant webserver instances behind a third load balancing VM, with another (fourth) load balancing VM that can take over the public IP addresses from the first load balancer at a moment’s notice (thanks to the “floating IP” or “reserved IP” features that the better VPS providers have been rolling out recently). The provider I’m going with, Vultr, has great customer service and assures me that my instances will not share any single points of failure. Thus, I should have full redundancy even for load balancing, and I’ll be able to take instances down one at a time to perform updates and maintenance with zero downtime.

Alas, even on VMs, full redundancy is expensive. I could have just bought sufficiently small instances to host everything at Vultr and stay within my budget, but I’m betting I’ll get better performance overall by directing all traffic to a beefed-up instance at Vultr (mostly superior to the old hardware I’m currently on) under nominal conditions, and have the load balancer fail over to a backend server on hardware in my home office for the short periods where my main Vultr instance is unavailable. My home office isn’t the Tier 4 DuPont Fabros facility in Piscataway, NJ that the rest of the instances will reside in, but it is server grade components with a battery backup that hasn’t seen a power failure yet, and the probability of a simultaneous outage of both webservers seems very low.

The two backend webservers need to be essentially identical for sanity’s sake. However, as hinted in other posts such as the one where I installed an Intel Atom SoC board, I’m a bit obsessive about power efficiency, and there was no way in hell I was running a separate machine at home to serve up a few minutes of hosting a month. So, I needed to figure out KVM.

As near as I can ascertain having completed this experience, there are one or two defaults you always need to override if you want a VM that can write to its block device and communicate on a network.

  1. You probably need to create an ethernet bridge on the host system, onto which you attach your guests’ virtualized NICs. Examples abound for configuring the bridge itself in various linux distros, but none seem to mention with clarity that the host kernel needs to be told not to pump all the ethernet packets traversing the (layer 2!) bridge through the (layer 3!) IPTables. I’m puzzled whose idea this was, but someone made this thing called bridge-nf and apparently talked their way into it being default ‘on’ in the Linux kernel. Maybe that would be nice if you wanted run suricata on a machine separate from your main router, or something, but yeah otherwise I’d recommend turning this off:
    cd /proc/sys/net/bridge
    for f in bridge-nf-call-*; do echo 0 > $f; done
    

    …and if the bridge should pass tagged vlan traffic, snag bridge-nf-filter-vlan-tagged too. See http://unix.stackexchange.com/questions/136918/why-does-my-firewall-iptables-interfere-in-my-bridge-brctl/148082#comment-218527 for more detail. If you have strict iptables rules on your host, your guests won’t get any network connectivity — and on the flipside, while troubleshooting that problem I inadvertently made packets on the theoretically isolated bridge start appearing on a totally different network segment when I tested a blanket-accept rule on the host iptables’ FORWARD chain. I’m trying to isolate my hosting-related vlan and subnet from my plain-old home Internet connection sharing vlan and subnet, and to get this result was just wrong. Bridge-nf is scary stuff; turn it off.

  2. If you opt to use lvm volumes as the backing for your VM’s disks, virt-install provides a handy syntax to cause it to make and manage the logical volume along with the VM:
    virt-install --disk pool=vm-guest-vg,size=80,bus=virtio ...

    where vm-guest-vg is a volume group you’ve set aside for VM disks. This says, make me a new 80G logical volume in the vm-guest-vg volume group, associate it to this instance, and attach it as a disk on the guest through the optimized virtio bus. If you do this, your VM will experience many I/O errors. You must also add “sparse=false”:

    virt-install --disk pool=vm-guest-vg,size=80,bus=virtio,sparse=false ...

    and then, finally, your VM hosting environment will be reasonably usable to a guest operating system.

For completeness/my own reference later on, an entire virt-install command:

virt-install -n 'the-instances-name' --virt-type kvm --hvm --autostart -r 4096 --vcpus=4 --os-type=linux --os-variant=ubuntutrusty --cpu host --disk pool=vm-guest-vg,size=80,bus=virtio,sparse=false --network bridge=hostingbr,model=virtio --graphics vnc -c "/data/public/Program Installers/ubuntu-14.04.3-server-amd64.iso"

There’s still much work ahead of me to build this entire infrastructure out — some custom scripting will likely be needed to pull off automated IP address transfers between the load balancers, and I’m shooting to use Apache Traffic Server as the HTTPS terminator/caching proxy, because it seems to have been doing the HTTP/2 thing for longer than alternatives like nginx, and HTTP/2 is cool. I’ll do a follow-up post after it’s all been up and running for awhile and we’ll see if going virtualized was even a good idea…

And, stay tuned, once I’m done with that, I have plans to further reduce my home infrastructure’s power consumption by losing the bulky full ATX PSU in my main server, which burns about 10 watts minimum, and go to a little laptop-sized DC transformer. If I put the hot data on an SSD and move the big PSU and spinning disks (which are the only things that need that much power, and only when spinning up) to a second machine that just acts as an iscsi target with wake-on-mac, I think I can keep the big disks and their PSU in standby the vast majority of the time. Fun times, fun times.

Posted in Linux

Pinterest hover buttons and picturefill



Pinterest hover buttons, those little “Pin It” buttons that appear when mousing over images on Pinterest-integrated sites, don’t work if your site is using HTML5 responsive images (the picture element and srcset attribute) in the way recommended when using the picturefill polyfill for legacy browser compatibility.

The reason is that the JavaScript Pinterest asks you to include on your site, “pinit.js,” assumes any image that should receive a hover button has a nonempty src attribute, while your markup written for optimal use of picturefill excludes the src attribute, relying solely on the srcset attribute instead. This way, legacy browsers don’t unnecessarily load a version of the image that may turn out not to be needed.

Of note, Drupal 8’s Responsive Images module, as themed by default, issues an srcset but no src attribute for this reason. Voila, no Pinterest joy.

Turns out a tiny modification to pinit.js (actually, pinit_main.js; pinit.js is merely a loader) fixes this issue, so long as you’re also comfortable specifying which particular version of the image you want pinned in your img tags using Pinterest’s data-pin-media custom data attribute. I’ve recently opened a pull request on github to try and get this into their hosted version of the JavaScript, but meanwhile you can grab my repository from github and host it yourself as well. Update: the above PR was merged; the stock pinterest script now supports the below example:

Once running this JavaScript, your markup can be

<img srcset="/my-pic.png" data-pin-media="http://mysite.com/my-pic.png" />

…and you should be good to go!

I’ve also got a really quick module for Drupal 8 that adds data-pin-media to all responsive images at https://github.com/mbaynton/pinterest_responsive, which I’ll see about making as a drupal.org project if/when my JS gets merged.
Update: Released as pinterest_hover for Drupal 8.x

Posted in Uncategorized
← Older Entries

Recent Posts

  • Reset connection rate limit in pfSense
  • Connecting to University of Minnesota VPN with Ubuntu / NetworkManager native client
  • Running nodes against multiple puppetmasters as an upgrade strategy
  • The easiest way to (re)start MySQL replication
  • Keeping up on one’s OpenSSL cipher configurations without being a fulltime sysadmin

Categories

  • Computing tips
    • Big Storage @ Home
    • Linux
  • dev
    • devops
    • Drupal
    • lang
      • HTML
      • JavaScript
      • PHP
    • SignalR
  • Product Reviews
  • Uncategorized

Tags

Apache iframe malware performance Security SignalR YWZmaWQ9MDUyODg=

Archives

  • June 2018
  • January 2018
  • August 2017
  • January 2017
  • December 2016
  • November 2016
  • July 2016
  • February 2016
  • January 2016
  • September 2015
  • March 2015
  • February 2015
  • November 2014
  • August 2014
  • July 2014
  • April 2014
  • February 2014
  • January 2014
  • October 2013
  • August 2013
  • June 2013
  • January 2013
  • December 2012
  • November 2012
  • September 2012
  • August 2012
  • July 2012

Blogroll

  • A Ph.D doing DevOps (and lots else)
  • gavinj.net – interesting dev blog
  • Louwrentius.com – zfs@home with 4x the budget, other goodies
  • Me on github
  • My old edulogon.com blog
  • My old GSOC blog
  • My wife started baking a lot
  • Now it's official, my wife is a foodie

Meta

  • Log in
  • Entries feed
  • Comments feed
  • WordPress.org

EvoLve theme by Theme4Press  •  Powered by WordPress eworldproblems