systemd: Using ExecStop to depool nodes for fun and profit

Preface: There is a tonne of drama about systemd on the internets; it won’t take you long to find it, if you’re curious. Despite that, I’m largely a fan and focusing on all the cool stuff I can finally do as an ops person without basically re-writing crappy bash scripts for a living (cough sys-v init)

Process Supervision

Without going into the basics about systemd too much (I quite enjoy this post as an intro), you tell systemd to run your executable using the “ExecStart” part of the config, and it will go and run that command and make sure it keeps running. Wonderful! In this case, we wanted to keep HHVM running all the time, so we told systemd to do it, in 3 lines. Waaaay easier than sys-v init.

ExecStop

By default when you tell systemd to stop a process, and you haven’t told it how to stop the process, it’s just going to gracefully kill the process and any other processes it spawned.

However, there is also the ExecStop configuration option that will be executed before systemd kills your processes, adding a new “deactivating” step to the process. It takes any executable name (or many) as an argument, so you can abuse this to do literally anything as cleanup before your processes get killed.

Systemd will also continue to do it’s regular killing of processes if by the end of running your ExecStop script the processes are not all dead.

Load balancer health checks

We have a load balancer that uses a bunch of health checks to ensure that the node that it’s asking to do work can actually still do work before it sends it there.

One of these is hitting an HTTP endpoint we set up, let’s call it “status.php” which just contains the text “Status:OK”. This way, if the server dies, or PHP breaks, or Apache breaks, that node will be automatically depooled and we don’t serve garbage to the user. Yay!

Example: automatic depooling using ExecStop

Armed with my new ExecStop super power, I realised we were able to let the load balancer know this node was no longer available before killing the process.

I wrote a simple bash script that:

Moves the status.php file to status.php.disabled
Starts pinging the built in HHVM “load” endpoint (which tells you how many requests are in flight in HHVM) to see if the load has hit 0
if the curl to the “load” endpoint fails, we try again after 1 second
If we hit 30 seconds and the load isn’t 0 or we still can’t reach the endpoint, we just carry on anyway; something is wrong.
Once the load is “0”, we can continue
use `pidof` to kill the HHVM process
Move status.php.disabled back to status.php

And now, i can reference this in our HHVM systemd unit file:

[Unit]
Description=HHVM HipHop Virtual Machine (FCGI)

[Service]
Restart=always
ExecStart=/usr/bin/hhvm -c <snip>
ExecStop=/usr/local/bin/hhvm_stop.sh

Now when I call service hhvm stop, it takes 6-10 seconds for the stop to complete, because the traffic is gracefully removed.

Logging

Another thing I personally love about systemd, is the increase visibility the operator gets about what’s going on. In sys-v, if you’re lucky, someone put a “status” action in their bash script and it might tell you if the pid exists.

In systemd, you get a tonne of information about what’s going on; the processes that have been launched (including child processes), the PIDs, logs associated with that process, and in the case of something like Apache, the process can report information back:

Apache systemd status output showing requests per second

In this case, our ExecStop script output gets shown when you look at the status output of systemd:

[root@hhvm01 ~]# systemctl status hhvm -l
hhvm.service - HHVM HipHop Virtual Machine (FCGI)
   Loaded: loaded (/usr/lib/systemd/system/hhvm.service; enabled)
   Active: inactive (dead) since Tue 2015-02-17 22:00:52 UTC; 48s ago
  Process: 23889 ExecStop=/usr/local/bin/hhvm_stop.sh (code=exited, status=0/SUCCESS)
  Process: 37601 ExecStart=/usr/bin/hhvm <snip> (code=killed, signal=TERM)
  Main PID: 37601 (code=killed, signal=TERM)

Feb 17 22:00:45 hhvm01 hhvm_stop.sh[23889]: Moving status.php to status.php.disabled
Feb 17 22:00:47 hhvm01 hvm_stop.sh[23889]: Waiting another second (currently up to 8) because the load is still 16
Feb 17 22:00:48 hhvm01 hhvm_stop.sh[23889]: Waiting another second (currently up to 9) because the load is still 10
Feb 17 22:00:49 hhvm01 hhvm_stop.sh[23889]: Waiting another second (currently up to 10) because the load is still 10
Feb 17 22:00:50 hhvm01 hhvm_stop.sh[23889]: Load was 0 after 11 seconds, now we can kill HHVM.
Feb 17 22:00:50 hhvm01 hhvm_stop.sh[23889]: Killing HHVM
Feb 17 22:00:52 hhvm01 hhvm_stop.sh[23889]: Flipping status.php.disabled to status.php
Feb 17 22:00:52 hhvm01 systemd[1]: Stopped HHVM HipHop Virtual Machine (FCGI).

Now all the information about what happened during the ExecStop process is captured for debugging later! No more having no idea what happened during the shut down.

When the script is in the process of running, the systemd status output will show as “deactivating” so you know it’s still ongoing.

Summary

This is just one example of how you might use/abuse the ExecStop to do work before killing processes. Whilst this was technically possible before, IMO the ease of use and the added introspection means this is actually feasible for production systems.

I’ve gisted a copy of the script here, if you want to steal it and modify it for your own use.

2 thoughts on “systemd: Using ExecStop to depool nodes for fun and profit”

a nonny mouse says:

February 23, 2015 at 6:50 pm

So, to avoid having to write bash scripts with system-v you write bash scripts with systemd.

Not sure this is the best argument 🙂

1. Laurie Denness says:
  
  February 23, 2015 at 6:55 pm
  
  A bash script to do something that wasn’t possible before versus a bash script to do process supervision which is taken care for me now… Pretty good argument I think 🙂