Just Enough Systemd

2022-12-10

Objective

The goal of this article is to help backend developers create Systemd service files for their applications.

Table of Contents

Introduction

Systemd is responsible for managing services, a.k.a. daemon processes or "background processes". These include the 'root' process which serves as the ancestor of all others. It is used by most mainstream Linux distributions, including Debian, Ubuntu, Redhat, and Fedora derivatives.

Systemd not only manages services, but also makes it easy for you to turn your own application into a daemon process. It does so by taking care of low level concerns such as:

These capabilities, along with the fact that it is pretty much standard on almost all Linux servers today, make it indispensable for the backend software developer to know how to write Systemd services.

Background and Comparison to Other Systems

Of courses, there are other service management systems which in use today, both by other Linux distros and notably all BSD-based operating systems. In particular, older "System V style" init scripts remain popular amongst old-school Unix geeks because they are simple (and are supported by Systemd).

The main difference between other service management systems and Systemd is that Systemd does a lot more than just manage processes. It can manages network connections, devices, the bootloader, time synchronization, and a host of other functions. Many people, including I, think that Systemd is rather un-Unix-like in its scope, especially since many of its function are redundant with existing services (for example, ntpd for time synchrony).

Nevertheless, you do not need to know or care about these other capabilities in order to take advantage of Systemd's service management features.

Let's get started!

Your First Systemd Service

Before we write our first service, here is the very simple application you can use in these examples (a simple HTTP server in Bash, using the nc (netcat) command):

#!/bin/bash
while true
do 
    echo -e "HTTP/1.1 200 OK\n\nhello world!" | nc -l -p 8080 -q 1 -k
done

In order to turn this script into a Service managed by Systemd, create a file called /etc/systemd/system/my-http-server.service:

[Unit]
Description=Write something clever here

[Service]
# 'simple' is the default. see the 'Service Types' section
Type=simple 
# point ExecStart to whereever you saved the above script. remember to chmod it executable. 
ExecStart=/usr/local/bin/my-http-server.bash 

[Install]
WantedBy=multi-user.target

ExecStart should be self-explanatory. There is also ExecStop for the command to stop the service, ExecStartPre and ExecStartPost for running commands before or after starting, and so on.

WantedBy=multi-user.target might look cryptic, but it basically means "when your machine is booted and ready for action". multi-user refers to "multi-user mode," in contrast to "single-user mode" which you will have used if you ever booted from a recovery disk. You can also specify other Services and states as targets -- we'll talk more about that later.

Simply creating this file isn't enough: we must tell Systemd that there is a new Service in town with daemon-reload:

sudo systemctl daemon-reload

Afterwards, assuming the service file did not have any errors, we should be able to query its status:

sudo systemctl status my-http-server # the ".service" part is default and therefore optional.

○ my-http-server.service - Write something clever here
     Loaded: loaded (/etc/systemd/system/my-http-server.service; disabled; vendor preset: enabled)
     Active: inactive (dead)

Now we can start and stop the service. Note that starting and stopping is different from 'enabling' and 'disabling' services:

What is [Unit]?

You may be wondering why there is a [Unit] section separate from the [Service] section in the service file above.

A unit is the basic building block of Systemd, and a Service is a type of Unit. There are many other Unit types, including devices, mountpoints, and so on, but the only two we are going to be covering here are Services and Timers (later in this article). You can think of Units as the abstract base class of all other Systemd entities. Configuration common to Services, Timers, and all other types of Units are therefore configured in the [Unit] section of the config file.

The configuration options for the [Unit] section are found in the systemd.unit(5) man page.

Where To Put Service Files

System-level unit files are found in many different places on your file system, but the two main ones to know about are:

More locations are listed in systemd.unit(5) man page.

Per-User Services

When Service (and other Unit) files are placed in the directories listed above, you need to run commands like systemctl and journalctl with root privileges. (Read-only operations like systemctl status are an exception). During testing and development, and maybe even deployment, this may not be desirable.

As an alternative, Systemd provides a per-user services directory located at ~/.config/systemd/user/. You can run the systemctl and journalctl (covered later, in the "Logging" section) with the --user options to access service and other files in this directory.

For example, if you create a service file called ~/.config/systemd/user/my-personal-service.service, you can run systemctl --user status my-personal-service, systemctl --user start my-personal-service, journalctl --user -u my-personal-service, and so on.

Requirement and Ordering Dependencies

Systemd has two concepts that sound similar but are distinct: requirement dependencies and ordering dependencies.

Systemd allows you to set up requirements and orderings between Services, Timers, and other Units. They are orthogonal: between two Services, you can have both a requirement and an ordering.

Requirement Dependencies

Basically, requirement dependencies define which other units need to be running for a unit to function. Dependencies are controlled with the Wants=, Requires=, WantedBy=, and RequiredBy= directives.

"Wants" are soft dependencies, and "Requires" are hard dependencies:

If ServiceA Requires ServiceB, and ServiceB is stopped for whatever reason, ServiceA is stopped as well. However, if ServiceA only Wants ServiceB, stopping ServiceB will have no effect on ServiceA.

RequiredBy= and WantedBy= are the same thing as Required= and Wanted=, in the reverse direction: If a ServiceA specifies it is RequiredBy ServiceB, when Systemd starts ServiceB, it will also start ServiceA. Putting Require=ServiceB in ServiceA is equivalent to putting RequiredBy=ServiceA in ServiceB. You do not need to specify both.

Generally, it's a good idea to stick with Wants and Requires instead of the reverse direction directives. In most situations I've seen, you "build" dependencies from the grounds up.

Ordering Dependencies

Requirement dependencies do not specify the temporal order in which services are started. That's what Ordering Dependencies do.

Orderings are controlled with two directives:

Putting After=ServiceB in ServiceA is equivalent to putting Before=ServiceA in ServiceB. You do not need to specify both.

There's a caveat: After and Before only affect the order in which services are started (or shutdown) when the services are started (or stopped) together. This can happen if there is also a Want or Require dependency between the two. Services are also started together during boot (and stopped together during shutdown). That means that merely specifying an ordering dependency does NOT mean that if you start a service that another will start on its own.

One common idiom in Systemd is to combine After= and Requires=. See the Ordering Example below.

Warning

Dependencies and Ordering are a complex part of Systemd, and I personally find it rather confusing.

My Advice: Keep It Simple.

You could set up an arbitrarily complex web of dependencies for dozens of services. You could use the many other capabilities of Systemd that I don't cover in this articles, such as "Conflicts" to prevent two different services from running at once. Don't.

If your service do have logical dependencies, handle that at the application level: for example, create a mechanism to allow service to check whether other services are alive. This will make your application more robust, and portable to other systems like Kubernetes, since it will not depend of Systemd for internal logic.

Remember that Systemd was designed and (over-)engineered to do a lot more than manage services. You do not need most of its features to run a simple web application. The three examples I cover below have been all I've needed for almost every service file I've ever written.

Requirement Dependency Example 1: Start at boot

First of all, let's get the most common and simplest case out of the way. The following means something like "when entering multi-user mode, try to start this service", or more simply "start it on boot":

[Install]
WantedBy=multi-user.target

WantedBy means "start this service when this other service/unit is starting". By specifying multi-user.target, when the computer tries to enter this mode during boot, it will start your service.

Note that there is also a RequiredBy, which is rarely used... RequiredBy also says if "stop the other service if this one fails to start". Here, "Requirement" is not you will want, most of the time... just because PostgreSQL or some other service fails to start during boot does NOT usually mean the entire boot sequence should be stopped.

(WantedBy and RequiredBy are placed in a separate [Install] section, which is separate from the [Unit] section. There are not many other options that are placed in the [Install] section and I have no idea why it isn't in [Unit] with every other dependency-related option).

Requirement Dependency Example 2: Start after network is up

This means "run this current unit/service after networking is up":

[Unit]
Description=Description here
After=network.target

network.target is a unit that signals that the network is up. It is a "special" unit, a kind of pseudo-unit that is signaled when the network is up.

Ordering Dependency Example: Start a prerequisite service before another

Often, you will have a service that depends upon another. The "other" service could be a database server, a message broker like RabbitMQ, or other "infrastructural" services.

In these cases, consider combining Requires= and After=. As you will recall:

Below is a service that will start requires and starts after PostgreSQL, and will stop before PostgreSQL shuts down:

[Unit]
Description=Write something clever here
Requires=postgresql.service
After=postgresql.service

[Service]
ExecStart=/home/tato/user-systemd/my-http-server

Service Types

There are multiple types of Service based on how the ExecStart= command behaves. In most cases, if you have not done anything specific to make an application run as a persistent background process (old unix-geeks call it "daemonization"), Type=simple is the right call.

Simple services

As a rule of thumb: if you have a command that runs 'in the foreground', i.e., it takes control of your shell, use simple. This is the default, and you can omit Type from your [Service] section for this case.

Notify Services

For simple services, Systemd has no insights into the status of your service beyond the fact that it is (or is not) running. It doesn't know, for example, if your process has started but is still initializing (by connecting to other services, checking for updates, etc). Or, more likely, if it is stuck in some intermediate state and not actually ready to start processing requests.

A Type=notify Service is essentially a simple service that has built in logic coordinate with Systemd using the sd_notify() command. Usage of the C-level API would be out of scope for this article. Hopefully, there is a package for your language (or web framework, more preferably) which automatically registers these lifecycle events for you.

Oneshot services

Simple services have one limitation: they needs to be an actual process to control. If the process unexpectedly dies, the simple Service will notice and mark it as such.

Often, however, you don't have a single process which represents the service. This could be because the Service actually runs externally (maybe you have an IoT lightbulb that you turn on and off with a command) or because the application comes with its own complicated set of startup and shutdown scripts.

Below is a Service file for UFW (Uncomplicated Firewall) that illustrates this. Underneath the hood, UFW is started and stopped by adding and removing IPTables rules to the Linux Kernel. There is no "UFW" process to control. However, there is a controlling script called ufw-init. Therefore, we can use a oneshot process with ExecStart and ExecStop commands:

[Unit]
Description=Uncomplicated firewall
Documentation=man:ufw(8)
DefaultDependencies=no
Before=network-pre.target
Wants=network-pre.target local-fs.target
After=local-fs.target

[Service]
Type=oneshot
RemainAfterExit=yes
ExecStart=/lib/ufw/ufw-init start quiet
ExecStop=/lib/ufw/ufw-init stop

[Install]
WantedBy=multi-user.target

RemainAfterExit=yes is something you'll typically want for a oneshot service. It says that after ExecStart runs, mark the Service as "Active", even though the original process is not running. Otherwise it will be marked as "Inactive".

Forking services

If you have an existing service that does it's own 'daemonization' where the main parent process forks a child process and exits, you need to use Type=forking. This is typical for older projects that existed before systemd.

For example, below is the Service file for nginx.service on my machine. The ExecStart is a command that, it you ran it on the CLI directly, would fork itself and exit immediately. Note also the presence of a PIDFile option, which the /usr/sbin/nginx uses internally to control the process.

[Unit]
Description=A high performance web server and a reverse proxy server
Documentation=man:nginx(8)
After=network.target nss-lookup.target

[Service]
Type=forking
PIDFile=/run/nginx.pid
ExecStartPre=/usr/sbin/nginx -t -q -g 'daemon on; master_process on;'
ExecStart=/usr/sbin/nginx -g 'daemon on; master_process on;'
ExecReload=/usr/sbin/nginx -g 'daemon on; master_process on;' -s reload
ExecStop=-/sbin/start-stop-daemon --quiet --stop --retry QUIT/5 --pidfile /run/nginx.pid
TimeoutStopSec=5
KillMode=mixed

[Install]
WantedBy=multi-user.target

Also note the various other forms of Exec options being used. These should be mostly self-explanatory.

The nss-lookup.target in After roughly says "run nginx after DNS resolution is up".

Logging

Systemd units including services and timers are logged using journald. They are stored by default in /var/log/journal/[RandomHexString]/.

To show the logs for a particular service or timer, use -u to specify it's unit name:

journalctl -u your-service.service

For user-specific services, use the --user option:

journalctl --user -u your-personal-service

systemctl status also shows the logs for the most recent invocation of the service or timer (more on timers later):

systemctl status your-service

journalctl has several useful options. Several that I use often are --follow, which acts like the same flag for tail, as well as a --since='' for display recent messages.

One of the most annoying things about journald is that it logs in binary 🤮, not text. That's not that bad, though, since the journalctl command can be pointed at any of these binary files using the --file option:

journalctl --file /var/log/journal/deadbeef9a943d890d5ba9c04cb0c9f/system.journal

Journald be default limits itself to 4GB or 15% of the partition (see journald.conf's manual page). A simple way to archive logs would be to simply archive the /var/log/journal directory.

Journald also integrates well with syslog, as well as logging solutions such as ElasticSearch (TODO: write this part).

Using Templates

Sometimes, you need to create multiple instances of a service, each with a slightly different configuration. This is where templating comes in.

For example, most distros use Systemd for managing Wireguard. Wireguard allows you to specify multiple configurations in its configuration directory, called something like /etc/wireguard/wg0.conf, /etc/wireguard/wg1.conf, /etc/wireguard/wg2.conf, and so on. Then, each connection is controlled with commands such as:

sudo systemctl enable wg-quick@wg0.service
sudo systemctl stop wg-quick@wg1.service
# ... and so on ...

The service file for templated services must contain the @ after the service name, so the wireguard Service file would be named wireguard@.service, contents shown below:

[Unit]
Description=WireGuard via wg-quick(8) for %I
After=network-online.target nss-lookup.target
Wants=network-online.target nss-lookup.target
PartOf=wg-quick.target

[Service]
Type=oneshot
RemainAfterExit=yes
ExecStart=/usr/bin/wg-quick up %i
ExecStop=/usr/bin/wg-quick down %i
ExecReload=/bin/bash -c 'exec /usr/bin/wg syncconf %i <(exec /usr/bin/wg-quick strip %i)'

[Install]
WantedBy=multi-user.target

The only differences between this file and any other is the %i and %I variables. %i gets replaced with the part after the @ (so, wg0, wg1 above) and %I is the same thing expect with funny characters escaped. (Why anyone would put non-standard characters in a Service name is beyond me).

Timers

Systemd Timers are a way to schedule jobs to run periodically. They are another type of Systemd Unit. However, you will see that every Timer has a companion oneshot Service.

Why?

Yes, this functionality is mostly redundant with cron. Why would you want to use Systemd timers, then?

A few reasons:

See a more complete list of advantages here. Personally, I use Systemd timers to install scheduled tasks that are bundled with packages (like certbot, described below), but tend to prefer cron for simpler, one-line jobs.

Configuring a Timer Task

Anyways, let's learn how Timers are configured by looking at how Let's Encrypt's certbot program renews certificates. If you've configured a web server on Linux recently, you'll know that the easiest way to get a valid SSL certificate for domain is to run Certbot, which verifies your domain with the people of Let's Encrypt and retrieves a certificate. However, certificates from Let's Encrypt have relatively short expiration periods and need to be periodically renewed.

In order to do this, Certbot configures a Systemd Timer unit by installing two files:

As you can see, Timers are defined using two files, one for the Timer and another for a oneshot Service. They need to have the same file basename (the certbot part of certbot.service and certbot.timer). The timer defines when the task runs, and the service defines what to run.

In order for the job to actually run, the Timer needs to be enabled with systemctl enable certbot.timer as well as stareted. Note that you should be explicit with the .timer part (recall that .service is added by default if there is not unit type). The service itself should not be enabled. Technically, though, you could run it once with start.

The contents of /lib/system/system/certbot.service should look exactly like a oneshot process seen above.

[Unit]
Description=Certbot

[Service]
Type=oneshot
RemainAfterExit=true
ExecStart=/usr/bin/certbot -q renew
PrivateTmp=true

# include section below if you want to be able to 'enable'/'start' this service 
[Install]
WantedBy=timers.target

/lib/system/system/certbot.timer contains a cron-like line to define the job's schedule:

[Unit]
Description=Run certbot twice daily

[Timer]
OnCalendar=*-*-* 00,12:00:00
RandomizedDelaySec=43200
Persistent=true

[Install]
WantedBy=timers.target

OnCalendar is, clearly, the schedule. Unfortunately, this syntax is different from cron. Roughly speaking, the first part defines the date, the second part the time. Thus, the above means "Everyday at noon and midnight":

*-*-* 00:00:00,12:00:00 

The Date can be omitted, so this means every fifteen minutes (notice the cron-style / syntax):

*:0/15 

see systemd.time (7) for OnCalendar syntax. YYYY-MM-DD HH:MM:SS

RandomizedDelaySec adds a random offset to the schedule time. This is neat because you can define a whole bunch of different timers to all run at midnight and there is no risk of accidentally overloading your system when the clock strikes 12:00 AM. Of course, this would not be appropriate if something needs to actually run at midnight, so beware not to go overboard with it..

As well as the expected journalctl commands, you can see a list of timers on your system with list-timers:

systemctl list-timers
journalctl -u certbot.timer
journalctl -u certbot.service

Running Containers as Systemd Services

Coming Soon!

Running Systemd inside Containers

Coming Soon!

Resources