SERVFAIL: the first 100 days
Co-Authored-By: famfo
Our little project has turned 100 days old today!
For the unaware: SERVFAIL is our (sdomi, famfo, Merlin, TheresNoTime) authoritative nameserver project: we host nameservers and provide a web interface for managing zones (adding/modifying/deleting records, etc). Users can then delegate their domains to our NSes. Our goal is to make DNS seem a bit less scary, and provide viable alternatives to commercial servers.
where are we now
- 8 nameservers (4 stable, 3 silly, 1 secondary) on 4 continents
- 55 zones delegated (over 10% owned by Merlin)
- thus far, 0 major outages (WIP)
- HTTP.sh backend, exposing an API proxy and a custom-built web interface, with no JS whatsoever (HTML + lots of CSS)
Picture 2 - current version of the zone editing UI
Picture 3 - some low-res statistics for the past ~90 days
Before SERVFAIL
The following section was written by famfo from their recollection of the initial events.I setup my own DNS servers for my own domain after Merlin told me about ns-global.zone, an anycasted, secondary DNS service.
It consists of two PowerDNS servers, one in Falkenstein, Germany (ns1.famfo.xyz) and one in Tokyo, Japan (ns2.famfo.xyz). The server in Germany acting as a primary server which notifies ns2.famfo.xyz and ns-global.zone, which then AXFR the zone.
At this point I want to thank @kwf@social.afront.org for 1. providing ns-global.zone in the first place and 2. being super helpful when setting this up for the first time.
I previously had my DNS hosted with Cloudflare, which were super unhelpful in migrating away from. AXFR is only available to enterprise customers and there is no way to trivially download the entire zone e.g. as a bind zonefile. At this point I also didn't think about gluing together the PowerDNS API.
Anyways, after a few days I got this mail from Cloudflare:
Lol, thanks for reminding me that my infrastructure works. So I went and just deleted my account:
At that point, Merlin and sdomi expressed some interest in joining in on the "fun". A yet to be named Matrix room appeared to see how far we can take our attention spans.
Fully Automated Globally Distributed Luxury Gay Name Server Network
This (and all later sections) are co-written by sdomi and famfo.
Picture 6 - i wrote /hj! it was a joke!
From my (sdomi's) perspective, SERVFAIL started on one faithful evening in July. After my /hj post, famfo added me and Merlin into a group chat for hacking on something DNS; Initially, we didn't really have a specific goal in mind with regards to what we want to create.
Both me and famfo had similar goals: get away from Cloudflare, because it kinda sorta SUCKS. Their UI is "acceptable" (albeit rough in a few places), their defaults encourage a Man-in-the-Middle approach for all of your traffic, and they're known for trying to be "not political" (which, of course, just equates to not blocking righties).
At the time, I was experiencing a period of creative block with other projects, so I decided that I might as well play around with DNS for a change. And while my pre-SERVFAIL DNS plans mostly included setting up a few redundant servers for myself and my projects, working in a group gave me motivation and energy to aim for something more - what if we made our own web UI, and made it a public project?
Tech stack (or lack thereof)
As long time readers of my weblog may guess, my (sdomi) default tech stack for writing webapps is... Bash, HTML, optionally some JS sprinkled to taste. I'm aware that it's not at all a common choice, and I didn't even think about subjecting others to it. Unfortunately, I'm not immediately comfortable with any other backend technologies - so I decided to use this for a "quick-and-dirty" version for experiments. With projects like this one, it's important to keep the momentum going, because if you don't have anything by the end of the day, there may be not enough motivation left to continue the next afternoon.
... but the temporary solutions are the most permanent, so our Bash backend stuck.
The lack of JS in the frontend is a result of a bet between me and famfo. It limited us to doing everything with HTML, CSS and server-side rendering, but this approach has some pros: not only is everything blazingly fast, it's also easier to debug. Thus far, we haven't added a single line of JS into our frontend code.
Humble beginnings
Simultaneously with developing the "temporary" version of our webapp, we started playing with PowerDNS deployments and replications between servers. We were really pleasantly surprised by how smooth and easy PDNS is to setup - within a few hours, we had three servers up and running: previously existing ns1.famfo.xyz, plus new miyuki.sakamoto.pl and ns1.homecloud.lol.
We ended up using several different distros (famfo and I are on Alpine, while Merlin went for sehr enterprise RHEL). Same about the database backends: famfo picked postgres, Merlin went with mariadb, and I chose sqlite. It all just worked! Big point for PDNS.
The personalized approach to all of our nameservers has somewhat set the tone for the future of the project - and, arguably, it was a saving grace from it being boring as hell. 
Gluing together multiple DNS servers
Picture 7 - Merlin in his natural habitat (src)
Distributed applications are usually really scary to set up (and even scarier when they break). For replication purposes, PowerDNS has a feature called autoprimaries (or supermasters) and supersecondaries (or superslaves). With autoprimaries configured properly, when one server gets notified of a zone change, it notifies all other servers in the "network" through an AXFR zone transfer. Within seconds, all the servers have a consistent state.
Outside of AXFR, PowerDNS exposes a REST HTTP API. This means that for most things, PowerDNS can be managed remotely from one central point:
+-----------+
+------------+ +---------------+ -- notify ->| secondary |
| | | | +-+---------+-+
| Web | -- pdns -->| PowerDNS zone | -- notify --->| secondary |
| server | HTTP API | (primary) | +-+---------+-+
| | | | -- notify ----->| secondary |
+------------+ +---------------+ +-----------+
Figure 1 - our rough initial idea of a network structure
API connectivity was one of the first things we got to work, and it became a core of how we designed out the service. The API exposes almost everything you need to administer the server, with the exception of:
- changing the main INI config file
- managing autoprimaries (more on those later)
- restarting/reloading the server
- reporting the server version (for making sure we're all up to date)
While there are reasons for why those options can't be managed through the HTTP API, it became a major pain point with the system design - something we'll get back to in a later part of this post.
Scrapped ideas: shared databases
Very briefly, when deciding about the general infrastructure, we experimented with replicating using a shared database. We had three ideas on how to run it:
- One central server periodically rsyncs a sqlite file onto all the other servers
- One central server modifies the database, while all the other ones have it mounted read-only through NFS
- One shared, replicated database (by any means necessary)
We abandoned this idea due to a variety of reasons:
- it encourages a single-point-of-failure design (the 3rd option is almost impossible to set up in reality)
- poor support from PDNS (even with sqlite, you have to manually restart the server process)
- lots of moving parts
- potential chokepoint if our database gets sufficiently big
- shared databases are very hard, PDNS already has build in replication for DNS zones only, which is what we need
If autoprimary support in PowerDNS didn't prove itself to be as reliable as it did, we would have likely ended up hacking up its source code to get something a bit tad better. Thankfully, after some back-and-forth with the docs, it just worked - another point for PDNS.
Building a basic user interface
Developing a complex application using your own homegrown web framework is often equivalent to building the rails just beneath your locomotive. This project, unfortunately, was no different: last 3 months have been some of the most active development-wise for HTTP.sh ever since its inception 4 years ago. Many times we embarked on long journeys to refactor a whole chunk of the code upstream, just to not have an ugly hack downstream.
This approach has pros and cons - it would likely be faster to use an existing technology, but thanks to building something from scratch we have a deeper understanding of how everything works together. It has also been mighty fun, so it's not a net loss :3
With SERVFAIL, we leverage HTTP.sh's templating system. The template engine is currently kind of a mess, but it allows us to do some neat things:
{{#app/templates/head.htm}}
{{#app/templates/nav.htm}}
<h2>Zone management</h2>
+ <a href="/zones/new">New zone
<h2>Your DNS zones</h2>
<table>
<tr>
<th>Zone</th>
<th>Share</th>
<th>Delete</th>
</tr>
{{start _list}}
<tr>
<td><a href="/zones/{{.ns}}/{{.domain}}/">{{.domain}}</a></td>
<td><a href="/zones/{{.ns}}/{{.domain}}/share">-></a></td>
<td><a class="danger" href="/zones/{{.ns}}/{{.domain}}/del">Del</a></td>
</tr>
{{end _list}}
{{start ?shared}}
<tr>
<td>---</td>
<td>---</td>
<td>---</td>
</tr>
{{end ?shared}}
{{start _shared}}
<tr>
<td><a href="/zones/{{.ns}}/{{.domain}}/">{{.domain}}</a></td>
<td>shared with you</td>
<td>-</td>
</tr>
{{end _shared}}
</table>
{{#app/templates/bottom.htm}}
Figure 2 - HTTP.sh template for the zone listing
- {{#path}} includes another template file, and transparently evaluates it
- {{.tag}} populates the page with a dynamic string
- {{start _tag}} and {{end _tag}} are loops
- {{start ?tag}} and {{end ?tag}} are conditional statements (with an optional {{else ?tag}} in between)
Then, the actual Bash code is remarkably simple:
1. Using notORM, the data (de)serializer I wrote, we access a data store storages/zone.dat, filter the entries by "$username" and execute x on all entries, which prefills a data structure we'll use later:
declare -A elem
nested_declare list
x() {
elem[ns]="${data[2]}"
elem[domain]="${data[1]}"
nested_add list elem
}
data_iter storage/zones.dat "$username" x
unset elem
2. For static strings, it's enough to just fill them out in an array:
str[title]="SERVFAIL :: zone list"
3. Conditionals equate to false when unset and true when set:
str[?shared]=_
4. Loops work by reference-passing a previously prepared array:
str[_list]=list
str[_shared]=shared_list
5. Finally, we finish the preparations and render the result:
render str app/templates/zone/list.htm
The code rendering the listing is 32 lines in total, including whitespace. The result looks something like this:
Picture 8 - rendered zone list
Myself (sdomi), I'm quite proud of how simple and uncluttered the downstream code ends up being. It's usually a big surprise to outsiders how much can be done in plain Bash with some clever abstractions.
Still, there are ways to improve on our current UI. One of the bigger pain points I've encountered was keyboard navigation, where some of our pages misbehave wildly when TAB-ing. The code itself is WiP in general - suggestions and help are absolutely welcome.
Missing pieces: servfail-sync
As of today (2024-10-23), we have most of the owl already drawn - with one notable exception: managing the servers.
By design, our network is hosted by multiple entities, with nobody truly having full access to all its parts. We're also joined by a couple folks from outside the SERVFAIL core team who wanted to host their own servers with us. Most functionality can be managed through the API, with the exception of autoprimaries: As our network grew bigger, each time we had to manually pester everyone else to change the main INI config and manually run pdnsutil add-autoprimary
servfail-sync is a client-server application which aims to address those issues:
- Clients run locally on specific nodes, alongside PowerDNS
- Every n seconds, clients connect to a central server to ask for changes in state, basing on a local "height"
- Server checks a key header and returns a JSON-encoded list of events. At the present an event can either modify or delete a NS
- Clients apply changes, restart the PDNS service and notify the server of an ACK. If they encounter an error (either with the received message, or with the local state), the client replies with NACK and terminates the service to prevent undefined behavior
Picture 9 - a debug log from a servfail-sync client
$ cat app/webroot/log.txt
2024-10-22 17:09:13: received NOK from suzuha_.test.server.: Failed to restart PDNS
2024-10-22 17:09:27: received OK from suzuha_.test.server.: (no message)
2024-10-23 15:46:54: received OK from suzuha_.test.server.: (no message)
2024-10-23 15:47:08: received NOK from suzuha_.test.server.: Failed to restart PDNS
2024-10-23 15:48:31: received OK from zork.test.server.: (no message)
2024-10-23 15:48:34: received NOK from zork.test.server.: Failed to restart PDNS
Figure 3 - a sample log from the management server, showing both successes and failures
servfail-sync is resilient against a temporary outage of any NS, while ensuring a minimal amount of restarts for all the servers in the network, thus minimizing downtime. We can also leverage it to send back arbitrary metadata (like the previously mentioned PowerDNS version). Adding new servers is as easy as setting up PDNS, setting up the servfail-sync client and letting it go through the event list up until this point.
The tool is designed to be agnostic of servfail-web, so other entities could conceivably deploy it without everything else we provide. It's not yet fully finished, but we plan to do a test deployment of it on our network sometime in the coming days.
Missing pieces: multidig
With a lot of servers, there's often a need to ask all of them for a record and quickly compare the results. But dig only supports asking one server at a time, and the response isn't formatted in a friendly way - what do?
#!/usr/bin/env bash
ns=(
miyuki.sakamoto.pl
sakamoto.pl
ns1.famfo.xyz
ns2.famfo.xyz
ns1.homecloud.lol
# ns7.kytta.dev
ns1.fops.at
ns1.rackspace.moe
)
for i in ${ns[@]}; do
dig "@$i" "$@" |
grep -v '^;' |
grep -v '^$' |
sed 's/^/'"$i"'\t/g' |
awk '{ printf "%-20s |", $1; $1=""; printf "%s\n", $0 }' &
# not safe but trusted input
done
wait
Figure 4 - multidig.sh
The snippet above iterates over all hardcoded servers and pretty-prints a response from them. So far we mostly used it internally, and it has been an invaluable help with checking whether ns2.famfo.xyz has broken replication again.
$ multidig CH TXT version.bind
ns1.fops.at | version.bind. 5 CH TXT "im a fox, powered by SERVFAIL networks"
ns1.homecloud.lol | version.bind. 5 CH TXT "meow - powered by SERVFAIL networks"
ns1.famfo.xyz | version.bind. 5 CH TXT "nyaaaa :3"
sakamoto.pl | version.bind. 5 CH TXT "Meow!"
ns1.rackspace.moe | version.bind. 5 CH TXT "uwu"
miyuki.sakamoto.pl | version.bind. 5 CH TXT "meow"
ns2.famfo.xyz | version.bind. 5 CH TXT "nyaaaa :3"
Figure 5 - multidig result
The future
To casual observers of the #SERVFAIL tag on fedi, it may seem that the progress has considerably slowed down, starting early September or so. A part of it was our attention span temporarily running out, but life and work have interfered for parts of our team quite significantly, too.
Nevertheless, our rough plans for the future (in no significant order):
- Formalize and create a SERVFAIL e.V. (a non-profit org)
- Finish up infra monitoring (and publish some cool usage graphs)
- Introduce a dedicated DDNS API endpoint
- Write. More. Docs!
- Zone rollback feature, keeping state from last 10-20 changes (useful if your change broke prod)
- Optional record validator (parsing some common record types, checking syntax of SPF/DKIM/DMARC, pinging A/AAAA record IPs to check for typos, etc)
- Finally, coming out of beta
If you'd like to help - we don't bite (...unless :3c), and we appreciate any kind of help. We hang out at #servfail @ irc.hackint.org (that's #servfail:hackint.org for those in the Matrix), feel free to come and say hi!
Addendum: Fun things we learned so far
(stolen from Nichijou, edited capt)
CAA records
TLS, the backbone of today's secure communications, relies on an established chain of trust to prevent eavesdropping. To cut the story short, you need a "trusted" 3rd-party to issue you a certificate, through which you prove that a domain is in your possession. This is used by clients to prevent eavesdropping (through MitM attacks, et al.)
But before a Certificate Authority can issue a certificate, it needs to otherwise verify the ownership of a domain. This can be done in a variety of ways (most common ones include: sending a verification link to an administration e-mail listed in the WHOIS / checking whether you can set a specific DNS record / checking whether you can put a file into a well-known HTTP path). Still, you have to trust all CAs to do a proper job with the verifications.
CAA records limit which CAs can issue a certificate for a domain. While it may not sound useful for most creatures, it doesn't cost a lot of time to set up, and helps mitigate some attack scenarios.
Verdict: COOL! When DNSSEC actually works properly :pDNSSEC, NSEC and NSEC3
Ahh, DNSSEC. The bolted on solution to sign resource records. The technique everyone laughs at when brought up. Yet one of us (famfo) keeps evangelizing about it.
With DNSSEC, DNS records can be signed to prevent tampering with responses in transit. This is achieved by bolting on some new DNS records (DNSKEY, RRSIG, DS). Hash of the public key for a zone is stored at the delegation (tldr: one zone above in hierarchy), and using that the DNSKEY and RRSIGs of a zone can be verified. We're currently writing a more in-depth set of (easy to digest) docs for it; For now, all you need to know is that the DS record is a hash of the DNSKEY (aka public key) which is stored at the delegation (e.g. for famfo.xyz., it would be the xyz. zone).
NSEC/NSEC3 is basically a signed NXDOMAIN, which proves that a domain doesn't exist.
Delegations. Delegations are a pain to deal with. There are multiple DNSSEC algorithms: rsa, ecdsa256, ecdsa384, ed25519 and ed448 - among others. All a delegation has to do is sign the DS record with their own private key. Unfortunately, some delegations filter certain DNSKEY algorithms, especially ed25519 and ed448. There's no real reason for not supporting specific algorithms, and yet in reality it's a hit or miss whether your delegation will support anything better than ecdsa256 (or in worse cases, RSA).
Verdict: P A I NHTTPS records
One of the most requested non-existent records we see is the HTTPS record. It is used to indicate how to access a site over HTTPS. It can be used to tell a browser which HTTP versions a server supports, or on which port to establish a QUIC connection. All served over DNS...
A neat thing HTTPS records can be used for is serving Encrypted Client Hello (ECH) public keys.
Verdict: cursedVideo-over-DNS
To break from talking about actual DNS features, check out this little snippet instead:
dig +short TXT {0..92}.vid.demo.servfail.network | sed 's/[" ]*//g' | base64 -d | mpv -
Requires bind-tools, mpv.If it doesn't work, try adding @8.8.8.8 just after dig, or replace mpv with ffplay
By complete accident, when toying with the idea of "let's store a lot of dumb shit in TXT records!" we encountered a PowerDNS bug: at some point, the zone would stop being AXFR-ed out to other servers, which lead to a desync.
We opened the funniest GitHub issue of all time (have you ever sent a meme as a test case?). Within 10 minutes, we heard back with a workaround! Apparently it's a known bug, and the current resolution is to append workaround-11804=yes to the INI config.
This was also a great opportunity to stress-test our web interface. We found a few issues, but it still worked somewhat OK. With over 3.8MB in TXT records, HTTP.sh took around 20 seconds to spit out a response - not great, but better than suspected. After a few refactors, it's a bit better - currently it takes around 8 seconds to render the page that's full of very long TXT records. We'll try to improve this number in the future.
Verdict: Accidentally found a bug!(in no significant order) Thanks a lot to:
- everyone from the betatesting team, for their time, support and interest
- Shebang, for being our shitpost marketing manager
- the core SERVFAIL team, for putting up with my bullshit daily for over 3 months (and counting)
- Multi for helping with some of our logo designs
- April, Nikita and Adrian for liking the project enough to join with their own silly NS
- kleines Filmröllchen, mei, Linus, April and ari for proofreading this post

Comments:
Cool, but could you please fix this typo: > HTTPS records > One of the most reuested [sic] ...
ptrdns at 02.12.2024, 17:02:46I like your API proxy approach, I built one myself but I have chosen to split the API proxy and the web application to manage domains. I currently reuse the PowerDNS API regression tests from their own Github to make sure that my proxy stays as close as possible to the original. Being the only one on the project, I could choose a uniform infrastructure and I went with a MariaDB Galera setup at first, and then standard MariaDB replication when the Galera setup proved itself unreliable over WAN. I would never have imagined to have multiple autoprimaries in a single network!
By commenting, you agree for the session cookie to be stored on your device ;p











