NH:STA S01E04 systemd posted Wednesday, July 23, 2025 by The Neighbourhoodie Team Announcement News

This post is part of our series on our work for the Sovereign Tech Agency (STA), formerly the Sovereign Tech Fund. Our introduction post explains why and how we are contributing to various Open Source projects.

This episode takes a closer look at DNS lookups in Linux, discover how the team fixed some parser bugs and talk to Neighbourhoodie’s James Coglan about the impact of extending test.

About the project

systemd is a critical component of most modern Linux distributions. At its core it is an init system, the component that starts up, monitors and manages all the other processes that run as the system boots up. Using a declarative configuration language, services can register themselves to be started when some other system event occurs, such as when multi-user functionality or networking becomes available, or when some other service starts or stops.

As well as the core init system, systemd includes a suite of other tools that provide functionality needed by many applications, such as log management and various network services. One such service is resolved, a DNS resolver. Applications use this whenever they need to look up information about a domain name, for example to translate a name like example.com into an IP address like 23.192.228.84, or look up where the mail servers for a certain email address are. Needless to say, a computer connected to the internet needs to do these things very frequently, and so it is critical that the DNS resolver is robust, performant, and secure. The DNS protocol involves talking to arbitrary other servers on the internet, and processing input that may not be trustworthy, so it’s important that a malicious DNS server cannot exploit resolved by, for example, sending it malformed responses.

Just like most open source projects, systemd is never “finished” and there is always more that can be done to improve things. Though the project already maintains a high quality bar and is very robust, the maintainers wanted some help adding extra test coverage and finding edge cases, particularly in the resolved component. The project uses the Meson build system, which can be configured to output code coverage reports using LCOV. This made it easy for us to identify source code modules and functions that lacked any test coverage, so we could focus our effort on those code paths where adding tests would have the biggest impact.

Our contributions

In network protocols, the code that deals with parsing and generating messages is often a source of security vulnerabilities, especially in C codebases where a parsing mistake can often result in invalid memory access that an attacker can exploit. We quickly homed in on the DNS message parser as a critical component and set about adding extensive unit tests for most of the DNS message types that resolved knows how to handle. Just as important as checking it parses any valid message correctly, is checking that malformed messages are rejected. We added plenty of checks for different ways that DNS messages and record types can be malformed, to make sure the parser reports errors for all of them and doesn’t mistakenly accept broken inputs.

Having thoroughly tested the parser, we then added tests for the serialiser that encodes DNS record structures into network messages. This makes sure that all the output produced by resolved is valid, and that it never emits a message that it itself could not parse. During this work, we identified and fixed a small number of bugs in DNS message handling:

Fixed a couple of parsing edge cases relating to how resolved interprets the content of OPT records, which are used to convey metadata such as whether a client or server supports DNSSEC.
Identified a scenario where resolved could generate invalid DNS messages, if it was somehow asked to send information about a domain name with segments longer than the 63-byte limit. It was determined there was no known way to exploit this, and the behaviour was corrected a few months later.

After this, we moved on to add test coverage for many of the key data structures and functions used for DNS logic, for example the functions that compare every DNS record type for equality, or that apply CNAME/DNAME redirects to IP address lookups, or that manage the system’s DNS cache. Over the course of a few weeks, we made a significant impact on the test coverage of resolved:

We added over 10,000 lines of unit test code to the project.
We increased the lines of code executed during tests from 16% to 52%.
We increased the number of functions invoked during tests from 21% to 65%.

Reflections from the team

Here’s a short interview with Neighbourhoodie developer James Coglan, who worked on the systemd project, reflecting on how testing makes it easier for new maintainers to approach a project:

What was the most surprising thing working on this project?

James: Projects written in C, especially system software, have a bit of a reputation for being impenetrable and difficult for newcomers to get started with. However, I found the systemd codebase remarkably easy to pick up — it has a clearly documented build and test process and detailed guides for contributing to the project, what their code style conventions are, etc.

The implementation of the DNS functionality is also pretty clear and unsurprising; once you know how DNS works it’s relatively easy to figure out what the code is doing and where to find particular bits of functionality. Much of it is written in a way that I could add plenty of tests without needing to modify the source code at all, which is not often the case when code is written without tests.

What was especially challenging about this project?

Going into the project, I had a basic understanding of what DNS is and what it does, but not a detailed understanding of the protocol. It has evolved a great deal over time and now encompasses an awful lot of different types of information, and I needed to learn how all of these is represented in DNS messages.

This involved reading many of the dozens of standards documents that define the protocol, all the different types of DNS messages, what the format and validation rules for each one are, etc. Also, a lot of internet protocols end up working slightly differently in actual implementations compared to what the spec says, and so programs like resolved have to deviate from the spec in places to be compatible with the bugs in various companies’ routers.

Fortunately, the basic DNS message format is fairly simple and it’s quite easy to just send messages to existing DNS servers on the internet and see how they respond. This way, you can get an idea of how it looks in the real world and validate your understanding of the specs, and I relied a lot this sort of experimentation to check various things.

Did you learn anything on this project that could be helpful for other open source teams building critical Linux components?

Making it possible for newcomers to contribute to the project, and to do so safely without fear of breaking things is critical to a project’s resilience. The easier you can make it to write and run tests, the easier it becomes to avoid introducing bugs into critical code paths. People new to the project, who don’t yet know all its details, are much more free to experiment and make changes when they know they can’t accidentally break existing functionality.

Even if you don’t have a lot of testing in place, it pays to write code in a way that makes it easy to do later. A lot of the resolved codebase consists of simple functions that you can call with some input and check what they return. You don’t need to spin up a whole DNS server or configure any network interfaces, you can just invoke the message parser with a buffer and check how it responds.

Making it simple to test specific functions without a lot of complex setup has many benefits for maintainability beyond testing, but writing tests is a good way to make sure more of your code follows this principle.

Conclusion

Good test coverage is one of those things that when it’s done right, people don’t notice it — everything just keeps working as expected. It’s only when things are not adequately tested that it becomes noticeable as random things break more easily.

By improving the test coverage in the resolved service, we’ve helped the project’s maintainers continue to update the software with confidence, and made it easier for new maintainers to join the project. Everyone can keep working efficiently without being slowed down by uncertainty or manual testing, keeping the project healthy and end users happy.

You can read more about our work with the Sovereign Tech Agency and the projects we’ve worked on with their support:

Thanks for joining us for another instalment. See you in the next one!

« Back to the blog post overview

NH:STA S01E04 systemd posted Wednesday, July 23, 2025 by The Neighbourhoodie Team AnnouncementNews

About the project#

Our contributions#

Reflections from the team#

What was the most surprising thing working on this project?#

What was especially challenging about this project?#

Did you learn anything on this project that could be helpful for other open source teams building critical Linux components?#

Conclusion#