$/!\$ Warning:
This is only a proposal at this time, and not approved for project-wide application. This should not be applied before being officially turned into a GNOME Goal.

GNOME Goal: Installed Tests

For a long time, the GNOME release process has encouraged individual component maintainers to write and run Automake-style "make check" tests (sometimes as part of "distcheck" before uploading tarballs). There have been some efforts to automate this, and it is not unusual for downstream build systems to run "make check" inside the build environment.

While this page outlines an alternative testing model geared towards improving continuous integration, many consuming build systems still only support "make check", this is relevant in situations where it is important or desired to run tests before allowing make install to write into your install prefix. You likely want to support both. The glib.mk Makefile fragment will help you support both.

Issues with "make check"

Environment: The most difficult problem with "make check" is the lack of a rigorous definition of the expected environment. Many "make check" tests as found in GNOME assume the presence of facilities that happen to be on developer laptops, such as an active X server. But most production build systems use restricted containers, and will not be building inside a logged in session. To work around this, a number of GNOME components use "Xvfb" to create a virtual X server; the tradeoff with this is now the tests are running against a different X server configuration than users will have. Related to this, in the "make check" model the tests are only testing binaries in the source tree, not in the actual final "make install" location.

Testing requires building: Many system builders want the ability to do reverse dependency testing, even after changing just one module. For example, after changing the kernel, we'd like to be able to run the GLib test suite; it's a heavy user of kernel interfaces such as eventfd and threads. Similarly, after changing GLib, the gjs and pygobject test suites are excellent candidates.

Granularity: "make check" offers no granularity to testers - one has the choice of "all" or "none". An intelligent build and integration system would like to be able to schedule first the tests which are relevant to source code change. For example, if the GLib networking code changes, it'd be possible to notice that libsoup makes calls to the modified functions, and schedule the libsoup tests before we rerun the tests for the GLib keyfile parser.

Test everything: Related to the above, a use case is to simply run all of the tests (possibly multiple times) on release candidate builds, without changing the source code at all.

Design

.test files

Installed tests are described by ".test" files, placed in /usr/share/installed-tests/componentname-apiversion/testname.test. The apiversion is added for parallel installability. Subdirectories of componentname-apiversion may be used. A ".test" file looks like this:

[Test]
Description=All gjs tests
Exec=/usr/libexec/installed-tests/gjs/gjs-all
Type=session
Output=TAP

The Exec key is mandatory; it has the same semantics as described in the Desktop Entry specification. See below for where to put test binaries.

The Type key is mandatory. There are two valid values: session and session-exclusive. See below for their semantics.

The key Output (if specified) can only take one value at present, which is TAP. This specifies that the test outputs Test Anything Protocol.

The key Description is optional; it is a freeform string describing the test.

Test execution semantics

session type tests are guaranteed to be run as non-root, in an "active desktop session". The precise guarantees of "active desktop session" are somewhat undefined, but currently, take this to mean an X server and DBus session bus. This could be done using virtualization, or a regular bare metal user login. These tests should not modify $HOME, and they should be robust against concurrently executed tests. For example, these tests cannot assume that after creating an X window that it has focus.

session-exclusive tests are the same as above, except they are guaranteed to executed without any other concurrent tests. This is useful for tests which manipulate the system UI shell, for example.

Tests will be run in an empty temporary directory that will be automatically cleaned upon termination. If they need access to any data files, they will need to be found by looking in the configured prefix.

Tests should exit with code 0 on success. The special exit code 77 indicates the test should be skipped (this is taken from Automake). Any other exit code indicates failure. Tests should log to stdout/stderr; where those are directed is up to the framework. Standard input will be /dev/null.

Source code

Components can opt to supply installed tests in one of three ways. The most common is:

A configure option --enable-installed-tests controls whether the tests are built from the primary source code

And two lesser used options:

A subdirectory "installed-tests" in the source code that has its own independent build system (configure/Makefile)
As a separate git repository

The advantage of the "installed-tests" subdirectory model (and the separate git repository model) is they allow separate build and runtime dependencies for the tests. For example, the current dbus test suite makes use of glib (and the glib test suite optionally makes use of dbus), but neither require each other at runtime or build time, only for the tests.

Test "packaging"

While the Exec= line may reference binaries installed anywhere, it is highly encouraged for components to place test binaries and data in $(libexecdir)/installed-tests/$(PACKAGE)-$(PACKAGE_API_VERSION). This allows for build systems to easily identify and separate installed tests from other parts.

Implementation Status

https://gitlab.gnome.org/GNOME/gnome-desktop-testing is a basic runner for the tests; it can be used with JHBuild, like this:

$ jhbuild buildone glib
$ gnome-desktop-testing-runner glib

The gnome-desktop-testing runner is also used by the Projects/GnomeContinuous continuous integration system. It outputs log messages tagged with MESSAGE_IDs that can be seen from the host system.

Several GNOME modules have presently been modified to install tests:

glib: https://bugzilla.gnome.org/show_bug.cgi?id=699079
gtk+: https://bugzilla.gnome.org/show_bug.cgi?id=699601
gjs': https://bugzilla.gnome.org/show_bug.cgi?id=698935
json-glib: https://git.gnome.org/browse/json-glib/commit/?id=3e9858cb9c34f492ad0859bd262c8c4691260b41
gdk-pixbuf
pango
clutter
gtksourceview
glib-networking
evolution-data-server
gnome-software
gspell

Further work

It should be possible to teach downstream packaging systems about installed tests. How exactly the should be handled will be up to the downstream; some may choose to make "-tests" subpackages for example. Others may choose to run the tests as part of the build, by simulating a fake installation inside the build chroot and use Xvfb and dbus-launch.

Related Art

Comments

This conflicts somewhat with Initiatives/GnomeGoals/DistCheck.
The files are installed in .../installed-tests/$(PACKAGE). For a library, shouldn't it be .../installed-tests/$(PACKAGE)-$(PACKAGE_API_VERSION) instead, for parallel installability? Edit: yes, above paths adapted.