Remote execution testing

  • BuildGrid running as a docker image

We intend to produce a test for BuildStream's remote execution system using docker images where possible. This will consist of:

  • BuildStream (the system under test) running on a host system as normal

  • BuildStream's artifact cache running as a docker image

  • BuildGrid running as a docker image

  • buildbox running as a docker image - to be arranged.

This is a BuildStream-centric test, which means BuildStream is the thing being tested. Versions of BuildGrid and buildbox and bst-artifact-server will be fixed, while our CI will test the latest version of BuildStream against them. This may still reveal faults in either BuildGrid or buildbox, but I want to avoid updating the versions of all components at arbitrary intervals, because this makes comparing a test against previous results more difficult.

In the future, we may try to get other projects (e.g. BuildGrid) to run similar tests, in which case the current master of BuildGrid would be tested against fixed versions of BuildStream and BuildBox, hopefully reusing as much code as possible from our test suite.

Minimum viable test

  • Create TLS certificates (server and client) for communication with the artifact cache.
  • Start the buildstream artifact cache container, passing a local folder containing the generated certificates as a volume.
  • Start the BuildGrid server docker container

The final piece - starting a bot - is slightly more tricky. We need to assemble buildbox ourselves and place it into a container which has BuildGrid in.

This can, I expect, be done by starting a second BuildGrid container with the entrypoint overwritten and using this container to download and build buildbox, then using this container as the build bot. If that container can be duplicated after building buildbox, and used to run multiple bots, all the better.

There is a Dockerfile present in buildbox, which means there have at least been attempts to make a buildbox docker image in the past. However, according to Jürg, this will only provide buildbox itself, not the BuildGrid buildbox bot.

What currently works

  • The bst-artifact-server docker image works with current BuildStream, excepting issues #22 (which has a workaround) and #32. [1]

  • It is possible to set up both BuildGrid and buildbox as separate docker containers running on the same virtual network, as below.

Configuring !BuildGrid and buildbox

These tests were run on the current master version of the BuildGrid repository (revision 33158b7f2baf28a89b5af6526fbadbebaf15e5ac).

Apply the following diff to DockerFile in the BuildGrid repository to build buildbox along with BuildGrid.

diff --git a/Dockerfile b/Dockerfile
index d533f68..5373604 100644
--- a/Dockerfile
+++ b/Dockerfile
@@ -1,4 +1,8 @@
-FROM python:3.5-stretch
+FROM debian:buster
+
+RUN apt-get update && apt-get install -y \
+    python3 \
+    python3-pip
 
 # Point the path to where buildgrid gets installed
 ENV PATH=$PATH:/root/.local/bin/
@@ -13,10 +17,32 @@ WORKDIR /app
 COPY . .
 
 # Install BuildGrid
-RUN pip install --user --editable .
+RUN pip3 install --user --editable .
+
+WORKDIR /app
+
+RUN apt-get update && apt-get install -y \
+    gcc \
+    g++ \
+    git  \
+    grpc++ \
+    libfuse3-dev \
+    libssl-dev \
+    meson \
+    pkg-config \
+    uuid-dev \
+    && apt-get clean
+
+RUN git clone https://gitlab.com/BuildGrid/buildbox/buildbox-fuse.git /buildbox
+
+WORKDIR /buildbox
+
+RUN mkdir build && cd build && meson .. && ninja && ninja install && buildbox --help
+
+WORKDIR /app
 
 # Entry Point of the image (should get an additional argument from CMD, the path to the config file)
 ENTRYPOINT ["bgd", "server", "start", "-vv"]
 
 # Default config file (used if no CMD specified when running)
-CMD ["buildgrid/_app/settings/default.yml"]
+CMD ["tls.yml"]

Now add tls.yml, the alternative configuration file which configures TLS:

server:
  - !channel
    port: 50051
    insecure-mode: false
    credentials:
      tls-server-key: /certs/server.key
      tls-server-cert: /certs/server.crt
      tls-client-certs: /certs/client.crt

description: |
  A single default instance.

instances:
  - name: ''
    description: |
      The main server

    storages:
      - !disk-storage &main-storage
        path: !expand-path $HOME/cas

    services:
      - !action-cache &main-action
        storage: *main-storage
        max-cached-refs: 256
        allow-updates: true

      - !execution
        storage: *main-storage
        action-cache: *main-action

      - !cas
        storage: *main-storage

      - !bytestream
        storage: *main-storage

Now build this with docker build --tag buildgrid_server ..

Now we can start our instances:

Create the openssl certificates somewhere:

mkdir certs; cd certs
openssl req -new -newkey rsa:4096 -x509 -sha256 -days 3650 -nodes -batch -subj "/CN=bgd_server" -out "server.crt" -keyout "server.key"
openssl req -new -newkey rsa:4096 -x509 -sha256 -days 3650 -nodes -batch -subj "/CN=bgd_client" -out "client.crt" -keyout "client.key"
cd ..

Start the network:

docker network create buildgrid-net

Start BuildGrid server, using the 'certs' directory we created earlier:

docker run -i -p 50051:50051 --network buildgrid-net --volume /home/jimmacarthur/bs/buildgrid/certs:/certs --name bgd_server buildgrid_server

Start a buildbox bot using the same image, but with a different entry point:

docker run --entrypoint "bgd" --network buildgrid-net --volume /home/jimmacarthur/bs/buildgrid/certs:/certs buildgrid_server bot --remote https://bgd_server --client-key /certs/client.key --client-cert /certs/client.crt --server-cert /certs/server.crt --remote-cas https://bgd_server:50051 --cas-client-key /certs/client.key --cas-client-cert /certs/client.crt --cas-server-cert /certs/server.crt buildbox

This should produce a working remote execution service, accessible outside docker as https://localhost:50051. You'll need the certificates created earlier for your client configuration.

What needs doing

  • Find our whether we can run docker on the runners used by BuildStream's CI system, and if not, arrange other runners to do it.

    • GitLab's standard runners are capable of starting a BuildGrid container as a service, as long as the images are hosted on a Docker repository. The standard hub.docker.com can be used for this. However, there is no apparent way to mount volumes for a container, which is our preferred way of sharing TLS keys between BuildGrid, bot and BuildStream. We can add a fixed certificate to the repository for BuildGrid, but we must be careful not to assume we have any security as a result of using TLS if we do this.

  • Find a cleaner way to merge a buildbox docker image into the BuildGrid image. The current solution requires cloning and building buildbox inside the BuildGrid docker file, which duplicates some work.

    • It is possible to use two separate Dockerfiles for BuildGrid, one which builds the normal base image and one which uses this image as its source and then downloads and builds buildbox. However, buildbox requires the libfuse3-dev Debian package which is not present in Debian stretch, the base image used by BuildGrid. We would need to upgrade the BuildGrid base image to Debian testing in order to deduplicate the steps in the Dockerfiles.

Things that would be nice to have

  • Start more than one buildbox bot.
  • Use servers and bots other than BuildGrid and BuildBox, when they become available.

References:

[1] https://gitlab.com/BuildStream/buildstream-docker-images/issues/32

Projects/BuildStream/RemoteExecutionTesting (last edited 2019-01-14 09:01:37 by JMacArthur)