Remote Persistent Workers

Steps to Enable and Use Remote Persistent Workers

What are Remote Persistent Workers?

Persistent workers are a mechanism where build systems like Bazel keep a local worker process (the persistent worker) running for an extended period of time, and use a local inter-process communication (IPC) protocol to send actions to the worker process. The primary advantage of this approach is that the worker process can cache data in memory to speed up action execution.

For example, consider the Javac persistent worker. Keeping Javac running avoids the JVM startup overhead, allows Javac to cache the standard library in memory, and allows Javac (which itself is written in Java) to cache its own compiled machine code.

EngFlow RE has experimental support to allow persistent worker processes in a cluster, i.e., remote from the client, hence the name “remote persistent workers”. It uses the same protocol for IPC as Bazel as documented at https://docs.bazel.build/versions/master/persistent-workers.html.

As of 2020-10-28, multiplex workers are not supported.

Security Considerations

Because remote persistent workers have state, they can introduce non-determinism and non-hermeticity into the build process. In the worst case, they can introduce a vector for bad actors to undermine the trustworthiness of the build outputs. We therefore recommend:

  1. Disable remote persistent workers for release builds. Note that action cache entries generated from remote persistent workers are not reused for release builds if the platform options differ. Therefore just disabling remote persistent workers in the client (using a platform option; see below) will automatically ignore the aforementioned action cache entries.

  2. Use different values for the cache-silo-key platform option for different subsets of clients to avoid reusing persistent workers between those subsets (e.g., for teams working on different products).

  3. Establish strict review and modification guidelines for remote persistent worker source or executable code, e.g., by requiring mandatory code reviews and restricting the set of users who are allowed to check in changes.

Enabling Remote Persistent Workers

You have to enable remote persistent workers both on the server side (EngFlow RE Service) and on the client side.

Server

Remote persistent workers can work with the docker (--allow_docker ) and local (--allow_local ) execution strategies.

To use persistent workers with docker execution, the docker container must be kept running with the persistent worker process inside. This requires the following configuration flags to be set:

--allow_docker=true
--docker_allow_reuse=true
--experimental_persistent_worker_and_docker=true

To use persistent workers with local execution, you need to set the following configuration flags:

--allow_local=true
--experimental_persistent_worker=true

Client

Even with the server-side flags enabled, remote persistent workers additionally require client opt-in on a per-action basis. This requires the following client settings:

  1. The platform option persistentWorkerKey must be set to a non-empty value
  2. If using Docker execution, i.e., if the platform option container-image is non-empty, then the client must also set the platform option dockerReuse=True
  3. The persistent worker inputs in the input tree must be marked with the property bazel_tool_input (the value can be empty)

Using Remote Persistent Workers with Bazel

We have created a patch for Bazel that adds support for the persistentWorkerKey platform option as well as marking the worker inputs with bazel_tool_input.

As of 2020-10-28, this patch is not in upstream Bazel. Instead, it’s available at https://github.com/EngFlow/bazel/tree/remote-persistent-worker. In addition to the patch, you also have to add --experimental_remote_mark_tool_inputs to your Bazel invocation.

Last modified 2020-11-19