Remote Persistent Workers

Steps to Enable and Use Remote Persistent Workers

What are Persistent Workers?

“Persistent workers” is a feature that can speed up builds.

Normally the build system (e.g. Bazel) starts a new compiler process for every compilation action. For some languages like C++, this is the only way to compile source files.

For some languages like Java, it’s possible to keep the compiler process alive and reuse it to compile more files. It is faster than starting a new compiler for every file.

A persistent worker is a long-lived, reused process, like the Java compiler in the example above. A build system like Bazel keeps a local process (the persistent worker process) running for an extended period and uses some local inter-process communication (IPC) protocol to send actions to the worker process. The worker process can cache data in memory to speed up action execution.

There can be workers for other languages (e.g. TypeScript, Scala, Kotlin) and for other types of actions (e.g. JUnit testing). See also Bazel’s definition of persistent workers.

What are Remote Persistent Workers?

Remote persistent workers are persistent workers on a Remote Execution cluster, not on the client machine.

EngFlow Remote Execution supports this feature, using the same IPC protocol as Bazel.

As of 2021-06-18, Bazel’s multiplex workers are not supported.

Security Considerations

Because remote persistent workers have state, they can introduce non-determinism and non-hermeticity into the build process. In the worst case, they can introduce a vector for bad actors to undermine the trustworthiness of the build outputs. We therefore recommend:

  1. Disable remote persistent workers for release builds. Note that action cache entries generated from remote persistent workers are not reused for release builds if the platform options differ. Therefore just disabling remote persistent workers in the client (using a platform option; see below) will automatically ignore the aforementioned action cache entries.

  2. Use different values for the cache-silo-key platform option for different subsets of clients to avoid reusing persistent workers between those subsets (e.g., for teams working on different products).

  3. Establish strict review and modification guidelines for remote persistent worker source or executable code, e.g., by requiring mandatory code reviews and restricting the set of users who are allowed to check in changes.

How to use this feature?

To use Remote Persistent Workers, enable it both on the server side (EngFlow RE Service) and on the client side.

Server-side requirements

Remote persistent workers can work with the docker (--allow_docker ) and local (--allow_local ) execution strategies.

With docker execution, the docker container must be kept running with the persistent worker process inside. Set the following configuration flags:

--allow_docker=true
--docker_allow_reuse=true
--experimental_persistent_worker_and_docker=true

With local execution, you need to set the following configuration flags:

--allow_local=true
--experimental_persistent_worker=true

Client-side requirements

If you use Bazel:

You’ll need to build a custom Bazel binary to support remote persistent workers.

  1. Check out our fork of:

    Build your Bazel with bazel build -c opt //src:bazel, then use the resulting binary (bazel-bin/src/bazel) instead of the released Bazel.

    (As of 2021-06-18, these patches are not in upstream Bazel yet.)

  2. Add the --experimental_remote_mark_tool_inputs flag to your .bazelrc

  3. Optional: explicitly enable persistent workers

    If you build Java, this step is unnecessary. Bazel uses persistent Java workers by default.

    For other languages, you may need to enable workers.

If you use other build tools:

  1. The platform option persistentWorkerKey must be set to a non-empty value
  2. If using Docker execution, i.e., if the platform option container-image is non-empty, then the client must also set the platform option dockerReuse=True
  3. The persistent worker inputs in the input tree must be marked with the property bazel_tool_input (the value can be empty)
2021-09-21