Remote Persistent Workers¶

Steps to enable and use remote persistent workers.

What are Persistent Workers?¶

"Persistent workers" is a feature that can speed up builds.

Normally the build system (e.g. Bazel) starts a new compiler process for every compilation action. For some languages like C++, this is the only way to compile source files.

For some languages like Java, it's possible to keep the compiler process alive and reuse it to compile more files. It is faster than starting a new compiler for every file.

A persistent worker is a long-lived, reused process, like the Java compiler in the example above. A build system like Bazel keeps a local process (the persistent worker process) running for an extended period and uses some local inter-process communication (IPC) protocol to send actions to the worker process. The worker process can cache data in memory to speed up action execution.

There can be workers for other languages (e.g. TypeScript, Scala, Kotlin) and for other types of actions (e.g. JUnit testing). See also Bazel's definition of persistent workers.

What are Remote Persistent Workers?¶

Remote persistent workers are persistent workers on a Remote Execution cluster, not on the client machine.

EngFlow Remote Execution supports this feature, using the same IPC protocol as Bazel.

As of 2021-06-18, Bazel's multiplex workers are not supported.

Security Considerations¶

Because remote persistent workers have state, they can introduce non-determinism and non-hermeticity into the build process. In the worst case, they can introduce a vector for bad actors to undermine the trustworthiness of the build outputs. We therefore recommend:

Disable remote persistent workers for release builds. Note that action cache entries generated from remote persistent workers are not reused for release builds if the platform options differ. Therefore just disabling remote persistent workers in the client (using a platform option; see below) will automatically ignore the aforementioned action cache entries.
Use different values for the cache-silo-key platform option for different subsets of clients to avoid reusing persistent workers between those subsets (e.g., for teams working on different products).
Establish strict review and modification guidelines for remote persistent worker source or executable code, e.g., by requiring mandatory code reviews and restricting the set of users who are allowed to check in changes.

How to use this feature?¶

To use Remote Persistent Workers, enable it both on the server side (EngFlow RE Service) and on the client side.

Server-side requirements¶

Remote persistent workers can work with the docker (--allow_docker) and local (--allow_local) execution strategies.

With docker execution, the docker container must be kept running with the persistent worker process inside. Verify the following remote execution flags are set:

Text Only
--allow_docker=true
--docker_allow_reuse=true
--experimental_persistent_worker_and_docker=true

These flags are enabled by default.

With local execution, verify the following remote execution flags are set:

Text Only
--allow_local=true
--experimental_persistent_worker=true

To view the current flag settings: Visit https://<cluster_name>.cluster.engflow.com/restatus replacing <cluster_name> with the name of your cluster. Expand the Options section.

Client-side requirements¶

If you use Bazel:

Bazel 6 and newer support persistent workers. For older Bazel versions, you'll need to build a custom Bazel binary to support remote persistent workers. Check out our forks:
Bazel 5.3.1
Bazel 5.3.0
Bazel 5.1.1
Bazel 5.0
Bazel 4.1
Bazel 4.0
Bazel 3.7

Build your Bazel with bazel build -c opt //src:bazel, then use the resulting binary (bazel-bin/src/bazel) instead of the released Bazel.

Add the --experimental_remote_mark_tool_inputs flag to your .bazelrc.
Optional: explicitly enable persistent workers

If you build Java, this step is unnecessary. Bazel uses persistent Java workers by default.

For other languages, you may need to enable workers.

If you use other build tools:

The platform option persistentWorkerKey must be set to a non-empty value
If using Docker execution, i.e., if the platform option container-image is non-empty, then the client must also set the platform option dockerReuse=True
The persistent worker inputs in the input tree must be marked with the property bazel_tool_input (the value can be empty)