Remote Persistent Workers¶
Steps to enable and use remote persistent workers.
What are Persistent Workers?¶
"Persistent workers" is a feature that can speed up builds.
Normally the build system (e.g. Bazel) starts a new compiler process for every compilation action. For some languages like C++, this is the only way to compile source files.
For some languages like Java, it's possible to keep the compiler process alive and reuse it to compile more files. It is faster than starting a new compiler for every file.
A persistent worker is a long-lived, reused process, like the Java compiler in the example above. A build system like Bazel keeps a local process (the persistent worker process) running for an extended period and uses some local inter-process communication (IPC) protocol to send actions to the worker process. The worker process can cache data in memory to speed up action execution.
There can be workers for other languages (e.g. TypeScript, Scala, Kotlin) and for other types of actions (e.g. JUnit testing). See also Bazel's definition of persistent workers.
What are Remote Persistent Workers?¶
Remote persistent workers are persistent workers on a Remote Execution cluster, not on the client machine.
EngFlow Remote Execution supports this feature, using the same IPC protocol as Bazel.
As of 2024-09-05, Bazel's multiplex workers are not supported.
Security Considerations¶
Because remote persistent workers have state, they can introduce non-determinism and non-hermeticity into the build process. In the worst case, they can introduce a vector for bad actors to undermine the trustworthiness of the build outputs. We therefore recommend:
-
Disable remote persistent workers for release builds. Note that action cache entries generated from remote persistent workers are not reused for release builds if the platform options differ. Therefore, just disabling remote persistent workers in the client (using a platform option; see below) will automatically ignore the aforementioned action cache entries.
-
Use different values for the
cache-silo-key
platform option for different subsets of clients to avoid reusing persistent workers between those subsets (e.g., for teams working on different products). -
Establish strict review and modification guidelines for remote persistent worker source or executable code, e.g., by requiring mandatory code reviews and restricting the set of users who are allowed to check in changes.
How to use this feature?¶
To use Remote Persistent Workers, enable it both on the server side (EngFlow RE Service) and on the client side.
Server-side requirements¶
Remote persistent workers can work with the docker
(--allow_docker
) and local (--allow_local
)
execution strategies.
To view the current flag settings: Visit
https://<cluster_name>.cluster.engflow.com/restatus
replacing <cluster_name>
with
the name of your cluster. Expand the Options section.
Local Execution¶
Local execution is not enabled by default. Therefore, the local execution strategy must be explicitly enabled to use remote persistent workers with this strategy.
Verify that the following remote execution flags are set on the relevant workers:
Text Only | |
---|---|
1 |
|
Docker Execution¶
Docker execution is enabled by default on Linux
and Windows
.
To ensure remote persistent workers using the docker execution strategy are available, verify that the following remote execution flags are set on the relevant workers:
Text Only | |
---|---|
1 2 |
|
Windows¶
Remote persistent workers are currently not available on Windows. If you are interested in this feature please reach out to your EngFlow support engineer.
Client-side requirements¶
If you use Bazel:
-
Bazel 6 and newer support persistent workers. For older Bazel versions, you'll need to build a custom Bazel binary to support remote persistent workers. Check out our forks:
- Bazel 5.3.0
- Bazel 5.1.1
- Bazel 5.0
- Bazel 4.1
- Bazel 4.0
- Bazel 3.7
Build your Bazel with bazel build -c opt //src:bazel
, then use the
resulting binary (bazel-bin/src/bazel
) instead of the released Bazel.
-
Add the
--experimental_remote_mark_tool_inputs
flag to your.bazelrc
. -
Optional: explicitly enable persistent workers
If you build Java, this step is unnecessary. Bazel uses persistent Java workers by default.
For other languages, you may need to enable workers.
If you use other build tools:
- The platform option
persistentWorkerKey
must be set to a non-empty value - If using Docker execution, i.e., if the platform option
container-image
is non-empty, then the client must also set the platform optiondockerReuse=True
- The persistent worker inputs in the input tree must be marked with the
property
bazel_tool_input
(the value can be empty)