What is remote execution?¶

To explain remote execution, it's helpful to first talk about software builds and build tools.

Software build and build tools¶

A software build is the process where source code is turned into software that can be run. More concretely, a build consists of a sequence of commands that need to be run in a particular order to generate files (also called "artifacts") – the targets of the build. This is often more than just running a compiler on source code; as a repository of code grows in size and complexity, build processes can become quite involved.

Build tools automate this process and take care of related tasks such as configuring builds, maintaining debug and release versions, determining what artifacts need to be rebuilt when not starting from scratch, and running tests.

Software developers run the build tools on their local machines, and typically, they run the build and test commands locally as well.

Running locally has several drawbacks. The local development environment usually has limited compute resources, so builds may end up taking a long time; and some commands, such as ones that require specialized hardware, may not run at all.

More importantly, software engineers' machines could differ in ways that prevent code built on one machine from working with code built on another. This makes it hard to share artifacts to save time and resources. Even test results may be inconsistent across machines.

With remote execution, the commands that constitute the build are distributed across a cluster of machines. The cluster can be scaled up as needed to support large builds, and the machines can be managed in a way that maximizes consistency between users and builds, allowing robust caching of build and test results.

Remote execution and Bazel¶

While Bazel supports local execution, it also implements the open-source Remote Execution API, which allows it to work with third-party remote execution services.

To see how Bazel works in more detail, consider the three logically-distinct phases of a Bazel build:

Loading
Analysis
Execution

Loading and Analysis run locally in preparation for Execution, which can run with either local or remote execution.

Note that in practice Bazel may overlay the phases, e.g. some targets may still be loading while others are already being analyzed. This is always done in such a way that is guaranteed to produce the same results as if phases were executed serially, so we can think of them as distinct phases.

Loading¶

A Bazel build consists of targets. Source files are a simple kind of target. Other targets are defined through rules, which define the relationship between a target's inputs (sources and dependencies) and outputs. A target's outputs can, in turn, be some other target's inputs.

When running a Bazel command, e.g. bazel build //app, Bazel reads the BUILD files of the requested targets and their dependencies to compute the target graph: a directed acyclic graph whose nodes are targets and whose edges represent their inter-dependencies.

graph
  app["//app:app (cc_binary)"] -- sources --> main.cc["//app:main.cc"]
  app -- dependencies --> utils["//library:utils (cc_library)"]
  utils -- sources --> utils.cc["//library:utils.cc"]

(Note that the convention in Bazel is that the arrow points from depender to dependee.)

Analysis¶

Given the target graph and the rules that define each target, Bazel proceeds to plan what commands it will run during the build. To do so, it computes the action graph – a bipartite graph whose nodes are the actions (roughly speaking, commands to be run) and artifacts (files that are inputs or outputs for actions).

The action graph includes all intermediate actions and artifacts. Targets do not have a one-to-one correspondence with actions; some targets may need more than one action to build, and others may need none.

For example, for the targets above (compiling a C++ program), the graph would contain actions for compiling the sources and linking the final executable, as well as intermediate object files and the final C++ program:

flowchart
  app[/app/app/]
   --> linkApp["CppLink (//app:app)"]
   --> main.o[/app/_objs/main/main.o/]
   --> compileMain["CppCompile (//app:app)"]
   --> main.cc[/app/main.cc/]
  linkApp
   --> utils.o[/library/_objs/utils/utils.o/]
   --> compileUtils["CppCompile (//library:utils)"]
   --> utils.cc[/library/utils.cc/]

This analysis phase is done in preparation for actually running the build, but nothing is run yet. The action graph is passed as input to the execution phase.

Execution¶

Once Bazel has determined the action graph, it can start executing the actions in the right order.

Each action consists of its inputs, command line and environment, execution properties and a list of output files. This results in actions that are well-defined and self-contained, so Bazel can do two things:

Store and look up the results of the precise action in a cache. As long as none of the action's inputs or properties have changed, the outputs should (ideally) be the same, and Bazel may be able to just fetch them from the cache.

Bazel supports delegating this to a remote caching service. Such services consist of an action cache, mapping actions to metadata on their results, and a content-addressable store, which stores actions' inputs and outputs.
Determine which of several strategies to use for executing each action.

These execution strategies can be local or remote. Examples of strategies:

The standalone strategy executes action's command lines as local subprocesses.
Sandbox strategies set up a local environment with the action's inputs, to allow executing actions with some degree of isolation from the local setup and allows more deterministic builds.
The remote strategy sends the action along with its inputs and metadata for execution on a remote execution service.

Bazel implements the Remote Execution API to support storing and looking up action results in a remote cache (remote caching) and sending actions for execution on a remote cluster (remote execution).

Remote execution and CMake¶

CMake does not support remote execution directly, however thanks to tipi.build it becomes possible to do so without changing your CMake builds.

For CMake builds, remote compilation and linking actions behave differently from other actions.

In the first scenario, all these actions can be run directly on remote execution, thanks to Goma inferring inputs and outputs of actions. In this case, the remote execution protocol used is the same as used by Bazel.

All other actions are executed on a single remote instance using a separate protocol that has less stringent constraints than Bazel's. This instance lives in the same cluster as the other instances, to minimize the load on the network.

flowchart
  workstation[User's workstation or CI] -->|Run CMake build| digital-twin[Single instance RE]
  digital-twin -->|Compile and link| engflow-re[Horizontally scalable RE]
  engflow-re -->|Cache inputs and outputs| S3[(S3)]

Remote execution on EngFlow¶

EngFlow's remote execution service is an implementation of the open-source Remote Execution API. Bazel, or any other client that faithfully implements this protocol, can run builds on EngFlow's remote execution service.