Skip to content

CI Runners for Bazel Builds

Speed up your Bazel builds using EngFlow CI runners that execute jobs in containers that keep a warm Bazel instance as well as a warm Bazel cache. When you opt into this feature, jobs from the supported CI providers can execute in containers running on EngFlow workers in your existing cluster. With warm Bazel, the initialization time for most jobs is significantly reduced because the container running the job has a cached file system and in-memory state from earlier CI jobs, even if that worker was newly instantiated.

We use official runner agents, and the workers in the cluster function like self-hosted, ephemeral CI runner machines. Using these workers provides efficiency gains from being co-located with your remote execution and caching instances.

EngFlow CI runners are currently available for the following CI providers:

What is a Bento?

As part of the initial set up to use EngFlow CI runners, you must create a Bento. A Bento describes everything needed to run a particular CI job: the Git repository to be cloned, the host architecture (x86-64 or arm64), and the container image that has all the tools (e.g. Bazel) required to execute the CI job. After creating a Bento, you can reference it from GitHub Actions or Buildkite job labels.

Example: Suppose you have two repositories hosted on GitHub:

  • MyCompany/dev running jobs on x86-64 architecture
  • MyCompany/prod running CI jobs on x86-64 and arm64 architectures.

To run CI jobs for these repositories on EngFlow CI runners, you would define Bentos containing the following information:

  • name="dev_x64", repo="github.com/MyCompany/dev", image="my.registry/x64/ci_dev@sha256:123(...)"
  • name="prod_x64", repo="github.com/MyCompany/prod", image="my.registry/x64/ci_prod@sha256:456(...)"
  • name="prod_arm64", repo="github.com/MyCompany/prod", image="my.registry/arm64/ci_prod@sha256:789(...)"

Speeding up builds using warm Bazel

When a CI job passes for the first time on an EngFlow CI runner, we create a snapshot of the container. This snapshot includes all changes made to the file system, as well as the process running on the CI runner, i.e., the Bazel server. We store this snapshot in EngFlow's Content-Addressable Storage (CAS). When a subsequent CI job is triggered, instead of starting the process from a cold state, the CI runner (even if it is a newly instantiated worker) that picked up this job fetches and installs the snapshot from CAS. Then, the runner is immediately ready to execute the job since a warm Bazel server is up and running, and the file system is readily available.

We can reuse the snapshot any number of times on any worker.

How fast is warm Bazel?

Cold-starting Bazel builds wastes significant time in CI pipelines. Each new CI runner must perform a full and expensive initialization, including parsing the build graph and fetching dependencies.

For Buildkite jobs running on EngFlow CI runners with warm Bazel, our benchmarks show a reduction in startup time from around 3 minutes to around just 30 seconds:1

  • Cold CI runner machine: 20s to run git clone and 2m 30s to initialize Bazel
  • Engflow CI runner with warm Bazel: ~30s total time to first action.

For GitHub Actions jobs running on EngFlow CI runners with warm Bazel, our benchmarks show a reduction in startup time from around 3m 10s to around 40 seconds:2

  • Cold CI runner machine: 20s seconds to run git clone, 10s internal buffering time for GitHub servers, and 2m 30s to initialize Bazel
  • Engflow CI runner with warm Bazel: ~40s seconds total time to first action.
Factors influencing startup time

The end-to-end time involves the following variables:

  • Snapshot size (RAM): Reviving the process snapshot currently takes about 0.5gbyte/s, by heap size. We incur this cost on each warm run.
  • Machine acquisition: This depends on AWS/GCP inventory.
  • Machine boot: Per current benchmarks, this takes ~12s (on GCP clusters) and ~25s (on AWS clusters).
  • Downloading/installing snapshot: This is limited by network/disk bandwidth. In practice, disk bandwidth is the limiting factor. We recommend using instances with local SSD.

Benefits of using EngFlow CI runners with warm Bazel

Reduced compute costs: The runner pool auto-scales with the Engflow cluster. Auto-scaling helps eliminate costs from idle VMs while also guaranteeing that runners will be available when there is a surge in queued jobs.

Eliminate maintenance overhead: We provision and manage the runner fleet, freeing your team from maintaining self-hosted infrastructure.

Lower latency: Since our CI runners are EngFlow worker VMs in the cluster, Bazel runs as close to the cluster as possible. This ensures minimal latency to the remote cache.

Supported cloud providers

CI runners with warm Bazel are available on EngFlow clusters hosted on GCP and AWS.

Supported host platforms (CPU and OS)

Warm Bazel relies on container snapshotting technology that is Linux-only. As such, we only support CI jobs running on Linux hosts, on x86-64 and arm64 architectures.

Known limitations

As of November 2025:

  • Multi-tenancy is not supported.
  • We don't support Docker-in-Docker for local actions. If your build or test actions run sibling containers, these actions must run remotely (e.g. on a worker pool using sysbox) and cannot run locally on the worker that executes the CI agent. For example, if your tests start a database in a Docker container, you have to run these tests using remote execution.

  1. Benchmarking data as of August 2025, for an Engflow repository using Buildkite and the Bazel 9 rolling release. 

  2. Benchmarking data as of August 2025, for an Engflow repository using GitHub Actions and the Bazel 9 rolling release.