Adding Suggestions to the Bazel Invocation Analyzer¶
This guide walks you through the steps to add new suggestions to the Bazel Invocation Analyzer by way of an example. This example covers most of the component types in the Bazel Invocation Architecture, with the notable exception of a provider that consumes external data (such as the Bazel Profile), which is an advanced topic. If you intend to create a provider that consumes a new data source, please study the Bazel Profile as a template.
You should be familiar with the Bazel Invocation Architecture. This will give you the big picture of how the various components fit together.
Note that this walk-through shows you the major aspects of the code. Some of the more mundane (but important) details such as error handling are left out for clarity. You can see these details in the actual classes the analyzer uses which are linked in the appropriate sections below.
The first step to adding a suggestion is to identify a pattern in the data contained in Bazel profiles that indicates an opportunity for improvement. In this example, we will use the garbage collection data contained in Bazel profiles to provide a suggestion if the garbage collection events are taking longer than expected. The Bazel profile contains data about each garbage collection event, including timing:
In this guide, we will create all the components necessary to extract the data and report on excessive Java garbage collection in Bazel.
For our analysis, we need the total duration of major garbage collection events in the BazelProfile. We need to define a Datum to represent this value. Datums implement the
Datum interface and simply hold the data they represent:
Next, we'll need a Data Provider which extracts our value from the Bazel profile and makes our Datum available. Data Providers extend the
DataProvider abstract base class and implement the
getSuppliers() function to register the
Datum they provide:
Here we're using a builder to create the
DatumSupplierSpecification based on the Datum class we're providing, as well as using a class function to return the actual data. We're wrapping it in the optional
memoized helper, which caches the resulting Datum so we don't recalculate it every time it gets requested by various other components. This is optional: If, for example, the Datum was a large stream of data we might not want it memoized.
To extract the data we're providing, we need to retrieve the Bazel profile from the Data Manager and extract the data we need from it:
Here we're retrieving the events from the garbage collection thread in the profile, filtering them down to just the major garbage collections (by event name and category), and summing the durations. (See the actual implementation for complete details with error handling.)
Finally, we need to add our new Data Provider in
DataProviderUtil.getAllDataProviders so that it automatically gets registered with the
Now that we have the total major garbage collection time available, we can add the actual suggestion in a Suggestion Provider. Suggestion Providers extend the abstract
SuggestionProviderBase class and implement the
getSuggestions virtual function.
First, we'll retrieve the
GarbageCollectionStatsDataProvider we created above from the Data Manager in the same way we retrieved the
BazelProfile, check the data to see if it warrants a suggestion, and if so create the suggestion and return it. If the suggestion is not relevant, we simply return an empty list of suggestions:
We have a Suggestion Provider Utility class that helps us build suggestions and all the related fields. The
createSuggestion helper takes the following elements of a suggestion:
- Title - short title for the suggestion
- Recommendation - this is the body of the recommendation itself, including details
- Potential Improvement (optional) - how much faster could this invocation be if this suggestion is implemented
- Rationale (optional) - why this suggestion is being made based on the profile analyzed
- Caveats (optional) - any stipulations about why this suggestion was made, or other information that could have been useful in validating or improving the suggestion
For this example, we'll simply suggest increasing the Java heap size to give Bazel more memory to work with before garbage collection is necessary. We want to give as much detail as possible, including the Bazel flag (
--host_jvm_args) that is used to adjust this size. We also want to give the rationale for why we're making this suggestion.
To provide information about the potential improvement, we want to compare the time in garbage collection to the total duration of the invocation to put it in perspective. Luckily there's already a Datum that contains this value;
TotalDuration. We don't need to know what Data Provider provides it, we just need to retrieve it from the Data Manager:
We can use another function in the Suggestion Provider Utility class to help us create the potential improvement.
createPotentialImprovement takes the message we want to present as well as the potential reduction percentage.
We'll also use functions in Duration Utility class to both format the times we have, as well as to calculate the reduction percentage.
You can see this all come together, along with error handling and an additional suggestion about reducing memory usage in the actual
Finally, we need to add our new Suggestion Provider in
SuggestionProviderUtil.getAllSuggestionProviders so that it automatically gets applied to profile analysis requests:
We now have a working Suggestion Provider utilizing the data from our Data Provider!
Running the profile analyzer on a Bazel profile with major garbage collection events will produce output similar to:
If you still have questions, see potential improvements, or would like to provide feedback you can get in touch with us by:
- Emailing us at email@example.com
- Filing an issue on GitHub
- Using the contact form on www.engflow.com