Skip to content

Commit

Permalink
Remote Output Service: place bazel-out/ on a FUSE file system
Browse files Browse the repository at this point in the history
Builds that yield large output files can generate lots of network
bandwidth when remote execution is used. To combat this, we have flags
such as --remote_download_minimal to disable downloading of output
files. Unfortunately, this makes it hard to perform ad hoc exploration
of build outputs.

In an attempt to solve this, this change adds an option to Bazel named
--remote_output_service. When enabled, Bazel effectively gives up the
responsibility of managing a bazel-out/ directory. Instead, it calls
into a gRPC service to request a directory and creates a symlink that
points to it.

Smart implementations of this gRPC service may use things like FUSE to
let this replacement bazel-out/ directory be lazy-loading, thereby
reducing network bandwidth significantly. In order to create
lazy-loading files and directories, Bazel can call into a BatchCreate()
RPC that takes a list of Output{File,Directory,Symlink} messages,
similar to REv2's ActionResult. This call is also used to create
runfiles directories by providing fictive instances of OutputSymlink.

To prevent Bazel from reading the contents of files stored in the FUSE
file system (which would cause network I/O), the protocol offers a
BatchStat() call that can return information such as the REv2 digest.
Though this is redundant with --unix_digest_hash_attribute_name, there
are a couple of reasons why I added this feature:

1. For non-Linux operating systems, it may make more sense to use NFSv4
   instead of FUSE (i.e., running a virtual NFS daemon on localhost).
   Even though RFC 8276 adds support for extended attributes to NFSv4,
   not all operating systems implement it.

2. It addresses the security/hermeticity concerns that people raised
   when this feature was added. There is no way to add extended
   attributes to files that can't be tampered with (as a non-root user),
   while using gRPC solves that problem inherently.

3. Callers of Bazel's BatchStat.batchStat() may generate many system
   calls successively. This causes a large number of context switches
   between Bazel and the FUSE daemon. Using gRPC seems to be cheaper.

By requiring that the output path returned by the gRPC service is
writable, no-remote actions can still run as before, both with
sandboxing enabled and disabled. The only difference is that it will use
space on the gRPC service side, as opposed to the user's home directory
(though the gRPC service may continue to store data in the user's home
directory.

I have a server implementation is written in Go on top of Buildbarn's
storage and file system layer. My plan is to release the code for this
service as soon as I've got a 'thumbs up' on the overall approach.
  • Loading branch information
EdSchouten authored and moroten committed Oct 24, 2023
1 parent 2cb3159 commit 34b8811
Show file tree
Hide file tree
Showing 14 changed files with 1,193 additions and 25 deletions.
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
package com.google.devtools.build.lib.remote;

import build.bazel.remote.execution.v2.ActionResult;
import com.google.common.util.concurrent.ListenableFuture;

public interface ActionResultDownloader {
public ListenableFuture<Void> downloadActionResult(ActionResult actionResult);
}
2 changes: 2 additions & 0 deletions src/main/java/com/google/devtools/build/lib/remote/BUILD
Original file line number Diff line number Diff line change
Expand Up @@ -127,6 +127,8 @@ java_library(
"//src/main/java/com/google/devtools/build/skyframe:skyframe-objects",
"//src/main/java/com/google/devtools/common/options",
"//src/main/protobuf:failure_details_java_proto",
"//src/main/protobuf:remote_output_service_java_grpc",
"//src/main/protobuf:remote_output_service_java_proto",
"//src/main/protobuf:spawn_java_proto",
"//third_party:auth",
"//third_party:caffeine",
Expand Down
Loading

0 comments on commit 34b8811

Please sign in to comment.