Skip to content

Commit

Permalink
add option to mangle AC keys with (non-empty) instance names
Browse files Browse the repository at this point in the history
In some client setups, untracked local files can be used by an action
without being included in the Action message, which causes action cache
collisions: bazelbuild/bazel#4558

Ideally this should be fixed on the client side (either in the client,
or in the build configuration), but it is not always easy to do in
practice.

As a workaround, this patch adds a setting to mangle ActionCache keys
with the instance name provided by the client (if it is not empty), to
produce a new ActionCache key. Clients are then able to specify a
different instance name whenever a change is made that could affect
these untracked inputs. The instance name value could be something like
the hash of the compiler version. This allows multiple ActionCache items
to exist in the cache, without requiring a change to the on-disk storage
format.

This feature is disabled by default, since it would cause cache
invalidations for existing users.

Fixes buchgr#15.
  • Loading branch information
mostynb committed Aug 25, 2020
1 parent 8ac5616 commit d056313
Show file tree
Hide file tree
Showing 10 changed files with 224 additions and 115 deletions.
77 changes: 45 additions & 32 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,13 +15,21 @@ commodity hardware and AWS servers. Outgoing bandwidth can exceed 15 Gbit/s on t

Cache entries are set and retrieved by key, and there are two types of keys that can be used:
1. Content addressed storage (CAS), where the key is the lowercase SHA256 hash of the stored value.
The REST API for these entries is: `/cas/<key>` or with an optional but ignored cache pool name: `/<pool>/cas/<key>`.
The REST API for these entries is: `/cas/<key>` or with an optional but ignored instance name:
`/<instance>/cas/<key>`.
2. Action cache, where the key is an arbitrary 64 character lowercase hexadecimal string.
Bazel uses the SHA256 hash of an action as the key, to store the metadata created by the action.
The REST API for these entries is: `/ac/<key>` or with an optional cache pool name: `/<pool>/ac/<key>`.
The REST API for these entries is: `/ac/<key>` or with an optional instance name: `/<instance>/ac/<key>`.

Values are stored via HTTP PUT requests, and retrieved via GET requests. HEAD requests can be used to confirm
whether a key exists or not.
Values are stored via HTTP PUT requests, and retrieved via GET requests.
HEAD requests can be used to confirm whether a key exists or not.

If the `--enable_ac_key_instance_mangling` flag is specified and the instance
name is not empty, then action cache keys are hashed along with the instance
name to produce the action cache lookup key. Since the URL path is processed
with Go's [path.Clean](https://golang.org/pkg/path/#Clean) function before
extracting the instance name, clients should avoid using repeated slashes,
`./` and `../` in the URL.

Values stored in the action cache are validated as an ActionResult protobuf message as per the
[Bazel Remote Execution API v2](https://github.com/bazelbuild/remote-apis/blob/master/build/bazel/remote/execution/v2/remote_execution.proto)
Expand Down Expand Up @@ -67,6 +75,10 @@ bazel-remote also supports the ActionCache, ContentAddressableStorage and Capabi
[Bazel Remote Execution API v2](https://github.com/bazelbuild/remote-apis/blob/master/build/bazel/remote/execution/v2/remote_execution.proto),
and the corresponding parts of the [Byte Stream API](https://github.com/googleapis/googleapis/blob/master/google/bytestream/bytestream.proto).

When using the `--enable_ac_key_instance_mangling` feature, clients are
advised to avoid repeated slashes, `../` and `./` strings in the instance
name, for consistency with the HTTP interface.

### Prometheus Metrics

To query endpoint metrics see [github.com/grpc-ecosystem/go-grpc-prometheus's metrics documentation](https://github.com/grpc-ecosystem/go-grpc-prometheus#metrics).
Expand Down Expand Up @@ -102,34 +114,35 @@ DESCRIPTION:
A remote build cache for Bazel.
GLOBAL OPTIONS:
--config_file value Path to a YAML configuration file. If this flag is specified then all other flags are ignored. [$BAZEL_REMOTE_CONFIG_FILE]
--dir value Directory path where to store the cache contents. This flag is required. [$BAZEL_REMOTE_DIR]
--max_size value The maximum size of the remote cache in GiB. This flag is required. (default: -1) [$BAZEL_REMOTE_MAX_SIZE]
--host value Address to listen on. Listens on all network interfaces by default. [$BAZEL_REMOTE_HOST]
--port value The port the HTTP server listens on. (default: 8080) [$BAZEL_REMOTE_PORT]
--grpc_port value The port the gRPC server listens on. Set to 0 to disable. (default: 9092) [$BAZEL_REMOTE_GRPC_PORT]
--profile_host value A host address to listen on for profiling, if enabled by a valid --profile_port setting. (default: "127.0.0.1") [$BAZEL_REMOTE_PROFILE_HOST]
--profile_port value If a positive integer, serve /debug/pprof/* URLs from http://profile_host:profile_port. (default: 0, ie profiling disabled) [$BAZEL_REMOTE_PROFILE_PORT]
--http_read_timeout value The HTTP read timeout for a client request in seconds (does not apply to the proxy backends or the profiling endpoint) (default: 0s, ie disabled) [$BAZEL_REMOTE_HTTP_READ_TIMEOUT]
--http_write_timeout value The HTTP write timeout for a server response in seconds (does not apply to the proxy backends or the profiling endpoint) (default: 0s, ie disabled) [$BAZEL_REMOTE_HTTP_WRITE_TIMEOUT]
--htpasswd_file value Path to a .htpasswd file. This flag is optional. Please read https://httpd.apache.org/docs/2.4/programs/htpasswd.html. [$BAZEL_REMOTE_HTPASSWD_FILE]
--tls_enabled This flag has been deprecated. Specify tls_cert_file and tls_key_file instead. (default: false) [$BAZEL_REMOTE_TLS_ENABLED]
--tls_cert_file value Path to a pem encoded certificate file. [$BAZEL_REMOTE_TLS_CERT_FILE]
--tls_key_file value Path to a pem encoded key file. [$BAZEL_REMOTE_TLS_KEY_FILE]
--idle_timeout value The maximum period of having received no request after which the server will shut itself down. (default: 0s, ie disabled) [$BAZEL_REMOTE_IDLE_TIMEOUT]
--s3.endpoint value The S3/minio endpoint to use when using S3 cache backend. [$BAZEL_REMOTE_S3_ENDPOINT]
--s3.bucket value The S3/minio bucket to use when using S3 cache backend. [$BAZEL_REMOTE_S3_BUCKET]
--s3.prefix value The S3/minio object prefix to use when using S3 cache backend. [$BAZEL_REMOTE_S3_PREFIX]
--s3.access_key_id value The S3/minio access key to use when using S3 cache backend. [$BAZEL_REMOTE_S3_ACCESS_KEY_ID]
--s3.secret_access_key value The S3/minio secret access key to use when using S3 cache backend. [$BAZEL_REMOTE_S3_SECRET_ACCESS_KEY]
--s3.disable_ssl Whether to disable TLS/SSL when using the S3 cache backend. (default: false, ie enable TLS/SSL) [$BAZEL_REMOTE_S3_DISABLE_SSL]
--s3.iam_role_endpoint value Endpoint for using IAM security credentials. By default it will look for credentials in the standard locations for the AWS platform. [$BAZEL_REMOTE_S3_IAM_ROLE_ENDPOINT]
--s3.region value The AWS region. Required when not specifying S3/minio access keys. [$BAZEL_REMOTE_S3_REGION]
--disable_http_ac_validation Whether to disable ActionResult validation for HTTP requests. (default: false, ie enable validation) [$BAZEL_REMOTE_DISABLE_HTTP_AC_VALIDATION]
--disable_grpc_ac_deps_check Whether to disable ActionResult dependency checks for gRPC GetActionResult requests. (default: false, ie enable ActionCache dependency checks) [$BAZEL_REMOTE_DISABLE_GRPS_AC_DEPS_CHECK]
--enable_endpoint_metrics Whether to enable metrics for each HTTP/gRPC endpoint. (default: false, ie disable metrics) [$BAZEL_REMOTE_ENABLE_ENDPOINT_METRICS]
--experimental_remote_asset_api Whether to enable the experimental remote asset API implementation. (default: false, ie disable remote asset API) [$BAZEL_REMOTE_EXPERIMENTAL_REMOTE_ASSET_API]
--help, -h show help (default: false)
--config_file value Path to a YAML configuration file. If this flag is specified then all other flags are ignored. [$BAZEL_REMOTE_CONFIG_FILE]
--dir value Directory path where to store the cache contents. This flag is required. [$BAZEL_REMOTE_DIR]
--max_size value The maximum size of the remote cache in GiB. This flag is required. (default: -1) [$BAZEL_REMOTE_MAX_SIZE]
--host value Address to listen on. Listens on all network interfaces by default. [$BAZEL_REMOTE_HOST]
--port value The port the HTTP server listens on. (default: 8080) [$BAZEL_REMOTE_PORT]
--grpc_port value The port the gRPC server listens on. Set to 0 to disable. (default: 9092) [$BAZEL_REMOTE_GRPC_PORT]
--profile_host value A host address to listen on for profiling, if enabled by a valid --profile_port setting. (default: "127.0.0.1") [$BAZEL_REMOTE_PROFILE_HOST]
--profile_port value If a positive integer, serve /debug/pprof/* URLs from http://profile_host:profile_port. (default: 0, ie profiling disabled) [$BAZEL_REMOTE_PROFILE_PORT]
--http_read_timeout value The HTTP read timeout for a client request in seconds (does not apply to the proxy backends or the profiling endpoint) (default: 0s, ie disabled) [$BAZEL_REMOTE_HTTP_READ_TIMEOUT]
--http_write_timeout value The HTTP write timeout for a server response in seconds (does not apply to the proxy backends or the profiling endpoint) (default: 0s, ie disabled) [$BAZEL_REMOTE_HTTP_WRITE_TIMEOUT]
--htpasswd_file value Path to a .htpasswd file. This flag is optional. Please read https://httpd.apache.org/docs/2.4/programs/htpasswd.html. [$BAZEL_REMOTE_HTPASSWD_FILE]
--tls_enabled This flag has been deprecated. Specify tls_cert_file and tls_key_file instead. (default: false) [$BAZEL_REMOTE_TLS_ENABLED]
--tls_cert_file value Path to a pem encoded certificate file. [$BAZEL_REMOTE_TLS_CERT_FILE]
--tls_key_file value Path to a pem encoded key file. [$BAZEL_REMOTE_TLS_KEY_FILE]
--idle_timeout value The maximum period of having received no request after which the server will shut itself down. (default: 0s, ie disabled) [$BAZEL_REMOTE_IDLE_TIMEOUT]
--s3.endpoint value The S3/minio endpoint to use when using S3 cache backend. [$BAZEL_REMOTE_S3_ENDPOINT]
--s3.bucket value The S3/minio bucket to use when using S3 cache backend. [$BAZEL_REMOTE_S3_BUCKET]
--s3.prefix value The S3/minio object prefix to use when using S3 cache backend. [$BAZEL_REMOTE_S3_PREFIX]
--s3.access_key_id value The S3/minio access key to use when using S3 cache backend. [$BAZEL_REMOTE_S3_ACCESS_KEY_ID]
--s3.secret_access_key value The S3/minio secret access key to use when using S3 cache backend. [$BAZEL_REMOTE_S3_SECRET_ACCESS_KEY]
--s3.disable_ssl Whether to disable TLS/SSL when using the S3 cache backend. (default: false, ie enable TLS/SSL) [$BAZEL_REMOTE_S3_DISABLE_SSL]
--s3.iam_role_endpoint value Endpoint for using IAM security credentials. By default it will look for credentials in the standard locations for the AWS platform. [$BAZEL_REMOTE_S3_IAM_ROLE_ENDPOINT]
--s3.region value The AWS region. Required when not specifying S3/minio access keys. [$BAZEL_REMOTE_S3_REGION]
--disable_http_ac_validation Whether to disable ActionResult validation for HTTP requests. (default: false, ie enable validation) [$BAZEL_REMOTE_DISABLE_HTTP_AC_VALIDATION]
--disable_grpc_ac_deps_check Whether to disable ActionResult dependency checks for gRPC GetActionResult requests. (default: false, ie enable ActionCache dependency checks) [$BAZEL_REMOTE_DISABLE_GRPS_AC_DEPS_CHECK]
--enable_ac_key_instance_mangling Whether to enable mangling ActionCache keys with non-empty instance names. (default: false, ie disable mangling) [$BAZEL_REMOTE_ENABLE_AC_KEY_INSTANCE_MANGLING]
--enable_endpoint_metrics Whether to enable metrics for each HTTP/gRPC endpoint. (default: false, ie disable metrics) [$BAZEL_REMOTE_ENABLE_ENDPOINT_METRICS]
--experimental_remote_asset_api Whether to enable the experimental remote asset API implementation. (default: false, ie disable remote asset API) [$BAZEL_REMOTE_EXPERIMENTAL_REMOTE_ASSET_API]
--help, -h show help (default: false)
```

### Example configuration file
Expand Down
21 changes: 21 additions & 0 deletions cache/cache.go
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
package cache

import (
"crypto/sha256"
"encoding/hex"
"io"
)

Expand Down Expand Up @@ -64,3 +66,22 @@ type Proxy interface {
// unknown).
Contains(kind EntryKind, hash string) (bool, int64)
}

// TransformActionCacheKey takes an ActionCache key and an instance name
// and returns a new ActionCache key to use instead. If the instance name
// is empty, then the original key is returned unchanged.
func TransformActionCacheKey(key, instance string, logger Logger) string {
if instance == "" {
return key
}

h := sha256.New()
h.Write([]byte(key))
h.Write([]byte(instance))
b := h.Sum(nil)
newKey := hex.EncodeToString(b[:])

logger.Printf("REMAP AC HASH %s : %s => %s", key, instance, newKey)

return newKey
}
99 changes: 54 additions & 45 deletions config/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -36,58 +36,67 @@ type HTTPBackendConfig struct {

// Config holds the top-level configuration for bazel-remote.
type Config struct {
Host string `yaml:"host"`
Port int `yaml:"port"`
GRPCPort int `yaml:"grpc_port"`
ProfileHost string `yaml:"profile_host"`
ProfilePort int `yaml:"profile_port"`
Dir string `yaml:"dir"`
MaxSize int `yaml:"max_size"`
HtpasswdFile string `yaml:"htpasswd_file"`
TLSCertFile string `yaml:"tls_cert_file"`
TLSKeyFile string `yaml:"tls_key_file"`
S3CloudStorage *S3CloudStorageConfig `yaml:"s3_proxy"`
GoogleCloudStorage *GoogleCloudStorageConfig `yaml:"gcs_proxy"`
HTTPBackend *HTTPBackendConfig `yaml:"http_proxy"`
IdleTimeout time.Duration `yaml:"idle_timeout"`
DisableHTTPACValidation bool `yaml:"disable_http_ac_validation"`
DisableGRPCACDepsCheck bool `yaml:"disable_grpc_ac_deps_check"`
EnableEndpointMetrics bool `yaml:"enable_endpoint_metrics"`
ExperimentalRemoteAssetAPI bool `yaml:"experimental_remote_asset_api"`
HTTPReadTimeout time.Duration `yaml:"http_read_timeout"`
HTTPWriteTimeout time.Duration `yaml:"http_write_timeout"`
Host string `yaml:"host"`
Port int `yaml:"port"`
GRPCPort int `yaml:"grpc_port"`
ProfileHost string `yaml:"profile_host"`
ProfilePort int `yaml:"profile_port"`
Dir string `yaml:"dir"`
MaxSize int `yaml:"max_size"`
HtpasswdFile string `yaml:"htpasswd_file"`
TLSCertFile string `yaml:"tls_cert_file"`
TLSKeyFile string `yaml:"tls_key_file"`
S3CloudStorage *S3CloudStorageConfig `yaml:"s3_proxy"`
GoogleCloudStorage *GoogleCloudStorageConfig `yaml:"gcs_proxy"`
HTTPBackend *HTTPBackendConfig `yaml:"http_proxy"`
IdleTimeout time.Duration `yaml:"idle_timeout"`
DisableHTTPACValidation bool `yaml:"disable_http_ac_validation"`
DisableGRPCACDepsCheck bool `yaml:"disable_grpc_ac_deps_check"`
EnableACKeyInstanceMangling bool `yaml:"enable_ac_key_instance_mangling"`
EnableEndpointMetrics bool `yaml:"enable_endpoint_metrics"`
ExperimentalRemoteAssetAPI bool `yaml:"experimental_remote_asset_api"`
HTTPReadTimeout time.Duration `yaml:"http_read_timeout"`
HTTPWriteTimeout time.Duration `yaml:"http_write_timeout"`
}

// New returns a validated Config with the specified values, and an error
// if there were any problems with the validation.
func New(dir string, maxSize int, host string, port int, grpcPort int,
profileHost string, profilePort int, htpasswdFile string,
tlsCertFile string, tlsKeyFile string, idleTimeout time.Duration,
s3 *S3CloudStorageConfig, disableHTTPACValidation bool,
disableGRPCACDepsCheck bool, enableEndpointMetrics bool,
profileHost string, profilePort int,
htpasswdFile string,
tlsCertFile string,
tlsKeyFile string,
idleTimeout time.Duration,
s3 *S3CloudStorageConfig,
disableHTTPACValidation bool,
disableGRPCACDepsCheck bool,
enableACKeyInstanceMangling bool,
enableEndpointMetrics bool,
experimentalRemoteAssetAPI bool,
httpReadTimeout time.Duration, httpWriteTimeout time.Duration) (*Config, error) {
httpReadTimeout time.Duration,
httpWriteTimeout time.Duration) (*Config, error) {
c := Config{
Host: host,
Port: port,
GRPCPort: grpcPort,
ProfileHost: profileHost,
ProfilePort: profilePort,
Dir: dir,
MaxSize: maxSize,
HtpasswdFile: htpasswdFile,
TLSCertFile: tlsCertFile,
TLSKeyFile: tlsKeyFile,
S3CloudStorage: s3,
GoogleCloudStorage: nil,
HTTPBackend: nil,
IdleTimeout: idleTimeout,
DisableHTTPACValidation: disableHTTPACValidation,
DisableGRPCACDepsCheck: disableGRPCACDepsCheck,
EnableEndpointMetrics: enableEndpointMetrics,
ExperimentalRemoteAssetAPI: experimentalRemoteAssetAPI,
HTTPReadTimeout: httpReadTimeout,
HTTPWriteTimeout: httpWriteTimeout,
Host: host,
Port: port,
GRPCPort: grpcPort,
ProfileHost: profileHost,
ProfilePort: profilePort,
Dir: dir,
MaxSize: maxSize,
HtpasswdFile: htpasswdFile,
TLSCertFile: tlsCertFile,
TLSKeyFile: tlsKeyFile,
S3CloudStorage: s3,
GoogleCloudStorage: nil,
HTTPBackend: nil,
IdleTimeout: idleTimeout,
DisableHTTPACValidation: disableHTTPACValidation,
DisableGRPCACDepsCheck: disableGRPCACDepsCheck,
EnableACKeyInstanceMangling: enableACKeyInstanceMangling,
EnableEndpointMetrics: enableEndpointMetrics,
ExperimentalRemoteAssetAPI: experimentalRemoteAssetAPI,
HTTPReadTimeout: httpReadTimeout,
HTTPWriteTimeout: httpWriteTimeout,
}

err := validateConfig(&c)
Expand Down
28 changes: 15 additions & 13 deletions config/config_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ htpasswd_file: /opt/.htpasswd
tls_cert_file: /opt/tls.cert
tls_key_file: /opt/tls.key
disable_http_ac_validation: true
enable_ac_key_instance_mangling: true
enable_endpoint_metrics: true
experimental_remote_asset_api: true
http_read_timeout: 5s
Expand All @@ -31,19 +32,20 @@ http_write_timeout: 10s
}

expectedConfig := &Config{
Host: "localhost",
Port: 8080,
GRPCPort: 9092,
Dir: "/opt/cache-dir",
MaxSize: 100,
HtpasswdFile: "/opt/.htpasswd",
TLSCertFile: "/opt/tls.cert",
TLSKeyFile: "/opt/tls.key",
DisableHTTPACValidation: true,
EnableEndpointMetrics: true,
ExperimentalRemoteAssetAPI: true,
HTTPReadTimeout: 5 * time.Second,
HTTPWriteTimeout: 10 * time.Second,
Host: "localhost",
Port: 8080,
GRPCPort: 9092,
Dir: "/opt/cache-dir",
MaxSize: 100,
HtpasswdFile: "/opt/.htpasswd",
TLSCertFile: "/opt/tls.cert",
TLSKeyFile: "/opt/tls.key",
DisableHTTPACValidation: true,
EnableACKeyInstanceMangling: true,
EnableEndpointMetrics: true,
ExperimentalRemoteAssetAPI: true,
HTTPReadTimeout: 5 * time.Second,
HTTPWriteTimeout: 10 * time.Second,
}

if !reflect.DeepEqual(config, expectedConfig) {
Expand Down
Loading

0 comments on commit d056313

Please sign in to comment.