Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade bazel-ios-fork to v2.9.0 #8

Draft
wants to merge 218 commits into
base: bazel-ios-fork
Choose a base branch
from

Conversation

chenj-hub
Copy link

No description provided.

werkt and others added 30 commits June 2, 2023 20:37
Throwable indicates that the response to getMessage() may be null.
Removed short circuit for executeWorkers which should never inspire
publish
Added memoized recentExecuteWorkers which will be delayed by currently
const workerSetMaxAge
Cleaned up getStorageWorkers, used grpc Deadline as premium, though
already expired is awkward

Storage removal is overzealous and will remove matching execute workers.
Received worker changes continue to only affect storage.
Adjustments will make use of options. Ensure that they have the
specified values from the commandline, and they have no checked
exception throws.
FMBs over a specified size limit (4MB in practice from grpc limits) will
be split log2n until under the limit to make requests.

Tests added to verify this. Coverage of split behavior confirmed.

Fixes buildfarm#1375
A directory which is missing during the course of validation should not
be identified as missing. This prevents a NPE and inspires validation to
emit one MISSING violation per path to a missing directory for
preconditions.

Fixes buildfarm#1374
Directories reevaluated only under the enumeration hierarchy must still
be guarded against empty child directories in their checks, and must
handle child directories missing in the index safely, with precondition
failures matching their outputs. Order is not guaranteed in precondition
output, but tests now guard this case.

Fixes buildfarm#1299
Include Action Mnemonic, Target Id, and Configuration Id in the bf-cat
output suite for RequestMetadata.
An invocation of app.run which fails for any reason in the spring
framework will exit silently. Ensure that errors are presented before
exiting the application.
* Enable custom latency buckets

* run formatter

* Remove unused load from build

* Run buildifier

* update example config with infinity bucket

---------

Co-authored-by: Trevor Hickey <[email protected]>
Avoid fileStore recalculation for entire trees to be deleted, instead
expect the callers to provide a fileStore, and that the entire tree
exists within it.
ExceutionExceptions wrap the actual exceptions of futures experienced
during putDirectory, and add no tracing capacity. Unwrap these when
thrown from failed futures.
Bleed grpc exposure into a retrier for copyExternalInput invocations,
and ensure that enough bytes have been provided from the requested blob
before returning.
The only presence of arbitrary symlinks in the CAS Filesystem is under
directories. Symlinks are explicitly identified as
non-readonly-executables. Prevent a dead symlink from throwing NSFE due
to the readonly check.
Files are delivered via readdir in utf8 encoding (on linux for xfs at
least), assume that posix will mandate this.
)

This reverts mistakenly added checks for non-null deadlines in StubInstance.
* Guard against fetchBlobFromWorker orphanization

Any exception thrown by fetchBlobFromWorker will leave the blobObserver
hanging. Ensure that the observer sees a failure and does not hang.

* Restore FMB functionality included in revert.
Handle ExecDirExceptions in InputFetcher such that the client can
observe a FAILED_PRECONDITION with PreconditionFailures that include
Violations with VIOLATION_TYPE_MISSING to inspire virtuous loop
reestablishment. A ViolationException acts as a container for this, and
can be extended to include the components of a putDirectory. Interpret
PutDirectoryExceptions with their imposed violations. Missing inputs at
the time of execute, whether linked as immediate files or through the
directory cache, will be interpreted as VIOLATION_TYPE_MISSING.
### Problem 
When workers die, their stored references are not removed from the backplane. This creates the possibility that new workers may come up with the same IP address or use an IP address previously used by another terminated host. As a result, the backplane becomes unreliable, requiring us to query each worker individually to find missing blobs. Clearly, this approach is not scalable since any problems encountered by a single worker can significantly impact the performance of the buildfarm.

### Past Work
We made code modifications for the `findMissingBlobs` function to exclusively query the backplane, prs: buildfarm#1310, buildfarm#1333, and buildfarm#1342. This update implemented the `findMissingViaBackplane` flag. However, the above issues made the `findMissingViaBackplane` flag ineffective.

### Solution
To address the issue of imposter workers, updated code to compare the start time of each worker (first_registered_at) with the insertion time of the digest. Any worker whose start time is later than the digest insertion time is considered an imposter worker. Also, the code removes imposter workers associated with the digest in the same function call.

**first_registered_at**: Added new field first_registered_at to the worker data type. This field stores the initial start time of the worker. Worker  informs the backplane about its start time, which is the same as the creation time of the cache directory (where all digests are stored) on the worker's disk.

**digest insert time**: The digest insertion time is calculated using the Time to Live (TTL) of the digest and the casExpire time. The formula for determining the digest insertion time is now() - configured casExpire + remaining ttl. In the current implementation, each worker updates the TTL of the digest upon completing the write operation. This means that the cas insert time in the backplane corresponds to the time when the last worker finished writing the digest on its disk.

### Testing
Deployed the change to our buildfarm staging, and ran full monorepo build. To make sure that the code change solve terminated worker problem, terminated bunch of workers in the middle of build. This caused temporary not_found `error`, which eventually faded away (fmb call autocorrect blob location).
<img width="1385" alt="Screenshot 2023-06-21 at 12 36 47 PM" src="https://github.com/bazelbuild/bazel-buildfarm/assets/119983081/62fcf8e0-847a-4632-b49e-cef2c17321dc">
In the above graph terminated workers during first build.


### Future Improvement
The above solution might not work if user updates `cas_expire` time between two deployments as algorithm to calculate `digest_insert_time` depends to `cas_expire` time. 

closes buildfarm#1371
When a server cannot acquire a transform token for an extended period of
time, assume that it is malfunctioning and initiate a shutdown.
Worker loss can signal cascading failure and shutdown for execute-only
peers. Ensure that a ReportResultStage seeing an SRE does not close the
stage, and that there are basic retries for remote uploads.
Works with current redis-py-cluster 2.1.3
werkt and others added 23 commits January 3, 2024 10:50
* Guard against writeObserver null race

* Avoid cancellation log for StubWriteOutputStream

Cancels will happen for all server->worker uploads on context cancels
initiated by clients, and are normal behaviors.

* Guarantee null write response for onCompleted

Avoid a complaint by gRPC that a client-streaming request was
completed without a response

* Reset remote CAS write on initial

Prevents the StubWriteOutputStream from issuing an unnecessary initial
queryWriteStatus.
These instances of `format(...)` do not have placeholders and there's
nothing to format.
* chore: Update proto file styling

* fix path

* move protp styling to file

* add new line at eof
…dfarm#1549)

* fix: Periodically Refresh Active Storage Workers With startTime
Node name strings provided via `cluster slots` will be byte arrays
that require decoding. Use the inbuilt SafeEncoder from jedis which is
used to decode all other strings.
* build: start adopting bzlmod

Only four bazel dependencies were found in the existing bzlmod registry (https://registry.bazel.build/)

* build: swap io_bazel_rules_go for bzlmod

* build: swap gazelle for bzlmod

* tests: support bzlmod

I don't know why. But it works.

* build: leave breadcrumbs for bzlmod migration

* build(chore): MODULE.bazel.lock

* build: swap buildtools for buildifier_prebuilt

There are conflicts with go tooling between buildtools and protobuf.
The awk was pulling out the denominator of the metric, not the percent.
Adjust the field.

Before:
```
current line coverage:  1625%
current function coverage:  340%
```

after:
```
current line coverage:  42%
current function coverage:  51%
```
… entire repo with tags: `helm/X.Y.Z-b1` (buildfarm#1602)

* add helm chart linting stage, bundle chart on helm/* tags

* fix triggers

* try try again

* always lint before bundling

* extract the helm chart version from the git tag ref

* fix name

* tickle the beast

* too much stack overflow

* better names

* that's output

* gussy up readme

* simplify

* stage 0.2.2

* put extraVolumeMounts under .Values.shardWorker

* omit defaults

* leave out emacs gitignore

* wrong example
…ildfarm#1606)

* Publish storage worker and execute worker pool size in prometheus

* run formatter

* add doc

---------

Co-authored-by: Yuriy Belenitsky <[email protected]>
Avoid stripping the existingHash, used to calculate the slot with suffix
modifiers for queue balancing. Client presentation may also move to
presenting each balanced queue name, removing the coordinated
calculation on the client.

The name could only have been used by a client prior for name
presentation or determination of queue names in redis from slot
arrangement.
* set expire and drop invocationId

* fix format

* patch configuration doc

* fix typo

* re-run ci use luxe's patch

* set expire when every active operation insert

* fix return code

* fix ci if statement

---------

Co-authored-by: wangpengfei.pfwang <[email protected]>
Co-authored-by: George Gensure <[email protected]>
* feat: add OSSF Scorecards workflow

Adding OSSF Scorecards workflow as github action. It runs periodially.  Output goes into Code Scanning alerts of GitHub.

* docs(README): add some badges

- OSSF Scorecard
- License (from GitHub)
- latest Release (from GitHub)
Remove rules_oss_audit and accompanying documentation
* fix template bugs with with

* stage chart v0.2.3
* rework for bitnami redis helm chart

* still doesn't work tho less specified

* disable redis pwd -- not for production use

* chart/bitnami-redis:18.14.2

* and later

* stage chart v0.3.0
@chenj-hub chenj-hub force-pushed the jackies/upgrade-bazel-buildfarm-to-v2.9.0 branch 3 times, most recently from 1141041 to 3977b90 Compare August 14, 2024 18:02
@chenj-hub chenj-hub changed the title Jackies/upgrade bazel buildfarm to v2.9.0 Upgrade bazel-ios-fork to v2.9.0 Aug 14, 2024
@chenj-hub
Copy link
Author

chenj-hub commented Aug 14, 2024

Resolved git conflict between v.2.9.0 and current bazel-ios-fork from running: git show d6c1859 --remerge-diff:

commit d6c1859fc818f183cc87a175791205cced4e787d
Merge: 0090ef07 14810a42
Author: Jackie Springstead-Chen <[email protected]>
Date:   Wed Aug 14 12:58:16 2024 -0400

    Merge commit '14810a42ec8ea1fd65009d1183dd37a8ace0a0e3' into jackies/upgrade-bazel-buildfarm-to-v2.9.0

diff --git a/defs.bzl b/defs.bzl
remerge CONFLICT (content): Merge conflict in defs.bzl
index 88bbd2e1..a115ad28 100644
--- a/defs.bzl
+++ b/defs.bzl
@@ -11,11 +11,7 @@ load(
 load("@io_grpc_grpc_java//:repositories.bzl", "IO_GRPC_GRPC_JAVA_OVERRIDE_TARGETS", "grpc_java_repositories")
 load("@com_google_protobuf//:protobuf_deps.bzl", "protobuf_deps")
 load("@com_grail_bazel_toolchain//toolchain:rules.bzl", "llvm_toolchain")
-<<<<<<< 0090ef07 (A temporary workaround of executors being used as storage worker)
-load("@io_bazel_rules_k8s//k8s:k8s.bzl", "k8s_repositories")
 load("@rules_pkg//:deps.bzl", "rules_pkg_dependencies")
-=======
->>>>>>> 14810a42 (migrate dependency to Bitnami Redis helm chart (#1637))
 
 IO_NETTY_MODULES = [
     "buffer",
@@ -71,17 +67,6 @@ def buildfarm_init(name = "buildfarm"):
                         "com.github.jnr:jnr-ffi:2.2.14",
                         "com.github.jnr:jnr-posix:3.1.17",
                         "com.github.pcj:google-options:1.0.0",
-<<<<<<< 0090ef07 (A temporary workaround of executors being used as storage worker)
-                        "com.github.serceman:jnr-fuse:0.5.5",
-                        "com.github.luben:zstd-jni:1.5.5-7",
-                        "com.github.oshi:oshi-core:6.4.0",
-                        "com.google.auth:google-auth-library-credentials:0.9.1",
-                        "com.google.auth:google-auth-library-oauth2-http:0.9.1",
-                        "com.google.code.findbugs:jsr305:3.0.1",
-                        "com.google.code.gson:gson:2.9.0",
-                        "com.google.errorprone:error_prone_annotations:2.9.0",
-                        "com.google.errorprone:error_prone_core:0.92",
-=======
                         "com.github.serceman:jnr-fuse:0.5.7",
                         "com.github.luben:zstd-jni:1.5.5-7",
                         "com.github.oshi:oshi-core:6.4.5",
@@ -91,7 +76,6 @@ def buildfarm_init(name = "buildfarm"):
                         "com.google.code.gson:gson:2.10.1",
                         "com.google.errorprone:error_prone_annotations:2.22.0",
                         "com.google.errorprone:error_prone_core:2.22.0",
->>>>>>> 14810a42 (migrate dependency to Bitnami Redis helm chart (#1637))
                         "com.google.guava:failureaccess:1.0.1",
                         "com.google.guava:guava:32.1.1-jre",
                         "com.google.j2objc:j2objc-annotations:2.8",
@@ -149,13 +133,8 @@ def buildfarm_init(name = "buildfarm"):
 
     grpc_java_repositories()
 
-<<<<<<< 0090ef07 (A temporary workaround of executors being used as storage worker)
-    k8s_repositories()
-
     rules_pkg_dependencies()
 
-=======
->>>>>>> 14810a42 (migrate dependency to Bitnami Redis helm chart (#1637))
     native.bind(
         name = "jar/redis/clients/jedis",
         actual = "@jedis//jar",
diff --git a/deps.bzl b/deps.bzl
remerge CONFLICT (content): Merge conflict in deps.bzl
index 1c5d80e8..9844cb88 100644
--- a/deps.bzl
+++ b/deps.bzl
@@ -7,37 +7,12 @@ load("@bazel_tools//tools/build_defs/repo:utils.bzl", "maybe")
 
 def archive_dependencies(third_party):
     return [
-<<<<<<< 0090ef07 (A temporary workaround of executors being used as storage worker)
-        {
-            "name": "platforms",
-            "urls": [
-                "https://mirror.bazel.build/github.com/bazelbuild/platforms/releases/download/0.0.6/platforms-0.0.6.tar.gz",
-                "https://github.com/bazelbuild/platforms/releases/download/0.0.6/platforms-0.0.6.tar.gz",
-            ],
-            "sha256": "5308fc1d8865406a49427ba24a9ab53087f17f5266a7aabbfc28823f3916e1ca",
-        },
-        {
-            "name": "rules_jvm_external",
-            "strip_prefix": "rules_jvm_external-%s" % RULES_JVM_EXTERNAL_TAG,
-            "sha256": RULES_JVM_EXTERNAL_SHA,
-            "url": "https://github.com/bazelbuild/rules_jvm_external/archive/%s.zip" % RULES_JVM_EXTERNAL_TAG,
-        },
         {
             "name": "rules_pkg",
             "sha256": "8a298e832762eda1830597d64fe7db58178aa84cd5926d76d5b744d6558941c2",
             "url": "https://mirror.bazel.build/github.com/bazelbuild/rules_pkg/releases/download/0.7.0/rules_pkg-0.7.0.tar.gz",
         },
 
-        # Kubernetes rules.  Useful for local development with tilt.
-        {
-            "name": "io_bazel_rules_k8s",
-            "strip_prefix": "rules_k8s-0.7",
-            "url": "https://github.com/bazelbuild/rules_k8s/archive/refs/tags/v0.7.tar.gz",
-            "sha256": "ce5b9bc0926681e2e7f2147b49096f143e6cbc783e71bc1d4f36ca76b00e6f4a",
-        },
-
-=======
->>>>>>> 14810a42 (migrate dependency to Bitnami Redis helm chart (#1637))
         # Needed for "well-known protos" and @com_google_protobuf//:protoc.
         {
             "name": "com_google_protobuf",
@@ -88,11 +63,7 @@ def archive_dependencies(third_party):
             "name": "com_grail_bazel_toolchain",
             "sha256": "b2d168315dd0785f170b2b306b86e577c36e812b8f8b05568f9403141f2c24dd",
             "strip_prefix": "toolchains_llvm-0.9",
-<<<<<<< 0090ef07 (A temporary workaround of executors being used as storage worker)
-            "url": "https://github.com/grailbio/bazel-toolchain/archive/refs/tags/0.9.tar.gz",
-=======
             "url": "https://github.com/bazel-contrib/toolchains_llvm/archive/refs/tags/0.9.tar.gz",
->>>>>>> 14810a42 (migrate dependency to Bitnami Redis helm chart (#1637))
             "patch_args": ["-p1"],
             "patches": ["%s:clang_toolchain.patch" % third_party],
         },
diff --git a/src/main/java/build/buildfarm/cas/cfc/CASFileCache.java b/src/main/java/build/buildfarm/cas/cfc/CASFileCache.java
remerge CONFLICT (content): Merge conflict in src/main/java/build/buildfarm/cas/cfc/CASFileCache.java
index e066a698..3f4d9b91 100644
--- a/src/main/java/build/buildfarm/cas/cfc/CASFileCache.java
+++ b/src/main/java/build/buildfarm/cas/cfc/CASFileCache.java
@@ -363,25 +363,7 @@ public abstract class CASFileCache implements ContentAddressableStorage {
     this.delegate = delegate;
     this.delegateSkipLoad = delegateSkipLoad;
     this.directoriesIndexDbName = directoriesIndexDbName;
-<<<<<<< 0090ef07 (A temporary workaround of executors being used as storage worker)
     this.keyReferences = Maps.newConcurrentMap();
-    if (publishTtlMetric) {
-      casTtl =
-          Histogram.build()
-              .name("cas_ttl_s")
-              .buckets(
-                  3600, // 1 hour
-                  21600, // 6 hours
-                  86400, // 1 day
-                  345600, // 4 days
-                  604800, // 1 week
-                  1210000 // 2 weeks
-                  )
-              .help("The amount of time CAS entries live on L1 storage before expiration (seconds)")
-              .register();
-    }
-=======
->>>>>>> 14810a42 (migrate dependency to Bitnami Redis helm chart (#1637))
 
     entryPathStrategy = new HexBucketEntryPathStrategy(root, hexBucketLevels);
 
@@ -1970,7 +1952,6 @@ public abstract class CASFileCache implements ContentAddressableStorage {
         || e instanceof ClosedByInterruptException;
   }
 
-<<<<<<< 0090ef07 (A temporary workaround of executors being used as storage worker)
   private int getLockedReferenceCount(Entry e) {
     synchronized (this) {
       Integer keyCt = keyReferences.get(e.key);
@@ -1982,7 +1963,9 @@ public abstract class CASFileCache implements ContentAddressableStorage {
         // we don't want to subtract from this value
         return keyCt + Math.min(Math.max(refCt, 0), 0);
       }
-=======
+    }
+  }
+
   private Entry safeStorageInsertion(String key, Entry entry) {
     Lock lock;
     try {
@@ -2029,7 +2012,6 @@ public abstract class CASFileCache implements ContentAddressableStorage {
         }
       }
       lock.unlock();
->>>>>>> 14810a42 (migrate dependency to Bitnami Redis helm chart (#1637))
     }
   }
 
@@ -2224,7 +2206,6 @@ public abstract class CASFileCache implements ContentAddressableStorage {
   }
 
   private void removeFilePath(Path path) throws IOException {
-<<<<<<< 0090ef07 (A temporary workaround of executors being used as storage worker)
     if (!Files.exists(path)) {
       return;
     }
@@ -2235,19 +2216,10 @@ public abstract class CASFileCache implements ContentAddressableStorage {
     }
 
     if (Files.isDirectory(temp)) {
-      log.log(Level.INFO, "removing existing directory " + path + " for fetch");
-      Directories.remove(temp);
+      log.log(Level.FINER, "removing existing directory " + path + " for fetch");
+      Directories.remove(temp, fileStore);
     } else {
       Files.delete(temp);
-=======
-    if (Files.exists(path)) {
-      if (Files.isDirectory(path)) {
-        log.log(Level.FINER, "removing existing directory " + path + " for fetch");
-        Directories.remove(path, fileStore);
-      } else {
-        Files.delete(path);
-      }
->>>>>>> 14810a42 (migrate dependency to Bitnami Redis helm chart (#1637))
     }
   }
 
@@ -2266,7 +2238,6 @@ public abstract class CASFileCache implements ContentAddressableStorage {
     return directory;
   }
 
-<<<<<<< 0090ef07 (A temporary workaround of executors being used as storage worker)
   // Unlocks keys
   public void unlockKeys(Iterable<String> keys) throws IOException {
     synchronized (this) {
@@ -2382,10 +2353,7 @@ public abstract class CASFileCache implements ContentAddressableStorage {
     }
   }
 
-  public ListenableFuture<Path> putDirectory(
-=======
   public ListenableFuture<PathResult> putDirectory(
->>>>>>> 14810a42 (migrate dependency to Bitnami Redis helm chart (#1637))
       Digest digest, Map<Digest, Directory> directoriesIndex, ExecutorService service) {
     // Claim lock.
     // Claim the directory path so no other threads try to create/delete it.
@@ -2733,11 +2701,7 @@ public abstract class CASFileCache implements ContentAddressableStorage {
   private void copyExternalInput(Digest digest, CancellableOutputStream out)
       throws IOException, InterruptedException {
     Retrier retrier = new Retrier(Backoff.sequential(5), Retrier.DEFAULT_IS_RETRIABLE);
-<<<<<<< 0090ef07 (A temporary workaround of executors being used as storage worker)
-    log.log(Level.FINE, format("downloading %s", DigestUtil.toString(digest)));
-=======
     log.log(Level.FINER, format("downloading %s", DigestUtil.toString(digest)));
->>>>>>> 14810a42 (migrate dependency to Bitnami Redis helm chart (#1637))
     try {
       retrier.execute(
           () -> {
@@ -2935,7 +2899,6 @@ public abstract class CASFileCache implements ContentAddressableStorage {
     }
   }
 
-<<<<<<< 0090ef07 (A temporary workaround of executors being used as storage worker)
   // Atomic file system mutation helpers
   //
   // Assuming the the file system has atomic renames.
@@ -2972,18 +2935,10 @@ public abstract class CASFileCache implements ContentAddressableStorage {
     }
   }
 
-  private void deleteExpiredKey(Path path) throws IOException {
-    // We don't want publishing the metric to delay the deletion of the file.
-    // We publish the metric only after the file has been deleted.
-    long createdTime = 0;
-    if (publishTtlMetric) {
-      createdTime = path.toFile().lastModified();
-    }
-=======
   private void deleteExpiredKey(String key) throws IOException {
     Path path = getRemovingPath(key);
     long createdTimeMs = Files.getLastModifiedTime(path).to(MILLISECONDS);
->>>>>>> 14810a42 (migrate dependency to Bitnami Redis helm chart (#1637))
+
 
     deleteFilePath(path);
 
@@ -3021,7 +2976,6 @@ public abstract class CASFileCache implements ContentAddressableStorage {
                     (expiredEntry) -> {
                       String expiredKey = expiredEntry.key;
                       try {
-<<<<<<< 0090ef07 (A temporary workaround of executors being used as storage worker)
                         // asnyc and possible to have gotten a key by now
                         if (keyReferences.get(key) != null) {
                           log.log(
@@ -3030,12 +2984,8 @@ public abstract class CASFileCache implements ContentAddressableStorage {
                                   "CASFileCache::putImpl ignore deletion for %s expiration due to key reference",
                                   expiredKey));
                         } else {
-                          Path path = getPath(expiredKey);
-                          deleteExpiredKey(path);
+                          deleteExpiredKey(expiredKey);
                         }
-=======
-                        deleteExpiredKey(expiredKey);
->>>>>>> 14810a42 (migrate dependency to Bitnami Redis helm chart (#1637))
                       } catch (NoSuchFileException eNoEnt) {
                         log.log(
                             Level.SEVERE,
@@ -3251,16 +3201,10 @@ public abstract class CASFileCache implements ContentAddressableStorage {
         Entry existingEntry = null;
         boolean inserted = false;
         try {
-<<<<<<< 0090ef07 (A temporary workaround of executors being used as storage worker)
           log.log(Level.FINEST, "comitting " + key + " from " + writePath);
           Path cachePath = CASFileCache.this.getPath(key);
           CASFileCache.this.renamePath(writePath, cachePath);
-          existingEntry = storage.putIfAbsent(key, entry);
-=======
-          // acquire the key lock
-          Files.createLink(CASFileCache.this.getPath(key), writePath);
           existingEntry = safeStorageInsertion(key, entry);
->>>>>>> 14810a42 (migrate dependency to Bitnami Redis helm chart (#1637))
           inserted = existingEntry == null;
         } catch (FileAlreadyExistsException e) {
           log.log(Level.FINER, "file already exists for " + key + ", nonexistent entry will fail");
diff --git a/src/main/java/build/buildfarm/common/OperationFailer.java b/src/main/java/build/buildfarm/common/OperationFailer.java
remerge CONFLICT (content): Merge conflict in src/main/java/build/buildfarm/common/OperationFailer.java
index 63a0bdbb..17c9a045 100644
--- a/src/main/java/build/buildfarm/common/OperationFailer.java
+++ b/src/main/java/build/buildfarm/common/OperationFailer.java
@@ -20,14 +20,9 @@ import build.bazel.remote.execution.v2.ExecutionStage;
 import build.buildfarm.v1test.ExecuteEntry;
 import com.google.longrunning.Operation;
 import com.google.protobuf.Any;
-<<<<<<< 0090ef07 (A temporary workaround of executors being used as storage worker)
-import com.google.rpc.PreconditionFailure;
-import io.grpc.Status.Code;
-import java.net.InetAddress;
-import com.google.common.base.Strings;
-=======
 import com.google.rpc.Status;
->>>>>>> 14810a42 (migrate dependency to Bitnami Redis helm chart (#1637))
+import com.google.common.base.Strings;
+import java.net.InetAddress;
 
 /**
  * @class OperationFailer
@@ -36,31 +31,20 @@ import com.google.rpc.Status;
  *     finished and failed.
  */
 public class OperationFailer {
-<<<<<<< 0090ef07 (A temporary workaround of executors being used as storage worker)
-
-  // Not great - consider using publicName if we upstream
+    // Not great - consider using publicName if we upstream
   private static String hostname = null;
   private static String getHostname() {
-    if (!Strings.isNullOrEmpty(hostname)) {
+      if (!Strings.isNullOrEmpty(hostname)) {
+          return hostname;
+      }
+      try {
+          hostname = InetAddress.getLocalHost().getHostName();
+      } catch (Exception e) {
+          hostname = "_unknown_host_";
+      }
       return hostname;
-    }
-    try {
-      hostname = InetAddress.getLocalHost().getHostName();
-    } catch (Exception e) {
-      hostname = "_unknown_host_";
-    }
-    return hostname;
   }
-
-  public static Operation get(
-      Operation operation,
-      ExecuteEntry executeEntry,
-      String failureType,
-      String failureMessage,
-      String failureDetails) {
-=======
   public static Operation get(Operation operation, ExecuteEntry executeEntry, Status status) {
->>>>>>> 14810a42 (migrate dependency to Bitnami Redis helm chart (#1637))
     return operation
         .toBuilder()
         .setDone(true)
@@ -80,27 +64,4 @@ public class OperationFailer {
         .setStage(stage)
         .build();
   }
-<<<<<<< 0090ef07 (A temporary workaround of executors being used as storage worker)
-
-  private static ExecuteResponse failResponse(
-      ExecuteEntry executeEntry, String failureType, String failureMessage, String failureDetails) {
-    PreconditionFailure.Builder preconditionFailureBuilder = PreconditionFailure.newBuilder();
-    preconditionFailureBuilder
-        .addViolationsBuilder()
-        .setType(failureType)
-        .setSubject(String.format("[%s] %s", OperationFailer.getHostname(), "blobs/" + DigestUtil.toString(executeEntry.getActionDigest())))
-        .setDescription(String.format("[%s] %s", OperationFailer.getHostname(), failureDetails));
-    PreconditionFailure preconditionFailure = preconditionFailureBuilder.build();
-
-    return ExecuteResponse.newBuilder()
-        .setStatus(
-            com.google.rpc.Status.newBuilder()
-                .setCode(Code.FAILED_PRECONDITION.value())
-                .setMessage(failureMessage)
-                .addDetails(Any.pack(preconditionFailure))
-                .build())
-        .build();
-  }
-=======
->>>>>>> 14810a42 (migrate dependency to Bitnami Redis helm chart (#1637))
 }
diff --git a/src/main/java/build/buildfarm/worker/PipelineStage.java b/src/main/java/build/buildfarm/worker/PipelineStage.java
remerge CONFLICT (content): Merge conflict in src/main/java/build/buildfarm/worker/PipelineStage.java
index e41e887a..49bd392a 100644
--- a/src/main/java/build/buildfarm/worker/PipelineStage.java
+++ b/src/main/java/build/buildfarm/worker/PipelineStage.java
@@ -58,17 +58,8 @@ public abstract class PipelineStage implements Runnable {
 
   @Override
   public void run() {
-<<<<<<< 0090ef07 (A temporary workaround of executors being used as storage worker)
-    try {
-      runInterruptible();
-    } catch (InterruptedException e) {
-      // ignore
-    } finally {
-      boolean wasInterrupted = Thread.interrupted();
-=======
     boolean keepRunningStage = true;
     while (keepRunningStage) {
->>>>>>> 14810a42 (migrate dependency to Bitnami Redis helm chart (#1637))
       try {
         runInterruptible();
 
diff --git a/src/main/java/build/buildfarm/worker/shard/CFCExecFileSystem.java b/src/main/java/build/buildfarm/worker/shard/CFCExecFileSystem.java
remerge CONFLICT (content): Merge conflict in src/main/java/build/buildfarm/worker/shard/CFCExecFileSystem.java
index 0bd9771a..bb667fcd 100644
--- a/src/main/java/build/buildfarm/worker/shard/CFCExecFileSystem.java
+++ b/src/main/java/build/buildfarm/worker/shard/CFCExecFileSystem.java
@@ -215,15 +215,11 @@ class CFCExecFileSystem implements ExecFileSystem {
           onKey.accept(key);
           if (digest.getSizeBytes() != 0) {
             try {
-<<<<<<< 0090ef07 (A temporary workaround of executors being used as storage worker)
               // Coordinated with the CAS - consider adding an API for safe path
               // access
               synchronized (fileCache) {
-                Files.createLink(filePath, fileCachePath);
+                Files.createLink(path, fileCachePath);
               }
-=======
-              Files.createLink(path, fileCachePath);
->>>>>>> 14810a42 (migrate dependency to Bitnami Redis helm chart (#1637))
             } catch (IOException e) {
               return immediateFailedFuture(e);
             }
@@ -303,25 +299,6 @@ class CFCExecFileSystem implements ExecFileSystem {
                     onKey,
                     inputDirectories));
       } else {
-<<<<<<< 0090ef07 (A temporary workaround of executors being used as storage worker)
-        downloads =
-            concat(
-                downloads,
-                ImmutableList.of(
-                    transform(
-                        linkDirectory(dirPath, digest, directoriesIndex),
-                        (result) -> {
-                          // note: this could non-trivial make sync due to
-                          // the way decrementReferences is implemented.
-                          // we saw null entries in the built immutable list
-                          // without synchronization
-                          synchronized (inputDirectories) {
-                            inputDirectories.add(digest);
-                          }
-                          return null;
-                        },
-                        fetchService)));
-=======
         linkedDirectories.add(
             transform(
                 linkDirectory(dirPath, digest, directoriesIndex),
@@ -333,7 +310,6 @@ class CFCExecFileSystem implements ExecFileSystem {
                   return null;
                 },
                 fetchService));
->>>>>>> 14810a42 (migrate dependency to Bitnami Redis helm chart (#1637))
       }
       if (Thread.currentThread().isInterrupted()) {
         break;
@@ -460,14 +436,6 @@ class CFCExecFileSystem implements ExecFileSystem {
     ImmutableList.Builder<String> inputFiles = new ImmutableList.Builder<>();
     ImmutableList.Builder<Digest> inputDirectories = new ImmutableList.Builder<>();
 
-<<<<<<< 0090ef07 (A temporary workaround of executors being used as storage worker)
-    // Get lock keys so we can increment them prior to downloading
-    // and no other threads can to create/delete during
-    // eviction or the invocation of fetchInputs
-    Iterable<String> lockedKeys =
-        fileCache.lockDirectoryKeys(execDir, inputRootDigest, directoriesIndex);
-
-=======
     Set<Path> linkedInputDirectories =
         ImmutableSet.copyOf(
             Iterables.transform(
@@ -476,7 +444,12 @@ class CFCExecFileSystem implements ExecFileSystem {
 
     log.log(
         Level.FINER, "ExecFileSystem::createExecDir(" + operationName + ") calling fetchInputs");
->>>>>>> 14810a42 (migrate dependency to Bitnami Redis helm chart (#1637))
+    // Get lock keys so we can increment them prior to downloading
+    // and no other threads can to create/delete during
+    // eviction or the invocation of fetchInputs
+    Iterable<String> lockedKeys =
+        fileCache.lockDirectoryKeys(execDir, inputRootDigest, directoriesIndex);
+
     Iterable<ListenableFuture<Void>> fetchedFutures =
         fetchInputs(
             execDir,
@@ -528,12 +501,8 @@ class CFCExecFileSystem implements ExecFileSystem {
       if (!success) {
         log.log(Level.INFO, "Failed to create exec dir (" + operationName + "), cleaning up");
         fileCache.decrementReferences(inputFiles.build(), inputDirectories.build());
-<<<<<<< 0090ef07 (A temporary workaround of executors being used as storage worker)
         fileCache.unlockKeys(lockedKeys);
-        Directories.remove(execDir);
-=======
         Directories.remove(execDir, fileStore);
->>>>>>> 14810a42 (migrate dependency to Bitnami Redis helm chart (#1637))
       }
     }
 
diff --git a/src/main/java/build/buildfarm/worker/shard/RemoteCasWriter.java b/src/main/java/build/buildfarm/worker/shard/RemoteCasWriter.java
remerge CONFLICT (content): Merge conflict in src/main/java/build/buildfarm/worker/shard/RemoteCasWriter.java
index fab7e0cf..ba9f3d9a 100644
--- a/src/main/java/build/buildfarm/worker/shard/RemoteCasWriter.java
+++ b/src/main/java/build/buildfarm/worker/shard/RemoteCasWriter.java
@@ -50,22 +50,13 @@ import lombok.extern.java.Log;
 
 @Log
 public class RemoteCasWriter implements CasWriter {
-<<<<<<< 0090ef07 (A temporary workaround of executors being used as storage worker)
-  private final Set<String> workerSet;
-=======
   private final Backplane backplane;
->>>>>>> 14810a42 (migrate dependency to Bitnami Redis helm chart (#1637))
   private final LoadingCache<String, Instance> workerStubs;
   private final Retrier retrier;
 
   public RemoteCasWriter(
-<<<<<<< 0090ef07 (A temporary workaround of executors being used as storage worker)
-      Set<String> workerSet, LoadingCache<String, Instance> workerStubs, Retrier retrier) {
-    this.workerSet = workerSet;
-=======
       Backplane backplane, LoadingCache<String, Instance> workerStubs, Retrier retrier) {
     this.backplane = backplane;
->>>>>>> 14810a42 (migrate dependency to Bitnami Redis helm chart (#1637))
     this.workerStubs = workerStubs;
     this.retrier = retrier;
   }
@@ -81,37 +72,24 @@ public class RemoteCasWriter implements CasWriter {
   private void insertFileToCasMember(Digest digest, DigestFunction.Value digestFunction, Path file)
       throws IOException, InterruptedException {
     try (InputStream in = Files.newInputStream(file)) {
-<<<<<<< 0090ef07 (A temporary workaround of executors being used as storage worker)
       retrier.execute(() -> writeToCasMember(digest, digestFunction, in));
-=======
-      retrier.execute(() -> writeToCasMember(digest, in));
->>>>>>> 14810a42 (migrate dependency to Bitnami Redis helm chart (#1637))
     } catch (RetryException e) {
       Throwable cause = e.getCause();
       Throwables.throwIfInstanceOf(cause, IOException.class);
       Throwables.throwIfUnchecked(cause);
-<<<<<<< 0090ef07 (A temporary workaround of executors being used as storage worker)
-      throw new RuntimeException(cause);
-    }
-  }
-
-  private long writeToCasMember(Digest digest, DigestFunction.Value digestFunction, InputStream in)
-=======
       throw new IOException(cause);
     }
   }
 
   private long writeToCasMember(Digest digest, InputStream in)
->>>>>>> 14810a42 (migrate dependency to Bitnami Redis helm chart (#1637))
+      throws IOException, InterruptedException {
+  private long writeToCasMember(Digest digest, DigestFunction.Value digestFunction, InputStream in)
       throws IOException, InterruptedException {
     // create a write for inserting into another CAS member.
     String workerName = getRandomWorker();
     Write write = getCasMemberWrite(digest, digestFunction, workerName);
 
-<<<<<<< 0090ef07 (A temporary workaround of executors being used as storage worker)
-=======
     write.reset();
->>>>>>> 14810a42 (migrate dependency to Bitnami Redis helm chart (#1637))
     try {
       return streamIntoWriteFuture(in, write, digest).get();
     } catch (ExecutionException e) {
@@ -119,11 +97,7 @@ public class RemoteCasWriter implements CasWriter {
       Throwables.throwIfInstanceOf(cause, IOException.class);
       // prevent a discard of this frame
       Status status = Status.fromThrowable(cause);
-<<<<<<< 0090ef07 (A temporary workaround of executors being used as storage worker)
-      throw status.asRuntimeException();
-=======
       throw new IOException(status.asException());
->>>>>>> 14810a42 (migrate dependency to Bitnami Redis helm chart (#1637))
     }
   }
 
@@ -142,7 +116,6 @@ public class RemoteCasWriter implements CasWriter {
   @Override
   public void insertBlob(Digest digest, DigestFunction.Value digestFunction, ByteString content)
       throws IOException, InterruptedException {
-<<<<<<< 0090ef07 (A temporary workaround of executors being used as storage worker)
     insertBlobToCasMember(digest, digestFunction, content);
   }
 
@@ -150,19 +123,11 @@ public class RemoteCasWriter implements CasWriter {
       throws IOException, InterruptedException {
     try (InputStream in = content.newInput()) {
       retrier.execute(() -> writeToCasMember(digest, digestFunction, in));
-=======
-    try (InputStream in = content.newInput()) {
-      retrier.execute(() -> writeToCasMember(digest, in));
->>>>>>> 14810a42 (migrate dependency to Bitnami Redis helm chart (#1637))
     } catch (RetryException e) {
       Throwable cause = e.getCause();
       Throwables.throwIfInstanceOf(cause, IOException.class);
       Throwables.throwIfUnchecked(cause);
-<<<<<<< 0090ef07 (A temporary workaround of executors being used as storage worker)
-      throw new RuntimeException(cause);
-=======
       throw new IOException(cause);
->>>>>>> 14810a42 (migrate dependency to Bitnami Redis helm chart (#1637))
     }
   }
 
diff --git a/src/main/java/build/buildfarm/worker/shard/Worker.java b/src/main/java/build/buildfarm/worker/shard/Worker.java
remerge CONFLICT (content): Merge conflict in src/main/java/build/buildfarm/worker/shard/Worker.java
index a85c4032..0548f135 100644
--- a/src/main/java/build/buildfarm/worker/shard/Worker.java
+++ b/src/main/java/build/buildfarm/worker/shard/Worker.java
@@ -42,10 +42,7 @@ import build.buildfarm.common.config.Cas;
 import build.buildfarm.common.config.GrpcMetrics;
 import build.buildfarm.common.grpc.Retrier;
 import build.buildfarm.common.grpc.Retrier.Backoff;
-<<<<<<< 0090ef07 (A temporary workaround of executors being used as storage worker)
-=======
 import build.buildfarm.common.grpc.TracingMetadataUtils.ServerHeadersInterceptor;
->>>>>>> 14810a42 (migrate dependency to Bitnami Redis helm chart (#1637))
 import build.buildfarm.common.services.ByteStreamService;
 import build.buildfarm.common.services.ContentAddressableStorageService;
 import build.buildfarm.instance.Instance;
@@ -66,11 +63,8 @@ import build.buildfarm.worker.SuperscalarPipelineStage;
 import build.buildfarm.worker.resources.LocalResourceSetUtils;
 import com.google.common.cache.LoadingCache;
 import com.google.common.collect.Lists;
-<<<<<<< 0090ef07 (A temporary workaround of executors being used as storage worker)
 import com.google.common.util.concurrent.SettableFuture;
 import com.google.devtools.common.options.OptionsParsingException;
-=======
->>>>>>> 14810a42 (migrate dependency to Bitnami Redis helm chart (#1637))
 import com.google.longrunning.Operation;
 import com.google.protobuf.ByteString;
 import com.google.protobuf.Duration;
@@ -96,24 +90,18 @@ import java.util.Random;
 import java.util.UUID;
 import java.util.concurrent.Executor;
 import java.util.concurrent.ExecutorService;
-<<<<<<< 0090ef07 (A temporary workaround of executors being used as storage worker)
 import java.util.concurrent.ScheduledExecutorService;
 import java.util.concurrent.ScheduledFuture;
-=======
 import java.util.concurrent.atomic.AtomicBoolean;
->>>>>>> 14810a42 (migrate dependency to Bitnami Redis helm chart (#1637))
 import java.util.logging.Level;
 import javax.annotation.Nullable;
 import javax.naming.ConfigurationException;
 import lombok.extern.java.Log;
-<<<<<<< 0090ef07 (A temporary workaround of executors being used as storage worker)
 import org.springframework.beans.factory.annotation.Autowired;
 import org.springframework.boot.SpringApplication;
 import org.springframework.boot.autoconfigure.SpringBootApplication;
 import org.springframework.context.ApplicationContext;
 import org.springframework.context.annotation.ComponentScan;
-=======
->>>>>>> 14810a42 (migrate dependency to Bitnami Redis helm chart (#1637))
 
 @Log
 public final class Worker extends LoggingMain {
@@ -204,7 +192,10 @@ public final class Worker extends LoggingMain {
     }
   }
 
-<<<<<<< 0090ef07 (A temporary workaround of executors being used as storage worker)
+  private Worker() {
+    super("BuildFarmShardWorker");
+  }
+
   private void exitPostPipelineFailure() {
     // Shutdown the worker if a pipeline fails. By means of the spring lifecycle
     // hooks - e.g. the `PreDestroy` hook here - it will attempt to gracefully
@@ -240,10 +231,6 @@ public final class Worker extends LoggingMain {
     termFuture.cancel(false);
     shutdownDeadlineExecutor.shutdown();
     System.exit(code);
-=======
-  private Worker() {
-    super("BuildFarmShardWorker");
->>>>>>> 14810a42 (migrate dependency to Bitnami Redis helm chart (#1637))
   }
 
   private Operation stripOperation(Operation operation) {
@@ -643,11 +630,7 @@ public final class Worker extends LoggingMain {
     CasWriter writer;
     if (!configs.getWorker().getCapabilities().isCas()) {
       Retrier retrier = new Retrier(Backoff.sequential(5), Retrier.DEFAULT_IS_RETRIABLE);
-<<<<<<< 0090ef07 (A temporary workaround of executors being used as storage worker)
-      writer = new RemoteCasWriter(backplane.getStorageWorkers(), workerStubs, retrier);
-=======
       writer = new RemoteCasWriter(backplane, workerStubs, retrier);
->>>>>>> 14810a42 (migrate dependency to Bitnami Redis helm chart (#1637))
     } else {
       writer = new LocalCasWriter(execFileSystem);
     }
@@ -693,7 +676,7 @@ public final class Worker extends LoggingMain {
     PrometheusPublisher.startHttpServer(configs.getPrometheusPort());
 
     // An executor can also be used as storage worker as scheduler treats all new workers as storage workers
-    // TODO (Congt) Fix it upstream and revert it. 
+    // TODO (Congt) Fix it upstream and revert it.
     if (configs.getWorker().getCapabilities().isCas()) {
       startFailsafeRegistration();
     } else {
diff --git a/src/test/java/build/buildfarm/cas/cfc/CASFileCacheTest.java b/src/test/java/build/buildfarm/cas/cfc/CASFileCacheTest.java
remerge CONFLICT (content): Merge conflict in src/test/java/build/buildfarm/cas/cfc/CASFileCacheTest.java
index ad63ca47..48a57ee8 100644
--- a/src/test/java/build/buildfarm/cas/cfc/CASFileCacheTest.java
+++ b/src/test/java/build/buildfarm/cas/cfc/CASFileCacheTest.java
@@ -1116,10 +1116,6 @@ class CASFileCacheTest {
             /* maxEntrySizeInBytes=*/ 1024,
             /* hexBucketLevels=*/ 1,
             storeFileDirsIndexInMemory,
-<<<<<<< 0090ef07 (A temporary workaround of executors being used as storage worker)
-            /* publishTtlMetric=*/ false,
-=======
->>>>>>> 14810a42 (migrate dependency to Bitnami Redis helm chart (#1637))
             /* execRootFallback=*/ false,
             DIGEST_UTIL,
             expireService,

@chenj-hub chenj-hub force-pushed the jackies/upgrade-bazel-buildfarm-to-v2.9.0 branch from e438475 to 9938118 Compare August 15, 2024 16:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.