Increase the timeout for storage operations before resetting an actor

The lower timeout could plausibly be hit purely due to a busy CPU loop after issuing a storage operation. The new number matches the CPU limit per request. There's no need for this to be exactly the same number as that, but it seemed just as good a reason as I had for any other larger number.
cloudflare · Mar 11, 2024 · 3fbf7c8 · 3fbf7c8
1 parent 0afa205
commit 3fbf7c8
Showing 1 changed file with 4 additions and 8 deletions.
diff --git a/src/workerd/io/worker.c++ b/src/workerd/io/worker.c++
@@ -2858,14 +2858,10 @@ struct Worker::Actor::Impl {
     // Implements InputGate::Hooks.
 
     kj::Promise<void> makeTimeoutPromise() override {
-#if __has_feature(address_sanitizer) || defined(__SANITIZE_ADDRESS__)
-      // Give more time under ASAN.
-      //
-      // TODO(cleanup): Should this be configurable?
-      auto timeout = 20 * kj::SECONDS;
-#else
-      auto timeout = 10 * kj::SECONDS;
-#endif
+      // This really only protects against total hangs. Lowering the timeout drastically is risky,
+      // since low timeouts can spuriously fire when under heavy CPU load, failing requests that
+      // would otherwise succeed.
+      auto timeout = 30 * kj::SECONDS;
       co_await timerChannel.afterLimitTimeout(timeout);
       kj::throwFatalException(KJ_EXCEPTION(FAILED,
             "broken.outputGateBroken; jsg.Error: Durable Object storage operation exceeded "