Skip to content

Commit

Permalink
backoff machine: Add, for replacement of progressiveTimeout.
Browse files Browse the repository at this point in the history
Improve state logic for sleep durations in API request retry loops.

progressiveTimeout uses a global state, with the effect that one
network request's retry attempts affect the sleep durations used for
any other request. This has the benefit of enabling a general
throttle on all request retry loops that used progressiveTimeout.
But different requests may have transient failures for different
reasons, so it's not obvious that we want this general throttle. And
hard-to-trace bugs can creep in when the behavior of
progressiveTimeout can't be determined from a particular call site.

Also, progressiveTimeout uses a 60-second threshold as a heuristic
to distinguish request retry loops from each other, with the
simplifying assumption that different types of requests will not
occur within 60 seconds of each other. This distinction is more
effectively done by managing the state per-loop in the first place,
and doing so eliminates more of those hard-to-trace bugs mentioned
in the previous paragraph.

So, introduce the ability to handle state locally. Capped
exponential backoff still mitigates high request traffic, but
per-request.

Preparation for zulip#3829, to be completed in this series of commits.
  • Loading branch information
Chris Bobbe committed Mar 4, 2020
1 parent c36e2bb commit 577ac4c
Show file tree
Hide file tree
Showing 2 changed files with 100 additions and 0 deletions.
44 changes: 44 additions & 0 deletions src/utils/__tests__/backoffMachine-test.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
/* @flow strict-local */
import { BackoffMachine } from '../async';
import { Lolex } from '../../__tests__/aux/lolex';

// Since BackoffMachine is in async.js, these tests *should* be in
// async-test.js. But doing that introduces some interference between these
// tests and the other Lolex-based tests, since Jest is running both of them in
// the same environment in parallel. This may be resolved out of the box in Jest
// 26, and it might even be safe in Jest 25.1.0 with a custom environment
// (https://github.com/facebook/jest/pull/8897). But as of 2020-03, putting them
// in a separate file is our workaround.

describe('BackoffMachine', () => {
const lolex: Lolex = new Lolex();

afterEach(() => {
lolex.clearAllTimers();
});

afterAll(() => {
lolex.dispose();
});

const measureWait = async (promise: Promise<void>) => {
const start = Date.now();
lolex.runOnlyPendingTimers();
await promise;
return Date.now() - start;
};

test('timeouts are 100ms, 200ms, 400ms, 800ms...', async () => {
const expectedDurations = [100, 200, 400, 800, 1600, 3200, 6400, 10000, 10000, 10000, 10000];
const results: number[] = [];

const backoffMachine = new BackoffMachine();
for (let j = 0; j < expectedDurations.length; j++) {
const duration = await measureWait(backoffMachine.wait());
results.push(duration);
}
expectedDurations.forEach((expectedDuration, i) => {
expect(results[i]).toBe(expectedDuration);
});
});
});
56 changes: 56 additions & 0 deletions src/utils/async.js
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,62 @@ export function delay<T>(callback: () => T): Promise<T> {
export const sleep = (ms: number = 0): Promise<void> =>
new Promise(resolve => setTimeout(resolve, ms));

/**
* Makes a machine that can sleep for increasing durations, for network backoff.
*
* Call the constructor before a loop starts, and call .wait() in each iteration
* of the loop. Do not re-use the instance after exiting the loop.
*/
export class BackoffMachine {
_firstDuration: number;
_durationCeiling: number;
_base: number;

_startTime: number | void;
_waitsCompleted: number;

constructor() {
this._firstDuration = 100;
this._durationCeiling = 10 * 1000;
this._base = 2;

this._startTime = undefined;
this._waitsCompleted = 0;
}

/**
* How many waits have completed so far.
*
* Use this to implement "give up" logic by breaking out of the loop after a
* threshold number of waits.
*/
waitsCompleted = (): number => this._waitsCompleted;

/**
* Promise to resolve after the appropriate duration.
*
* Until a ceiling is reached, the duration grows exponentially with the number
* of sleeps completed, with a base of 2. E.g., if firstDuration is 100 and
* durationCeiling is 10 * 1000 = 10000, the sequence is
*
* 100, 200, 400, 800, 1600, 3200, 6400, 10000, 10000, 10000, ...
*/
wait = async (): Promise<void> => {
if (this._startTime === undefined) {
this._startTime = Date.now();
}

const duration = Math.min(
// Should not exceed durationCeiling
this._durationCeiling,
this._firstDuration * this._base ** this._waitsCompleted,
);
await sleep(duration);

this._waitsCompleted++;
};
}

/**
* Calls an async function and if unsuccessful retries the call.
*
Expand Down

0 comments on commit 577ac4c

Please sign in to comment.