Skip to content
This repository has been archived by the owner on Jan 12, 2019. It is now read-only.

Fix any possible fillBuffer_ race conditions by debouncing all fillBuffers_ #959

Merged
merged 1 commit into from
Jan 24, 2017

Conversation

imbcmdth
Copy link
Member

@imbcmdth imbcmdth commented Jan 13, 2017

Description

On a rendition switch, SegmentLoader seems to be able to enter a state where it fetches the same segment twice at the same time causing bandwidth starvation. The requests then timeout and an emergency rendition switch which can trigger the same behavior again.

What's gone wrong?

The root problem is that handleUpdateEnd_ can trigger a rendition switch (by emitting the 'progress' event). Eventually, a rendition switch results in load being called on the SegmentLoader which has the side effect of calling fillBuffer_ and starting a request for a segment. At the same time, the original handleUpdateEnd_ function calls fillBuffer_ itself triggering a second request.

Why doesn't this happen every time we change renditions?

The issue is "masked" by two different mechanisms each of which can stop the second request from happening:

  1. If the rendition we are switching to has never been loaded before, the PlaylistLoader first must make a request for the renditions playlist/manifest. That introduces asynchronicity that results in the SegmentLoader being in a state (waiting) so that fillBuffer_ is never called by SegmentLoader#load.
  2. If there is buffered data more than 30 seconds before the playhead, the first call to fillBuffer_ will, via loadSegment_, result in a SourceUpdater.remove that causes the SourceBuffer#updating to be true. When the next fillBuffer_ runs, it will exit early because the sourceBuffer is updating.

Specific Changes proposed

  • Convert all calls to fillBuffer_ into calls to monitorBuffer_
  • Rename monitorBuffer_ to monitorBufferTick_ which becomes the 500ms buffer check timer-loop
  • Make a new monitorBuffer_ that schedule an immediate (1 ms) timer for monitorBufferTick_

Requirements Checklist

  • Feature implemented / Bug fixed
  • If necessary, more likely in a feature request than a bug fix
  • Reviewed by Two Core Contributors

…ffers_

- Convert all calls to fillBuffer_ to calls to monitorBuffer_
- Rename monitorBuffer_ to monitorBufferTick_ which becomes the 500ms buffer check timer loop
- Make monitorBuffer_ schedule an immediate timer for monitorBufferTick_
*
* @private
*/
monitorBuffer_() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this in addition to monitorBufferTick_? It seems that by changing calls from fillBuffer_ to monitorBuffer_, we already get the guard that checks whether the state is READY.

fillBuffer_, if it starts a segment request, will synchronously get to the WAITING state. This means that any subsequent calls to monitorBuffer would be safe with the READY guard.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The real difference is in how long in the future things are scheduled. Calling monitorBuffer_ forces the tick to happen next (it either schedules or reschedules the timer tick to be just one millisecond). Whereas monitorBufferTick_ always schedules the next time to be 500ms in the future.

Originally, I had a piece of code with an optional parameter to monitorBuffer_ to set/reset different timer durations in one function but I decided that was "too clever".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a followup, spoke with @imbcmdth , and he pointed out that monitorBuffer_ contextually is async, and is safer as async when there are events that are being listened to that may happen around a segment's end of processing (as an example, if monitorBuffer_ was synchronous, by the time listeners received the events, a new segment may have already been requested and a lot of state changed).


// so the loader should try the next segment
assert.equal(this.requests[0].url, '1.ts', 'moved ahead a segment');

// verify stats
assert.equal(loader.mediaBytesTransferred, 20, '20 bytes');
assert.equal(loader.mediaTransferDuration, 2, '2 ms (clocks above)');
assert.equal(loader.mediaTransferDuration, 1, '1 ms (clocks above)');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did this decrease?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prior to this change, the request happened immediately. As a result, there were two clock ticks between the request and the response. Now, the request happens later and only one tick happens now (the first tick becomes the tick that triggers the request).

Copy link
Contributor

@gesinger gesinger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@forbesjo forbesjo assigned forbesjo and unassigned forbesjo Jan 20, 2017
@forbesjo forbesjo self-requested a review January 24, 2017 15:07
@forbesjo
Copy link
Contributor

LGTM

@mjneil mjneil merged commit 409313c into master Jan 24, 2017
@mjneil mjneil deleted the fill-buffer-fix branch January 24, 2017 18:35
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants