Streams using Paginators/Kotlin coroutines #206

mattbdean · 2017-12-11T05:59:23Z

The purpose of this feature is to continuously poll an API endpoint that returns a Listing (for example, /r/{subreddit}/comments), and notify the user when that listing provides new data. This is similar to PRAW's SubredditStream.

A Stream is an object that is infinitely iterable. Each new model detected is yielded to the Stream.

Mock usage:

Stream stream = redditClient.subreddit("redditdev").commentStream();

for (Comment newComment : stream) {
  // do something
}

Here are a few strategies for managing network requests:

Constant rate/no backoff: Each request is sent at a constant, user configurable rate
Exponential backoff: Each request that does not yield new data will increase the delay to the next request by a factor of 2. For example, the third request that does not yield new data will be followed by a delay of 2³ = 8 seconds. The maximum delay should be user configurable.
Short-term learned backoff: The first few requests are executed at the maximum OAuth2 rate, 1 req/s. The amount of new models per request are kept track of. After a user-configurable number of requests, the average number of models per request is used to calculate the new request rate. Each request updates the average and by extension, the request rate. There should be some user-configurable error to the request rate, so if there's a 10% increase in comments, we won't miss out on all of them. If the request rate is too low, fall back to exponential backoff.
Long-term learned backoff: This is like short-term learned backoff, but it takes more variables into account like the time of day and the day of the week. This is a machine learning problem at its core. The recorded data should be able to be saved/loaded so when the stream shuts down it doesn't have to build up all of it's data again. This option is a bit optimistic and I'm not too sure how much use it'll actually get.

Streams would be ideally implemented using Kotlin coroutines, specifically using buildSequence.

A very basic example:

val seq = buildSequence {
  val data: Listing<T> = fetchLatestData().filter(/* if we've seen this model */)
  if (data.isEmpty()) {
    // delay next request
  } else {
    yieldAll(data)
  }
}

Here is PRAW's implementation for reference.

All this is subject to change, of course. Any feedback is welcome!

Related: #200

The text was updated successfully, but these errors were encountered:

eduard-netsajev · 2017-12-15T15:44:23Z

Constant rate is a valid strategy as well, at least haven't had any problems using it for smaller subreddits

mattbdean · 2017-12-15T21:38:56Z

@eduard-netsajev Agreed, I've added it to the list

mattbdean added the enhancement label Dec 11, 2017

mattbdean added this to the v1.1.0 milestone Dec 11, 2017

mattbdean self-assigned this Dec 11, 2017

mattbdean changed the title ~~feat: Streams using Paginators/Kotlin coroutines~~ Streams using Paginators/Kotlin coroutines Dec 11, 2017

mattbdean added a commit that referenced this issue Mar 16, 2018

Create a working Stream as described in #206

4b937bc

mattbdean closed this as completed in 94f7622 May 6, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Streams using Paginators/Kotlin coroutines #206

Streams using Paginators/Kotlin coroutines #206

mattbdean commented Dec 11, 2017 •

edited

Loading

eduard-netsajev commented Dec 15, 2017

mattbdean commented Dec 15, 2017

Streams using Paginators/Kotlin coroutines #206

Streams using Paginators/Kotlin coroutines #206

Comments

mattbdean commented Dec 11, 2017 • edited Loading

eduard-netsajev commented Dec 15, 2017

mattbdean commented Dec 15, 2017

mattbdean commented Dec 11, 2017 •

edited

Loading