Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add opaque cursor-based pagination #665

Closed
matthew-white opened this issue Nov 12, 2022 · 1 comment · Fixed by #934
Closed

Add opaque cursor-based pagination #665

matthew-white opened this issue Nov 12, 2022 · 1 comment · Fixed by #934
Assignees
Labels
enhancement New feature or behavior

Comments

@matthew-white
Copy link
Member

matthew-white commented Nov 12, 2022

There are a couple of ways to fetch submissions in chunks:

  • Use the OData query parameters $top and $skip. However, it is difficult to use those to fetch only new submissions, because submissions are sorted by createdAt descending, not ascending.
  • As an alternative, each time after fetching submissions, store the latest createdAt timestamp. The next time, use the $filter OData query parameter to fetch submissions after that timestamp. However, this runs into OData filter: date/time fields more precise than millisecond #459.

As an alternative to both of these, we could implement opaque cursor-based pagination. Quoting from https://graphql.org/learn/pagination/:

Especially if the cursors are opaque, either offset or ID-based pagination can be implemented using cursor-based pagination (by making the cursor the offset or the ID), and using cursors gives additional flexibility if the pagination model changes in the future. As a reminder that the cursors are opaque and that their format should not be relied upon, we suggest base64 encoding them.

@lognaturel wrote about this concept:

I think usually this opaque cursor concept is used for pagination.

Aggregate/Briefcase use it as a “query resume point”: https://docs.getodk.org/briefcase-api/#get-view-submissionlist. That’s pagination but with the page being “whatever is available now”

[Opacity] prevents clients from trying to build their own cursors and enforces that they just pass them on to the next request.
So that you have the flexibility to change the actual contents of your cursor.

I imagine we could use createdAt as a string as a basis for a cursor to get around the node precision loss issue [described in #459].

This proposal for using $skiptoken for server-side paging is similar to what I have in mind.
(I don’t really understand the context for this but the general description of the problem it solves is helpful I think)

At the moment, the main resource for which cursor-based pagination is of interest is submissions. However, at some point soon, we will face similar questions around entities.

@matthew-white matthew-white added the enhancement New feature or behavior label Nov 12, 2022
@matthew-white
Copy link
Member Author

This came up again in the context of entities. It'd be useful to have cursor-based pagination for the infinite scroll of the entities table (getodk/central-frontend#741). Since we're planning to support entity deletion (#803), we can't use the same strategy for that infinite scroll as we do for the submissions table, which fetches chunks based on offsets. Deletion + offset-based pagination could result in entities being skipped in the infinite scroll:

  • Navigate to the entities page. There are 500 total entities. 250 entities are shown.
  • Delete the latest entity.
  • Scroll to the bottom of the page, fetching the next chunk of entities ($skip=250).
  • Even though there are 250 entities left to fetch, only 249 entities are returned.

Instead of using an offset, the infinite scroll could filter based on entities."createdAt", fetching the latest 250 entities created at or before the earliest entity fetched so far. However, that might not work with bulk entity creation (#804), because potentially there would be more than 250 entities with the same timestamp, causing the infinite scroll to try to fetch the same chunk over and over. The infinite scroll could filter based on entities."createdAt" and also use an offset, but it'd be more convenient to use a cursor.

One idea for the cursor is for it to filter on entities."createdAt" and entities.id under the hood. Those are the two columns that are used to sort entities. The cursor would specify createdAt and id opaquely so that we have flexibility to change how the cursor works. A cursor could be passed to Backend to fetch entities after the cursor. If we have an index on createdAt, then we should see some performance improvements as well. (Side thoughts: Should the cursor specify entities.uuid instead of entities.id? Could it even just specify entities.uuid, and Backend will look up createdAt?)

@github-project-automation github-project-automation bot moved this to 🕒 backlog in ODK Central May 1, 2023
@sadiqkhoja sadiqkhoja self-assigned this Jul 18, 2023
@sadiqkhoja sadiqkhoja moved this from 🕒 backlog to ✏️ in progress in ODK Central Jul 18, 2023
@github-project-automation github-project-automation bot moved this from ✏️ in progress to ✅ done in ODK Central Sep 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or behavior
Projects
Status: ✅ done
Development

Successfully merging a pull request may close this issue.

2 participants