Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reporting: Lifetime stats #1472

Open
dyaffe opened this issue May 10, 2024 · 2 comments
Open

Reporting: Lifetime stats #1472

dyaffe opened this issue May 10, 2024 · 2 comments

Comments

@dyaffe
Copy link
Member

dyaffe commented May 10, 2024

Goal
Understand how a data flow is progressing and whether / how much a materialization is behind a capture.

Proposal

  • We don’t need too much detail, just a running total for how much a collection has read / written.
  • If possible, resetting that on truncation when we support truncation.
  • Showing a % completed in the collections view and potentially materializations view using this.
@jgraettinger
Copy link
Member

quick thoughts:

  • Bucket life-cycle policies remove data -- that means, if I create a materialization for a collection after the fact, I will never see as many documents read as have been written, and a % completion metric can never be accurate. This seems a likely potential source of confusion.

  • Is this really just a larger grain of time than "month"? "year"?

  • If we introduce a "compaction" feature for a collection, that also could reduce the number of docs / bytes I need to actually read -- though compaction can likely be framed as a truncation, which makes them the same problem.

  • These smell a bit like guages (rather than countesr) that are tracked and reported by tasks -- "I've captured this many docs / bytes since the binding was last truncated" or "I've read this many docs / bytes since I last saw a truncation for this binding"

@jgraettinger
Copy link
Member

Discussed options:

  • Derivations / Materializations can self-report the maximum publication time they've read through in each transaction.
  • Derivations / Materializations can self-report the observed delta between journal read offset and journal write head (summed across all journals).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants