-
Notifications
You must be signed in to change notification settings - Fork 514
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Synchronous iterator #2165
Synchronous iterator #2165
Conversation
Signed-off-by: Joe Elliott <[email protected]>
@mdisibio Will we need doc for this? |
Can we resurrect this PR? I keep seeing traces like this: Issue here: #2336 This PR would eliminate the possibility that the large gaps seen here are due to synchronization. It would also hopefully allow for easier performance analysis by simplifying this bit of code. |
Yes. A couple updates:
|
no concerns. do what makes sense to you. is the PR that risky? or are the gains not very clear?
👍 |
…pping out of a partially scanned page, and fix optimization to skip more pages in SeekTo
I don't think so. This is a performance experiment and if it pans out we will make it permanent. No user-facing changes.
A little of both but mostly the former. Concerned about correctness because although our test suite is fairly comprehensive, there are execution paths that are hard to cover synthetically (like jumping between pages, etc). The gains are shaping up better after the last batch of changes, but still not 100% beneficial. Here are some benchmarks on the wider traceql test suite. Executed by
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm. some small comments, but excited to get this one in.
var _ Iterator = (*SyncIterator)(nil) | ||
|
||
func NewSyncIterator(ctx context.Context, rgs []pq.RowGroup, column int, columnName string, readSize int, filter Predicate, selectAs string) *SyncIterator { | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ctx not used? it looks like we're losing the columnIterator.iterate
span with this change which i find somewhat useful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree, the span and inspected/kept tags are useful. It's a little atypical but we could start a span in NewSyncIterator
and then finish it in iter.Close()
. Any other ideas? There's no longer an over-arching method like iterate
, and Next
/SeekTo
are too fine-grained.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oof, good point. our traces need attention generally and this PR is too obvious a win to block.
fine with the "NewSyncIterator" approach to see how it goes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Tested it out and it works well enough for a first pass or if we have a better idea. The side-effect is that the span starts as soon as the iter is created, instead of on the first "pull" like the async one.
Pushed a small fix to |
What this PR does:
This PR creates a new synchronous version of the ColumnIterator. Current tests are passing and performance is looking better, but needs a bit more polish and testing. Need to run it against the enhanced set of tests in #2119.
Which issue(s) this PR fixes:
n/a
Checklist
CHANGELOG.md
updated - the order of entries should be[CHANGE]
,[FEATURE]
,[ENHANCEMENT]
,[BUGFIX]