Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Enhancement]: Pass mvccTS rather than BeginTS for Iterators #37158

Open
1 task done
PwzXxm opened this issue Oct 25, 2024 · 11 comments
Open
1 task done

[Enhancement]: Pass mvccTS rather than BeginTS for Iterators #37158

PwzXxm opened this issue Oct 25, 2024 · 11 comments
Labels
kind/enhancement Issues or changes related to enhancement

Comments

@PwzXxm
Copy link
Contributor

PwzXxm commented Oct 25, 2024

Is there an existing issue for this?

  • I have searched the existing issues

What would you like to be added?

Use mvccTS rather than BeginTS in the first search call of the Iterator, affecting seach/query iterators.

Why is this needed?

The second next() becomes slower than the first and the third.
One or two magnitude slower.

Anything else?

No response

@PwzXxm PwzXxm added the kind/enhancement Issues or changes related to enhancement label Oct 25, 2024
@xiaofan-luan
Copy link
Collaborator

Is there an existing issue for this?

  • I have searched the existing issues

What would you like to be added?

Use mvccTS rather than BeginTS in the first search call of the Iterator, affecting seach/query iterators.

Why is this needed?

The second next() becomes slower than the first and the third. One or two magnitude slower.

Anything else?

No response

what is beginTS? why does that has impact on performance?

@congqixia
Copy link
Contributor

@xiaofan-luan the background of this issue was to use same mvcc timestamp for each sub requests of SearchIterator
current implementation is using BeginTs() (see code below)

t.result.Results.OutputFields = t.userOutputFields
t.result.CollectionName = t.request.GetCollectionName()
if t.isIterator && t.request.GetGuaranteeTimestamp() == 0 {
// first page for iteration, need to set up sessionTs for iterator
t.result.SessionTs = t.BeginTs()
}

BeginTs() is almost current ts (since first search is quick say about 10ms), then second sub search request is like using strong consistency level and take 200ms +

@PwzXxm
Copy link
Contributor Author

PwzXxm commented Oct 28, 2024

The current implementation is returning the current ts of the first seach/query request and send back to SDK. The sdk captures this and send the following requests with this TS, which results in the second request needs to catch up, could take extra 200ms if unlucky

Ideally, we could use mvcc ts in the first request, so that the following requests does not need to catch up

@PwzXxm
Copy link
Contributor Author

PwzXxm commented Oct 28, 2024

/assign @MrPresent-Han

@PwzXxm
Copy link
Contributor Author

PwzXxm commented Oct 28, 2024

For detailed explanation, see #37180

sre-ci-robot pushed a commit that referenced this issue Oct 28, 2024
issue: #37158

Return the GuaranteeTS so that the subsequent requests following the
correct TS.

BeginTS is the current timestamp when the task is created.
The GuaranteeTS is the one parsed based on both consistency level and
beginTS, in PreExecute of the task on Proxy.
The delegator will wait until GuaranteeTS is met.
In PostExecute of the task on Proxy, the TS of the first iterator
request will be returned to the SDK and add it to the subsequent
requests.
Hence, if the default consistency level is Eventually or Bounded, the
order of TS will be
> Guarantee TS < BeginTS

If it returns the BeginTS, the second request will need to catch up and
result in extra 200ms max of latency, which results in something like

| Call | Latency |
| --- | --- |
| first call on `Next()` | 30ms |
| second call on `Next()` | 210ms |
| third call on `Next()` | 10ms |
| fourth call on `Next()` | 11 ms |
| ... | ... |

where we expect

| Call | Latency |
| --- | --- |
| first call on `Next()` | 30ms |
| second call on `Next()` | 10ms |
| third call on `Next()` | 10ms |
| fourth call on `Next()` | 11 ms |
| ... | ... |

Signed-off-by: Patrick Weizhi Xu <[email protected]>
sre-ci-robot pushed a commit that referenced this issue Oct 28, 2024
issue: #37158 
pr: #37180

Signed-off-by: Patrick Weizhi Xu <[email protected]>
(cherry picked from commit 80b0acd)
@PwzXxm
Copy link
Contributor Author

PwzXxm commented Oct 29, 2024

Consistency Level Eventually and Session may lose results on 2.5/master branch. Fixing.

@PwzXxm
Copy link
Contributor Author

PwzXxm commented Oct 29, 2024

Copy from

For detailed explanation, see #37180

BeginTS is the current timestamp when the task is created.
The GuaranteeTS is the one parsed based on both consistency level and beginTS, in PreExecute of the task on Proxy.
The delegator will wait until GuaranteeTS is met.
In PostExecute of the task on Proxy, the TS of the first iterator request will be returned to the SDK and add it to the subsequent requests.

image

Hence, if the default consistency level is Eventually or Bounded, the order of TS will be

Guarantee TS < BeginTS

If it returns the BeginTS, the second request will need to catch up and result in extra 200ms max of latency, which results in something like

Call Latency
first call on Next() 30ms
second call on Next() 210ms
third call on Next() 10ms
fourth call on Next() 11 ms
... ...

where we expect

Call Latency
first call on Next() 30ms
second call on Next() 10ms
third call on Next() 10ms
fourth call on Next() 11 ms
... ...

@PwzXxm
Copy link
Contributor Author

PwzXxm commented Oct 29, 2024

Hence, we return the MvccTs (tsafe) from the first request. Will deal with multiple shards in a separate issue/PR.

sre-ci-robot pushed a commit that referenced this issue Oct 30, 2024
issue: #37158

Signed-off-by: Patrick Weizhi Xu <[email protected]>
sre-ci-robot pushed a commit that referenced this issue Oct 30, 2024
issue: #37158
pr: #37247

Signed-off-by: Patrick Weizhi Xu <[email protected]>
(cherry picked from commit 5ed7230)
@xiaofan-luan
Copy link
Collaborator

Hence, we return the MvccTs (tsafe) from the first request. Will deal with multiple shards in a separate issue/PR.

Is there a difference between multi shards and single shard or simply some extra work?

@MrPresent-Han
Copy link
Contributor

/unassign

@MrPresent-Han
Copy link
Contributor

/assign PwzXxm

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/enhancement Issues or changes related to enhancement
Projects
None yet
Development

No branches or pull requests

4 participants