-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allocate start timestamps for exact staleness read-only transactions #21717
Comments
The timestamp |
Updated. |
@Yisaer So, can we close this issue? |
We can close this issue after |
Done. |
Background
This is a subtask of #21094.
An exact staleness timestamp bound executes reads at a user-specified timestamp.
Users can specify 2 kinds of timestamps:
For an absolute timestamp, it can always be taken as the start timestamp, because this timestamp is typically inferred from the commit timestamp of another transaction.
For a relative timestamp, however, the duration is relative to the PD leader time rather than the TiDB local time, so it's more complex to obtain the right start timestamp.
Infer start timestamps from absolute timestamps
The physical part of a start timestamp can be inferred directly from the user-specified absolute timestamp, and the logical part is 0, just like what
setSnapshotTS
does.Since staleness read-only transactions are started explicitly with the
START TRANSACTION
statements, the entrances where timestamps are allocated are obvious. Just follow the code fromSimpleExec.executeBegin
.Infer start timestamps from relative timestamps
Relative timestamps need to be transformed to absolute timestamps that conform to PD leader time. However, the timestamps need not be accurate, as the staleness is always up to tens of seconds.
pdOracle.lastTSMap
maintains alastTS
map for all transaction scopes. EachlastTS
is the last allocated timestamp in the corresponding transaction scope and can be taken as a rough current timestamp. For example,lastTS
can be used to obtain a low-resolution timestamp, judge whether a lock is expired, etc.Note that the
lastTS
has these characteristics:lastTS
and the real current timestamp is not bounded, but typically, the difference is small. ThepdOracle
ticks every 2 seconds to refreshlastTS
to make sure it's fresh even if no timestamp is allocated for a long time.Since low-resolution timestamps are based on
lastTS
, we can replace normal timestamps with low-resolution timestamps for exact staleness read-only transactions. To obtain low-resolution timestamps, we can do either way:GetLowResolutionTimestamp
/GetLowResolutionTimestampAsync
to get a fresh timestamp, then subtract the duration from the physical time, and compose a timestamp again.Uniqueness of stale timestamps
Some modules assume that the start timestamps of all transactions are distinct and thus take start timestamps as transaction IDs. For the TiDB component, I only found the lock resolver does so. However, read-only transactions won't lock data.
Temporarily, we just skip this part. Please leave your comment if you think it's a problem.
Compatibility with local transactions
Obtaining start timestamps from relative timestamps relies on
lastTS
, which can be either local or global. It's practical to make them compatible. The rule is simple: get thelastTS
which corresponds to thetxn_scope
in the current session. Actually, it doesn't matter because stale timestamps are not accurate anyway.This problem doesn't exist in the absolute timestamp cases because the absolute timestamp isn't specified to be local or global.
Validate stale timestamps
Both absolute timestamps and relative timestamps may be illegal. They may be too old or too new.
DataIsNotReady
.The safe points in GC leader and applied timestamps in follower are both calculated from the PD leader timestamps. Thus, to validate relative timestamps, they must be transformed to absolute timestamps first.
It's also acceptable to skip the check and let it fail in the execution phase, so it can be taken as a future improvement task.
The text was updated successfully, but these errors were encountered: