Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allocate start timestamps for exact staleness read-only transactions #21717

Closed
djshow832 opened this issue Dec 14, 2020 · 6 comments
Closed

Allocate start timestamps for exact staleness read-only transactions #21717

djshow832 opened this issue Dec 14, 2020 · 6 comments
Labels
component/pd sig/transaction SIG:Transaction type/enhancement The issue or PR belongs to an enhancement.

Comments

@djshow832
Copy link
Contributor

djshow832 commented Dec 14, 2020

Background

This is a subtask of #21094.

An exact staleness timestamp bound executes reads at a user-specified timestamp.

Users can specify 2 kinds of timestamps:

  • An absolute timestamp, such as "2020-12-31 00:00:00".
  • A duration relative to the current time, such as "00:00:05".

For an absolute timestamp, it can always be taken as the start timestamp, because this timestamp is typically inferred from the commit timestamp of another transaction.

For a relative timestamp, however, the duration is relative to the PD leader time rather than the TiDB local time, so it's more complex to obtain the right start timestamp.

Infer start timestamps from absolute timestamps

The physical part of a start timestamp can be inferred directly from the user-specified absolute timestamp, and the logical part is 0, just like what setSnapshotTS does.

Since staleness read-only transactions are started explicitly with the START TRANSACTION statements, the entrances where timestamps are allocated are obvious. Just follow the code from SimpleExec.executeBegin.

Infer start timestamps from relative timestamps

Relative timestamps need to be transformed to absolute timestamps that conform to PD leader time. However, the timestamps need not be accurate, as the staleness is always up to tens of seconds.

pdOracle.lastTSMap maintains a lastTS map for all transaction scopes. Each lastTS is the last allocated timestamp in the corresponding transaction scope and can be taken as a rough current timestamp. For example, lastTS can be used to obtain a low-resolution timestamp, judge whether a lock is expired, etc.

Note that the lastTS has these characteristics:

  • It's always behind the real current timestamp because it's the last allocated timestamp.
  • The difference between lastTS and the real current timestamp is not bounded, but typically, the difference is small. The pdOracle ticks every 2 seconds to refresh lastTS to make sure it's fresh even if no timestamp is allocated for a long time.

Since low-resolution timestamps are based on lastTS, we can replace normal timestamps with low-resolution timestamps for exact staleness read-only transactions. To obtain low-resolution timestamps, we can do either way:

  • Reuse GetLowResolutionTimestamp / GetLowResolutionTimestampAsync to get a fresh timestamp, then subtract the duration from the physical time, and compose a timestamp again.
  • Create a new interface to obtain a stale low-resolution timestamp.

Uniqueness of stale timestamps

Some modules assume that the start timestamps of all transactions are distinct and thus take start timestamps as transaction IDs. For the TiDB component, I only found the lock resolver does so. However, read-only transactions won't lock data.

Temporarily, we just skip this part. Please leave your comment if you think it's a problem.

Compatibility with local transactions

Obtaining start timestamps from relative timestamps relies on lastTS, which can be either local or global. It's practical to make them compatible. The rule is simple: get the lastTS which corresponds to the txn_scope in the current session. Actually, it doesn't matter because stale timestamps are not accurate anyway.

This problem doesn't exist in the absolute timestamp cases because the absolute timestamp isn't specified to be local or global.

Validate stale timestamps

Both absolute timestamps and relative timestamps may be illegal. They may be too old or too new.

  • Too old: the timestamp is older than the safe point. In this case, the transaction will fail when the data is found erased.
  • Too new: the timestamp is even larger than the current time. In this case, the transaction will fail when the follower returns DataIsNotReady.

The safe points in GC leader and applied timestamps in follower are both calculated from the PD leader timestamps. Thus, to validate relative timestamps, they must be transformed to absolute timestamps first.

It's also acceptable to skip the check and let it fail in the execution phase, so it can be taken as a future improvement task.

@djshow832 djshow832 added the type/enhancement The issue or PR belongs to an enhancement. label Dec 14, 2020
@nolouch
Copy link
Member

nolouch commented Dec 14, 2020

Uniqueness of stale timestamps

The timestamp start_ts, commit_ts serves as the version in the TiKV MVCC, we should make the version be unique. But for the read-only transaction, we can use any version(timestamps) to read data, such as use the same timestamp to read repeatedly for multiple transactions, that same as https://docs.pingcap.com/tidb/stable/read-historical-data.

@djshow832
Copy link
Contributor Author

Uniqueness of stale timestamps

The timestamp start_ts, commit_ts serves as the version in the TiKV MVCC, we should make the version be unique. But for the read-only transaction, we can use any version(timestamps) to read data, such as use the same timestamp to read repeatedly for multiple transactions, that same as https://docs.pingcap.com/tidb/stable/read-historical-data.

Updated.

@Yisaer
Copy link
Contributor

Yisaer commented Dec 25, 2020

I think #21713 and #21967 have solved the most tasks here.

Only Validate stale timestamps is needed to be solved now.

@nolouch
Copy link
Member

nolouch commented Dec 25, 2020

@Yisaer So, can we close this issue?

@Yisaer
Copy link
Contributor

Yisaer commented Dec 25, 2020

We can close this issue after Validate stale timestamps is done. It's not difficult.

@nolouch
Copy link
Member

nolouch commented Dec 13, 2021

Done.

@nolouch nolouch closed this as completed Dec 13, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/pd sig/transaction SIG:Transaction type/enhancement The issue or PR belongs to an enhancement.
Projects
None yet
Development

No branches or pull requests

3 participants