Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sink-to-mysql (CDC) missing explicit timezone session variable may cause wrong TIMESTAMP value #10393

Closed
zhangjinpeng87 opened this issue Dec 29, 2023 · 2 comments · Fixed by #10423
Assignees
Labels
type/enhancement The issue or PR belongs to an enhancement.

Comments

@zhangjinpeng87
Copy link
Contributor

zhangjinpeng87 commented Dec 29, 2023

How TiDB/MySQL handle timezone for TIMESTAMP columns

Like MySQL, TiDB's TIMESTAMP columns will be internally transformed to UTC timezone value and stored in TiDB. When users in different regions want to retrieve the TIMESTAMP columns values, they should set their time_zone session value to read the correct value. The whole progress is:

  • Write TIMESTAMP columns with local timezone to TiDB
  • TiDB transform the TIMESTAMP to UTC timezone value and store it
  • User use set different time_zone session variable to read the TIMESTAMP column, TiDB internally transform the UTC timezone column value to time_zone session variable specified timezone value and return to user, so different users in different time zone can see correct TIMESTAMP value

Using CDC replicate data from TiDB to MySQL/TiDB

When using CDC replicate data from TiDB to MySQL/TiDB, if upstream TiDB, CDC and downstream MySQL/TiDB are all deployed in the same local timezone, there is no issue. But if CDC and downstream MySQL/TiDB are deployed in different local timezones, the replicated value of TIMEZONE columns may be wrong.

This is because CDC received upstream TiDB row changes, and CDC use its local timezone (system_timezone) to decode these row values, and then execute SQL to downstream MySQL/TiDB without explicitly set time_zone session variable. CDC should explicitly set its session variable time_zone here.

See related code here:
https://github.com/pingcap/tiflow/blob/master/cdc/processor/processor.go#L584
https://github.com/pingcap/tiflow/blob/master/cdc/entry/mounter_group.go#L70
https://github.com/pingcap/tidb/blob/master/pkg/util/rowcodec/decoder.go#L143

Updates 12/29/2023

Seems https://github.com/pingcap/tiflow/blob/master/pkg/sink/mysql/db_helper.go#L116 set the timezone for all connections. But mounter always use server-config's timezone to decode row data changes, the sink also should use the same timezone to write to MySQL.

@zhangjinpeng87
Copy link
Contributor Author

zhangjinpeng87 commented Dec 29, 2023

Seems https://github.com/pingcap/tiflow/blob/master/pkg/sink/mysql/db_helper.go#L116 set the timezone for all connections. But mounter always use server-config's timezone to decode row data changes, the sink also should use the same timezone to write to MySQL.

@3AceShowHand
Copy link
Contributor

3AceShowHand commented Feb 2, 2024

I would like to implement this functionality as soon as possible.

I found that time-zone also may affect row-level checksum functionality when using Kafka sink.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/enhancement The issue or PR belongs to an enhancement.
Projects
2 participants