From 4738cda67e2ff9538f068db6023dc13ecea10f7f Mon Sep 17 00:00:00 2001 From: zhexuany Date: Mon, 10 Sep 2018 16:03:41 +0800 Subject: [PATCH 01/15] adding tz env --- docs/adding_tz_env.md | 57 +++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 57 insertions(+) create mode 100644 docs/adding_tz_env.md diff --git a/docs/adding_tz_env.md b/docs/adding_tz_env.md new file mode 100644 index 0000000000000..e2878e23f912f --- /dev/null +++ b/docs/adding_tz_env.md @@ -0,0 +1,57 @@ +Proposal: adding TZ as an environment variable as a necessary condition before starting TiDB + +# Proposal: adding TZ as an environment variable as a necessary condition before starting TiDB + +- Author(s): [Zhexuan Yang](www.github.com/zhexuany) +- Last updated: 2018/09/09 +- Discussion at: Not applicable + +## Abstract + +When it comes to time-related calculation, it is hard for the distributed system. This proposal tries to solve the system's timezone issue. In this protocol, we'll evaluate the advantage and disadvantage of different approaches and choose the optimal one. The impact of this proposal is changing the default timezone name when we push down to TiKV. Before this proposal, the default timezone name pushed down to TiKV is `System`. After this, TiDB evaluates system's timezone name and then pushes down the real name rather than `System`. + +## Background + +After we solved the daylight saving time issue, we found the performance delegation of TiKV side. Thanks for the investigation done by engineers from TiKV. The root cause of such performance delegation is that TiKV infers `System` timezone name via a third party lib, which calls a syscall and costs a lot. In our internal benchmark system, after [this PR](https://github.com/pingcap/tidb/search?q=daylight+saving+time&type=Issues), our codebase is 1000 times slower than before. We have to address this. + +## Proposal + +Firstly, we need to introduce the `TZ` environment. In POSIX system, the value of `TZ` variable can be one of the following three formats. A detailed description can be found in [this link](http://www.gnu.org/software/libc/manual/html_node/TZ-Variable.html) + + * std offset + * std offset dst [offset], start[/time], end[/time] + * :characters + +The std means the IANA timezone name; the offset means timezone offset; the dst indicates the leading timezone having daylight saving time. + +In our case, which I mean both `TiDB` and `TiKV`, we need care the first and third formats. For answering why we do not need the second format, we need to review how Golang evaluates timezone. In `time` package, the method `LoadLocation` reads tzData from pre-specified sources(directories may contain tzData) and then builds `time. Location` from such tzData which already contains daylight saving time information. + +In this proposal, we suggest setting `TZ` to a valid IANA timezone name which can be read from `DB`. If `TiDB` can't get `TZ` or the supply of `TZ` is invalid, `TiDB` just quits and prints an understandable message telling the user you should set `TZ` properly. Setting `TZ` can be done in our `tidb-ansible` project. It is the configuration in config file and it will never go wrong. + +The positive side of this change is resolving performance delegation issue to avoid any potential time-related calculation. + +The negative side is just adding a config item which is a very small matter and the user probably does not care it if we can take care of it and more importantly guarantee the correctness. + + +## Rationale + +We tried to read system timezone name by checking the path of the soft link of `/etc/localtime` but, sadly, failed. The failed case is docker. In docker image, it copies the real timezone file and links to `/usr/share/zoneinfo/utc`. The timezone data is correct but the path is not. Regarding of `UTC`, Golang just returns `UTC` instance and will not further read tzdata from sources. I guess that why docker chooses to link real timezone file to `/usr/share/zoneinfo/utc`. + +The impact of this proposed change affects the deploying logic of TiDB, which is just a small matter. + +## Compatibility + +It does not have compatibility issue as long as the user deploys by `tidb-ansible`. We may mention this in our release-node and the message printed before tidb quits, which must be easy to understand. + + +## Implementation + +The implementation is relatively easy. We just get `TZ` environment from system and check whether it is valid or not. If it is invalid, TiDB evaluates the path of soft link of `/etc/localtime`. +In addition, a warning message needs to be printed indicating user has to set `TZ` variable. For example, if `/etc/localtime` links to /usr/share/zoneinfo/Asia/Shanghai, then timezone name should be `Asia/Shangahi`. + +In order to ensure the uniqueness of global timezone, we need to write timezone name into `mysql.tidb` under column `system_tz`. + +## Open issues (if applicable) + +PR of this proposal: https://github.com/pingcap/tidb/pull/7638/files +PR of change TZ loading logic of golang: https://github.com/golang/go/pull/27570 From a8e2c784d3b447e731301d745ffaf4dcd1d01d29 Mon Sep 17 00:00:00 2001 From: zhexuany Date: Mon, 10 Sep 2018 17:21:08 +0800 Subject: [PATCH 02/15] change title and context --- docs/adding_tz_env.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/docs/adding_tz_env.md b/docs/adding_tz_env.md index e2878e23f912f..99b699a11b513 100644 --- a/docs/adding_tz_env.md +++ b/docs/adding_tz_env.md @@ -1,4 +1,4 @@ -Proposal: adding TZ as an environment variable as a necessary condition before starting TiDB +Proposal: adding TZ as an environment variable # Proposal: adding TZ as an environment variable as a necessary condition before starting TiDB @@ -26,7 +26,7 @@ The std means the IANA timezone name; the offset means timezone offset; the dst In our case, which I mean both `TiDB` and `TiKV`, we need care the first and third formats. For answering why we do not need the second format, we need to review how Golang evaluates timezone. In `time` package, the method `LoadLocation` reads tzData from pre-specified sources(directories may contain tzData) and then builds `time. Location` from such tzData which already contains daylight saving time information. -In this proposal, we suggest setting `TZ` to a valid IANA timezone name which can be read from `DB`. If `TiDB` can't get `TZ` or the supply of `TZ` is invalid, `TiDB` just quits and prints an understandable message telling the user you should set `TZ` properly. Setting `TZ` can be done in our `tidb-ansible` project. It is the configuration in config file and it will never go wrong. +In this proposal, we suggest setting `TZ` to a valid IANA timezone name which can be read from `TiDB`. If `TiDB` can't get `TZ` or the supply of `TZ` is invalid, `TiDB` just fallback to evaluates the path of the soft link of `/etc/localtime`. An warning message telling the user you should set `TZ` properly will be printed. Setting `TZ` can be done in our `tidb-ansible` project. The value of `localStr` should initialized at `TiDB`'s bootstrap stage in order to ensure the uniqueness of gloabl timezone name. If system timezone can not be read from `mysql.tidb`, `UTC` will be used as default timezone name. The positive side of this change is resolving performance delegation issue to avoid any potential time-related calculation. @@ -35,7 +35,7 @@ The negative side is just adding a config item which is a very small matter and ## Rationale -We tried to read system timezone name by checking the path of the soft link of `/etc/localtime` but, sadly, failed. The failed case is docker. In docker image, it copies the real timezone file and links to `/usr/share/zoneinfo/utc`. The timezone data is correct but the path is not. Regarding of `UTC`, Golang just returns `UTC` instance and will not further read tzdata from sources. I guess that why docker chooses to link real timezone file to `/usr/share/zoneinfo/utc`. +We tried to read system timezone name by checking the path of the soft link of `/etc/localtime` but, sadly, failed. The failed case is docker. In docker image, it copies the real timezone file and links to `/usr/share/zoneinfo/utc`. The timezone data is correct but the path is not. Regarding of `UTC`, Golang just returns `UTC` instance and will not further read tzdata from sources. The impact of this proposed change affects the deploying logic of TiDB, which is just a small matter. From f3c958b6ad96bb765f74fa140198d5e6f48be99a Mon Sep 17 00:00:00 2001 From: zhexuany Date: Mon, 10 Sep 2018 17:31:28 +0800 Subject: [PATCH 03/15] fix a wrong word --- docs/adding_tz_env.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/adding_tz_env.md b/docs/adding_tz_env.md index 99b699a11b513..ead6546acc3fb 100644 --- a/docs/adding_tz_env.md +++ b/docs/adding_tz_env.md @@ -12,7 +12,7 @@ When it comes to time-related calculation, it is hard for the distributed system ## Background -After we solved the daylight saving time issue, we found the performance delegation of TiKV side. Thanks for the investigation done by engineers from TiKV. The root cause of such performance delegation is that TiKV infers `System` timezone name via a third party lib, which calls a syscall and costs a lot. In our internal benchmark system, after [this PR](https://github.com/pingcap/tidb/search?q=daylight+saving+time&type=Issues), our codebase is 1000 times slower than before. We have to address this. +After we solved the daylight saving time issue, we found the performance degradation of TiKV side. Thanks for the investigation done by engineers from TiKV. The root cause of such performance delegation is that TiKV infers `System` timezone name via a third party lib, which calls a syscall and costs a lot. In our internal benchmark system, after [this PR](https://github.com/pingcap/tidb/search?q=daylight+saving+time&type=Issues), our codebase is 1000 times slower than before. We have to address this. ## Proposal From 71b643bc55c54affbc67c7e0af157dc2caf0025c Mon Sep 17 00:00:00 2001 From: zhexuany Date: Mon, 10 Sep 2018 18:00:11 +0800 Subject: [PATCH 04/15] refine doc --- docs/adding_tz_env.md | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/docs/adding_tz_env.md b/docs/adding_tz_env.md index ead6546acc3fb..f4cb93c86792f 100644 --- a/docs/adding_tz_env.md +++ b/docs/adding_tz_env.md @@ -26,9 +26,11 @@ The std means the IANA timezone name; the offset means timezone offset; the dst In our case, which I mean both `TiDB` and `TiKV`, we need care the first and third formats. For answering why we do not need the second format, we need to review how Golang evaluates timezone. In `time` package, the method `LoadLocation` reads tzData from pre-specified sources(directories may contain tzData) and then builds `time. Location` from such tzData which already contains daylight saving time information. -In this proposal, we suggest setting `TZ` to a valid IANA timezone name which can be read from `TiDB`. If `TiDB` can't get `TZ` or the supply of `TZ` is invalid, `TiDB` just fallback to evaluates the path of the soft link of `/etc/localtime`. An warning message telling the user you should set `TZ` properly will be printed. Setting `TZ` can be done in our `tidb-ansible` project. The value of `localStr` should initialized at `TiDB`'s bootstrap stage in order to ensure the uniqueness of gloabl timezone name. If system timezone can not be read from `mysql.tidb`, `UTC` will be used as default timezone name. +In this proposal, we suggest setting `TZ` to a valid IANA timezone name which can be read from `TiDB` later. If `TiDB` can't get `TZ` or the supply of `TZ` is invalid, `TiDB` just falls back to evaluates the path of the soft link of `/etc/localtime`. In addition, an warning message telling the user you should set `TZ` properly will be printed. Setting `TZ` can be done in our `tidb-ansible` project, it is also can be done at user side by `export TZ="Asia/Shanghai"`. If both of them are failed, `TiDB` will use `UTC` as timezone name. -The positive side of this change is resolving performance delegation issue to avoid any potential time-related calculation. +The value of `localStr` should be initialized by method `initLocalStr()`at `TiDB`'s bootstrap stage. `localStr` will be stored in `mysql.tidb` under `system_tz` column. If system timezone can not get valid IANA timezone name from `mysql.tidb`, `UTC` will be used as default timezone name. + +The positive side of this change is to resolve performance degradation issue and to ensure the uniqueness of gloabl timezone name in multiple `TiDB` instances. The negative side is just adding a config item which is a very small matter and the user probably does not care it if we can take care of it and more importantly guarantee the correctness. @@ -47,9 +49,9 @@ It does not have compatibility issue as long as the user deploys by `tidb-ansibl ## Implementation The implementation is relatively easy. We just get `TZ` environment from system and check whether it is valid or not. If it is invalid, TiDB evaluates the path of soft link of `/etc/localtime`. -In addition, a warning message needs to be printed indicating user has to set `TZ` variable. For example, if `/etc/localtime` links to /usr/share/zoneinfo/Asia/Shanghai, then timezone name should be `Asia/Shangahi`. +In addition, a warning message needs to be printed indicating user has to set `TZ` variable. For example, if `/etc/localtime` links to /usr/share/zoneinfo/Asia/Shanghai, then timezone name should be `Asia/Shangahi`. -In order to ensure the uniqueness of global timezone, we need to write timezone name into `mysql.tidb` under column `system_tz`. +In order to ensure the uniqueness of global timezone across multiple `TiDB` instances, we need to write timezone name into `mysql.tidb` under column `system_tz`. ## Open issues (if applicable) From 419a8da62cbcde049bbbf9834e815c5d13e25db9 Mon Sep 17 00:00:00 2001 From: zhexuany Date: Mon, 10 Sep 2018 21:19:20 +0800 Subject: [PATCH 05/15] refine writing --- docs/adding_tz_env.md | 14 ++++++-------- 1 file changed, 6 insertions(+), 8 deletions(-) diff --git a/docs/adding_tz_env.md b/docs/adding_tz_env.md index f4cb93c86792f..471dae666897a 100644 --- a/docs/adding_tz_env.md +++ b/docs/adding_tz_env.md @@ -8,12 +8,14 @@ Proposal: adding TZ as an environment variable ## Abstract -When it comes to time-related calculation, it is hard for the distributed system. This proposal tries to solve the system's timezone issue. In this protocol, we'll evaluate the advantage and disadvantage of different approaches and choose the optimal one. The impact of this proposal is changing the default timezone name when we push down to TiKV. Before this proposal, the default timezone name pushed down to TiKV is `System`. After this, TiDB evaluates system's timezone name and then pushes down the real name rather than `System`. +When it comes to time-related calculation, it is hard for the distributed system. This proposal tries to resolve two problems: 1. timezone maybe inconsisten across multiple `TiDB` instance, 2. performance degradation casued by pushing `System` down to `TiKV`. The impact of this proposal is changing the way of `TiDB` inferring system's timezone name. Before this proposal, the default timezone name pushed down to TiKV is `System` when session's timezone is not set. After this, TiDB evaluates system's timezone name via `TZ` environment variable and then pushes the real name down to `TiKV` rather than `System`. ## Background After we solved the daylight saving time issue, we found the performance degradation of TiKV side. Thanks for the investigation done by engineers from TiKV. The root cause of such performance delegation is that TiKV infers `System` timezone name via a third party lib, which calls a syscall and costs a lot. In our internal benchmark system, after [this PR](https://github.com/pingcap/tidb/search?q=daylight+saving+time&type=Issues), our codebase is 1000 times slower than before. We have to address this. +Another problem needs also to be addressed is the the poentially incosistent timezone name across multiple `TiDB` instances. `TiDB` instances may reside at different timezone which could produce incorrect calculation when it comes to time-related calculation. Just getting `TiDB`'s sytem timezone could be be broken. We need find a way to ensure the uniqueness of global timezone name across multiple `TiDB`'s timezone name. + ## Proposal Firstly, we need to introduce the `TZ` environment. In POSIX system, the value of `TZ` variable can be one of the following three formats. A detailed description can be found in [this link](http://www.gnu.org/software/libc/manual/html_node/TZ-Variable.html) @@ -24,22 +26,18 @@ Firstly, we need to introduce the `TZ` environment. In POSIX system, the value o The std means the IANA timezone name; the offset means timezone offset; the dst indicates the leading timezone having daylight saving time. -In our case, which I mean both `TiDB` and `TiKV`, we need care the first and third formats. For answering why we do not need the second format, we need to review how Golang evaluates timezone. In `time` package, the method `LoadLocation` reads tzData from pre-specified sources(directories may contain tzData) and then builds `time. Location` from such tzData which already contains daylight saving time information. +In our case, which means both `TiDB` and `TiKV`, we need care the first and third formats. For answering why we do not need the second format, we need to review how Golang evaluates timezone. In `time` package, the method `LoadLocation` reads tzData from pre-specified sources(directories may contain tzData) and then builds `time. Location` from such tzData which already contains daylight saving time information. In this proposal, we suggest setting `TZ` to a valid IANA timezone name which can be read from `TiDB` later. If `TiDB` can't get `TZ` or the supply of `TZ` is invalid, `TiDB` just falls back to evaluates the path of the soft link of `/etc/localtime`. In addition, an warning message telling the user you should set `TZ` properly will be printed. Setting `TZ` can be done in our `tidb-ansible` project, it is also can be done at user side by `export TZ="Asia/Shanghai"`. If both of them are failed, `TiDB` will use `UTC` as timezone name. -The value of `localStr` should be initialized by method `initLocalStr()`at `TiDB`'s bootstrap stage. `localStr` will be stored in `mysql.tidb` under `system_tz` column. If system timezone can not get valid IANA timezone name from `mysql.tidb`, `UTC` will be used as default timezone name. - -The positive side of this change is to resolve performance degradation issue and to ensure the uniqueness of gloabl timezone name in multiple `TiDB` instances. +The positive side of this change is resolving performance degradation issue and ensuring the uniqueness of gloabl timezone name in multiple `TiDB` instances. The negative side is just adding a config item which is a very small matter and the user probably does not care it if we can take care of it and more importantly guarantee the correctness. ## Rationale -We tried to read system timezone name by checking the path of the soft link of `/etc/localtime` but, sadly, failed. The failed case is docker. In docker image, it copies the real timezone file and links to `/usr/share/zoneinfo/utc`. The timezone data is correct but the path is not. Regarding of `UTC`, Golang just returns `UTC` instance and will not further read tzdata from sources. - -The impact of this proposed change affects the deploying logic of TiDB, which is just a small matter. +We tried to read system timezone name by checking the path of the soft link of `/etc/localtime` but, sadly, failed at a corner case. The failed case is docker. In docker image, it copies the real timezone file and links to `/usr/share/zoneinfo/utc`. The timezone data is correct but the path is not. Regarding of `UTC`, Golang just returns `UTC` instance and will not further read tzdata from sources. This leads to a fallback solution. When we cannot evaluate from the path, we fall back to `UTC`. ## Compatibility From 779ad7c903f8c51135ca21c26e771c2ee3dbe373 Mon Sep 17 00:00:00 2001 From: zhexuany Date: Mon, 10 Sep 2018 21:36:50 +0800 Subject: [PATCH 06/15] rename titile --- docs/adding_tz_env.md | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/docs/adding_tz_env.md b/docs/adding_tz_env.md index 471dae666897a..2a8ffc42fe2da 100644 --- a/docs/adding_tz_env.md +++ b/docs/adding_tz_env.md @@ -1,6 +1,4 @@ -Proposal: adding TZ as an environment variable - -# Proposal: adding TZ as an environment variable as a necessary condition before starting TiDB +# Proposal: adding TZ as an environment variable - Author(s): [Zhexuan Yang](www.github.com/zhexuany) - Last updated: 2018/09/09 From 1c9d9c37e6976d4dd2c3fef83161c592765c8b4f Mon Sep 17 00:00:00 2001 From: zhexuany Date: Mon, 10 Sep 2018 21:37:54 +0800 Subject: [PATCH 07/15] remove empty string --- docs/adding_tz_env.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/adding_tz_env.md b/docs/adding_tz_env.md index 2a8ffc42fe2da..93a0db807d5ec 100644 --- a/docs/adding_tz_env.md +++ b/docs/adding_tz_env.md @@ -1,6 +1,6 @@ # Proposal: adding TZ as an environment variable -- Author(s): [Zhexuan Yang](www.github.com/zhexuany) +- Author(s): [Zhexuan Yang](www.github.com/zhexuany) - Last updated: 2018/09/09 - Discussion at: Not applicable From 0f59b85068b7434623d080cbd37860b4a3bf9aa9 Mon Sep 17 00:00:00 2001 From: zhexuany Date: Mon, 10 Sep 2018 21:38:32 +0800 Subject: [PATCH 08/15] correct spelling --- docs/adding_tz_env.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/adding_tz_env.md b/docs/adding_tz_env.md index 93a0db807d5ec..72679965567b0 100644 --- a/docs/adding_tz_env.md +++ b/docs/adding_tz_env.md @@ -6,7 +6,7 @@ ## Abstract -When it comes to time-related calculation, it is hard for the distributed system. This proposal tries to resolve two problems: 1. timezone maybe inconsisten across multiple `TiDB` instance, 2. performance degradation casued by pushing `System` down to `TiKV`. The impact of this proposal is changing the way of `TiDB` inferring system's timezone name. Before this proposal, the default timezone name pushed down to TiKV is `System` when session's timezone is not set. After this, TiDB evaluates system's timezone name via `TZ` environment variable and then pushes the real name down to `TiKV` rather than `System`. +When it comes to time-related calculation, it is hard for the distributed system. This proposal tries to resolve two problems: 1. timezone may be inconsistent across multiple `TiDB` instance, 2. performance degradation casued by pushing `System` down to `TiKV`. The impact of this proposal is changing the way of `TiDB` inferring system's timezone name. Before this proposal, the default timezone name pushed down to TiKV is `System` when session's timezone is not set. After this, TiDB evaluates system's timezone name via `TZ` environment variable and then pushes the real name down to `TiKV` rather than `System`. ## Background From b61058947b18d2794c65e09ffb4651f6128f2847 Mon Sep 17 00:00:00 2001 From: zhexuany Date: Mon, 10 Sep 2018 21:40:04 +0800 Subject: [PATCH 09/15] refine doc --- docs/adding_tz_env.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/docs/adding_tz_env.md b/docs/adding_tz_env.md index 72679965567b0..5db7b15bf394c 100644 --- a/docs/adding_tz_env.md +++ b/docs/adding_tz_env.md @@ -6,13 +6,13 @@ ## Abstract -When it comes to time-related calculation, it is hard for the distributed system. This proposal tries to resolve two problems: 1. timezone may be inconsistent across multiple `TiDB` instance, 2. performance degradation casued by pushing `System` down to `TiKV`. The impact of this proposal is changing the way of `TiDB` inferring system's timezone name. Before this proposal, the default timezone name pushed down to TiKV is `System` when session's timezone is not set. After this, TiDB evaluates system's timezone name via `TZ` environment variable and then pushes the real name down to `TiKV` rather than `System`. +When it comes to time-related calculation, it is hard for the distributed system. This proposal tries to resolve two problems: 1. timezone may be inconsistent across multiple `TiDB` instances, 2. performance degradation casued by pushing `System` down to `TiKV`. The impact of this proposal is changing the way of `TiDB` inferring system's timezone name. Before this proposal, the default timezone name pushed down to TiKV is `System` when session's timezone is not set. After this, TiDB evaluates system's timezone name via `TZ` environment variable and the path of the soft link of `/etc/localtime`. If both of them are failed, `TiDB` then push `UTC` to `TiKV`. ## Background -After we solved the daylight saving time issue, we found the performance degradation of TiKV side. Thanks for the investigation done by engineers from TiKV. The root cause of such performance delegation is that TiKV infers `System` timezone name via a third party lib, which calls a syscall and costs a lot. In our internal benchmark system, after [this PR](https://github.com/pingcap/tidb/search?q=daylight+saving+time&type=Issues), our codebase is 1000 times slower than before. We have to address this. +After we solved the daylight saving time issue, we found the performance degradation of TiKV side. Thanks for the investigation done by engineers from TiKV. The root cause of such performance degradation is that TiKV infers `System` timezone name via a third party lib, which calls a syscall and costs a lot. In our internal benchmark system, after [this PR](https://github.com/pingcap/tidb/pull/6823), our codebase is 1000 times slower than before. We have to address this. -Another problem needs also to be addressed is the the poentially incosistent timezone name across multiple `TiDB` instances. `TiDB` instances may reside at different timezone which could produce incorrect calculation when it comes to time-related calculation. Just getting `TiDB`'s sytem timezone could be be broken. We need find a way to ensure the uniqueness of global timezone name across multiple `TiDB`'s timezone name. +Another problem needs also to be addressed is the potentially incosistent timezone name across multiple `TiDB` instances. `TiDB` instances may reside at different timezone which could produce incorrect calculation when it comes to time-related calculation. Just getting `TiDB`'s sytem timezone could be be broken. We need find a way to ensure the uniqueness of global timezone name across multiple `TiDB`'s timezone name. ## Proposal From f53f5dc3dce3f33424a69549f611fe721c8fb386 Mon Sep 17 00:00:00 2001 From: zhexuany Date: Mon, 10 Sep 2018 21:53:16 +0800 Subject: [PATCH 10/15] final refine --- docs/adding_tz_env.md | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/docs/adding_tz_env.md b/docs/adding_tz_env.md index 5db7b15bf394c..d5b361512c920 100644 --- a/docs/adding_tz_env.md +++ b/docs/adding_tz_env.md @@ -12,7 +12,7 @@ When it comes to time-related calculation, it is hard for the distributed system After we solved the daylight saving time issue, we found the performance degradation of TiKV side. Thanks for the investigation done by engineers from TiKV. The root cause of such performance degradation is that TiKV infers `System` timezone name via a third party lib, which calls a syscall and costs a lot. In our internal benchmark system, after [this PR](https://github.com/pingcap/tidb/pull/6823), our codebase is 1000 times slower than before. We have to address this. -Another problem needs also to be addressed is the potentially incosistent timezone name across multiple `TiDB` instances. `TiDB` instances may reside at different timezone which could produce incorrect calculation when it comes to time-related calculation. Just getting `TiDB`'s sytem timezone could be be broken. We need find a way to ensure the uniqueness of global timezone name across multiple `TiDB`'s timezone name. +Another problem needs also to be addressed is the potentially incosistent timezone name across multiple `TiDB` instances. `TiDB` instances may reside at different timezone which could cause incorrect calculation when it comes to time-related calculation. Just getting `TiDB`'s sytem timezone could be broken. We need find a way to ensure the uniqueness of global timezone name across multiple `TiDB`'s timezone name and also to leverage to resolve the performance degradation. ## Proposal @@ -44,10 +44,9 @@ It does not have compatibility issue as long as the user deploys by `tidb-ansibl ## Implementation -The implementation is relatively easy. We just get `TZ` environment from system and check whether it is valid or not. If it is invalid, TiDB evaluates the path of soft link of `/etc/localtime`. -In addition, a warning message needs to be printed indicating user has to set `TZ` variable. For example, if `/etc/localtime` links to /usr/share/zoneinfo/Asia/Shanghai, then timezone name should be `Asia/Shangahi`. +The implementation is relatively easy. We just get `TZ` environment from system and check whether it is valid or not. If it is invalid, TiDB evaluates the path of soft link of `/etc/localtime`. In addition, a warning message needs to be printed indicating user has to set `TZ` variable properly. For example, if `/etc/localtime` links to `/usr/share/zoneinfo/Asia/Shanghai`, then timezone name `TiDB` gets should be `Asia/Shangahi`. -In order to ensure the uniqueness of global timezone across multiple `TiDB` instances, we need to write timezone name into `mysql.tidb` under column `system_tz`. +In order to ensure the uniqueness of global timezone across multiple `TiDB` instances, we need to write timezone name into `mysql.tidb` under column `system_tz`. This cached value can be read once `TiDB` finishs its bootstrap stage. A method `loadLocalStr` can do this job. ## Open issues (if applicable) From 6308b997b4aac3cbe360e7c4e3a37c5108bc8f71 Mon Sep 17 00:00:00 2001 From: zhexuany Date: Mon, 10 Sep 2018 21:54:15 +0800 Subject: [PATCH 11/15] rename title --- docs/adding_tz_env.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/adding_tz_env.md b/docs/adding_tz_env.md index d5b361512c920..fadcea0a3466f 100644 --- a/docs/adding_tz_env.md +++ b/docs/adding_tz_env.md @@ -1,4 +1,4 @@ -# Proposal: adding TZ as an environment variable +# Proposal: Infer the System Timezone of a TiDB cluster via TZ environment variable - Author(s): [Zhexuan Yang](www.github.com/zhexuany) - Last updated: 2018/09/09 From cabd5434545c8f3b901c4744a365e4cbb4a43ef4 Mon Sep 17 00:00:00 2001 From: zhexuany Date: Mon, 10 Sep 2018 21:56:46 +0800 Subject: [PATCH 12/15] correct grammar --- docs/adding_tz_env.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/adding_tz_env.md b/docs/adding_tz_env.md index fadcea0a3466f..ff818bc548b83 100644 --- a/docs/adding_tz_env.md +++ b/docs/adding_tz_env.md @@ -26,7 +26,7 @@ The std means the IANA timezone name; the offset means timezone offset; the dst In our case, which means both `TiDB` and `TiKV`, we need care the first and third formats. For answering why we do not need the second format, we need to review how Golang evaluates timezone. In `time` package, the method `LoadLocation` reads tzData from pre-specified sources(directories may contain tzData) and then builds `time. Location` from such tzData which already contains daylight saving time information. -In this proposal, we suggest setting `TZ` to a valid IANA timezone name which can be read from `TiDB` later. If `TiDB` can't get `TZ` or the supply of `TZ` is invalid, `TiDB` just falls back to evaluates the path of the soft link of `/etc/localtime`. In addition, an warning message telling the user you should set `TZ` properly will be printed. Setting `TZ` can be done in our `tidb-ansible` project, it is also can be done at user side by `export TZ="Asia/Shanghai"`. If both of them are failed, `TiDB` will use `UTC` as timezone name. +In this proposal, we suggest setting `TZ` to a valid IANA timezone name which can be read from `TiDB` later. If `TiDB` can't get `TZ` or the supply of `TZ` is invalid, `TiDB` just falls back to evaluate the path of the soft link of `/etc/localtime`. In addition, a warning message telling the user you should set `TZ` properly will be printed. Setting `TZ` can be done in our `tidb-ansible` project, it is also can be done at user side by `export TZ="Asia/Shanghai"`. If both of them are failed, `TiDB` will use `UTC` as timezone name. The positive side of this change is resolving performance degradation issue and ensuring the uniqueness of gloabl timezone name in multiple `TiDB` instances. From 7a2f6b2da4acf414c53515773ef17d008940bb82 Mon Sep 17 00:00:00 2001 From: zhexuany Date: Mon, 10 Sep 2018 21:58:26 +0800 Subject: [PATCH 13/15] add link for loadlocation --- docs/adding_tz_env.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/adding_tz_env.md b/docs/adding_tz_env.md index ff818bc548b83..a18d6c166359d 100644 --- a/docs/adding_tz_env.md +++ b/docs/adding_tz_env.md @@ -24,7 +24,7 @@ Firstly, we need to introduce the `TZ` environment. In POSIX system, the value o The std means the IANA timezone name; the offset means timezone offset; the dst indicates the leading timezone having daylight saving time. -In our case, which means both `TiDB` and `TiKV`, we need care the first and third formats. For answering why we do not need the second format, we need to review how Golang evaluates timezone. In `time` package, the method `LoadLocation` reads tzData from pre-specified sources(directories may contain tzData) and then builds `time. Location` from such tzData which already contains daylight saving time information. +In our case, which means both `TiDB` and `TiKV`, we need care the first and third formats. For answering why we do not need the second format, we need to review how Golang evaluates timezone. In `time` package, the method [LoadLocation](https://golang.org/pkg/time/#LoadLocation) reads tzData from pre-specified sources(directories may contain tzData) and then builds `time.Location` from such tzData which already contains daylight saving time information. In this proposal, we suggest setting `TZ` to a valid IANA timezone name which can be read from `TiDB` later. If `TiDB` can't get `TZ` or the supply of `TZ` is invalid, `TiDB` just falls back to evaluate the path of the soft link of `/etc/localtime`. In addition, a warning message telling the user you should set `TZ` properly will be printed. Setting `TZ` can be done in our `tidb-ansible` project, it is also can be done at user side by `export TZ="Asia/Shanghai"`. If both of them are failed, `TiDB` will use `UTC` as timezone name. From f2de5a2204d9b6ac90884a01f9ce69568ba42741 Mon Sep 17 00:00:00 2001 From: zhexuany Date: Tue, 11 Sep 2018 11:47:20 +0800 Subject: [PATCH 14/15] addding desc of upgrading process. --- docs/adding_tz_env.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/docs/adding_tz_env.md b/docs/adding_tz_env.md index a18d6c166359d..b9de6cd37fb9a 100644 --- a/docs/adding_tz_env.md +++ b/docs/adding_tz_env.md @@ -41,6 +41,8 @@ We tried to read system timezone name by checking the path of the soft link of ` It does not have compatibility issue as long as the user deploys by `tidb-ansible`. We may mention this in our release-node and the message printed before tidb quits, which must be easy to understand. +The upgrading process need to be handled in particular. `TZ` environment variable has to be set before we start new `TiDB` binary. In this way, the following bootstrap process can benefit from this and avoid any hazard happening. + ## Implementation From 8669dd8cd3b11b371f7a913b815664ad4bd26db2 Mon Sep 17 00:00:00 2001 From: zhexuany Date: Tue, 11 Sep 2018 12:51:09 +0800 Subject: [PATCH 15/15] update writing logic --- docs/adding_tz_env.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/adding_tz_env.md b/docs/adding_tz_env.md index b9de6cd37fb9a..b95cfef472f6c 100644 --- a/docs/adding_tz_env.md +++ b/docs/adding_tz_env.md @@ -48,7 +48,7 @@ The upgrading process need to be handled in particular. `TZ` environment variabl The implementation is relatively easy. We just get `TZ` environment from system and check whether it is valid or not. If it is invalid, TiDB evaluates the path of soft link of `/etc/localtime`. In addition, a warning message needs to be printed indicating user has to set `TZ` variable properly. For example, if `/etc/localtime` links to `/usr/share/zoneinfo/Asia/Shanghai`, then timezone name `TiDB` gets should be `Asia/Shangahi`. -In order to ensure the uniqueness of global timezone across multiple `TiDB` instances, we need to write timezone name into `mysql.tidb` under column `system_tz`. This cached value can be read once `TiDB` finishs its bootstrap stage. A method `loadLocalStr` can do this job. +In order to ensure the uniqueness of global timezone across multiple `TiDB` instances, we need to write timezone name into `variable_value` with variable name `system_tz` in `mysql.tidb`. This cached value can be read once `TiDB` finishs its bootstrap stage. A method `loadLocalStr` can do this job. ## Open issues (if applicable)