Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

linux: free datetime interface from tz #3420

Merged
merged 1 commit into from
Aug 4, 2017

Conversation

obastemur
Copy link
Collaborator

@obastemur obastemur commented Jul 24, 2017

  • Improves performance on XPLAT (up to 10 times)
  • New OSX Date interface had dropped old Windows specific date tests. Do the same for Linux
  • XPLAT DST for BC (negative years) doesn't have to match to Windows.

@MSLaguana
Copy link
Contributor

How much of a performance change do you see here? Removing the redundant tzsets is good, but since mktime implicitly calls tzset won't it still end up making syscalls every time unless the TZ environment variable is set?

tzset();
time_t ltime = timegm(&utc_tm);
time_t ltime = mktime(&utc_tm);
ltime += -(utc_tm.tm_isdst * 3600) + (utc_tm.tm_gmtoff);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does this behave around a DST transition? And the docs seem to indicate that tm_isdst can be negative if it is unknown, which would cause an incorrect conversion here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And the docs seem to indicate that tm_isdst can be negative if it is unknown

We fix the date at the other part of the same file. If system really doesn't know DST for a modern date, we will end-up showing the wrong date.

Besides, same opposite calculation was already being done under LocalToUTC. So, it's a give/take situation.

@obastemur
Copy link
Collaborator Author

but since mktime implicitly calls tzset

@MSLaguana Could you please share a reference for this claim? It reaches to DB for only negative dst value. There is no other way around.

@MSLaguana
Copy link
Contributor

That comment was based on the man page and experimentation. The man page actually only seemed to mention it for localtime:

The localtime() function converts the calendar time timep to broken-down time representation, expressed relative to  the
user's  specified  timezone.   The  function  acts  as if it called tzset(3) and sets the external variables tzname with
information about the current timezone, timezone with the difference between Coordinated Universal Time (UTC) and  local
standard  time  in seconds, and daylight to a nonzero value if daylight savings time rules apply during some part of the
year.  The return value points to a statically allocated struct which might be overwritten by subsequent calls to any of
the  date  and time functions.  The localtime_r() function does the same, but stores the data in a user-supplied struct.
It need not set tzname, timezone, and daylight.

Experimenting myself, I do see the behavior that I mention though. Compile this program:

#include <time.h>
#include <stdio.h>

int main() {
  struct tm tm = {};
  time_t t = mktime(&tm);
  printf("%u\n", t);
  tm.tm_sec = 1;
  t = mktime(&tm);
  printf("%u\n", t);
  return 0;
}

and note that I never invoke tzset. Then run the output through strace, and see that each call to mktime has a corresponding stat call (add/remove mktime calls if you want to be sure):

open("/etc/localtime", O_RDONLY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=2845, ...}) = 0
fstat(3, {st_mode=S_IFREG|0644, st_size=2845, ...}) = 0
read(3, "TZif2\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\5\0\0\0\5\0\0\0\0"..., 4096) = 2845
lseek(3, -1811, SEEK_CUR)               = 1034
read(3, "TZif2\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\5\0\0\0\5\0\0\0\0"..., 4096) = 1811
close(3)                                = 0
fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 9), ...}) = 0
write(1, "2085920896\n", 112085920896
)            = 11
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2845, ...}) = 0
write(1, "2085920897\n", 112085920897

@obastemur
Copy link
Collaborator Author

We do not use localtime either.

From your comment;

The localtime_r() function does the same, but stores the data in a user-supplied struct.
It need not set tzname, timezone, and daylight.

For the example you have shared; again, mktime reaches to DB for only negative dst value. There is no other way around. Force setting it up to +1 doesn't make much difference there.

@obastemur
Copy link
Collaborator Author

This implementation replaces timegm to mktime with an additional arithmetic operation. Otherwise we would have to do below;

time_t
my_timegm(struct tm *tm)
{
    time_t ret;
    char *tz;

   tz = getenv("TZ");
    setenv("TZ", "", 1);
    tzset();
    ret = mktime(tm);
    if (tz)
        setenv("TZ", tz, 1);
    else
        unsetenv("TZ");
    tzset();
    return ret;
}

@MSLaguana
Copy link
Contributor

I don't understand your comment there. I just added to my example to use all values of tm_isdst and it always acts as if it calls tzset:

int main() {
  struct tm tm = {};
  time_t t = mktime(&tm);
  printf("%u\n", t);
  tm.tm_sec = 1;
  tm.tm_isdst = -1;
  t = mktime(&tm);
  printf("%u\n", t);
  tm.tm_sec = 2;
  tm.tm_isdst = 0;
  t = mktime(&tm);
  printf("%u\n", t);
  tm.tm_sec = 3;
  tm.tm_isdst = 1;
  t = mktime(&tm);
  printf("%u\n", t);

  tzset();
  tzset();
  return 0;
}
open("/etc/localtime", O_RDONLY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=2845, ...}) = 0
fstat(3, {st_mode=S_IFREG|0644, st_size=2845, ...}) = 0
read(3, "TZif2\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\5\0\0\0\5\0\0\0\0"..., 4096) = 2845
lseek(3, -1811, SEEK_CUR)               = 1034
read(3, "TZif2\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\5\0\0\0\5\0\0\0\0"..., 4096) = 1811
close(3)                                = 0
fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 9), ...}) = 0
write(1, "2085920896\n", 112085920896
)            = 11
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2845, ...}) = 0
write(1, "2085920897\n", 112085920897
)            = 11
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2845, ...}) = 0
write(1, "2085920898\n", 112085920898
)            = 11
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2845, ...}) = 0
write(1, "2085920899\n", 112085920899
)            = 11
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2845, ...}) = 0
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2845, ...}) = 0

Whenever tzset is called it does a stat system call to /etc/localtime, and whenever I call mktime it does the same, as if it had called tzset.

@obastemur
Copy link
Collaborator Author

@MSLaguana I see what you are comparing. See my sample above (#3420 (comment))

In order to produce a correct result, we have to do the same. However, this PR replaces that with a single mktime instead.

@MSLaguana
Copy link
Contributor

How does performance compare? From what I can see, we will still be waiting on syscalls that may touch the disk every time we call one of our date conversion functions because they still implicitly call tzset. Is that correct? Do you see performance improvements here?

@obastemur
Copy link
Collaborator Author

they still implicitly call tzset

No they won't. Once again, there is no timezone set. You are mixing tzset with reaching to db for getting DST. We've been both setting tz (forth and back) and environment. Now we don't. We still reach DB. We have to.

@MSLaguana
Copy link
Contributor

Then let me put it this way: We still make system calls each time we perform date conversions, which will still have an adverse perf impact. Do you actually see perf improvements? When I was experimenting previously, I didn't see gains until I had removed everything that would make the stat system calls.

- New OSX Date interface had dropped old Windows specific date tests. Do the same for Linux
- Up to 10 times better perf.
@obastemur
Copy link
Collaborator Author

Simplified the interface further. Performance gain is up to 10 times!

Please review. See also updated PR comment #3420 (comment)

@obastemur
Copy link
Collaborator Author

We still make system calls each time we perform date conversions, which will still have an adverse perf impact.

@MSLaguana we have to make some calls to get DST. We could embed a DST db but that would be beyond the scope.

@MSLaguana
Copy link
Contributor

I think that this newest change looks good, but I have one concern: The docs for localtime_r state

The localtime_r() function does the same, but stores the data in a user-supplied struct.
It need not set tzname, timezone, and daylight

I think that for portability, we might need to guarantee that we invoke tzset at least once, to make sure that the timezone is loaded. I found some (old, yes) discussions showing that the behavior differed between implementations if you don't call tzset: https://mm.icann.org/pipermail/tz/2008-January/014773.html

I'm also not sure what the behavior would be around daylight savings changes in that case, if it would not update until the next time tzset was invoked, and so if it was run on a system with localtime_r which didn't invoke tzset, it might never get updated.

@MSLaguana
Copy link
Contributor

Looks like my DST concern is backed up by some SO questions: https://stackoverflow.com/questions/19170721/real-time-awareness-of-timezone-change-in-localtime-vs-localtime-r

@obastemur
Copy link
Collaborator Author

@MSLaguana I would leave this (system timezone change) responsibility to host application. If we will be caring about TZ changes, we could do it under UtcToLocal but that would take 10% extra perf.

If host cares about TZ changes, they can sync in fitting time frame easily. They have to do it for their application anyways! I'm not sure why we need to invest 10% perf on something won't be used?

@MSLaguana MSLaguana requested a review from dilijev July 28, 2017 22:13
@MSLaguana
Copy link
Contributor

My perspective here is that since by-spec the JS engine is expected to deal with computing the timezone offset and daylight savings time, the host should be able to rely on that behavior and not have to know about internal details like how timezones are handled.

I'd prefer an approach more similar to what we do on windows, where we have a cache that expires every ~second. In the xplat case that could be invoking tzset at most once a second. We'd still get significant gains while being as correct WRT system timezone changes as possible.

@obastemur
Copy link
Collaborator Author

@MSLaguana well, once we trigger tzset, we do the math for host app too. no? Besides, node doesn't do anything for this? Also, having time zone changed for a server or user machine is not something in common? So, why do we bother dealing with it? ChakraCore is a library and handles timezone for datetime stuff according to spec. We just don't care if system timezone has changed. It is up to host app? On Windows, system deals with that, not the host app or chakracore.

@MSLaguana
Copy link
Contributor

Right: On windows, the system deals with it, and ChakraCore asks the system. On xplat, if we don't invoke tzset then we don't ask the system.

However, I just ran an experiment. On node running v8 on linux, if you change the system timezone, then it behaves in an inconsistent way:

> var d = new Date()
undefined
> d.toString()
'Mon Jul 31 2017 13:17:33 GMT-0700 (PDT)'
// System timezone change
> d.toString()
'Mon Jul 31 2017 13:17:33 GMT-0700 (ACST)'
> var d2 = new Date()
undefined
> d2.toString()
'Mon Jul 31 2017 12:19:48 GMT-0800 (ACST)'
> d.toString()
'Mon Jul 31 2017 13:17:33 GMT-0700 (ACST)'
// System timezone change
> d.toString()
'Mon Jul 31 2017 13:17:33 GMT-0700 (PDT)'
> d2.toString()
'Mon Jul 31 2017 12:19:48 GMT-0800 (PDT)'

Currently node-chakracore behaves in a way that I feel is more correct, because we do check the system timezone every time:

> var d = new Date()
undefined
> d.toString()
'Mon Jul 31 2017 13:26:11 GMT-0700 (PDT)'
// Change system timezone
> d.toString()
'Mon Jul 31 2017 13:26:11 GMT-0700 (ACST)'
> var d2 = new Date()
undefined
> d2.toString()
'Tue Aug 01 2017 05:56:30 GMT+0930 (ACST)'
// Change system timezone
> d2.toString()
'Tue Aug 01 2017 05:56:30 GMT+0930 (PDT)'
> d.toString()
'Mon Jul 31 2017 13:26:11 GMT-0700 (PDT)'

This is still arguably wrong, since the timezone abbreviation always reports the current timezone and not the timezone of the date, but at least times are correctly converted to local time.

However, given the behavior of node-v8, I am not opposed to matching that behavior by only checking the system timezone once, as long as daylight savings behaves correctly (I'm not sure how best to check that experimentally).

@dilijev
Copy link
Contributor

dilijev commented Jul 31, 2017

https://tc39.github.io/ecma262/#sec-local-time-zone-adjustment

An implementation of ECMAScript is expected to determine the local time zone adjustment. The local time zone adjustment is a value LocalTZA measured in milliseconds which when added to UTC represents the local standard time.

If the timezone did change and as a result this computation did not show the current local standard time, we would be in violation of the spec (at a minimum, in violation of the spirit of the spec).

https://tc39.github.io/ecma262/#sec-daylight-saving-time-adjustment

An implementation of ECMAScript is expected to make its best effort to determine the local daylight saving time adjustment.

There's always potential for being incorrect on the edge cases, no matter how frequently you check, unless you ensure that the exact time you get when you determine the TZ is the same as the time you are formatting.

Worst case, there's a glitch in those edge cases and maybe someone notices (or not) but the glitch goes away when you make a new request because the TZ will be corrected at that time.

best effort tells us we should do the best we can, but if we're out of date by a small interval or the system is wrong (which we can't do anything about) then we're okay as far as the spec is concerned.

Also, having time zone changed for a server or user machine is not something in common? So, why do we bother dealing with it?

It's a good point that this isn't especially likely to come up in practice. However, there will be long running server processes which would probably like to have the current date and timezone information if something changed (including political changes). Which is precisely why we can cache the result and only check occasionally (~1s = 1000-2000 requests).

We should take the spirit of the spec and do best effort to get the best possible answer while getting good performance.

Making the expensive check only every ~1s seems like a good compromise here.

@obastemur
Copy link
Collaborator Author

@MSLaguana @dilijev Thanks for the comments.

I understand your points. Well, on OSX, we could listen to an event. Although we do cache the time and make sure it's going forward, we could arrange some tricks.

However, we can not sync to tz instantly; unless there is any trigger to UTCToLocal / Get new time etc.! (which I understood you give a 1s update time if any of these are being called, please read more.)

On Linux, timezone data is cached when the process is started. Unless process calls tzset, the mapped data will remain as is. tzset is used to read the database and update process's mapping. Nothing more. Heck, that's why it's slow and implemented this way.

One more interesting part; If a particular time zone data is not available?? not sure the behavior. If system tzdata is updated? this will require system restart on linux.

Concern;
If, for any reason, system time zone changes and my process reacts to that because of a library I did embed for executing JS code. That may turn out to be unexpected. Especially, when I have multiple processes and same library doesn't sync similarly (one may collect a new data variable while other one sticks to tick count?), it is a bigger problem to overcome.

One example;

My host app (along with the many child processes) don't care about time zone changes (including political ones) and use time as a going forward piece of variable. If, for any reason, time goes to backwards, what will happen?

One may argue that time goes backwards (an hour) during DST? Well, that's part of the current tz configuration and applies to all processes simultaneously.

Previously, every piece of time information we provide was synced hence the behavior across the processes were consistent. (hence extremely slow)

I'm okay with the best efforts. I'm just against altering the host behavior other than the currently expected behavior while sacrificing performance and other potential issues.

@obastemur
Copy link
Collaborator Author

@MSLaguana @dilijev opened #3467 to track the problem and continue to discussion on updating tzdata if system timezone changes while a process is actively running.

Copy link
Contributor

@MSLaguana MSLaguana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now since this isn't any worse than v8's behavior I'm happy to take the change.

Copy link
Contributor

@dilijev dilijev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with Jimmy. This is an improvement. Happy to continue the discussion later.

@obastemur
Copy link
Collaborator Author

@MSLaguana @dilijev Thanks for the review

@chakrabot chakrabot merged commit e2a51b9 into chakra-core:release/1.7 Aug 4, 2017
chakrabot pushed a commit that referenced this pull request Aug 4, 2017
Merge pull request #3420 from obastemur:time_faster

- Improves performance on XPLAT (up to 10 times)
- New OSX Date interface had dropped old Windows specific date tests. Do the same for Linux
- XPLAT DST for BC (negative years) doesn't have to match to Windows.
chakrabot pushed a commit that referenced this pull request Aug 4, 2017
…from tz

Merge pull request #3420 from obastemur:time_faster

- Improves performance on XPLAT (up to 10 times)
- New OSX Date interface had dropped old Windows specific date tests. Do the same for Linux
- XPLAT DST for BC (negative years) doesn't have to match to Windows.
chakrabot pushed a commit that referenced this pull request Aug 4, 2017
…from tz

Merge pull request #3420 from obastemur:time_faster

- Improves performance on XPLAT (up to 10 times)
- New OSX Date interface had dropped old Windows specific date tests. Do the same for Linux
- XPLAT DST for BC (negative years) doesn't have to match to Windows.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants