Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enumeration of time zones #435

Open
ptomato opened this issue May 12, 2020 · 46 comments
Open

Enumeration of time zones #435

ptomato opened this issue May 12, 2020 · 46 comments
Labels
c: datetime Component: dates, times, timezones Proposal Larger change requiring a proposal s: in progress Status: the issue has an active proposal

Comments

@ptomato
Copy link
Contributor

ptomato commented May 12, 2020

In the Temporal proposal we currently have an API for enumerating all named time zones known to the system. We are currently discussing removing this time zone enumeration API from the proposal as it's not clearly related to Temporal and seems like it is an orthogonal project.

It was originally added in response to this use case: the list of named time zones is useful for implementing a time zone picker in UI. (Although, to be useful for UI, there would also have to be a way to get human-friendly display names for time zones rather than IANA names; #31?)

ECMA-402 seems like a good place to investigate this.

@leobalter
Copy link
Member

It seems interesting to discuss.

@leobalter leobalter added the s: discuss Status: TG2 must discuss to move forward label May 12, 2020
@sffc sffc added the c: datetime Component: dates, times, timezones label May 12, 2020
@sffc
Copy link
Contributor

sffc commented May 12, 2020

Agreed; this seems roughly related to Intl. We need to put the enumeration API somewhere (could be Intl, could be elsewhere), and then we need to add time zone names to Intl.DisplayNames.

@FrankYFTang

@littledan
Copy link
Member

littledan commented May 12, 2020

I think enumeration is a broader problem than just timezones. Some other things that you might want to enumerate:

  • Locales (all of them, not just from a subset of a list that the user provides as we have with supportedLocalesOf)
  • Regions
  • Currencies
  • Calendars
  • Numbering systems
  • Months of the year/days of the week
  • IIRC someone asked that the allowed hourCycle settings be listed, even though that's a fixed set... There are many other options that might be included this way

I don't think we need to ship all these together, but it'd be nice to come up with an API pattern that would potentially capture all of them. That's why I was a bit skeptical of tying Timezone enumeration to Temporal: there's no clear way to extend this to regions, locales or currencies, but those seem quite important as pickers.

Let's watch out for data size issues in Intl.DisplayNames with timezones, but it could definitely be useful if someone wants to make a timezone picker.

@leobalter
Copy link
Member

My only concern is that Intl is optional and not available in all JS platforms (e.g. Moddable XS). If we identify use cases for some of these enumerations out of the Intl API, there is a chance we might want this enumeration elsewhere.

Still, Intl might be a good place to provide this enumeration, and I wouldn't complain. This also doesn't seem to be a heavy weight addition.

ptomato added a commit to tc39/proposal-temporal that referenced this issue May 12, 2020
We have discussed removing enumeration of time zones from Temporal, and
suggested to investigate it in ECMA-402 instead. It seems like an
orthogonal problem to Temporal.

See tc39/ecma402#435
@littledan
Copy link
Member

I don't know whether this should be in Intl or not; my main concern is that we find an API design that will eventually work for all sorts of enumerations.

The only use case I've heard about for enumeration of timezones was for a sort of timezone picker. The consensus among internationalization experts is that this only really makes sense if the timezones are localized (even if some application developers are OK with displaying IANA names, we don't want to encourage this). If there's a non-picker use case, then that might change the calculus.

Overall, it's a bit hard for me to think about the space of environments where some parts of JS aren't present. I imagine some of them have other restrictions or allowances that aren't specifically sanctioned by the specification; we may or may not want to permit these in the standard. If we end up making major decisions on the basis of this kind of optionality/grouping, I wonder if we might weaken/unconstrain things to permit engines to let them support enumeration but not other parts.

@ljharb
Copy link
Member

ljharb commented May 13, 2020

I can localize my own time zone names, i don’t need Intl to do it for me (and because Intl isn’t everywhere, i can’t rely on it anyways). All i need is the data, in 262.

@littledan
Copy link
Member

@ljharb Which data do you need, and why do you need it?

@ljharb
Copy link
Member

ljharb commented May 13, 2020

The list of valid time zone identifiers. Otherwise, i have to maintain my own list, and laboriously validate each one at runtime.

The goal is to know what timezones an engine supports, so i can, among many other things, set up backend validation, and notify myself when i need to support a new timezone.

@anba
Copy link
Contributor

anba commented May 13, 2020

A list of things which directly pop into my head when thinking about time zones:

  • Should this possible API be restricted to IANA time zone names or do we also want to allow CLDR or ICU time zone names? (I'm not sure if https://github.com/unicode-org/icu/blob/master/icu4c/source/tools/tzcode/icuzones is covered by CLDR, which would make the time zones listed there ICU-only.)
  • Any possible time zone name which is accepted by Intl.DateTimeFormat, or just the canonical names?
  • Any thoughts about the time zones in backward, pacificnew, or systemv?
  • "timezones an engine supports" is tricky, because sometimes engines are lying a bit: IANA timezone db reference in the spec : should backzone be taken into account? #272
  • When displaying time zones to the user, directly showing IANA time zone names should probably be avoided, because of issues like "Kiev" vs. "Kyiv". (See the numerous threads about this topic on the tz mailing list.)
    • CLDR provides "exemplar cities" for displaying purposes.
    • But engines are currently stripping these exemplar city names from the ICU data file, so there are data size issues we should be aware of.

@ljharb
Copy link
Member

ljharb commented May 13, 2020

Perhaps an object, whose keys are the canonical names, and whose values are all the valid aliases.

ptomato added a commit to tc39/proposal-temporal that referenced this issue May 14, 2020
We have discussed removing enumeration of time zones from Temporal, and
suggested to investigate it in ECMA-402 instead. It seems like an
orthogonal problem to Temporal.

See tc39/ecma402#435
@littledan
Copy link
Member

@ljharb Because the list of timezones is so long, it's a common pattern in applications' timezone pickers to show a subset, so it's unclear to me what signal application developers should take based on just the existence of a timezone in a browser's tzdb. Does this need occur in the frontend, or is it more of a development-time need, or some other context?

ptomato added a commit to tc39/proposal-temporal that referenced this issue May 14, 2020
We have discussed removing enumeration of time zones from Temporal, and
suggested to investigate it in ECMA-402 instead. It seems like an
orthogonal problem to Temporal.

See tc39/ecma402#435
@ljharb
Copy link
Member

ljharb commented May 15, 2020

The object form I suggested, run through Object.keys, seems like it'd be a subset?

My use case is for both the frontend (rerendering in the client), and the backend that generates the initial HTML hydrated in the frontend.

@FrankYFTang
Copy link
Contributor

ICU's API provide 3 style of enumeration call that we can use to surface to JS

  1. return the whole list by calling icu::TimeZone::createEnumeration()
  2. return the list of timezone for a specific country/region by calling icu::TimeZone::createEnumeration( region code )
  3. return the list of timezone in a offset
    I think 1 and 2 above are useful and I have some doubt about 3 .

@FrankYFTang
Copy link
Contributor

How about

Intl.DateTimeFormat.getSupportedCalendars()
Intl.DateTimeFormat.getSupportedTimeZones()
Intl.NumberFormat.getSupportedNumberingSystems()
Intl.NumberFormat.getSupportedCurrencies()
Intl.NumberFormat.getSupportedUnits()

and later we may let each of above take optional argument to restrict the return list

@sffc
Copy link
Contributor

sffc commented May 15, 2020

These enumerations are not really specific to specific Intl formatters. If we were to add methods like this, I think it makes more sense to put them on the top Intl namespace:

Intl.getSupportedCalendars()
Intl.getSupportedTimeZones()
Intl.getSupportedNumberingSystems()
Intl.getSupportedCurrencies()
Intl.getSupportedUnits()

@FrankYFTang
Copy link
Contributor

ok, I will start to champion an "Intl Enumeration API Specification" to address this. Start to draft it under https://github.com/FrankYFTang/proposal-intl-enumeration/blob/master/README.md now.

@zbraniecki
Copy link
Member

Btw. This is a major fingerprinting increase as basically the API does nothing else but add identifiable bits.

@ljharb
Copy link
Member

ljharb commented May 15, 2020

@zbraniecki seems more like it collects existing bits into a single list, as opposed to me having to maintain the list myself and laboriously feature-test against the runtime? iow, not a new capability, just happens to make it easier?

@zbraniecki
Copy link
Member

I'm not a security/privacy expert, so please, take my read with a grain of salt, but my understanding is that the race for privacy vs fingerprinting is composed of two pieces:

  1. Number of APIs that give me the highest number of uniquely identifiable information
  2. Selection of APIs that give me the highest number of uniquely identifiable information at the lowest CPU/time

Number (1) is important because all/any anti-fingerprinting attempts will have to mask all those APIs to return some jammed responses that are generic and unidentifiable
Number (2) is important because if my tracker needs to take 10 seconds of your CPU to get a fingerprint its hard to hide. If my tracked can get it in 16ms, I'm good.

Now, if I understand correctly, Intl API originally was designed to force the fingerprint script to cycle through API calls attempting to ask for various bits and checking the output in hope to collect a bit. That's time consuming and CPU costly.
On the other hand getting a white-hat API use was easy - just ask for a date, tell me your calendar of choice, and accept the result (which may be suboptimal).

The "give me all available/supported X" type of API is making it trivial to ask for all fonts, all calendars, all languages, all numerical systems.

The common driver for such requests are "pickers", and I recognize the value for a picker to know what's available. I'm not sure how to resolve that tradeoff and I would love to get some privacy/security experts involved in guidelines for API design to strike the right tradeoffs.

Otherwise, non privacy experts will keep adding fingerprinting APIs as the API surface grows, and then privacy engineers will struggle to add "anti-fingerprinting" masking mode to each and every one of them. That seems suboptimal.

@FrankYFTang
Copy link
Contributor

Why don't the hacker just read the user agent string instead? That will cost less CPU power, right? How would this API provide more fingerprint information than the user agent string?

@zbraniecki
Copy link
Member

Why don't the hacker just read the user agent string instead?

Funny you should ask: https://www.zdnet.com/article/google-to-phase-out-user-agent-strings-in-chrome/

And anti-fingerprinting is always masking your UA string anyway.

How would this API provide more fingerprint information than the user agent string?

UA string adds some bits of entropy, your screen dimensions, color depth, add more, your installed plugins, refresh rate (vsync) even more, and so on.

You can see an example of such finderprinting on https://panopticlick.eff.org/ if you click "Test me" and then "Show full results for fingerprinting".
If you use Tor browser, or turn on anti-fingerprinting bits in Safari, Firefox, Tor, Brave etc. you'll see how they fake many of those API results to make them less unique.
As we increase the surface, Intl bits are becoming part of the "game". The question is what design should we use to make it a costly process for the fingerprinters, or how can we make our APIs easy to mask for anti-fingerprinting techniques.

As I said, I'm not an expert, I just know that I often end up reviewing patches for Gecko/SM that add the masking and I remember the reasoning behind "supportedLocalesOf" rather than "getSupportedLocales".

@zbraniecki
Copy link
Member

The only picker I can realistically see being commonly used is the unit picker and its still not a generic "what units do you support overall", but rather "do you support both celsius and kelvin and fahrenheit" kind of picker.

@sffc
Copy link
Contributor

sffc commented May 16, 2020

I'd like to get Tor people involved in this discussion. I don't know any, but I can ask around.

@zbraniecki Please do. Thanks!

ptomato added a commit to tc39/proposal-temporal that referenced this issue May 18, 2020
We have discussed removing enumeration of time zones from Temporal, and
suggested to investigate it in ECMA-402 instead. It seems like an
orthogonal problem to Temporal.

See tc39/ecma402#435
@jswalden
Copy link
Collaborator

Seems to me you can determine whether any individual time zone is supported by just using it and seeing if it shows up as a resolved option. For purposes of revealing differentiations across UAs and across their successive versions, an enumeration API does not expose new information. It makes it easier to query in bulk, but if the differentiations pertain to specific time zone strings, an attempt to fingerprint could just check behavior of those specific time zone strings.

Or is this a more theoretical concern, about user agents that the would-be fingerprinter hasn't taken the time to individually figure out the distinctions of? Because I guess an enumeration API does mean the fingerprinter doesn't have the ongoing maintenance burden of figuring out which time zones are differentiably supported by distinct UAs and UA versions -- it could just grab the whole list and generate a hash from it, for fingerprinting purposes.

@ljharb
Copy link
Member

ljharb commented May 21, 2020

It seems to me like not exposing the list is just security by obscurity.

@zbraniecki
Copy link
Member

@zbraniecki Please do. Thanks!

Hi all. I hear your feedback. I understand that it's hard for me to explain the concerns around the privacy area, and since I'm not an expert in the area, I may not even be able to.

I want to ensure you tho, that it is not "security by obscurity" - the idea is not just to hide the information or make it harder to retrieve in an attempt to discourage the fingerprinting.
If I made you feel this way, it is just the shortcoming of my ability to explain my position.

I reached out to several people working on the Tor browser at Mozilla and I'll try to get them to help us make decisions around this area.

The topic is complex and several dynamics are intertwined, making it harder to design clear guidelines as multiple tradeoffs are in play.
What concerns me personally, is that it seems to me like ECMA402 group is currently not seeing that as any form of tradeoff, and rather see it as a clear "someone requested, let's add it" kind of situation. It further indicates to me that I either don't understand the problem scope or I failed to explain my concerns.

I'll try to get back to this thread within the next week or so with more feedback on how to design such APIs in a privacy-friendly manner.

@sffc
Copy link
Contributor

sffc commented May 21, 2020

Wearing my hat as ECMA-402 chair: the contents of this post doesn't necesarilly reflect my personal opinion

The consensus from the ECMA-402 meeting today is that we think this proposal has solid use cases, but acknowledge the potential fingerprint concerns. We plan to present it for Stage 1 at the upcoming TC39 meeting, and continue investigating the privacy and security implications before it reaches Stage 2.

@zbraniecki
Copy link
Member

I spun off #442 for the generic conversation about the scope of ECMA402, which I believe is important for assessment of this API.

I'd also like to say that I did not see the "solid use cases" list beyond "pickers" and did not receive an answer to my question about them.

Putting privacy aside, and putting the generic "How far should ECMA402 go" aside as well, I'd like to better understand what makes this a "solid use case".
In particular, I'd like to understand how the group sees a difference from "any use case" and "solid use case". Basically every API ever added to any library ended up there because there was a use case. I worry not about every request to ECMA402 being able to bring some use case.

But I struggle to evaluate whether the user case is solid.
For example, when I started with ECMA402, one of the strategies we used was to see what JS libraries get developed with an assumption that if the use case is important enough, people will develop userland libraries, and from that we can collect in-field experience, validate that a use case is high profile and common enough to gather momentum around a library or libraries, and extract low-level API that can make such libraries easier or even unnecessary.

I don't know if this is applicable to today's velocity and dynamic behind ECMA402, but I don't think I've seen userland libraries around time zone names, numbering system names and calendar names.

I also don't know if "pickers" is the right use case - should the "pickers" be hand-written, or part of HTML? What is the use of "pickers" outside of Web environment? (they don't help much Node.js, right?). Are there other uses than "pickers"?

I understand the Stage 1 and I hope to see motivation for the API in the Stage 1 proposal that can be verified against the outcome of #442.

@ljharb
Copy link
Member

ljharb commented May 21, 2020

@zbraniecki anything the browser needs, is helpful in node, because generating HTML on the server is a very important a11y/performance/robustness practice.

@zbraniecki
Copy link
Member

anything the browser needs, is helpful in node, because generating HTML on the server is a very important a11y/performance/robustness practice.

I'm not sure if I understand. You can generate <input type="date"/> on server side. Or you can write your own date picker. Those two have very different API surface requirements.

@ljharb
Copy link
Member

ljharb commented May 21, 2020

Sure - but assuming there's no native HTML control for a timezone picker, eg, you'd need every possible timezone to be present in the serverside HTML, generated in node. I agree that if a native form control existed, that was sufficiently styleable and hookable in the browser, then there'd likely be no need to expose the data as a list.

@litherum
Copy link

(I didn't realize this discussion was happening here, and opened https://github.com/FrankYFTang/proposal-intl-enumeration/issues/1 about it)

@zbraniecki
Copy link
Member

I agree that if a native form control existed, that was sufficiently styleable and hookable in the browser, then there'd likely be no need to expose the data as a list.

I see. Thank you for your patience!

I think my question is then, should we evaluate the native form control path for pickers, rather than API scope extension as the more privacy friendly, easier to get internationalization right, and lower overhead for the user, approach first?

It may be related to #442 and #443

@sffc
Copy link
Contributor

sffc commented May 22, 2020

My personal opinion on this feature request:

We have heard from multiple Temporal stakeholders that exposing this API covers their use case of making a time zone picker. This information is already available via Intl APIs, but less efficiently. In Temporal, where time zones and calendars are first-class objects, one can also imagine use cases expanding beyond only pickers.

If you are building a client-side app and want to let people select their time zone, calendar system, etc., right now you would hard-code an expected list, even if a browser engine is capable of doing more. I think it is better for the JavaScript engine to provide a list of what it can support than making the programmer start with their own list and essentially take the intersection of that list with the browser's list by feature-testing each entry.

Although we should consider also supporting this in HTML, I still think this proposal has merits in JavaScript. The ecosystem is likely to never reach a point in which all web sites can use only the W3C pickers. Although I like using them in personal projects, I can't remember the last time I've visited a web site that has used a native HTML date picker in production, for example. I would hope that we can at least agree that JavaScript-based pickers are a legitimate use case.

@ljharb
Copy link
Member

ljharb commented May 22, 2020

Also, if the effort of creating the list is acceptable for the majority of devs, the good users, why wouldn’t the minority that are malicious just do the same? Someone will probably make a library for it nigh in immediately anyways, so any effort barrier vanishes.

@aphillips
Copy link

The thing that has me confused about this thread is that the list of time zones is finite (if rather larger than the logical minimum and unstable to boot) and, ignoring Etc/offset "private-use" values, reasonably well-defined. So it's not exactly "fingerprintable" to get a list of available time zones.

Making a time zone picker is a little more complicated, since many time zones need to be "rolled up" into a representative zone and a bunch of zones are obsolete. Hence all the "metazone" gunk in ICU (or, in my case, a bunch of utility classes).

@sffc I agree that this should be supported in HTML--in fact W3C I18N has asked for first-class time zone support in HTML going back a ways and I should probably follow up on that with WHATWG in the near future--but I also agree that JS APIs should provide access also.

@sffc
Copy link
Contributor

sffc commented May 22, 2020

Good point about metazones and containment. I guess we should consider exposing that additional information in an enumeration API? A flat list of time zones will get a lot of obsolete junk.

@ljharb
Copy link
Member

ljharb commented May 22, 2020

@sffc #435 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c: datetime Component: dates, times, timezones Proposal Larger change requiring a proposal s: in progress Status: the issue has an active proposal
Projects
Archived in project
Development

No branches or pull requests