Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

i18n.getMessage() language fallback paths #296

Open
carlosjeurissen opened this issue Oct 17, 2022 · 23 comments
Open

i18n.getMessage() language fallback paths #296

carlosjeurissen opened this issue Oct 17, 2022 · 23 comments
Labels
i18n-tracker Group bringing to attention of Internationalization, or tracked by i18n but not needing response. implemented: chrome Implemented in Chrome implemented: safari Implemented in Safari inconsistency Inconsistent behavior across browsers supportive: firefox Supportive from Firefox topic: localization

Comments

@carlosjeurissen
Copy link
Contributor

Not all browsers handle language fallbacks the same. Considering the following situation:

An extension is using the native i18n APIs with "default_locale": "en" in manifest.json, and three messages.json files in the languages en, pt and pt-BR.

Both en and pt include the message ids message1 and message2. While pt-BR includes only message1.

In the above situation, browsers handle fetching i18n.getMessage('message2') different.

Chromium first checks pt_BR/messages.json, if the message is not present, it checks pt/messages.json, and finally, if the message is still not found, it will check the default_locale, in this case en/messages.json. In the above situation, this means it gets the message2 value from pt.

In Firefox, however, the browser first checks pt_BR/messages.json. If the message is not in this file, it will directly fallback to default_locale. so it checks en/messages.json. Resulting in message2 value becomes the one from en.

Interestingly enough, in Firefox, if pt_BR/messages.json is not present in general, it will check pt/messages.json first, before checking en/messages.json.

What is the behaviour we want in these cases?

@carlosjeurissen carlosjeurissen added inconsistency Inconsistent behavior across browsers agenda Discuss in future meetings labels Oct 17, 2022
@hanguokai
Copy link
Member

I support Chrome's behavior, which seems more reasonable. Usually, developers want language-region locale to fallback to language locale first, then the default locale.

If the browser wants to support multiple different behaviors at the same time, I recommend add a new property in the 3rd parameter(options) in this api.

@carlosjeurissen
Copy link
Contributor Author

@hanguokai generally speaking this can be useful to reduce the overall package size.

However, there are cases when the fallback might not always be welcome. Say your zh/messages.json is in Simplified Chinese script, and zh_TW is in Traditional Chinese. Would it be great if a different script is used as fallback? Same can happen with other languages with multiple scripts, like Serbian (Latin and Cyrillic).

@hanguokai
Copy link
Member

I know the difference, it's better than the default(complete another language like English). For example,

en: "Software"
zh: "软件"
zh_TW: "軟體"

The difference between "软件" and "軟體" is smaller than that of English.

For the best user experience, developers need to supply full message map(1:1) if they are different. Only when they are the same or acceptable, they can be omitted.

@carlosjeurissen
Copy link
Contributor Author

@hanguokai relying on good developer behaviour can be tricky. I can imagine there are Chinese people knowing only English and either Traditional / Simplified Chinese? I could be wrong?

@hanguokai
Copy link
Member

Do you know how many people only understand Chinese(zh-CN and/or zh-TW) but not English? Of course, there are real examples in every situation(combinations).

I said in my previous post:

If the browser wants to support multiple different behaviors at the same time, I recommend add a new property in the 3rd parameter(options) in this api.

@hanguokai
Copy link
Member

There are multiple possible strategies. Another possible fallback strategy is following navigator.languages order. For example:

If navigator.languages is ['zh-TW', 'en'], then the search order is zh-TW -> en -> extension default locale.

@xeenon
Copy link
Collaborator

xeenon commented Oct 27, 2022

I believe Safari matches Chrome here after looking at the code.

@carlosjeurissen
Copy link
Contributor Author

Reached out to the ltli w3c group here: w3c/ltli#35.

Safari currently matches the behaviour of Chrome. If from above discussion is concluded this the preferred process, Firefox will follow.

@carlosjeurissen carlosjeurissen removed the agenda Discuss in future meetings label Oct 27, 2022
@xfq xfq added the i18n-tracker Group bringing to attention of Internationalization, or tracked by i18n but not needing response. label Dec 5, 2022
@carlosjeurissen
Copy link
Contributor Author

Quick update, @aphillips mentioned two potential fallback algorithms. One being a simple progressive removal of subtags. And the other being the more advanced algorithm from the Unicode's CLDR used in ICU. See:
w3c/ltli#35 (comment)

@xeenon @oliverdunk Do you know which algorithm is used in Safari and Chrome? Based on this we can figure out what algorithm should be used in Firefox considering the lack of any fallback algorithm in Firefox (Except to the default_locale).

@xeenon
Copy link
Collaborator

xeenon commented May 7, 2024

@carlosjeurissen Safari removes subtags, which we coded to match Chrome.

@oliverdunk
Copy link
Member

I had a brief look through the code and Chrome appears to remove subtags as @xeenon suspected 👍

@Rob--W
Copy link
Member

Rob--W commented Sep 24, 2024

Some of us (@dotproto, @Rob--W, @oliverdunk, @carlosjeurissen) met with the I18n group (@aphillips, @eemeli and others) and discussed the topic of whether to fall back (partial minutes). Chrome and Safari already have the same behavior of falling back from specific language tags to less specific ones, ultimately to default_locale. Firefox is supportive of implementing the same, and there was already a feature request at https://bugzilla.mozilla.org/show_bug.cgi?id=1381580.

Arguments in favor of the multiple fallback include the ability to have smaller message.json files, e.g. generic English + small en-US and en-GB specific files.

@erosman
Copy link

erosman commented Sep 24, 2024

@Rob--W Since the fallback process is being updated, can the following #258 (comment) be relevant as it suggests an additional step in the fallback chain?

@Rob--W
Copy link
Member

Rob--W commented Sep 29, 2024

@Rob--W Since the fallback process is being updated, can the following #258 (comment) be relevant as it suggests an additional step in the fallback chain?

I don't see the relevance of that other issue. The issue here is about unifying the fallback behavior across browsers (basically for Firefox to match Chrome and Safari). What you are proposing is an additional step, but the referenced comment mentions a feature request that has not been adopted by any browser.

@carlosjeurissen
Copy link
Contributor Author

@Rob--W I believe @erosman is trying to say once this language fallback logic has been improved in firefox, it is more valuable to extension authors to have a way to make use of the fallback logic using getMessage with a specific locale tag or using some form of setLanguage() versus just loading message.json files directly.

@birtles
Copy link

birtles commented Oct 21, 2024

@carlosjeurissen Safari removes subtags, which we coded to match Chrome.

@xeenon is this to say it doesn't even try looking up subtags? Because that's what several people are reporting.

@xeenon
Copy link
Collaborator

xeenon commented Oct 21, 2024

@birtles I'll take a look. We do look for the sub-tags first, but there might be a bug somewhere.

@xeenon
Copy link
Collaborator

xeenon commented Oct 21, 2024

@birtles I'm not seeing any issues with Safari's locale fallback in Safari 18. We use zh_CN and zh_TW for Simplified Chinese and Traditional Chinese on Apple platforms. Your change to rename zh_hans to zh_CN is correct for Safari (and seems fine for Chrome and Firefox).

@birtles
Copy link

birtles commented Oct 22, 2024

@birtles I'm not seeing any issues with Safari's locale fallback in Safari 18. We use zh_CN and zh_TW for Simplified Chinese and Traditional Chinese on Apple platforms. Your change to rename zh_hans to zh_CN is correct for Safari (and seems fine for Chrome and Firefox).

Thank you so much for looking into this. I'll follow up in the issue you kindly commented on since I'm not quiet yet able to get this working in Safari 18.

@birtles
Copy link

birtles commented Oct 25, 2024

I filed Chromium issue 375528194 for the fact that Chrome doesn't seem to recognize zh_hans, only zh_CN.

@hanguokai
Copy link
Member

hanguokai commented Oct 25, 2024

zh-CN and zh-TW are language code + region code.
zh-Hans and zh-Hant are language code + script code.
zh-Hans-CN, zh-Hans-SG, zh-Hant-HK and zh-Hant-TW are language code + script code + region code.

However, due to historical reasons, some operating systems, browsers and other softwares still use or support only zh-CN rather than zh-Hans. In #641 , we also discussed it (See link-1, link-2).

@xeenon
Copy link
Collaborator

xeenon commented Oct 25, 2024

After my change in https://commits.webkit.org/285633@main, Safari will support script codes in _locales — including three part locale identifiers.

We have always reported the script (if used) in i18n locale APIs as well.

@carlosjeurissen
Copy link
Contributor Author

Firefox patch can be found here: https://phabricator.services.mozilla.com/D224084

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
i18n-tracker Group bringing to attention of Internationalization, or tracked by i18n but not needing response. implemented: chrome Implemented in Chrome implemented: safari Implemented in Safari inconsistency Inconsistent behavior across browsers supportive: firefox Supportive from Firefox topic: localization
Projects
None yet
Development

No branches or pull requests

8 participants