-
Notifications
You must be signed in to change notification settings - Fork 182
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GTFS-Translations #180
GTFS-Translations #180
Conversation
+1 from Google. We have a big provider that gives more than 100 feeds in several countries and uses GTFS-Translations spec. |
Here is a feed that Google gets from our producer for the city of Lviv in Ukraine: |
The default language - per dataset - is not clear to me. Shouldn't we allow different default language per record?. If for example there's a dataset that contains Switzerland (the whole of it) then what would be the default language? To me it sounds like probably Zürich (de) should be the default for Zurich, however Genève (fr) the default for Geneva. |
@flocsy: Yes, this is the example cited in the definition of
If what you're suggesting is that we attached default_lang information to sub section of the feed, like agency or stop, that could be doable but I don't see the added value to do it. |
Ah, ok I wasn't aware of the "mul" standard. I thought it means something
else. Maybe you should emphasize it so everyone understand. Just to help
you where to rephrase, this is what I thought it means:
I thought that usually feeds have 1 language only, so they would have
feed_lang=hu. (this would also mean there's no translation.txt)
However if there is translation.txt, then feed_lang must be set to "mul".
Now I understand that the two (existance of translations.txt and
feed_lang=mul) are not necessarily connected.
Regarding the added value of default_lang per sub section: no, per section
it wouldn't give any useful info, I agree.
But I would maybe like to see an optional field: "lang" in all the tables
that can be translated, and it would only be used/useful if the
default_lang="mul".
Well it really depends on the consumer apps... but I can say that in a
place where they have latin letters I might prefer to see the local names
(Geneva, Zürich), 'cause that's the way I most probably will see/hear it,
so why displaying it in English, just because my phone's language is set to
English, Hungarian, Hebrew.
On the other hand, if we're talking about a place where they have non-latin
letters, then I am not able to read them, so I'd prefer English.
I agree that for the 2 examples I gave it's not necessary to know the
default language of each record, but it might be useful.
…On Thu, Aug 8, 2019 at 6:40 PM Leo Frachet ***@***.***> wrote:
@flocsy <https://github.com/flocsy>: Yes, this is the example cited in
the definition of feed_lang. When the default language must vary from
places, you can defined it as mul and provide the local version in every
place:
If the dataset contains values in multiple languages (e.g. in multilingual
countries like Switzerland, Belgium or Canada), the norm ISO 639-2 contains
the language code “mul” to describe such reality. In such case, the best
practice is to provide a translation for each of the languages used in the
dataset.
For example, a dataset in Switzerland will have feed_lang=mul and will
contain by default stop names “Genève” for Geneva, “Zürich” for Zurich and
“Biel/Bienne” for the bilingual city of Biel/Bienne. But translations will
be provided, in German: “Genf”, “Zürich” and “Biel”; in French: “Genève”,
“Zurich” and “Bienne”; in Italian: “Ginevra”, “Zurigo” and “Bienna”; and in
English: “Geneva”, “Zurich” and “Biel/Bienne”.
If what you're suggesting is that we attached default_lang information to
sub section of the feed, like agency or stop, that could be doable but I
don't see the added value to do it.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#180?email_source=notifications&email_token=AAHI4RCL5GDOU5CP2GXT4ODQDQ45DA5CNFSM4IKFTV22YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD34AYIY#issuecomment-519572515>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAHI4REJTAYQOY4CGIQ4TITQDQ45DANCNFSM4IKFTV2Q>
.
--
Gavriel Fleischer
|
It would be useful for some, but sometime just one stop name would already be
Indeed, it depends on the consumer apps. There would be a lot to say of how should those translated fields be filled (e.g. should "Köln" be translated in English as "Cologne"? "Köln (Cologne)"?), but this is IMHO on the shoulders of the data producer to produce them, and on the consumer to decide how to display them. Maybe some guidelines will be useful down the road if we see inconsistent behavior. |
I'm opening the vote on this proposal. Vote will be open until next Thursday 22nd, 23:59:59 UTC. |
I still would like to change the following sentence to make it clearer:
It's unclear IMHO what "dataset contains values in multiple languages" means. In my reading this means that there are more than one languages in the DATASET. However if the default is "en" and I provide a translation to "fr", then no need for "mul". I would suggest something like: If the default values in the dataset contain values in multiple languages (e.g. in multilingual countries like Switzerland, Belgium or Canada in stops.txt you have more than one language), the norm ISO 639-2 contains the language code “mul” to describe such reality. In such case, the best practice is to provide a translation for each of the languages used in the dataset. If all the labels in stops.txt are in one language, and there are translations in translations.txt, then "mul" is not to be use. I'm sure the English speakers can improve it even further, I'd like it to be as explicit as possible. |
Thanks @flocsy for the suggested language. I'm adding a slightly altered version of your proposal:
|
Since nobody voted since I opened the vote, and since we changed the phrasing, I'm closing and reopening the vote. Vote will be open until Thursday 22nd, 23:59:59 UTC. |
+1 from Google. |
+1 |
It took me a while to understand the text "If the untranslated values in the dataset are in multiple languages (e.g. in multilingual countries like Switzerland, Belgium or Canada the stop_name in stops.txt will be by default in different languages depending of the area)". I think it will not be immediately apparent to many readers what this means. The expressions "untranslated values are in multiple languages" and "depending on the area" are ambiguous. At first I thought this was describing some kind of system that reacted to the location of the reader or consumer and extracted language-specific sub-values out of multi-lingual individual records. Here is my attempt at a rewrite (also correcting some small errors with prepositions etc.): Datasets may contain untranslated values in multiple languages. For example, in a multilingual country like Switzerland, Belgium, or Canada the stop_name field of each stop could be in a different language, depending on the dominant language in that stop's geographic location. In such cases, the feed_lang field should contain the language code Though the comments mention putting off the stop_name="Biel/Bienne" case for the future, the proposal in its current form describes covers both known use cases of the |
So is it up for vote again? |
+1 (Google) |
+1 (Kisio) |
@flocsy and @aababilov |
@flocsy and @abyrd
Below would be the whole field description:
Please, don’t hesitate to provide any feedback! |
this is clear to me! |
Since both a producer and a consumer have implemented
The vote will be open until Monday, December 23rd at 23:59:59 UTC. |
+1 Stichting OpenGeo / Bliksem Labs |
+1 from Google. |
+1 from Transit |
+1 from Kisio |
+1 from SkedGo |
+1 (Moovit) |
+1 from Trillium |
The vote is closed. We have 6 votes in favor. Zero against. We have a producer and a consumer. So the proposal is adopted 🎉 ! |
As explained in the issue #138 (Jan 29th 2019) then in the issue #175 (Jul 28th 2019), we drafted a GTFS-Translations proposal (bit.ly/gtfs-translations), which is based on Google's old private GTFS translation extension.
Since then, and after a few modification of the proposal (see the Google doc), Google has shifted to use it internally, deprecating their old private GTFS translation extension, as described in their documentation (here).
I'm opening a pull request with the current (2019-08-07T22:00:00-04:00) state of the Google Doc.
Google is already consuming since quite a while. What's currently missing to open the vote is a producer.