Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Option ot use .xliff as translation interchange files #11

Closed
prat0088 opened this issue May 3, 2016 · 27 comments
Closed

Option ot use .xliff as translation interchange files #11

prat0088 opened this issue May 3, 2016 · 27 comments
Assignees

Comments

@prat0088
Copy link
Contributor

prat0088 commented May 3, 2016

I'm new to Serge. This week I have been reading through the docs, code, and presentations. From what I gather Translation Interchange File must be .po files. I was wondering if you're open to adding an option for xliff files, and how you think it would be best implemented in Serge.

@iafan
Copy link
Contributor

iafan commented May 3, 2016

My vision is that transport file serializer/parser should be a plugin, and support for .po transport files would become a plugin as well (used by default in not specified explicitly, for backward compatibility). This requires externalizing .po handling code and preparing an infrastructure for such plugins. After that, writing a support for XLIFF or other formats should be straightforward (provided such a format has a notion of "developer comment", "translator comment", "context", "fuzzy/needs work" flag besides just having key-value pairs, otherwise much of the Serge power in providing this information would be lost).

Having said that, .po format worked perfectly so far for our needs so far, and this is why I didn't externalized the related code yet. If you need that, I can work on preparing the infrastructure.

Would be also good to know why you need XLIFF. Is it strictly required for integration with some translation service, or just a matter of preference?

@prat0088
Copy link
Contributor Author

prat0088 commented May 6, 2016

I see. That seems like a reasonable design.

The reason I'm interested in this possibility is because the commercial CAT tools I'm trialing have better support for xliff.

@iafan
Copy link
Contributor

iafan commented May 6, 2016

If your intent is to use Serge with some external CAT tool in order to implement the continuous localization approach, make sure such tool works not in terms of "uploading translation jobs", but can simply reflect the current state of your translation files (in other words, there should be a way to synchronize with the files by adding missing strings and removing outdated ones from CAT internal database). Or this could be an offline CAT tool that you can simply use to open translation files directly.

If you want, ping me on our IRC channel so we could discuss this in a bit more detail.

I also strongly suggest looking at Pootle as your translation frontend (this is what we use at Evernote), as it works with Serge beautifully.

@nemoeslovo
Copy link

nemoeslovo commented May 22, 2016

Possible workaround for now could be a TranslationService plugin, which will use xliff2po util from Pootle developers, which will convert generated files before actual performing pull-ts and push-ts

@prat0088
Copy link
Contributor Author

I just came across another use case for this request:

For very large translation requests, it's sometimes convenient for our translators to use Exel because of macros and navigation speed. It's what most are familiar with. If we could load Pootle with .csv files instead of .po files then the translators would have the option of doing offline translation on the .csv.

I'm still interested in modularizing Engine.pm so that users can choose their ts output format.

@iafan
Copy link
Contributor

iafan commented May 23, 2016

I'll look at externalizing this soon. This needs to be done for the sake of code quality/maintainability regardless of the external use cases (which I might not fully understand or agree with).

Speaking of CSV/Excel, I understand that it might sound convenient (would ensure a smoother path to your translators), but the downsides of this are (just for you to be aware of):

  1. Lack of TM (similar translations) provided within translation UI
  2. Lack of terminology suggestions
  3. Lack of quality checks and immediate feedback
  4. Lack of the "Needs review" notion
  5. Translators might be working with the strings which are already outdated (and which were removed from Pootle).

If your concern is primarily speed, I'd suggest looking into both options:

  1. Improving Pootle speed when it comes to navigating between units. There's a room for optimization there, but if you have some poor performance, this is something that Pootle devs need to be aware of.
  2. As a transitional step, you might want to look at offline editing of .po files (because there are tools that allow you to translate .po offline), and their UI is still orders of magnitude better than using Excel for translation). Take a look at Virtaal (done by the same guys behind Pootle), or POEdit.

@prat0088
Copy link
Contributor Author

prat0088 commented May 24, 2016

Thanks for pointing out the downsides. I am aware of them, but there are a few small cases where it can make sense for the type and quantity of content we translate here. I'm not advocating it as the go-to tool. Just something to keep in their back pockets in the rare case it is needed.

If your concern is primarily speed...

General interface responsiveness is one part of "speed". I agree Pootle should be enhanced as you suggest.

Pre-translation and suggestions are the other part of "speed" in certain instances for certain categories of requests. We found Excel and one commercial CAT tool work great for us here. This is entirely in-house and business related so I can't go into more detail. I'm reasonably certain this makes sense to us so I'll just leave it as our special case.

@erikogan
Copy link
Contributor

erikogan commented Jun 9, 2016

I would love to be able to use XLIFF 2.0 as the translation interchange format over PO. I would also be happy to help with the efforts to externalize the PO support.

(I am relatively new to Serge, and my Perl skills have atrophied over the last 8-10 years, but I think this could be a great way to familiarize myself with the codebase, and those muscles were well developed once, they will return eventually.)

@iafan
Copy link
Contributor

iafan commented Jun 10, 2016

@erikogan thanks for volunteering! I think we can split the effort where I'll deal with externalizing the current code and creating a .po serializer plugin, and you could work off that to provide XLIFF serializer.

@iafan iafan self-assigned this Jun 10, 2016
iafan pushed a commit that referenced this issue Jun 16, 2016
- .PO file support is now implemented as a serialization plugin
- .CSV serialization plugin added
@iafan
Copy link
Contributor

iafan commented Jun 16, 2016

Ok, the first part of the work is done.

@prat0088 I also added CSV serialization plugin. Let me know if this works for you. In addition to providing translation, you can also marks strings as needing work there, and translators can provide comments in a separate column. See the new serialize_csv test for more information and sample .csv file. Docs on serge.io will be added later.

@prat0088
Copy link
Contributor Author

@iafan Thanks! I think I'll have time in the next week to try it out.

@iafan
Copy link
Contributor

iafan commented Jun 18, 2016

Documentation has been added.

@whereisjim
Copy link

Any update with XLIFF 2.0 serializer?

@iafan
Copy link
Contributor

iafan commented Feb 3, 2017

@whereisjim I didn't hear of such activity, but it shouldn't be that hard to add now that we have XLIFF 1/2 parser, which can be used to borrow the code from.

Is the absence of XLIFF serializer preventing you from using Serge in a specific scenario? What do you do with the serialized files?

@whereisjim
Copy link

We are trying to use the markup tag in XLIFF to block some keyword so we can prevent over translation for certain strings such as product names, name of functions, code and etc.

@iafan
Copy link
Contributor

iafan commented Feb 3, 2017

Ok, so you need XLIFF serializer with the ability to specify — by the means of e.g. regular expressions — some sequences that should be marked as untranslatable.

Do you send these XLIFF files to some external localization vendor? Or do you have your own localization software? The reason I ask is that we use Serge with Zing translation server (our own fork of Pootle), and we found that, instead of locking parts of the string as non-translatable, it's easier (both implementation-wise and from translator's experience perspective) to allow edit the entire string, but have this string immediately validated afterwards. We implemented many quality checks, so that if someone breaks placeholders or tags, they will immediately see this in the translation UI, and localization managers will see this as well, so it is pretty trivial to go through failing units and fix them.

@whereisjim
Copy link

Right now, our process is developed using Catalyst and thinking about moving to web.
Thanks for the link for Zing. I just checked the page and quick question with 'Requirements: TTK bump'.
That is TTK means in here?

@whereisjim
Copy link

Actually, we have a similar plan for locking or not.
We are also thinking about using Terminology to prevent over translation instead of actually lock them in the sentences.

@iafan
Copy link
Contributor

iafan commented Feb 3, 2017

That is TTK means in here?

TTK stands for Translate Toolkit (an underlying library used in Pootle and [still] used in Zing)

@dragosv
Copy link
Contributor

dragosv commented Mar 2, 2018

Beta versions of xliff serializers are here

Xliff 1.2
https://github.com/dragosv/serge/tree/xliff
https://github.com/dragosv/serge/blob/xliff/lib/Serge/Engine/Plugin/serialize_xliff.pm

Xliff 2.0
https://github.com/dragosv/serge/tree/xliff2
https://github.com/dragosv/serge/blob/xliff2/lib/Serge/Engine/Plugin/serialize_xliff2.pm

Xliff 1.2 has been tested against various translation services while 2.0 was only tested manually (Using Ocelot)

Any feedback would be appreciated. Will add documentation soon. For now, there are tests for most of the options the serializers are supporting.

@dragosv
Copy link
Contributor

dragosv commented Mar 2, 2018

In case anyone is wondering two versions are needed for xliff serializers as Xliff 1.x and 2.x diverged heavily and there is no backwards compatibility.

@iafan
Copy link
Contributor

iafan commented Mar 3, 2018

@dragosv why do you think it's important to support both versions? Do other services only support 1.x at the moment?

@dragosv
Copy link
Contributor

dragosv commented Mar 3, 2018

Most only support 1.2 or a subset of it. Just of few support 2.0 so initially if 1.2 is supported is ok, and then after testing the 2.0 serializer against providers that support it, the xliff 2.0 serializer should be added.

@dragosv
Copy link
Contributor

dragosv commented Mar 3, 2018

On top of it 1.2 was tested against 6 providers and 5 support it to a certain level, while I have not tested 2.0 against any provider.

@dragosv
Copy link
Contributor

dragosv commented Mar 3, 2018

@erikogan xliff 2.0 serialized is here in a beta version. Please take a look and let me know what you think.

https://github.com/dragosv/serge/tree/xliff2
https://github.com/dragosv/serge/blob/xliff2/lib/Serge/Engine/Plugin/serialize_xliff2.pm

@dragosv
Copy link
Contributor

dragosv commented Mar 5, 2018

Pull request created #78 for the xliff 1.2 serializer

@dragosv
Copy link
Contributor

dragosv commented Feb 1, 2019

@iafan This was merged and should be closed.

@iafan iafan closed this as completed Feb 1, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants