Skip to content

Glossary & Resources

Romulo Cintra edited this page Feb 19, 2020 · 3 revisions

Glossary

  • language - a system of communication used by a particular country or community. ISO 639 is the main standard used to define language codes.
  • locale - the implementation of a language in a given market, including formatting (numbers, date, etc), common expressions and cultural differences. For example, French France (fr-FR) is different than French Canada (fr-CA). BCP 47 is the main standard used to define locale codes.
  • Internationalization (i18n): a set of best practices and design process that ensures that an application can be adapted to various locales without requiring code changes.
  • localization / l10n - converting a program to run in a different locale. Most of the effort revolves around translating UI string text, so "localization" often gets used synonymously with "translation". But technically also includes designing different layouts (ex: scripts that prefer top-to-bottom right-to-left) and UI widgets (some icons flipping for right-to-left languages)
  • CAT (Computer Assisted Translation) tool - an editor that is designed for translators to be efficient to use and integrated with other l10n services. The CAT tool UI usually has a 2-column interface in which each message's source (original) and target (translation) text are kept vertically aligned with each other.
  • TMS (Translation Management System) - a workflow system that manages the end-to-end work of translation. Includes user upload and download, cost estimation & billing, distributing work to translators, integration of reviewers and secondary reviewers, QA / issue management, and post-editing. Some TMSes provide their own integrated CAT tools. In other cases translators choose to use their own CAT tool, in which case they may use an industry standard format like XLIFF to download the translation source and upload their finished translation.
  • post-editing - after the user receives their translated document as the result of the main translation workflow in the TMS, the user may want to make their own final touches to the translated doc. Those final touches are called post-editing.
  • Translation Memory (TM) - a database of previous translations (translation entry = source string, source language, target language, target string). TMs typically store individual messages as source strings in separate entries. TMs can be shared globally, shared within a company, and/or private to a single user.
  • Machine Translation (MT) - letting an automatic translation program perform translation of the source text. This is usually performed only when no entries in the Translation Memory exit that match the source text. The reason is that it is usually easier & cheaper to start translation by correcting Machine Translation output than to write out the translated string from scratch.
  • XLIFF (XML Localization Interchange File Format) - a localization industry standard file format that defines the structure for translation task data.
  • Okapi Framework - a software framework that enables people to develop their own l10n software (CAT tools / TMSes). Hierarchy of classes is similar to XLIFF data hierarchy spec. Supports XLIFF and many common textual document file formats via a plugin-style architecture. Most CAT tools / TMSes are built on Okapi.
  • translatable unit / text unit - the first level of granularity in which a translation document is broken down to in XLIFF/Okapi. Represents translatable text -- text to be translated by a translator. Usually corresponds to paragraphs, but depends on the file format handler implementation
  • segment - a sub-unit of a text unit. Usually corresponds to a sentence. Also represents translatable text.
  • placeholder - a piece of information inline within a segment that should not be translated. Usually represented in the UI as an indivisible widget (or substring).
  • Natural Language Generation (NLG) - a process that transforms structured data into natural language.

Resources

Clone this wiki locally