Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert translations #22

Conversation

asmecher
Copy link
Contributor

A proposal for a rearrangement of translation data; see pkp/pkp-lib#857 (comment)

Just in case I lose it -- the source of the throwaway conversion tool is...

<?php

// Get an associative mapping of English term => URI.
$englishRoleDocument = simplexml_load_file('https://raw.githubusercontent.com/JATS4R/jats-schematrons/master/schematrons/1.0/credit-roles.xml');
$englishRoleToUri = [];
foreach ($englishRoleDocument->item as $item) {
  $englishTerm = (string) $item['term'];
  $uri = (string) $item['uri'];
  $englishTerm = match($englishTerm) {
    'Writing – original draft' => 'Writing – Original Draft Preparation',
    'Writing – review & editing' => 'Writing – Review & Editing',
    'Data curation' => 'Data Curation',
    'Funding acquisition' => 'Funding Acquisition',
    'Project administration' => 'Project Administration',
    default => $englishTerm,
  };

  $englishRoleToUri[$englishTerm] = $uri;
}

$uriToId = [];

$englishData = [];


foreach (glob('translations/credit_translation_*.json') as $filename) {
  if (!preg_match('~translations/credit_translation_([a-z_]+)\.json~', $filename, $matches)) throw new Exception('Unexpected filename!');
  $localeCode = $matches[1];
  // Load a translation
  $data = json_decode(file_get_contents($filename),true);
  $newData = ['metadata' => $data['metadata'], 'translations' => []];
  foreach ($englishRoleToUri as $englishRole => $uri) {
    // Store old-style data in new translation data
    $newData['translations'][$uri] = [
      'name' => $data['translations'][$englishRole]['translation']['name'],
      'description' => $data['translations'][$englishRole]['translation']['description'],
    ];
  
    // Handle the IDs, which should be in a common descriptor file
    $id = $data['translations'][$englishRole]['id'];
    if (!isset($uriToId[$uri])) $uriToId[$uri] = $id;
    elseif ($uriToId[$uri] !== $id) throw new Exception('ID mismatch in ' . $filename . ' for role ' . $uri . ': ' . $id . ' vs. ' . $uriToId[$uri]);
  }
  file_put_contents("translations/$localeCode.json", json_encode($newData, JSON_PRETTY_PRINT | JSON_UNESCAPED_UNICODE));
}

// Write the mapping of CRediT roles to IDs
file_put_contents('credit_roles.json', json_encode($uriToId, JSON_PRETTY_PRINT | JSON_UNESCAPED_UNICODE));

// Finally, output the English versions (such as we have them)
$newData = ['metadata' => [], 'translations' => []];
foreach ($englishRoleToUri as $englishRole => $uri) {
  // Store old-style data in new translation data
  $newData['translations'][$uri] = [
    'name' => $englishRole,
    'description' => '',
  ];
}
file_put_contents('translations/en.json', json_encode($newData, JSON_PRETTY_PRINT | JSON_UNESCAPED_UNICODE));```

@marton-balazs-kovacs
Copy link
Contributor

I want to loop in @mfenner who extensively worked on article metadata. Martin, as I mentioned in our call today we had a json-schema for storing translations that you can find in the main branch of the repo. Here, @asmecher suggests a new format that would be more streamlined and easier to translate to JATSXML for validation with the JATS schematron. The goal is to incorporate the CRediT translations in OJS so people can upload their metadata using the local language.

What do you think of the suggested format? Since we are modifying the previous version of the translation json-schema it is a good opportunity to discuss the structure.

@asmecher asmecher marked this pull request as ready for review July 10, 2024 21:17
@marton-balazs-kovacs marton-balazs-kovacs merged commit ae7dd44 into contributorshipcollaboration:main Jul 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants