Skip to content

Strings and Languages

Martin Ledvinka edited this page May 27, 2024 · 1 revision

RDF allows string to be language tagged (using the RDF langString datatype), thus simplifying application internationalization.

On the repository access level, language tagged strings are represented by the LangString class. An RDF langString values like "test"@en will thus be represented by a LangString instance.

JOPA itself works with language tagged strings in multiple ways. First, it allows to set a persistence unit-level language (using the cz.cvut.jopa.lang config parameter) that will be used for all String-valued attributes. This global setting can be overridden on attribute level by specifying the simpleLiteral attribute or on instance level by using a descriptor. If the global language is not configured, all String-valued attribute values are stored as RDF simple literals (i.e., strings without language).

JOPA also supports working with multilingual strings, i.e. translations of the same value in multiple languages.

Multilingual string attributes allow JOPA object model to access language tagged string values (RDF langString datatype). These are represented on object model level by the MultilingualString class.

Probably the best way to illustrate how language tagged strings support works is by an example, so here's one.

Example

Ontology Level

ex:a a ex:ClassA ;
rdfs:label "Building"@en , "Stavba"@cs , "Bau, der"@de .

OntoDriver Level

Maps to four Axiom instances:

final Axiom<NamedResource> classAssertion // = ex:a a ex:ClassA 
final Axiom<LangString> enLabel // = ex:a rdfs:label "Building"@en
final Axiom<LangString> csLabel // = ex:a rdfs:label "Budova"@cs 
final Axiom<LangString> deLabel // = ex:a rdfs:label "Bau, der"@de

Object Model

@OWLClass(iri = "ex:a")
public class ClassA {

@Id
private URI id;

@OWLDataProperty(iri = "rdfs:label")
private MultilingualString label;

// ...
}

final ClassA a;
// a.id = ex:a
// a.label = en -> "Building", cs -> "Budova", de -> "Bau, der"

Multilingual strings are supported both as annotation and data property values.

Another example illustrating also the persistence unit level language setting can be found in the Strings and languages demo.

Plural Multilingual String Attribute Mapping

Multilingual strings can be also used as values of plural attributes. However, in that case, it is impossible to determine which translations are supposed to be part of the same value (unless a sequence is used to structure the values). Because of this, JOPA will, when loading such attribute values, try to 'fill' translations from the beginning. That is, it will find the first MultilingualString instance not containing translation in the language being currently processed and add the value to it. If none is found, a new MultilingualString instance is created and added to the target collection.

Again, an example will probably best illustrate the principle.

ex:a a ex:ClassA ;
rdfs:label "Building"@en , "Stavba"@cs , "Construction"@en , "Budova"@cs , "Bau, der"@de .
@OWLClass(iri = "ex:a")
public class ClassA {

@Id
private URI id;

@OWLDataProperty(iri = "rdfs:label")
private Set<MultilingualString> label;

// ...
}

final ClassA a;
// a.id = ex:a

Now, when loading a's labels, the following sequence of events will occur (assuming the values are loaded in the order written above):

  1. Processing "Building"@en, a.label = [en -> "Building"]
  2. Processing "Stavba"@cs, a.label = [{en -> "Building", cs -> "Stavba"}]
  3. Processing "Construction"@en, a.label = [{en -> "Building", cs -> "Stavba"}, {en -> "Construction"}]
  4. Processing "Budova"@en, a.label = [{en -> "Building", cs -> "Stavba"}, {en -> "Construction", cs -> "Budova"}]
  5. Processing "Bau, der"@de, a.label = [{en -> "Building", cs -> "Stavba", de -> "Bau, der"}, {en -> "Construction", cs -> "Budova"}]