Skip to content

Input Data Syntax: N Tuples

rdelbru edited this page Sep 7, 2011 · 9 revisions

Input Data Syntax: N-Tuples

SIREn extends Lucene with a new field type 'tuples'. The field accepts structured information in a special syntax called N-Tuples which is derived from the [N-Triples|http://www.w3.org/TR/rdf-testcases/#ntriples] syntax. The N-Tuples syntax is a superset of the N-Triples syntax. N-Tuples is a line-based, plain text format for encoding semi-structured data such as RDF graph or other data format. The content of field of type tuples is an ordered list of tuples, each tuple being an ordered list of cells. The current syntax differentiates three types of cells:

  • URIs, or Uniform Resource Identifiers, are enclosed in '<' and '>';
  • Literals, or plain text, are written using double-quotes;
  • Blank nodes, or local identifiers (specific to the RDF data model), are written as '_:nodeID'.

A dot signifies the end of a tuple. In the following, we present various examples of semi-structured data encoded into N-Tuples. The possibilities are not restricted to these examples, and it is up to you to structured your data the way you want.

N-Triples

Here is a sample of a plain N-Triples document that encodes a RDF graph. The document describes itself, i.e., the FOAF file of Renaud Delbru, and the entity identfied by the URI [http://renaud.delbru.fr/rdf/foaf#me].

<http://renaud.delbru.fr/rdf/foaf> <http://www.w3.org/2000/01/rdf-schema#label> "FOAF file of Renaud Delbru" .
<http://renaud.delbru.fr/rdf/foaf> <http://xmlns.com/foaf/0.1/maker> <http://renaud.delbru.fr/rdf/foaf#me> .
<http://renaud.delbru.fr/rdf/foaf#me> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://xmlns.com/foaf/0.1/Person> .
<http://renaud.delbru.fr/rdf/foaf#me> <http://xmlns.com/foaf/0.1/name> "Renaud Delbru" .
<http://renaud.delbru.fr/rdf/foaf#me> <http://xmlns.com/foaf/0.1/givenname> "Renaud" .
<http://renaud.delbru.fr/rdf/foaf#me> <http://xmlns.com/foaf/0.1/family_name> "Delbru" .
<http://renaud.delbru.fr/rdf/foaf#me> <http://xmlns.com/foaf/0.1/homepage> <http://renaud.delbru.fr/> .

Entity-Centric

Here is a sample of entity description using N-Tuples. Compared to the previous example where the first cell was the identifier of an entity, the first cell of a tuple is a predicate (or property name). The subsequent cells of a tuple are the values associated to the predicate.

As you can see, the syntax is flexible. In line 1 and 3, we can model a multi-valued predicate with a first cell representing the predicate and the following cells as values. You can also mix different tuple cell types (URIs, Literals) in a same tuple.

<http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://xmlns.com/foaf/0.1/Person> "A Person" .
<http://xmlns.com/foaf/0.1/name> "Renaud Delbru" .
<http://xmlns.com/foaf/0.1/knows> <http://g1o.net#me> <http://eyaloren.org/foaf.rdf#me> .

Tabular Data

TODO