Skip to content

Input Data Syntax: N Tuples

scampi edited this page Sep 14, 2011 · 9 revisions

Input Data Syntax: N-Tuples

SIREn extends Lucene with a new field type 'tuples'. The field accepts structured information in a special syntax called N-Tuples which is derived from the [N-Triples|http://www.w3.org/TR/rdf-testcases/#ntriples] syntax. The N-Tuples syntax is a superset of the N-Triples syntax. N-Tuples is a line-based, plain text format for encoding semi-structured data such as RDF graph or other data format. The content of a tuples field is an ordered list of tuples, each tuple being an ordered list of cells. The current syntax differentiates three types of cells:

  • URIs, or Uniform Resource Identifiers, are enclosed in '<' and '>';
  • Literals, or plain text, are written using double-quotes;
  • Blank nodes, or local identifiers (specific to the RDF data model), are written as '_:nodeID'.

A dot signifies the end of a tuple. In the following, we present various examples of semi-structured data encoded into N-Tuples. The possibilities are not restricted to these examples, and it is up to you to structured your data the way you want.

N-Triples

Here is a sample of a plain N-Triples document that encodes a RDF graph. The document describes itself, i.e., the FOAF file of Renaud Delbru, and the entity identfied by the URI [http://renaud.delbru.fr/rdf/foaf#me].

<http://renaud.delbru.fr/rdf/foaf> <http://www.w3.org/2000/01/rdf-schema#label> "FOAF file of Renaud Delbru" .
<http://renaud.delbru.fr/rdf/foaf> <http://xmlns.com/foaf/0.1/maker> <http://renaud.delbru.fr/rdf/foaf#me> .
<http://renaud.delbru.fr/rdf/foaf#me> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://xmlns.com/foaf/0.1/Person> .
<http://renaud.delbru.fr/rdf/foaf#me> <http://xmlns.com/foaf/0.1/name> "Renaud Delbru" .
<http://renaud.delbru.fr/rdf/foaf#me> <http://xmlns.com/foaf/0.1/givenname> "Renaud" .
<http://renaud.delbru.fr/rdf/foaf#me> <http://xmlns.com/foaf/0.1/family_name> "Delbru" .
<http://renaud.delbru.fr/rdf/foaf#me> <http://xmlns.com/foaf/0.1/homepage> <http://renaud.delbru.fr/> .

Entity-Centric

Here is a sample of entity description using N-Tuples. Compared to the previous example where the first cell was the identifier of an entity, the first cell of a tuple is a predicate (or property name). The subsequent cells of a tuple are the values associated to the predicate.

As you can see, the syntax is flexible. In line 1 and 3, we can model a multi-valued predicate with a first cell representing the predicate and the following cells as values. You can also mix different tuple cell types (URIs, Literals) in a same tuple.

<http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://xmlns.com/foaf/0.1/Person> "A Person" .
<http://xmlns.com/foaf/0.1/name> "Renaud Delbru" .
<http://xmlns.com/foaf/0.1/knows> <http://g1o.net#me> <http://eyaloren.org/foaf.rdf#me> .

Tabular Data

TODO