-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DMRS representation: named graphs #24
Comments
An idea that is close to RDF dataset that is implemented on RDFLib is the RDF store (more on the ideas of that here.
where the When we put to serialize the RDF in a format that does not encode named graphs, like ntriples, the same triples end up being encoded but the graphs the graphs are excluded. As we are still connecting the semantic representation to its parts, the file end up still consistent with the versions before. The only thing that is not consistent is actually the TOP and INDEX encoding; which aren't satisfactory as well; we end up having a lot of blank nodes being linked to a MRS variables; the problem with connecting the MRS URI to its TOP/INDEX is that there would have a triple with the subject having the URI as the graph URI (in that case above, we would have |
RDFLib version 6.0.0 is out and it creates a new way of solving this problem.
The newest version creates the class Dataset, which is an implementation of the RDF 1.1 Dataset Notion. In terms of usage, it's similar to ConjunctiveGraphs, which is a graph that contains all graphs of its store. These two are easier to use as they already have a "default graph" and we can directly add triples to it. Therefore, the creation of named graphs can be made in four different ways: creating a |
Not clear what are the pros and cons of each approach or if you already decided about the way to go |
Maybe related to the discussion above about HOW to use the RDFLib to implemented named graphs, we still need clarity about the way information is modeled in named graphs. From the beginning, we know that named graphs introduce one disadvantage, not all triple stores, and libraries implemented it. So we should, if possible, be able to produce an RDF with and without named graphs. The simpler solution is to make the code unique and, given the desire output format, decide if the named graph part (the context, the fourth elements of the quads) should be serialized or not. In other words, if the user asks for a format that only supports triples (e.g. turtle), the fourth element of the quads are discarded. But there is a potential problem with that: redundance. Below, the first quads say the URI has a type and this information is in the named graph named by the same URI. See #30, but it is fine. The next two quads say that a Node and Link belong to the DMRS with itself is the named graph where those triples are defined.
We can live with this redundancy, if the nquads are loaded in a triple store, all triples of
Just ignore the fourth element of the quads and it gives me:
Or we should remove those triples with |
The redundancy of the RDF quad generation is there to make it compatible with applications that only supports triples without needing to discriminate whether we need or not to make triples or quads as you pointed out; the CLI application can be used to output |
Related to #21
If a user asks for a representation that supports named graphs, we should be able to produce it. In the CLI, the representations are limited to the ones that RDFLib supports. See https://en.wikipedia.org/wiki/N-Triples#N-Quads as one format.
But a user may want to save in JSON-LD using or not one named graph per semantic representation... Moreover, inside a python code, the user may need to specify if a named graph should be used or all triples should be in the single default graph (more about these concepts). What alternatives do we have?
Regarding triple stores, Allegrograph supports N-Quads and JSON-LD, both formats compatible with named graphs. More about N-Quads.
The text was updated successfully, but these errors were encountered: