Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parsing external owl entities (direct/indirect imports) + individual facts with external owl entites #668

Merged
merged 5 commits into from
Jan 9, 2024

Conversation

vChavezB
Copy link
Contributor

@vChavezB vChavezB commented Jan 2, 2024

This solves #667 .

This PR allows to tag owl entities that are not part of the main ontology (i.e. direct/indirect imports) as an external property. With the owlapi an ExternalPropertyParser class is added which looks for the external property superscripts in the html generated by LODE and tries to find the owl entity based on the IRI.

The xslt extraction sheet was modified to add the external property superscript and class type-ep. In addition named individual facts which contain IRIs from imported ontologies are now added.

Example

Te assertions look as follows for the sample ontology I provided in #667

grafik

In this case the only assertion that was not found is foaf:membershipclass because I did not import foaf and I am using directly a definition that is not loaded in the ontology.

Named individuals with assertions not belonging to the URI
of the ontology were not parsed. This commit allows
to parse object/data/annotation properties which are
not in the URI of the ontology and marks them with
superscript "ep" (external property)
@dgarijo
Copy link
Owner

dgarijo commented Jan 3, 2024

@vChavezB thanks for this contribution! The results look great
I am wondering, is it possible to have this directly in the xslt transformation? It feels a little hackish to have some done in the xslt, and some added directly afterwards

@vChavezB
Copy link
Contributor Author

vChavezB commented Jan 3, 2024

The xslt transformation sheets only work with the data provided from an xml serialized ontology. In this case from the serialization provided here.

The transformation can not know the type of OWL object from the URI if the content is not provided in the xml. You would need a run-time environment to load the imports and find the missing URIS, which is what I propose with the ExternalParser. Perhaps you could do the same in xslt language but would require more effort as you need to retrieve the imports, load them as xml rdf and then do the parsing.

One thing that is missing is that the xslt transformation has a language file that provides the appropiate translation. This is obtained dynamically here with the xslt function getDescriptionLabel. What would need to be added to this PR is to load these same files (en.xml, de.xml, etc) to reproduce this functionality. At the moment I just add the title to the superscript in the english language such as here.

@dgarijo
Copy link
Owner

dgarijo commented Jan 3, 2024

I see, thanks! However, if I don't recall incorrectly, there is an option to document not only the ontology, but the ontology + imports. If that's used, then that would address the type, no?

Or is it that the target ontology is not even imported, just reused, and therefore you don't even know?

@vChavezB
Copy link
Contributor Author

vChavezB commented Jan 3, 2024

I did a quick test with the option you mentioned (-includeImportedOntologies) and this solves the issue as the ontology is imported in xml rdf serialization, however the documentation becomes unreadable with imported definitions. For the example ontology I uploaded in issue #667, I get all the assertions from foaf.

grafik

However, for the use case I was thinking is when you dont want to import the ontology definitions in your documentation.

For example I am importing an ontology of units (qudt). This has around thirty thousand assertions for a vocabulary of units, which would pollute my documentation. In this case then I would not want to document them as the qudt organization already provides documentation in qudt.org.

Perhaps I could make this functionality available only when the user does not provide the option -includeImportedOntologies.

@vChavezB vChavezB force-pushed the individual_external_props branch 6 times, most recently from 905ed37 to 4bfc2c6 Compare January 4, 2024 10:46
@vChavezB vChavezB marked this pull request as ready for review January 4, 2024 10:47
@dgarijo
Copy link
Owner

dgarijo commented Jan 4, 2024

@vChavezB one question: are you importing the external vocabularies in your ontology? If you are not, then I agree with an external parser that bring in the information from the ontologies, but that would require downloading them, etc.

If the ontologies are imported but you don't want to pollute the doc, maybe we can have 2 loaded models (one for the ontology, one for the ontology + imported) and do the xslt transformation on the simple one and the xslt for the individuals on the complete one. Then, mix the individual section of one with the simplified documentation of the other.

I am brainstorming here, it's just that doing things outside the xslt still looks to me like a hard to maintain solution. And external properties would only be added for individuals, making inconsistent the rest of the documentation (e.g., if you are extending existing external classes or properties).

@vChavezB
Copy link
Contributor Author

vChavezB commented Jan 4, 2024

Just an update on the current PR.

  • Minor modifications to ignore entites that belong to namespace rdfs and xs, which are found frequently when assigning a literal constant to a dataproperty.
  • Use language resources from lode to add correct translation.
  • I have updated the code to do the parsing for all entities as you mentioned, all entities should be parsed and not only individuals.
  • Updated commit messages and PR to reflect the latest changes.

are you importing the external vocabularies in your ontology?

Yes I am importing the external vocabularies (i.e., owl:imports).

maybe we can have 2 loaded models

That could also work. Just a minor detail I have found.

  • The second model should not only import the axioms from the direct imports but also the indirect imports. Example. Main ontology imports A. A imports B. Then Axioms from B should be added to the main ontology.

I noticed this while using an ontology with a vocabulary of units (qudt), which imports the main schema with object properties such as has unit, has quantity kind, etc. As these are not in the imported ontology but rather is an indirect import, the xslt transformation will be missing this information.

So for this alternative, all the imports should be recursively added and then use the information from this second serialized model. Either with another xslt transformation sheet (?) or a java implementation that does this.

From my point of view I find more practical working with the owlapi as there is no need create a second ontology, serialize it and then extract the metadata from the xml. With the owlapi I can just look for the owl entities and find their metadata and just add a tag external property to know which owl entities have to be looked for.

I am not against the other alternative you suggest but I will probably not have time to develop a second solution.

@vChavezB vChavezB changed the title Individual external props Parsing external owl entities (direct/indirect imports) + individual facts with external owl entites Jan 4, 2024
The xslt parser cannot extract metadata from imported ontologies when
`-includeImportedOntologies` is not enabled. This is important because
sometimes you do not want to document the imported ontologies but you
still want to include metadata from the imported ontologies.

This change allows to tag owl entities that are not part of the
main ontology (i.e. direct/indirect imports) as an external property.
With the owlapi an ExternalPropertyParser class is added which looks for the
external property superscripts in the html generated by LODE and tries to find
the owl entity based on the IRI.

The xslt extraction sheet was modified to add the `external property` superscript
and html class `type-ep`.
@dgarijo
Copy link
Owner

dgarijo commented Jan 5, 2024

@vChavezB I understand. Let me review the PR and approve when I have bit of time. Thanks again for your contributions!

@vChavezB
Copy link
Contributor Author

vChavezB commented Jan 5, 2024

@dgarijo ok, let me see if I can create a test case so its easier to automatically check in the future.

@vChavezB
Copy link
Contributor Author

vChavezB commented Jan 5, 2024

I have added a test case which parses the generated html and asserts that the superscripts for the owl entities are correct.

@dgarijo dgarijo merged commit 5b9b1de into dgarijo:develop Jan 9, 2024
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants