Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Representing type hierarchies #68

Closed
wetneb opened this issue Apr 13, 2021 · 5 comments · Fixed by #73
Closed

Representing type hierarchies #68

wetneb opened this issue Apr 13, 2021 · 5 comments · Fixed by #73

Comments

@wetneb
Copy link
Member

wetneb commented Apr 13, 2021

In some reconciliation services (such as Wikidata), the types are organized in a hierarchy: some types are included in others. For now, the specifications do not say anything about this. In today's call, it was mentioned that the specifications could also say something about that. What would be the impact on the API? What workflows would it enable?

@fsteeg
Copy link
Member

fsteeg commented Jul 16, 2021

After our last meeting, this is what I understand about the use case here. We can currently choose a suggested default type in the reconciliation dialog (1) or search and get suggestions for a specific type (2):

recon-types

So the idea here would be to additionally provide a way visually browse a type hierarchy for selecting a type to reconcile against. This would be very nice to have for the GND class/type hierarchy too. It would provide a way to discover what types are available in the dataset, where today we can only discover default types that are suggested (1) or that we already know of and search for specifically (2).

On the protocol level, this would require expressing the hierarchy somehow when describing types. This could be achieved by adding fields like broader and/or narrower to each type (like in SKOS). These fields could be ignored by current clients/functionality (like (1), using the defaultTypes in the service manifest and (2), using type suggest responses). New clients/functionality could use broader and/or narrower to build a visual representation of the hierarchy. To get the full hierarchy, they could use a type suggest query with an empty prefix param (or omit the prefix param).

@wetneb
Copy link
Member Author

wetneb commented Jul 20, 2021

My initial reaction to this proposal was that to make this work in Wikidata's case, we would need to add quite some paging mechanisms because Wikidata's type hierarchy is huge:

  • the number of items that can be used as types is too large to be sent in one HTTP response
  • the number of children (subclasses) of a given type is too large to be stored in full in one JSON object

So I was thinking that it would be better to have a dedicated endpoint for this feature. But in fact, Wikidata's full type hierarchy is so big and messy that we probably do not want to show it to the user anyway. So, if I had to implement this in the Wikibase reconciliation service, I could simply let the service provider define a hierarchy of important, sensible types that should be advertised to the user by default, and we would use that. It would essentially be an improvement over the current "defaultType" that we offer.

So, all in all, @fsteeg's proposal above looks good to me. In today's call we discussed that it could be better to simply supply each type's broader classes (parents in the hierarchy) and omit the narrower ones, to save some bytes and avoid redundancy.

@thadguidry
Copy link
Contributor

I like the @wetneb idea of simply sidestepping Wikidata's currently hard to visualize type hierarchy (it's hard for good reason, not everything in the world always fits in neat buckets as Denny always says). But for reconciliation service providers where the type hierarchy is known or not as messy, then a good first step might be allowing to define a hierarchy of sensible types.

Hmm, but then practically, what does this look like for reconciliation service providers to provide?
Where and how would that definition of hierarchy of sensible types occur in Wikibase? A JSON file or separate Config file that is defined in the manifest sounds too hardcoded? Or a URL defined in the manifest pointing to a JSON type hierarchy spec? Is that yet another small spec needed to ratify or can we borrow one, maybe Hydra's JSON-LD or something else? I do like the idea of Hydra's Collections, and where a set of Types can be managed within Collections and there's support for pagination of large collections through PartialCollectionView. And this also seems very useful in attaching a Collection to supportedProperty https://www.hydra-cg.com/spec/latest/core/#supported-property-data-source

@wetneb
Copy link
Member Author

wetneb commented Jul 21, 2021

In the case of Wikibase, yes the service provider would define the hierarchy in the configuration file of the service.

@wetneb
Copy link
Member Author

wetneb commented Sep 22, 2021

Ah sorry, yet another detail comes to mind. In the JSON serialization of this new broader field, do we want to only give the type ids, or also the corresponding names?
See for instance the case of reconciliation candidates, where the service can return the names of the corresponding types as well.
We could:

  • keep this PR as it is (services can only return ids)
  • require to return objects instead of strings, with fields id and name
  • allow both syntaxes (as in the case of reconciliation candidates if my memory serves me well?)

@fsteeg fsteeg linked a pull request Nov 16, 2021 that will close this issue
wetneb pushed a commit that referenced this issue Dec 14, 2021
* Add optional `broader` field for types (#68)

* Switch `broader` field from string to array (#68)

* Switch `broader` from array of strings to array of types (#68)
wetneb pushed a commit to reconciliation-api/testbench that referenced this issue Dec 14, 2021
* Display single-level `broader` types as breadcrumb

With default and suggested type IDs

See:

reconciliation-api/specs#68
reconciliation-api/specs#73

* Handle `broader` as array

See reconciliation-api/specs#73

* Handle `broader` array elements as types

See reconciliation-api/specs#73
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants