Associated record / Store all relations in index. #4912

fxprunayre · 2020-08-07T08:46:42Z

Store all associated records using a structure like:

"recordLink" : [
    {
      "to" : "792361bb-4cfa-409f-9762-ab42e5a05b39",
      "origin" : "catalog",
      "created" : "bySearch",
      "title" : "Concentration en habitants dans un rayon de 500m en Wallonie - Service de visualisation REST",
      "url" : "http://localhost:8080/geonetwork/srv/api/records/792361bb-4cfa-409f-9762-ab42e5a05b39",
      "type" : "services"
    }, ...

in the document index. Using those information, the list of related records can be directly displayed in search result or record view.
The main drawback is that user privileges are not taken into account. The title of a record not visible by current user may be displayed.
The main advantages of this approach is that it is much faster than the related API.

A couple of issues are identified. This mode is for now disabled by default and marked as experimental.

As we use the index to query for relations (relation not stored in children, bidirectional sibling, dataset operatedBy), the indexing must be 2 steps:

First index all records
Then index relations

Currently it is hard to only do a partial indexing. In this case, we only need to collect recordLink and update the doc in the index.

While editing a record, all related before and after the editing session needs to be update.
We should collect all UUIDs affected by the current session, and index them following the rule above.

Store all associated records using a structure like: ```json "recordLink" : [ { "to" : "792361bb-4cfa-409f-9762-ab42e5a05b39", "origin" : "catalog", "created" : "bySearch", "title" : "Concentration en habitants dans un rayon de 500m en Wallonie - Service de visualisation REST", "url" : "http://localhost:8080/geonetwork/srv/api/records/792361bb-4cfa-409f-9762-ab42e5a05b39", "type" : "services" }, ... ``` in the document index. Using those information, the list of related records can be directly displayed in search result or record view. The main drawback is that user privileges are not taken into account. The title of a record not visible by current user may be displayed. The main advantages of this approach is that it is much faster than the related API. A couple of issues are identified. As we use the index to query for relations (relation not stored in children, bidirectional sibling, dataset operatedBy), the indexing must be 2 steps: * First index all records * Then index relations Currently it is hard to only do a partial indexing. In this case, we only need to collect recordLink and update the doc in the index. While editing a record, all related before and after the editing session needs to be update. We should collect all UUIDs affected by the current session, and index them following the rule above.

Related to geonetwork/core-geonetwork#4912

* Associated record / Store all relations in index. Store all associated records using a structure like: ```json "recordLink" : [ { "to" : "792361bb-4cfa-409f-9762-ab42e5a05b39", "origin" : "catalog", "created" : "bySearch", "title" : "Concentration en habitants dans un rayon de 500m en Wallonie - Service de visualisation REST", "url" : "http://localhost:8080/geonetwork/srv/api/records/792361bb-4cfa-409f-9762-ab42e5a05b39", "type" : "services" }, ... ``` in the document index. Using those information, the list of related records can be directly displayed in search result or record view. The main drawback is that user privileges are not taken into account. The title of a record not visible by current user may be displayed. The main advantages of this approach is that it is much faster than the related API. A couple of issues are identified. As we use the index to query for relations (relation not stored in children, bidirectional sibling, dataset operatedBy), the indexing must be 2 steps: * First index all records * Then index relations Currently it is hard to only do a partial indexing. In this case, we only need to collect recordLink and update the doc in the index. While editing a record, all related before and after the editing session needs to be update. We should collect all UUIDs affected by the current session, and index them following the rule above. * Update en-admin.json

Related to geonetwork/core-geonetwork#4912

…l feature, not used. Related to #4912

* Update to Elasticsearch 8. Use of Elasticsearch Java API Client instead of Java High Level REST Client * Update to Elasticsearch 8 / WFS indexing draft. (#88) * Update to Elasticsearch 8 / remove TODOs * Update Elasticsearch client to version 8.11.3 * Elasticsearch / Update maven plugin. * Associated record / Store all relations in index / Remove experimental feature, not used. Related to #4912 * Elasticsearch / Update maven plugin configuration. Avoid error like ERROR: Elasticsearch exited unexpectedly, with exit code 143 * Elasticsearch / Update MetadataUtils.getAssociated to retrieve scripted overview field * Elasticsearch / Update MetadataUtils.getAssociated remove TODO comment * Elasticsearch / Fix and refactor index readonly health check * Elasticsearch / Log query error details * Elasticsearch / Sonarlint improvements * Elasticsearch / WrapperQuery use base64 encoded JSON string query. * Elasticsearch / Remove unused commented code from EsSearchManager * Elasticsearch / More strict Xlink query based on UUID and fix check on hits. A request may return no hits but can be used to check number of hits. In such case we should avoid using hits.hits.size and use hits.total.value to get number of match. * Elasticsearch / Health check / Fix number of hits info. * Elasticsearch / Cleaning / No need to retrieve hits to only get matches. * Elasticsearch / Deprecated field [include] used, expected [includes] instead. * Elasticsearch / Remove 'Clear XLink cache' from Administration > Tools, clear the Xlink cache automatically before indexing and remove non-implemented code to retrieve metadata with XLink (not required anymore) * Kibana / Update install instruction Related to elastic/kibana#82521. * Elasticsearch / Remove unused imports * Kibana / Update default dashboards. * Elasticsearch / Documentation / Update Elasticsearch version * Elasticsearch / Fix logger module name typo --------- Co-authored-by: François Prunayre <[email protected]>

fxprunayre added this to the 4.0.0 milestone Aug 7, 2020

fxprunayre added a commit to metadata101/iso19115-3.2018 that referenced this pull request Aug 7, 2020

Associated record / Store all relations in index.

cf7c688

Related to geonetwork/core-geonetwork#4912

fxprunayre marked this pull request as ready for review August 10, 2020 11:03

Update en-admin.json

728ca8a

fxprunayre modified the milestones: 4.0.0, 4.0.0-alpha.2 Aug 10, 2020

fxprunayre merged commit cd0009e into geonetwork:4.0.x Aug 10, 2020

fxprunayre mentioned this pull request Aug 10, 2020

Associated record / Store all relations in index. metadata101/iso19115-3.2018#84

Merged

fxprunayre added a commit to metadata101/iso19115-3.2018 that referenced this pull request Aug 10, 2020

Associated record / Store all relations in index. (#84)

7d2b92c

Related to geonetwork/core-geonetwork#4912

josegar74 added a commit that referenced this pull request Jan 8, 2024

Associated record / Store all relations in index / Remove experimenta…

4004b10

…l feature, not used. Related to #4912

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Associated record / Store all relations in index. #4912

Associated record / Store all relations in index. #4912

fxprunayre commented Aug 7, 2020 •

edited

Loading

Associated record / Store all relations in index. #4912

Associated record / Store all relations in index. #4912

Conversation

fxprunayre commented Aug 7, 2020 • edited Loading

fxprunayre commented Aug 7, 2020 •

edited

Loading