-
-
Notifications
You must be signed in to change notification settings - Fork 489
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Associated record / Store all relations in index. #4912
Merged
fxprunayre
merged 2 commits into
geonetwork:4.0.x
from
fxprunayre:es-recordlink-at-indexingtime
Aug 10, 2020
Merged
Associated record / Store all relations in index. #4912
fxprunayre
merged 2 commits into
geonetwork:4.0.x
from
fxprunayre:es-recordlink-at-indexingtime
Aug 10, 2020
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Store all associated records using a structure like: ```json "recordLink" : [ { "to" : "792361bb-4cfa-409f-9762-ab42e5a05b39", "origin" : "catalog", "created" : "bySearch", "title" : "Concentration en habitants dans un rayon de 500m en Wallonie - Service de visualisation REST", "url" : "http://localhost:8080/geonetwork/srv/api/records/792361bb-4cfa-409f-9762-ab42e5a05b39", "type" : "services" }, ... ``` in the document index. Using those information, the list of related records can be directly displayed in search result or record view. The main drawback is that user privileges are not taken into account. The title of a record not visible by current user may be displayed. The main advantages of this approach is that it is much faster than the related API. A couple of issues are identified. As we use the index to query for relations (relation not stored in children, bidirectional sibling, dataset operatedBy), the indexing must be 2 steps: * First index all records * Then index relations Currently it is hard to only do a partial indexing. In this case, we only need to collect recordLink and update the doc in the index. While editing a record, all related before and after the editing session needs to be update. We should collect all UUIDs affected by the current session, and index them following the rule above.
fxprunayre
added a commit
to metadata101/iso19115-3.2018
that referenced
this pull request
Aug 7, 2020
MichelGabriel
pushed a commit
to MichelGabriel/core-geonetwork
that referenced
this pull request
Aug 10, 2020
* Associated record / Store all relations in index. Store all associated records using a structure like: ```json "recordLink" : [ { "to" : "792361bb-4cfa-409f-9762-ab42e5a05b39", "origin" : "catalog", "created" : "bySearch", "title" : "Concentration en habitants dans un rayon de 500m en Wallonie - Service de visualisation REST", "url" : "http://localhost:8080/geonetwork/srv/api/records/792361bb-4cfa-409f-9762-ab42e5a05b39", "type" : "services" }, ... ``` in the document index. Using those information, the list of related records can be directly displayed in search result or record view. The main drawback is that user privileges are not taken into account. The title of a record not visible by current user may be displayed. The main advantages of this approach is that it is much faster than the related API. A couple of issues are identified. As we use the index to query for relations (relation not stored in children, bidirectional sibling, dataset operatedBy), the indexing must be 2 steps: * First index all records * Then index relations Currently it is hard to only do a partial indexing. In this case, we only need to collect recordLink and update the doc in the index. While editing a record, all related before and after the editing session needs to be update. We should collect all UUIDs affected by the current session, and index them following the rule above. * Update en-admin.json
fxprunayre
added a commit
to metadata101/iso19115-3.2018
that referenced
this pull request
Aug 10, 2020
josegar74
added a commit
that referenced
this pull request
Jan 8, 2024
…l feature, not used. Related to #4912
fxprunayre
added a commit
that referenced
this pull request
Feb 9, 2024
* Update to Elasticsearch 8. Use of Elasticsearch Java API Client instead of Java High Level REST Client * Update to Elasticsearch 8 / WFS indexing draft. (#88) * Update to Elasticsearch 8 / remove TODOs * Update Elasticsearch client to version 8.11.3 * Elasticsearch / Update maven plugin. * Associated record / Store all relations in index / Remove experimental feature, not used. Related to #4912 * Elasticsearch / Update maven plugin configuration. Avoid error like ERROR: Elasticsearch exited unexpectedly, with exit code 143 * Elasticsearch / Update MetadataUtils.getAssociated to retrieve scripted overview field * Elasticsearch / Update MetadataUtils.getAssociated remove TODO comment * Elasticsearch / Fix and refactor index readonly health check * Elasticsearch / Log query error details * Elasticsearch / Sonarlint improvements * Elasticsearch / WrapperQuery use base64 encoded JSON string query. * Elasticsearch / Remove unused commented code from EsSearchManager * Elasticsearch / More strict Xlink query based on UUID and fix check on hits. A request may return no hits but can be used to check number of hits. In such case we should avoid using hits.hits.size and use hits.total.value to get number of match. * Elasticsearch / Health check / Fix number of hits info. * Elasticsearch / Cleaning / No need to retrieve hits to only get matches. * Elasticsearch / Deprecated field [include] used, expected [includes] instead. * Elasticsearch / Remove 'Clear XLink cache' from Administration > Tools, clear the Xlink cache automatically before indexing and remove non-implemented code to retrieve metadata with XLink (not required anymore) * Kibana / Update install instruction Related to elastic/kibana#82521. * Elasticsearch / Remove unused imports * Kibana / Update default dashboards. * Elasticsearch / Documentation / Update Elasticsearch version * Elasticsearch / Fix logger module name typo --------- Co-authored-by: François Prunayre <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Store all associated records using a structure like:
in the document index. Using those information, the list of related records can be directly displayed in search result or record view.
The main drawback is that user privileges are not taken into account. The title of a record not visible by current user may be displayed.
The main advantages of this approach is that it is much faster than the related API.
A couple of issues are identified. This mode is for now disabled by default and marked as experimental.
As we use the index to query for relations (relation not stored in children, bidirectional sibling, dataset operatedBy), the indexing must be 2 steps:
Currently it is hard to only do a partial indexing. In this case, we only need to collect recordLink and update the doc in the index.
While editing a record, all related before and after the editing session needs to be update.
We should collect all UUIDs affected by the current session, and index them following the rule above.