Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Elasticsearch / Update to 8.14.3. #8337

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

Elasticsearch / Update to 8.14.3. #8337

wants to merge 1 commit into from

Conversation

fxprunayre
Copy link
Member

Fix for #8305

Checklist

  • I have read the contribution guidelines
  • Pull request provided for main branch, backports managed with label
  • Good housekeeping of code, cleaning up comments, tests, and documentation
  • Clean commit history broken into understandable chucks, avoiding big commits with hundreds of files, cautious of reformatting and whitespace changes
  • Clean commit messages, longer verbose messages are encouraged
  • API Changes are identified in commit messages
  • Testing provided for features or enhancements using automatic tests
  • User documentation provided for new features or enhancements in manual
  • Build documentation provided for development instructions in README.md files
  • Library management using pom.xml dependency management. Update build documentation with intended library use and library tutorials or documentation

Funded by Ifremer

@fxprunayre fxprunayre added this to the 4.4.6 milestone Sep 2, 2024
Copy link

sonarcloud bot commented Sep 2, 2024

Copy link
Member

@josegar74 josegar74 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have done the following test:

  1. Start ElasticSearch 8.4.2
  2. Start GeoNetwork
  3. Admin console > Tools > Delete index and reindex
  4. Load ISO19139 samples
  5. Go to the search page: Query returned an error. Check the console for details.

The response error:

{
    "servlet": "spring",
    "message": "Error is: Bad Request.\nRequest:\n{"from":0,"size":30,"sort":["_score"],"query":{"function_score":{"boost":"5","functions":[{"filter":{"match":{"resourceType":"series"}},"weight":1.5},{"filter":{"exists":{"field":"parentUuid"}},"weight":0.3},{"filter":{"match":{"cl_status.key":"obsolete"}},"weight":0.2},{"filter":{"match":{"cl_status.key":"superseded"}},"weight":0.3},{"gauss":{"changeDate":{"scale":"365d","offset":"90d","decay":0.5}}}],"score_mode":"multiply","query":{"bool":{"must":[{"terms":{"isTemplate":["n"]}}],"filter":{"query_string":{"query":"*:* AND (draft:n OR draft:e)"}}}}}},"aggregations":{"resourceType":{"terms":{"field":"resourceType"},"meta":{"decorator":{"type":"icon","prefix":"fa fa-fw gn-icon-"},"field":"resourceType"}},"cl_spatialRepresentationType.key":{"terms":{"field":"cl_spatialRepresentationType.key","size":10},"meta":{"field":"cl_spatialRepresentationType.key"}},"format":{"terms":{"field":"format"},"meta":{"collapsed":true,"field":"format"}},"availableInServices":{"filters":{"filters":{"availableInViewService":{"query_string":{"query":"+linkProtocol:/OGC:WMS.*/"}},"availableInDownloadService":{"query_string":{"query":"+linkProtocol:/OGC:WFS.*/"}}}},"meta":{"decorator":{"type":"icon","prefix":"fa fa-fw ","map":{"availableInViewService":"fa-globe","availableInDownloadService":"fa-download"}}}},"th_gemet_tree.key":{"terms":{"field":"th_gemet_tree.key","size":100,"order":{"_key":"asc"},"include":"[^^]+^?[^^]+"},"meta":{"field":"th_gemet_tree.key"}},"th_httpinspireeceuropaeumetadatacodelistPriorityDataset-PriorityDataset_tree.default":{"terms":{"field":"th_httpinspireeceuropaeumetadatacodelistPriorityDataset-PriorityDataset_tree.default","size":100,"order":{"_key":"asc"}},"meta":{"field":"th_httpinspireeceuropaeumetadatacodelistPriorityDataset-PriorityDataset_tree.default"}},"th_httpinspireeceuropaeutheme-theme_tree.key":{"terms":{"field":"th_httpinspireeceuropaeutheme-theme_tree.key","size":34},"meta":{"decorator":{"type":"icon","prefix":"fa fa-fw gn-icon iti-","expression":"http://inspire.ec.europa.eu/theme/(.*)"},"field":"th_httpinspireeceuropaeutheme-theme_tree.key"}},"tag":{"terms":{"field":"tag.langeng","include":".*","size":10},"meta":{"caseInsensitiveInclude":true,"field":"tag.langeng"}},"th_regions_tree.default":{"terms":{"field":"th_regions_tree.default","size":100,"order":{"_key":"asc"}},"meta":{"field":"th_regions_tree.default"}},"resolutionScaleDenominator":{"histogram":{"field":"resolutionScaleDenominator","interval":10000,"keyed":true,"min_doc_count":1},"meta":{"collapsed":true}},"creationYearForResource":{"histogram":{"field":"creationYearForResource","interval":5,"keyed":true,"min_doc_count":1},"meta":{"collapsed":true}},"OrgForResource":{"terms":{"field":"OrgForResourceObject.langeng","include":".*","size":20},"meta":{"caseInsensitiveInclude":true,"field":"OrgForResourceObject.langeng"}},"cl_maintenanceAndUpdateFrequency.key":{"terms":{"field":"cl_maintenanceAndUpdateFrequency.key","size":10},"meta":{"collapsed":true,"field":"cl_maintenanceAndUpdateFrequency.key"}}},"_source":{"includes":["uuid","id","groupOwner","logo","cat","inspireThemeUri","inspireTheme_syn","cl_topic","resourceType","resourceTitle*","resourceAbstract*","draft","draftId","owner","link","status*","rating","geom","contact*","Org*","isTemplate","valid","isHarvested","dateStamp","documentStandard","standardNameObject.default","cl_status*","mdStatus*","op*","documentStandard","groupOwner","owner","id"]},"script_fields":{"overview":{"script":{"source":"return params['_source'].overview == null ? [] : params['_source'].overview.stream().findFirst().orElse([]);"}}},"track_total_hits":true}\n.\nError:\n{"error":{"root_cause":[{"type":"illegal_argument_exception","reason":"Text fields are not optimised for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true on [cl_spatialRepresentationType.key] in order to load field data by uninverting the inverted index. Note that this can use significant memory."}],"type":"search_phase_execution_exception","reason":"all shards failed","phase":"query","grouped":true,"failed_shards":[{"shard":0,"index":"gn-records","node":"I1k01ZfGSYuYfRxVmwOglQ","reason":{"type":"illegal_argument_exception","reason":"Text fields are not optimised for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true on [cl_spatialRepresentationType.key] in order to load field data by uninverting the inverted index. Note that this can use significant memory."}}],"caused_by":{"type":"illegal_argument_exception","reason":"Text fields are not optimised for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true on [cl_spatialRepresentationType.key] in order to load field data by uninverting the inverted index. Note that this can use significant memory.","caused_by":{"type":"illegal_argument_exception","reason":"Text fields are not optimised for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true on [cl_spatialRepresentationType.key] in order to load field data by uninverting the inverted index. Note that this can use significant memory."}}},"status":400}.",
    "url": "/geonetwork/srv/api/search/records/_search",
    "status": "400"
}

It seems related to the facets, clicking the search button displays the results.

Tested the same steps with ElasticSearch 8.14, looks ok.

ElasticSearch 8.4.2 was tested with security enabled, but I don't think that is the problem.

@fxprunayre
Copy link
Member Author

set fielddata=true on [cl_spatialRepresentationType.key] in order to load

So it relates to the mapping,
When catalogue is empty, checking
http://localhost:9200/gn-records/_mapping

 {
          "codelist": {
            "match": "[cl_*]",
            "mapping": {
              "properties": {
                "default": {
                  "type": "keyword"
                },
                "link": {
                  "type": "keyword"
                },
                "text": {
                  "type": "text"
                },
                "key": {
                  "type": "keyword"
                }
              },
              "type": "object"
            }
          }
        },

and when you index records, fields for each codelists are created:

"cl_topic": {
          "properties": {
            "default": {
              "type": "text",
              "fields": {
                "keyword": {
                  "type": "keyword",
                  "ignore_above": 256
                }
              }
            },
            "key": {
              "type": "text",
              "fields": {
                "keyword": {
                  "type": "keyword",
                  "ignore_above": 256
                }
              }
            },
            "lang": {
              "type": "text",
              "fields": {
                "keyword": {
                  "type": "keyword",
                  "ignore_above": 256
                }
              }
            },
            "langeng": {
              "type": "text",
              "fields": {
                "keyword": {
                  "type": "keyword",
                  "ignore_above": 256
                }
              }
            }
          }
        },

Not sure why the dynamic_templates is not matched anymore. Checking it

Copy link
Contributor

@jahow jahow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fxprunayre
Copy link
Member Author

Not sure why the dynamic_templates is not matched anymore. Checking it

Probably related to elastic/elasticsearch-java#841

 "codelist": {
            "match": "[cl_*]",

instead of

          "codelist": {
            "match": "cl_*",

Not sure what is the best way to solve this?

@fxprunayre
Copy link
Member Author

Not sure what is the best way to solve this?

Maybe on the long run, for each GeoNetwork branches we should stick to an Elasticsearch version branch eg. 4.4.x on 8.14.x (or if we don't want the issue above, rollback to a version before 8.9? 8.11 was used and we did not noticed that issue)

On my side, no setup requires a fixed (and "old") version of Elasticsearch and usually the request is more to update to the latest so using the same version for the Java client and the server is also fine.

@fxprunayre
Copy link
Member Author

For users who would like to use 8.4 servers they can always create the index with the mapping

curl -X DELETE http://localhost:9200/gn-records
curl -X PUT http://localhost:9200/gn-records -H "Content-Type:application/json"  -d @web/src/main/webapp/WEB-INF/data/config/index/records.json
{"acknowledged":true,"shards_acknowledged":true,"index":"gn-records"}

and then use the "reindex record"

image

to avoid the 8.14 Java client to send the index mapping to the server.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants