Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Does saveToOpenSearch supports index mappings? #490

Open
pjpastrana opened this issue Jul 11, 2024 · 1 comment
Open

[BUG] Does saveToOpenSearch supports index mappings? #490

pjpastrana opened this issue Jul 11, 2024 · 1 comment
Labels
bug Something isn't working enhancement New feature or request

Comments

@pjpastrana
Copy link

pjpastrana commented Jul 11, 2024

What is the bug?

I have created an index with its specific settings and mappings for the data fields.
When I try to bulk insert my dataframe calling saveToOpenSearch, I get the following error:
Could not write all entries for bulk operation [1000/1000]. Error sample (first [5] error messages): org.opensearch.hadoop.rest.OpenSearchHadoopRemoteException: illegal_argument_exception: mapper [field.x] cannot be changed from type [keyword] to [text]...

Most likely because saveToOpenSearch is assuming a dynamic mapping and not the defined mapping from the index.

I can't find any reference in the docs of saveToOpenSearch supporting the mappings definitions for the index as an argument. Is this a bug? or am I missing something here?
I don't get this behavior when the index has been created without any mappings or if it does not exists and is automatically created by the saveToOpenSearch function call.

How can one reproduce the bug?

Create the index

index_body = {
  "settings": {
    "index": index_settings
  },
  "mappings": mappings
}
client = OpenSearch(
    hosts = [{'host': host, 'port': port}],
    use_ssl = True,
    verify_certs = True
)
client.indices.create(
  index_name, 
  body=index_body
)

Push data to OpenSearch

val df = spark.sql(s"select * from table")
df.saveToOpenSearch(index_name)

What is the expected behavior?

I expect the records to be added to the index similarly to saveToOpenSearch function automatically creates the index.

What is your host/environment?

Operating system, version.
Scala 2.12
Spark 3.4.1
OpenSearch 1.3.2
opensearch-hadoop:1.1.0

@pjpastrana pjpastrana added bug Something isn't working untriaged labels Jul 11, 2024
@dblock
Copy link
Member

dblock commented Jul 29, 2024

[Catch All Triage - 1, 2, 3, 4]

@dblock dblock added enhancement New feature or request and removed untriaged labels Jul 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants