Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ingest: upsert with drop processor returns error #36746

Closed
jakelandis opened this issue Dec 17, 2018 · 4 comments · Fixed by #104585
Closed

ingest: upsert with drop processor returns error #36746

jakelandis opened this issue Dec 17, 2018 · 4 comments · Fixed by #104585
Assignees
Labels
>bug :Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP Team:Data Management Meta label for data/management team

Comments

@jakelandis
Copy link
Contributor

If performing an upsert with a default pipeline, and that default pipeline drops the document, an error will be returned to the client.

This is a minor bug since the end result is the same, the document is not indexed. However, it can pollute error logs, cause confusion, and have a minor performance penalty due to the exception and logging.

Both normal upsert and bulk upsert will error, but with different signatures

DELETE test
PUT test
{
  "settings": {
    "index.default_pipeline": "dropper"
  }
}
PUT _ingest/pipeline/dropper
{
  "processors": [
    {
      "drop": {}
    }
  ]
}

Non-bulk upsert:

POST test/doc/1/_update
{
  "script":{
    "source": "ctx._source.foo = 'bar'" 
  },
  "upsert" :{
    "foo" : "bar"
  }
}

Results in

{
  "error": {
    "root_cause": [
      {
        "type": "remote_transport_exception",
        "reason": "[node-0][127.0.0.1:9300][indices:data/write/update[s]]"
      }
    ],
    "type": "class_cast_exception",
    "reason": "class org.elasticsearch.action.update.UpdateResponse cannot be cast to class org.elasticsearch.action.index.IndexResponse (org.elasticsearch.action.update.UpdateResponse and org.elasticsearch.action.index.IndexResponse are in unnamed module of loader 'app')"
  },
  "status": 500
}

and bulk upsert results in

    {
      "index" : {
        "_index" : null,
        "_type" : null,
        "_id" : null,
        "status" : 500,
        "error" : {
          "type" : "null_pointer_exception",
          "reason" : null
        }
      }
    }
@jakelandis jakelandis added >bug :Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP labels Dec 17, 2018
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-features

@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-data-management (Team:Data Management)

@joegallo
Copy link
Contributor

Here's a note for my own benefit, this is what an appropriate _bulk would look like -- it took me a couple of minutes to piece it together.

POST _bulk
{ "update" : { "_index" : "test-1", "_id" : "1" } }
{ "upsert" : { "some" : "fields" }, "script" : "ctx" }
{ "update" : { "_index" : "test-2", "_id" : "1" } }
{ "upsert" : { "some" : "fields" }, "script" : "ctx" }

@joegallo
Copy link
Contributor

On more recent versions the _update API has moved, so the non-bulk upsert looks like this now:

POST test/_update/1
{
  "script":{
    "source": "ctx._source.foo = 'bar'"
  },
  "upsert" :{
    "foo" : "bar"
  }
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP Team:Data Management Meta label for data/management team
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants