Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parse Error while loading into Elasticsearch Cluster #457

Closed
vrozental opened this issue Jul 2, 2019 · 3 comments · Fixed by pelias/schema#372
Closed

Parse Error while loading into Elasticsearch Cluster #457

vrozental opened this issue Jul 2, 2019 · 3 comments · Fixed by pelias/schema#372

Comments

@vrozental
Copy link

I got POST http://sdt-dev-elastic-001.test.pro:9200/_bulk => Parse Error while running npm start to import data into our Elastic cluster.

What might be the reason?

myuser@sdt-dev-elastic-001:~/pelias-setup/whosonfirst$ time npm start

> [email protected] start /home/myuser/pelias-setup/whosonfirst
> ./bin/start

Elasticsearch ERROR: 2019-07-02T09:22:28Z
  Error: Request error, retrying
  POST http://sdt-dev-elastic-001.test.pro:9200/_bulk => Parse Error
      at Log.error (/home/myuser/pelias-setup/whosonfirst/node_modules/elasticsearch/src/lib/log.js:226:56)
      at checkRespForFailure (/home/myuser/pelias-setup/whosonfirst/node_modules/elasticsearch/src/lib/transport.js:259:18)
      at HttpConnector.<anonymous> (/home/myuser/pelias-setup/whosonfirst/node_modules/elasticsearch/src/lib/connectors/http.js:164:7)
      at ClientRequest.wrapper (/home/myuser/pelias-setup/whosonfirst/node_modules/lodash/lodash.js:4935:19)
      at ClientRequest.emit (events.js:198:13)
      at Socket.socketOnData (_http_client.js:448:9)
      at Socket.emit (events.js:198:13)
      at addChunk (_stream_readable.js:288:12)
      at readableAddChunk (_stream_readable.js:269:11)
      at Socket.Readable.push (_stream_readable.js:224:10)

The importer settings from the pelias.json:

"whosonfirst": {
      "datapath": "/mnt/local_drive_xvdc/pelias/data/whosonfirst",
      "importPostalcodes": true,
      "maxDownloads": 8
    }

The Elastic version is

  "version" : {
    "number" : "5.6.16",
    "build_hash" : "3a740d1",
    "build_date" : "2019-03-13T15:33:36.565Z",
    "build_snapshot" : false,
    "lucene_version" : "6.6.1"
  },
@missinglink
Copy link
Member

I have also seen these recently, usually two at the beginning of the import and not repeated?
I'm not sure exactly why it's happening but it appears that the request is retried and eventually succeeds.

Nothing to worry about.

@vrozental
Copy link
Author

Thank you @missinglink

@orangejulius
Copy link
Member

Yeah, these messages are nothing to worry about. It turns out Elasticsearch 5 emits a LOT of deprecation warnings for our current schema, and does so using HTTP headers. Node.js recently added a max header size, which triggers these errors. You can read all the details in pelias/schema#337 (comment)

We'll be able to fix these errors soon by cleaning up some of the deprecation warnings now that we've completely dropped support for ES2. Until then, nothing to worry about

orangejulius added a commit to pelias/schema that referenced this issue Jul 5, 2019
This change makes our Elasticsearch schema compatible with Elasticsearch
5 and 6. It shouln't have any effect on performance or operation, but it
will completely drop compatibility for Elasticsearch 2.

The primary change is that Elasticsearch 5 introduces two types of text
fields: `text` and `keyword`, whereas Elasticsearch 2 only had 1:
`string`.

Roughly, a `text` field is for true full text search and a `keyword`
field is for simple values that are primarily used for filtering or
aggregation (for example, our `source` and `layer` fields). The `string` datatype previously filled both of those roles depending on
how it was configured.

Fortunately, we had already roughly created a concept similar to the
`keyword` datatype in our schema, but called it `literal`. This has been
renamed to `keyword` to cut down on the number of terms needed

One nice effect of this change is that it removes all deprecation
warnings printed by Elasticsearch 5. Notably, as discovered in
#337 (comment), these
warnings were quite noisy and required special handling to work around
Node.js header size restrictions. This special handling can now been
removed.

Fixes pelias/whosonfirst#457
Connects pelias/pelias#719
Connects pelias/pelias#461
orangejulius added a commit to pelias/schema that referenced this issue Jul 5, 2019
This change makes our Elasticsearch schema compatible with Elasticsearch
5 and 6. It shouldn't have any effect on performance or operation, but it
will completely drop compatibility for Elasticsearch 2.

The primary change is that Elasticsearch 5 introduces two types of text
fields: `text` and `keyword`, whereas Elasticsearch 2 only had 1:
`string`.

Roughly, a `text` field is for true full text search and a `keyword`
field is for simple values that are primarily used for filtering or
aggregation (for example, our `source` and `layer` fields). The `string` datatype previously filled both of those roles depending on
how it was configured.

Fortunately, we had already roughly created a concept similar to the
`keyword` datatype in our schema, but called it `literal`. This has been
renamed to `keyword` to cut down on the number of terms needed

One nice effect of this change is that it removes all deprecation
warnings printed by Elasticsearch 5. Notably, as discovered in
#337 (comment), these
warnings were quite noisy and required special handling to work around
Node.js header size restrictions. This special handling can now been
removed.

Fixes pelias/whosonfirst#457
Connects pelias/pelias#719
Connects pelias/pelias#461
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants