Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kibana can't start against latest ES on cloud 8.0 snapshot deploy #71399

Closed
EricDavisX opened this issue Jul 9, 2020 · 14 comments
Closed

Kibana can't start against latest ES on cloud 8.0 snapshot deploy #71399

EricDavisX opened this issue Jul 9, 2020 · 14 comments
Assignees
Labels
bug Fixes for quality problems that affect the customer experience Team:Fleet Team label for Observability Data Collection Fleet team

Comments

@EricDavisX
Copy link
Contributor

Run the cloud deploy of 8.0 latest as of today (new snapshot available around 4PM)

Plugins installed: []
use all defaults

verified on multiple cloud regions and providers

Description of the problem including expected versus actual behavior:
see logs attached.

Steps to reproduce:
deploy to cloud - see that Kibana never comes up. get logs and see that ES seems to have problems? I may not be reading the logs right, please forgive.

logs:
cloud-deploy-logs.txt

a particular log line here, latest error, of several:
Jul 9, 2020 @ 20:18:48.000 - Unable to connect to Elasticsearch. Error: [resource_already_exists_exception] index [.kibana_task_manager_1/RdWVpDqGTqy442X2O0gE-Q] already exists, with { index_uuid="RdWVpDqGTqy442X2O0gE-Q" & index=".kibana_task_manager_1" }

It sounds like a Kibana problem right? But I have a special usage of the snapshots in place in the Endpoint Kibana demo server that we updated with just the kibana snapshot and it connected up fine to a 1-2 day older ES snapshot. which is what makes me think its on ES side. but I am not sure at all. thank you in advance for helping this issue on towards its best home.

image seen on cloud:
Screen Shot 2020-07-09 at 4 23 29 PM

@EricDavisX
Copy link
Contributor Author

the snapshot was built this AM, from this job:
https://internal-ci.elastic.co/view/All/job/elastic+release-manager+master+unified-snapshot/518/
518 - master - 8.0.0-28d950ce

@tylersmalley
Copy link
Contributor

tylersmalley commented Jul 9, 2020

This error is being addressed here: #71343

From the logs it's the same error as we are seeing with our ES snapshot promotion:

Jul 9, 2020 @ 20:15:48.000	 - 	{ Error: [mapper_parsing_exception] unknown parameter [index] on mapper [api_key] of type [binary]
    at respond (/usr/share/kibana/node_modules/elasticsearch/src/lib/transport.js:349:15)
    at checkRespForFailure (/usr/share/kibana/node_modules/elasticsearch/src/lib/transport.js:306:7)
    at HttpConnector.<anonymous> (/usr/share/kibana/node_modules/elasticsearch/src/lib/connectors/http.js:173:7)
    at IncomingMessage.wrapper (/usr/share/kibana/node_modules/lodash/lodash.js:4929:19)
    at IncomingMessage.emit (events.js:203:15)
    at endReadableNT (_stream_readable.js:1145:12)
    at process._tickCallback (internal/process/next_tick.js:63:19)

Kibana has index: false on a binary field type. From the docs it seems like this is unnecessary and not a valid option for that type. Prior to today, it wasn't something that resulted in a rejection.

We will handle this from the Kibana side and open an Elasticsearch issue with details on what the breaking change was if that is warranted.

@EricDavisX EricDavisX changed the title ES can't start on latest cloud 8.0 snapshot deploy Kibana can't start against latest ES on cloud 8.0 snapshot deploy Jul 10, 2020
@jasontedor jasontedor transferred this issue from elastic/elasticsearch Jul 12, 2020
@joegallo
Copy link
Contributor

I think it's OOMing:

root@ip-172-23-148-242:~$ docker logs fac-617fade7e92544489eb723bceb66bba0-instance-0000000001
*** Running setuser kibana /app/kibana.sh...
2020-07-14T19:44:46+0000 Booting at Tue Jul 14 19:44:46 UTC 2020
2020-07-14T19:44:46+0000 Done preparing, starting Kibana. See Kibana logs for further output.
[BABEL] Note: The code generator has deoptimised the styling of /usr/share/kibana/x-pack/plugins/canvas/server/templates/pitch_presentation.js as it exceeds the max of 500KB.

 FATAL  Error: [config validation of [xpack.ingestManager].epm]: definition for this key is missing


<--- Last few GCs --->

[32:0x3621b40]   121414 ms: Mark-sweep 475.9 (732.2) -> 475.9 (666.2) MB, 1101.3 / 0.0 ms  (average mu = 0.739, current mu = 0.000) last resort GC in old space requested
[32:0x3621b40]   122614 ms: Mark-sweep 475.9 (666.2) -> 475.9 (638.2) MB, 1199.7 / 0.0 ms  (average mu = 0.558, current mu = 0.000) last resort GC in old space requested


<--- JS stacktrace --->

==== JS stack trace =========================================

    0: ExitFrame [pc: 0x1dc0959dbe1d]
Security context: 0x32f68db1e6c1 <JSObject>
    1: byteLength(aka byteLength) [0x16aa76b7a669] [buffer.js:531] [bytecode=0x38fddb31e7f1 offset=204](this=0x35c0efe826f1 <undefined>,string=0x3a1a7e1eac99 <Very long string[147846958]>,encoding=0x32f68db3ddd9 <String[4]: utf8>)
    2: arguments adaptor frame: 3->2
    3: fromString(aka fromString) [0x3e8c8913379] [buffer.js:342] [bytecode=0x38fddb318891 o...

FATAL ERROR: CALL_AND_RETRY_LAST Allocation failed - JavaScript heap out of memory
 1: 0x8fb090 node::Abort() [/usr/share/kibana/bin/../node/bin/node]
 2: 0x8fb0dc  [/usr/share/kibana/bin/../node/bin/node]
 3: 0xb0322e v8::Utils::ReportOOMFailure(v8::internal::Isolate*, char const*, bool) [/usr/share/kibana/bin/../node/bin/node]
 4: 0xb03464 v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char const*, bool) [/usr/share/kibana/bin/../node/bin/node]
 5: 0xef74c2  [/usr/share/kibana/bin/../node/bin/node]
 6: 0xf06cdf v8::internal::Heap::AllocateRawWithRetryOrFail(int, v8::internal::AllocationSpace, v8::internal::AllocationAlignment) [/usr/share/kibana/bin/../node/bin/node]
 7: 0xed688b v8::internal::Factory::NewRawTwoByteString(int, v8::internal::PretenureFlag) [/usr/share/kibana/bin/../node/bin/node]
 8: 0x1020113 v8::internal::String::SlowFlatten(v8::internal::Handle<v8::internal::ConsString>, v8::internal::PretenureFlag) [/usr/share/kibana/bin/../node/bin/node]
 9: 0xb00bd4 v8::internal::String::Flatten(v8::internal::Handle<v8::internal::String>, v8::internal::PretenureFlag) [/usr/share/kibana/bin/../node/bin/node]
10: 0xb0e5a0 v8::String::Utf8Length() const [/usr/share/kibana/bin/../node/bin/node]
11: 0x914779  [/usr/share/kibana/bin/../node/bin/node]
12: 0xb9166f  [/usr/share/kibana/bin/../node/bin/node]
13: 0xb921d9 v8::internal::Builtin_HandleApiCall(int, v8::internal::Object**, v8::internal::Isolate*) [/usr/share/kibana/bin/../node/bin/node]
14: 0x1dc0959dbe1d
/app/kibana.sh: line 60:    32 Aborted                 (core dumped) NODE_OPTIONS="--max-old-space-size=800" ${KIBANA_HOME}/bin/kibana -c /app/config/kibana.yml $*
*** setuser exited with status 134.
*** Killing all processes...

@joegallo
Copy link
Contributor

I tried again with 4gb (rather than the default of 1) and it still failed, but this time with:

root@ip-172-23-148-242:~$ docker logs -f fac-617fade7e92544489eb723bceb66bba0-instance-0000000002
*** Running setuser kibana /app/kibana.sh...
2020-07-14T19:59:01+0000 Booting at Tue Jul 14 19:59:01 UTC 2020
2020-07-14T19:59:01+0000 Done preparing, starting Kibana. See Kibana logs for further output.
[BABEL] Note: The code generator has deoptimised the styling of /usr/share/kibana/x-pack/plugins/canvas/server/templates/pitch_presentation.js as it exceeds the max of 500KB.

 FATAL  Error: [config validation of [xpack.ingestManager].epm]: definition for this key is missing

*** setuser exited with status 1.
*** Killing all processes...

@tylersmalley
Copy link
Contributor

FATAL Error: [config validation of [xpack.ingestManager].epm]: definition for this key is missing

@elastic/ingest-management is this a known issue?

@liza-mae
Copy link
Contributor

It was probably blocked failure now being uncovered after fixing for #71343

@jfsiii
Copy link
Contributor

jfsiii commented Jul 14, 2020

The messages which started earlier today are almost certainly from #71542

There was also an email to ingest-management@ yesterday titled "Kibana xpack.ingestManager.epm.* settings removed" which said

Hi team,
This is an FYI that the following changes related to Ingest Manager settings were recently merged into Kibana:
xpack.ingestManager.epm.enabled was removed
xpack.ingestManager.epm.registryUrl was renamed (flattened) to xpack.ingestManager.registryUrl
If you have been using these settings previously, make sure to remove or update them next time you update with latest Kibana. Kibana will fail to start if it detects unknown settings.

@EricDavisX
Copy link
Contributor Author

the info here is 5 days old and I'm sure is out of date, per all of our comments. and the 8.0 line of snapshots still seem blocked due to a Beats problem (I think) so... I don't know whats up. I'm focusing on 7.9 - but am happy to post back specific tests or more info as needed based on current info, but I don't know how helpful this is at this point. I wasn't using any manual override settings like John has posted above, if those are hard-coded in cloud setup somewhere that could certainly be it.

@ph
Copy link
Contributor

ph commented Jul 14, 2020

@joegallo @jfsiii Do we need to create a PR to fix the EPM issue?

@nachogiljaldo
Copy link

I can confirm what Joe sees. There seems to be 2 problems (or is one perhaps a consequence of the other)?

The xpack.ingestManager.epm.enabled setting (and

xpack.ingestManager.enabled: true
xpack.ingestManager.epm.enabled: true
xpack.ingestManager.fleet.enabled: true

in general) were introduced by @jfsiii to test those for , if those are not valid please, raise a PR to fix it ASAP, good thing is that this is on stackpacks only so we can release it as soon as the PR is approved.

@jfsiii
Copy link
Contributor

jfsiii commented Jul 15, 2020

Thanks, @nachogiljaldo

@ph I'm afk until this after afternoon so I can't dig in until then. If anyone else wants to start on it, https://github.com/elastic/cloud/pull/56653 shows the relevant changes

@nachogiljaldo I believe we want xpack.ingestManager.epm.enabled to remain valid only for 7.8, and become invalid for 7.9 & 8.0. I'll update the tests to verify 7.9 & 8.x are invalid for that flag, but I'm not sure what code changes I need to make. Do I need to update https://github.com/elastic/cloud/blob/master/scala-services/adminconsole/src/main/resources/settings/kibana/ingest_manager.yml#L5-L7 or add/change something elsewhere?

@nachogiljaldo
Copy link

@jfsiii my understanding is that you have to revert/adapt https://github.com/elastic/cloud/pull/55464/files#diff-895cf982094d02bdfd8e44c277afc1bf to the new reality

@jfsiii
Copy link
Contributor

jfsiii commented Jul 15, 2020

@nachogiljaldo oh, if this is just about the block of overrides in kibana.yml then this is much simpler. We can drop that line or even that whole block since those are the new defaults.

I'll submit a PR as soon as I can.

@nachogiljaldo
Copy link

@jfsiii I think that's the key thing, yes. Please remember to do it on the cloud-assets repository as well.

@ph ph added Team:ingest-management bug Fixes for quality problems that affect the customer experience labels Jul 15, 2020
@jen-huang jen-huang added Team:Fleet Team label for Observability Data Collection Fleet team and removed Team:ingest-management labels Jul 16, 2020
@jfsiii jfsiii closed this as completed Jul 20, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Fixes for quality problems that affect the customer experience Team:Fleet Team label for Observability Data Collection Fleet team
Projects
None yet
Development

No branches or pull requests

8 participants