Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Fleet] Issues installing APM package #89829

Closed
simitt opened this issue Feb 1, 2021 · 8 comments · Fixed by #94040
Closed

[Fleet] Issues installing APM package #89829

simitt opened this issue Feb 1, 2021 · 8 comments · Fixed by #94040
Assignees
Labels
bug Fixes for quality problems that affect the customer experience Feature:Fleet Fleet team's agent central management project Team:Fleet Team label for Observability Data Collection Fleet team v7.13.0

Comments

@simitt
Copy link
Contributor

simitt commented Feb 1, 2021

Kibana version: master (docker)

Elasticsearch version: master (docker)

Describe the bug:
Trying to install the apm integration to the default policy in Fleet occasionally leads to
(a) showing that the enabled_rum field is required
(b) the resulting policy configuration does not have an apm-server section where the variables are configured.
(c) the Elastic Agent subscribed to this policy going into an endless restart loop of the apm-server
Unfortunately nothing helpful is logged in the agent logs; apm-server logs are not created.
This happens for apm package 0.1.0-dev.4, which hasn't been changed in a while.

Steps to reproduce:
The behavior was observed twice, without exactly knowing how to reproduce. @jalvz experienced some similar error.
Wiping ES, Kibana, Elastic Agent docker containers helped getting to a clean state again.

Expected behavior:
Expected apm part in config:

  - id: a64b61e3-08cf-4865-b700-99ce867d3f67
    name: apm-1
    revision: 1
    type: apm
    use_output: default
    meta:
      package:
        name: apm
        version: 0.1.0-dev.4
    data_stream:
      namespace: default
    apm-server:
      host: '0.0.0.0:8200'
      secret_token: null
      rum.enabled: false

Observed apm config:
Screenshot 2021-01-29 at 19 51 28

Adding APM integration:
Screenshot 2021-01-29 at 19 46 10

cc @jen-huang

@simitt simitt added bug Fixes for quality problems that affect the customer experience Feature:Fleet Fleet team's agent central management project labels Feb 1, 2021
@elasticmachine
Copy link
Contributor

Pinging @elastic/fleet (Feature:Fleet)

@ph ph added the Team:Fleet Team label for Observability Data Collection Fleet team label Feb 1, 2021
@ph
Copy link
Contributor

ph commented Feb 1, 2021

@jen-huang @skh or @jfsiii any idea where the issues is coming from?

@jen-huang jen-huang self-assigned this Feb 1, 2021
@simitt
Copy link
Contributor Author

simitt commented Feb 4, 2021

A similar or the same issue came up in one of our system tests recently (elastic/apm-server#4684) - the apm section was not injected into the policy.

@jen-huang
Copy link
Contributor

Based on the symptoms I'm seeing, this could happen if enable_rum becomes undefined instead of false, but I'm not sure how this could happen after reviewing the code paths.

@simitt How were you able to get to the APM config after the Enable rum error in the UI? Did you have to toggle/enable it? Then, was the config still not showing the apm-server section?

@simitt
Copy link
Contributor Author

simitt commented Feb 8, 2021

Yes I toggled to enabled and then could store it. I believe your assumption that it might be undefined could be right, the host field should have a default value localhost:8200 - I realized that when this error happens the host field is also empty (I manually entered the value shown in the screenshot).

@axw
Copy link
Member

axw commented Mar 6, 2021

I think I've tracked this down to https://github.com/elastic/kibana/blob/master/x-pack/plugins/fleet/server/services/epm/archive/validation.ts

There's a couple of bugs IIANM:

@axw
Copy link
Member

axw commented Mar 8, 2021

Also I wonder if we could make that code a bit more robust to changes in the types? I barely know TypeScript so I don't know if this is possible, but it would be nice if when RegistryInput etc. changed (namely, fields added) the type checker would complain about dropped fields.

@jen-huang
Copy link
Contributor

jen-huang commented Mar 8, 2021

When I looked at the above code paths initially, it seemed like it would only get reached from installing a package by upload, but now I see that we use some of the validation functions when retrieving installed packages from ES storage too:

const streams = parseAndVerifyStreams(dataStreamManifest, dataStreamPath);
dataStreams.push({
dataset: dataset || `${pkgName}.${dataStreamPath}`,
title: dataStreamTitle,
release,
package: pkgName,
ingest_pipeline: ingestPipeline || 'default',
path: dataStreamPath,
type,
streams,
});
})
);
packageInfo.policy_templates = parseAndVerifyPolicyTemplates(packageInfo);

This looks like the smoking gun. I'll take a deeper look and come up with a PR. Thanks @axw for the pointer!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Fixes for quality problems that affect the customer experience Feature:Fleet Fleet team's agent central management project Team:Fleet Team label for Observability Data Collection Fleet team v7.13.0
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants