Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Fleet integration] Package agent binaries in APM Server #383

Closed
felixbarny opened this issue Dec 9, 2020 · 12 comments
Closed

[Fleet integration] Package agent binaries in APM Server #383

felixbarny opened this issue Dec 9, 2020 · 12 comments

Comments

@felixbarny
Copy link
Member

felixbarny commented Dec 9, 2020

Adding the APM integration to a Fleet policy will prompt all corresponding Elastic Agents to download the APM Server binaries from elastic.co and to start the bundled apm-server.

As we eventually want to support auto-attachment of APM Agents for those that support it (currently Java, .NET, PHP, and Node.js) we need a way of downloading the APM Agent binaries, too.

The most straightforward way seems to be to bundle the APM Agent binaries with APM Server.

An alternative would be to download the agent binaries from their respective package registries (such as Nuget, Maven central, or NPM) on demand. This would, however, complicate the setup in restricted environments where the Elastic Agent has no connection to the internet.

@axw
Copy link
Member

axw commented Dec 14, 2020

An alternative would be to download the agent binaries from their respective package registries (such as Nuget, Maven central, or NPM) on demand. This would, however, complicate the setup in restricted environments where the Elastic Agent has no connection to the internet.

Do we have docs around offline environments for Node.js already? I think that Java, .NET, and PHP are all self-contained, but the Node.js agent requires the installation of various external modules.

Also, do you have a rough idea of the size on disk of the agents? We're considering bundling APM Server with every Elastic Agent, and wouldn't want to blow out the size. I expect it'll be fine, but something we should check.

@felixbarny
Copy link
Member Author

Do we have docs around offline environments for Node.js already? I think that Java, .NET, and PHP are all self-contained, but the Node.js agent requires the installation of various external modules.

@elastic/apm-agent-node-js could you help out here?

Also, do you have a rough idea of the size on disk of the agents?

Not sure about the Node.js agent but the Java agent would be under 10MB and .NET and PHP under 1MB

@trentm
Copy link
Member

trentm commented Dec 16, 2020

Do we have docs around offline environments for Node.js already?

No docs that I'm aware of. Typically yes to install the node.js agent you'd need to hit npm for myriad external packages. Approximately 130 packages:

% npm install elastic-apm-node
...
+ [email protected]
added 131 packages from 106 contributors and audited 131 packages in 4.243s

It is possible to zip/tar-up this installation into a redistributable form that doesn't require running npm install and hitting npm (and running 3rd party install scripts!) on the target container/VM. I could help with that. Is this a thing we'd want to do at apm-server build time? Or have apm-agent-nodejs.git publish bundled releases (these would be separate packages to what is published to npm)?

Also, do you have a rough idea of the size on disk of the agents?

Currently the Node.js agent adds up to 10MB:

% du -sh node_modules
 10M	node_modules

@felixbarny felixbarny added this to the 7.12 milestone Dec 17, 2020
@felixbarny
Copy link
Member Author

Thanks, Trent. That's really helpful!

Is this a thing we'd want to do at apm-server build time? Or have apm-agent-nodejs.git publish bundled releases (these would be separate packages to what is published to npm)?

Good question. Both would probably work. I'd lean towards building the package on the Node.js agent releases and then just including the binary in APM Server.

Typically yes to install the node.js agent you'd need to hit npm for myriad external packages. [...] It is possible to zip/tar-up this installation into a redistributable form that doesn't require running npm install and hitting npm (and running 3rd party install scripts!) on the target container/VM.

What happens when some of the modules are conflicting with the user's application and the Node.js agent gets installed via NODE_OPTIONS=--require $ELASTIC_AGENT_HOME/apm/agents/nodejs/elastic-agent.js?

@trentm
Copy link
Member

trentm commented Dec 17, 2020

What happens when some of the modules are conflicting with the user's application

I need to play with this a little bit. First, the elastic-agent.js code would load its own versions of its package deps from "$ELASTIC_AGENT_HOME/apm/agents/nodejs/node_modules/...". However, the part I'm not sure about is whether the APM agent's loading of package foo@v$version might cause a conflict for an app trying to import package foo@v$anotherVersion.

@trentm
Copy link
Member

trentm commented Dec 17, 2020

However, the part I'm not sure about is ...

Nevermind, I'm more sure now. This apm-agent-nodejs deployment will be fine: modules won't conflict.

I had a momentary brain-freeze thinking Node.js's require cache was based on module name (like Python's sys.modules) rather than module path.

If it helps to illustrate, say we have this layout of the APM agent and the user's app:

    $ELASTIC_AGENT_HOME/apm/agents/nodejs/
        elastic-agent.js
        package.json
        node_modules/
            foo/        # at version 1.0.0
            ...

    /some/dir/the-user-app/
        package.json
        app.js
        node_modules/
            foo/        # at conflicting version 2.0.0

When "elastic-agent.js" does require('foo') it'll get its "$ELASTIC_AGENT_HOME/apm/agents/nodejs/node_modules/foo"
and when "app.js" does require('foo') it'll get its own "/some/dir/the-user-app/node_modules/foo".

@axw
Copy link
Member

axw commented Dec 18, 2020

Thanks for the details @trentm @felixbarny. Seems like it'll be reasonably straightforward then.

@felixbarny
Copy link
Member Author

felixbarny commented Jan 13, 2021

Let's discuss how publishing Docker images containing agent binaries for k8s init containers relates to this.

TL;DR: Publishing Docker images for initi containers is a good idea but don't replace agent binaries packaged in APM Server. However, the lower effort alternative to just wget the binaries in an init container sounds interesting, too.

Background: We're about to add k8s observability docs that include attachment instructions (zero code changes) for the .NET and Java agent. The idea is to copy the agent binaries from a container we provide into the user's container before startup and attaching the agent with the JAVA_TOOL_OPTIONS or DOTNET_STARTUP_HOOKS environment variable.

Is that an alternative for packaging agent binaries in APM Server?
I don't think so. Init containers don't cover non-k8s use cases.

The question is when we're doing the k8s auto-startup attach via MutatingAdmissionWebhooks (#385), should we use init containers or agent binaries that are packaged in APM Server.
I guess both options are valid. But for the sake of consistency and to not require access to a public Docker registry, I lean towards using the binaries that are packaged within APM Server.

Once there's an Elastic Agent for k8s, we'll have to rework the k8s monitoring docs (elastic/observability-docs#151) but I guess that's no surprise.

Is using init containers a good idea?
I think it is. It seems native to k8s and simplifies the distribution of the agent binaries without requiring them to be packaged in APM Server.

An alternative to init containers is just downloading the agent binaries from the package registry: https://gist.github.com/bmorelli25/a8e31252f218e3db85d1ba72640b8a46#file-kube-yaml-L23-L25
However, that requires access to the public internet.

The question is whether the benefits of the init-container approach outweigh the effort of creating and publishing the Docker image. We're currently doing that for Java but it's still a manual step in the release process.

Given that we'll likely not be using init containers for the Elastic Agent/Fleet integration I doubt that the effort of publishing an image as part of the release process of at least 3 agents (Java, .NET, Node.js) is going to be worth it, given there's a simpler alternative (wget).

cc @eyalkoren @bmorelli25

@jalvz
Copy link
Contributor

jalvz commented Jan 21, 2021

What happens on the integration side?

Do we just show the path to the agent binary in the integration UI and how to attach it to the service?

@felixbarny
Copy link
Member Author

felixbarny commented Jan 28, 2021

The idea for phase 2 (tentatively 7.13) of integrating APM Agents into the integration UI is to provide a copy/pasteable snippet of environment variables that can be added to the startup script of the application.

For Java, that it could look something like this:

Instrument Java applications
Manual startup attach (Phase 2)
Add the following line to the startup script of your application
JAVA_TOOL_OPTIONS=-javaagent:/opt/elastic/agent/apm-server/agents/elastic-apm-agent.jar
Programmatic attach (Phase 1) 1. add dependency...
2. start agent...

Therefore, the APM Server should bundle the APM Agent binaries for .NET, Node.js, and Java. In upcoming iterations, the bundled agent binaries should be used for auto-attach in k8s: #385

There are several challenges that need to be discussed:

Installation directory of agent

This proposal assumes that there’s a fixed path to an agent binary that can just be copy/pasted to an application startup script. That is, however, probably not possible for many reasons:

Facts: Elastic Agent installation directory

  • The installation directory of Elastic Agent includes variables such as the git sha, the version number and a platform identifier.
  • When installed via a package manager the path looks like this: /usr/share/elastic-agent/data/elastic-agent-d4b590/install/apm-server-8.0.0-linux-x86_64.
  • When installed from the tar archive, the path is /opt/Elastic/Agent/* for Linux or /Library/Elastic/Agent/ for MacOS
  • A single agent policy can include agents in different installation directories
  • A single agent policy can include agents running on different operating systems

Discussion

The integration policy editor could show different paths for each type of installation to account for that. There are other difficulties related to Agent verison updates covered in "Independent updates of APM Server and APM Agents".

Hosted vs local agent

As the agent may not run on-premises but hosted in Cloud or ECE, the integration policy editor needs to show different installation instructions based on that.
For hosted agents, it would need to include the APM Server URL and the secret token/API Key.

This is very similar to the behavior of the current "add APM data" dialog that is aware of the APM Server URL and the secret token when running on cloud. The difference is that there can be multiple Agent policies that can either be local or hosted. Eventually, the "add APM data" should be replaced by the instructions within the integration policy editor.

Independent updates of APM Server and APM Agents

Requirements

  1. To decrease the risk of breaking things in their applications, users should be able to stay on the same APM agent version even when updating Elastic agent.
  2. As a nice-to-have, they should also be able to update an APM Agent without having to update Elastic Agent.

Non-requirements

  • For now, we don’t see using different APM Agent versions in different applications as a requirement.

Facts: Elastic Agent update procedure

  • When updating Elastic Agent, the directory of the current agent doesn’t change. Instead a new one is created so that every time a new update is installed, there’s another folder matching /usr/share/elastic-agent/data/elastic-agent-*.
  • All agents of a given Agent policy always has the same version. An exception of that rule is that during an update process, agents of two different versions may run at the same time.
  • After an Elastic Agent update is complete, the installation directory of the older Elastic Agent version is not deleted right away. But after if the new one is stable the old installation will be deleted

Discussion

So if we instruct users to add an environment variable that includes a path to the agent directory that path gets invalid after updating to a new version as the old directory will be deleted. On the other hand, if we somehow provide a symlink that always points to the agent binaries in the latest installation, we can’t meet requirement 1.
Maybe we could copy the APM Agent binaries to a path that’s outside of the Elastic Agent installation directory such as /usr/share/elastic-agent/data/apm-agents/.

When thinking about k8s and Java runtime attachment, there’s a similar challenge to meet requirement 1 as we’d always attach the APM Agent that’s bundled with the current version of Elastic Agent/APM Server.

For the nice-to-have requirement 2 bundling a single version of an APM Agents with APM Server might not be enough and we’d need to dynamically download the agent binaries from a package registry, the GitHub Releases page or package the APM Agents in a separate Elastic Agent integration.

Fully automated attach via environment variables?

I’m not sure if it’s something that’s feasible to achieve but I’d like us to challenge ourselves to think about how we could fully auto-attach the .NET and Node.js agents by automatically adding the environment variables to starting processes. Maybe we could leverage some ld_preload tricks or shim the binaries for the runtimes?
Another advantage of that approach is that it allows each agent instance to inject the path to the agent binaries, depending on where it's installed.


I propose to split this effort into two phases. For 7.12 we complete a definition phase as a joint effort between server team and me. For 7.13, the server team takes ownership on delivering the implementation.

@felixbarny
Copy link
Member Author

felixbarny commented Feb 23, 2021

Based on several discussions, we concluded that we will not focus on either semi-automatic or fully-automatic (ld_preload) attachment using JAVA_TOOL_OPTIONS/ DOTNET_STARTUP_HOOKS for now. Instead, we're focussing on improving and integrating the already existing Java runtime attachment as a first step and k8s attachment #385 as a second step.

As independent updates of APM Agents and Elastic Agent/APM Server is an important use case, bundling the agents in APM Server does not make sense. Instead, we are considering adding the agents to artifacts.elastic.co and extend Elastic Agent so APM Server can request an agent binary in a specific version, depending on which version has been selected in the policy editor. This is in the early stage of the definition.
This does, however, not block us from doing a POC for Java runtime attachment by just assuming that the necessary binaries are in a specific folder. The current thinking is that we'll bundle the attacher-cli-slim with APM Server and download the agent binary on demand.

@AlexanderWert AlexanderWert modified the milestones: 7.12, 7.14 Mar 29, 2021
@felixbarny
Copy link
Member Author

Closing in favor of elastic/apm-server#4830

@felixbarny felixbarny removed this from the 7.14 milestone Sep 20, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants