This doc is to describe all the assets available in the packages and some details around each asset.
- Asset Path:
manifest.yml
The manifest.yml
contains the information about the pacakge. It can contain the following entries:
- name: Name of the package (required)
- description: Description of the package (required)
- version: Version of the package (required)
- categories: List of categories this package falls under. The available categories still need to be defined.
- requirement: Requirement is an object that contains all the requirements for the stack versions of this package. Inside it contains an entry for each possible service which then can contain
version.min
andversion.max
. Other requirements might be added here like dependency on a specific Elasticsearch plugin / ingest pipeline if needed. In the past this was needed for geo and user_agent as they were not installed by default. - format_version: The package format version this package was built on top of. By default this is for now always 1.0.0.
An example manifest might look as following:
format_version: 1.0.0
name: envoyproxy
title: Envoy Proxy
description: This is the envoyproxy package.
version: 0.0.2
categories: ["logs", "metrics"]
# Options are experimental, beta, ga
release: beta
# The package type. The options for now are [integration, solution], more type might be added in the future.
# The default type is integration and will be set if empty.
type: integration
compatibility: [1.0.2, 2.0.1]
os.platform: [darwin, freebsd, linux, macos, openbsd, windows]
requirement:
elasticsearch:
versions: >7.0
kibana:
versions: >7.0
agent:
versions: >7.1
# The order of the items listed here is the order they show up in the package overview.
screenshots:
# The src path is relative from inside the package. The full path will be generated by the server
# and exposed through the API.
- src: /img/overview-logs.png
title: This shows the overview of the logs dashboard.
# The type does not have to be set explicitly if it's in the file extesions
# but the server will extract it an expose it through the API
type: image/png
# The size of the image could be detect by the server too if needed.
# We must come up with a recommended image size.
size: 800x600
- src: /img/overview-metrics.jpg
title: Metrics Dashboard.
size: 800x600
- src: getting-started.mp4
title: Getting started with the envoyproxy integration.
size: 800x600
type: video/mp4
The definition of the manifest is not complete yet and further details will follow.
The changelog of a package contains always all previous changes and not only the one from the last major, minor, bugfix release. Each array entry is a release. The type entry can contain the following values: [added, bugfix, deprecated, breaking-change, known-issue]
The file looks as following:
- version: 1.0.4
changes:
- description: >
Unexpected breaking change had to be introduced. This should not happen in a minor.
type: breaking-change
link: https://github.com/elastic/beats/issues/13504
- version: 1.0.3
changes:
- description: Fix broken template
type: bugfix
link: https://github.com/elastic/beats/issues/13507
- description: It is a known issue that the dashboard does not load properly
type: known-issue
link: https://github.com/elastic/beats/issues/13506
- Asset Path: fields/*.yml
The fields.yml files are used for fields definitions and can be used to generate the index pattern in Kibana, elasticsearch index template or rollup jobs. It's not clear yet on how the package manager should use this file and if.
The directory is reserved for multiple fields.yml as each package, beat and ecs have it's own fields.yml
.
Elasticsearch assets are the assets which are loaded into Elasticsearch. All of them are inside elasticsearch
directory.
- Asset Path:
elasticsearch/ingest-pipeline/*.json
The Elasticsearch ingest pipeline contains information on how the data should be processed. Multiple ingest pipelines can depend on each other thanks to the pipeline processor. As during package creation, the exact names given to the pipeline by the package manager is not know, we will need to use some variables to reference. An example on this can be found here in Beats. It means the package manager will have to be able to understand this template language (we still need to decide what our template language is) and replace the pipeline ids with the correct values.
- Asset Path:
elasticsearch/index-template/*.json
The Elasticsearch index template
is used to have a template applied to certain index patterns. Inside the Index Template the values index_patterns
is defined
for the matching indices. As the indexing convention is either given by the index manager or the user, the package
manager must be able to overwrite the index pattern.
On the Beats side today the Index Template is generated out of the fields.yml
files. This allows to give more flexibility
to generate the correct template for different Elasticsearch version. As we can release package packages for different
version of Elasticsearch independently this is probably not needed anymore. I expect fields.yml to stick around as it's a nice
way to create index templates and index patterns in one go. The package manager should be able to generate index templates
and index patterns out of all the combined fields.yml.
An Index Template also relates to the ILM policy as it can reference to which ILM policy should be applied to the indices created.
Prebuilt Index Templates contained in packages have to follow the Index Templates v2 specification. Index Templates generated by EPM also follow this specification.
- Asset Path:
elasticsearch/ilm-policy/*.json
The Elasticsearch index lifecycle management policy can be added / removed through the API. For the ILM policy it's important that the id / name of it matches what was configured in the index template.
The setup of ILM also requires to created an alias and a write index. It's important that this happens before the first data is ingested. More details on this can be found in the rollover documentation.
Even if an ILM policy is created after a template and the write index were created, it will still apply. But if data is ingested before the template and the write index exist, this will break the system.
- Asset Path:
elasticsearch/rollup-job/*.json
A rollup job defines on how metric
data is rolled up. One special thing about rollup jobs is that they can only be created if index and data is already around.
Rollup jobs can potentially also be generated from fields.yml
but it should be up to the package creator to do this.
Rollup jobs today do not support rollup templates which would be nice to have, see discussion here. This would allow to separate creation of the template and actually creation / start of the job.
A rollup job depends on an index pattern and has a target index. The package manager should potentially be able to configure this.
- Asset Path:
elasticseearch/index/*.json
Index data is data that should be written into Elasticsearch. The data format is expected to be in the Bulk format.
If the user can configure the index, the package manager should potentially be able to overwrite / prefix the index fields inside the loaded data.
Loading of data can fail or partially fail. Because of this handling on failure must be possible.
- Asset Path:
elasticsearch/ml-job/*.json
Elasticsearch Machine Learning Jobs can be created in Elasticsearch assuming ML is enabled. As soon as a job is started, the job creates results. If results are around, a Job can't be just removed anymore but also the results must be removed first (more details needed).
- Asset Path:
elasticsearch/data-frame-transform/*.json
Data Frame Transforms can be used to transfrom documents. There are a few things which are special about data frame transforms:
- Destination index must exist before creation
- Source index must be exist before creation
- If data frame uses ingest pipeline, it must exist before creation
- Data Fram transform must be stopped before deletion
Some of the above limitations might be removed in the future.
Kibana assets are the assets which are loaded in Kibana. The Kibana API docs can be found here. A large portion of the Kibana assets are saved objects. All saves objects are space aware, meaning the same object id with a different prefix can exists in multiple spaces.
Assuming the package manager generates the ids of the assets it must be capable to adjust the reference ids acrross dashboards, visualizations, search, index patterns.
- Asset Path:
kibana/dashboard/*.json
A Kibana dashboard consists of multiple visualisations it references.
- Asset Path:
kibana/visualization/*.json
Visualizations are referenced inside dashboards and can reference a search object.
- Asset Path:
kibana/search/*.json
The search object contains a saved search and is referenced by visualisations. A search object also references an index
like "index": "filebeat-*"
. In case we allow users to adjust indices, this would have to be adjusted in the search object.
- Asset Path:
kibana/infrastructure-ui-source/*.json
The Infrastructure UI source is used to tell the Logs and Metrics UI which indices to query for data and how to visualise the data.
The asset is like dashboards / visualizations just a saved object and can be loaded the same way. But the Logs UI could also add an API for a tighter integration. At the moment there is no selection in the UI to change / switch the source but it can be triggered through URL parameters.
- Asset Path:
dataset/{dataset-name}/{package-structure}
All dataset are defined inside the dataset
directory. An example here is the access
dataset of the nginx
package.
Inside each dataset, the same structure is repeated which is defined for the overall package. In general ingest pipelines
and fields definitions are only expected inside dataset. An dataset is basically a template for an input.
manifest.yml
Each dataset must contain a manifest.yml. It contains all information about the dataset and how to configure it.
# Needs to describe the type of this input. Currently either metric or log
type: metric
# Each input can be in its own release status
release: beta
# If set to true, this will be enabled by default in the input selection
default: true
# Defines variables which are used in the config files and can be configured by the user / replaced by the package manager.
vars:
-
# Name of the variable that should be replaced
name: hosts
# Default value of the variable which is used in the UI and in the config if not specified
default:
["http://127.0.0.1"]
required: true
# OS specific configurations!
os.darwin:
- /usr/local/var/log/nginx/error.log*
os.windows:
- c:/programdata/nginx/logs/error.log*
# Below are UI Configs. Should we prefix these with ui.*?
# Title used for the UI
title: "Hosts lists"
# Description of the varaiable which could be used in the UI
description: Nginx hosts
# A special type can be specified here for the UI Input document. By default it is just a
# text field.
type: password
required: true
- name: period
description: "Collection period. Valid values: 10s, 5m, 2h"
default: "10s"
- name: username
type: text
- name: password
# This is the html input type?
type: password
requirements:
# Defines on which platform is input is available
platform: ["linux", "freebsd"]
elasticsearch.processors:
# If a user does not have the user_agent processor, he should still be able to install the package but not
# enable the access input
- name: user_agent
plugin: ingest-user-agent
- name: geoip
plugin: ingest-geoip
fields
The fields directory contains all fields.yml which are need to build the full template. All fields related to the dataset must be in here in one or multiple files.
An open question is on how the fields for all the processors and autodiscovery are loaded.
docs
The docs for each dataset are combined with the overall docs. For the datasets it is encouraged to have data.json
as an
example event available.
agent/input
Agent input configuration for the input. It's by design not an array but a single entry. The package manager will build a list out of it for the user.