Package assets

This doc is to describe all the assets available in the packages and some details around each asset.

General Assets

Manifest

Asset Path: manifest.yml

The manifest.yml contains the information about the pacakge. It can contain the following entries:

name: Name of the package (required)
description: Description of the package (required)
version: Version of the package (required)
categories: List of categories this package falls under. The available categories still need to be defined.
requirement: Requirement is an object that contains all the requirements for the stack versions of this package. Inside it contains an entry for each possible service which then can contain version.min and version.max. Other requirements might be added here like dependency on a specific Elasticsearch plugin / ingest pipeline if needed. In the past this was needed for geo and user_agent as they were not installed by default.
format_version: The package format version this package was built on top of. By default this is for now always 1.0.0.

An example manifest might look as following:

format_version: 1.0.0
name: envoyproxy
title: Envoy Proxy
description: This is the envoyproxy package.
version: 0.0.2
categories: ["logs", "metrics"]
# Options are experimental, beta, ga
release: beta
# The package type. The options for now are [integration, solution], more type might be added in the future.
# The default type is integration and will be set if empty.
type: integration
compatibility: [1.0.2, 2.0.1]
os.platform: [darwin, freebsd, linux, macos, openbsd, windows]

requirement:
  elasticsearch:
    versions: >7.0
  kibana:
    versions: >7.0
  agent:
    versions: >7.1

# The order of the items listed here is the order they show up in the package overview.
screenshots:
  # The src path is relative from inside the package. The full path will be generated by the server
  # and exposed through the API.
- src: /img/overview-logs.png
  title: This shows the overview of the logs dashboard.
  # The type does not have to be set explicitly if it's in the file extesions
  # but the server will extract it an expose it through the API
  type: image/png
  # The size of the image could be detect by the server too if needed.
  # We must come up with a recommended image size.
  size: 800x600
- src: /img/overview-metrics.jpg
  title: Metrics Dashboard.
  size: 800x600
- src: getting-started.mp4
  title: Getting started with the envoyproxy integration.
  size: 800x600
  type: video/mp4

The definition of the manifest is not complete yet and further details will follow.

changelog.yml

The changelog of a package contains always all previous changes and not only the one from the last major, minor, bugfix release. Each array entry is a release. The type entry can contain the following values: [added, bugfix, deprecated, breaking-change, known-issue]

The file looks as following:

- version: 1.0.4
  changes:
    - description: >
        Unexpected breaking change had to be introduced. This should not happen in a minor.
      type: breaking-change
      link: https://github.com/elastic/beats/issues/13504
- version: 1.0.3
  changes:
    - description: Fix broken template
      type: bugfix
      link: https://github.com/elastic/beats/issues/13507
    - description: It is a known issue that the dashboard does not load properly
      type: known-issue
      link: https://github.com/elastic/beats/issues/13506

Fields.yml

Asset Path: fields/*.yml

The fields.yml files are used for fields definitions and can be used to generate the index pattern in Kibana, elasticsearch index template or rollup jobs. It's not clear yet on how the package manager should use this file and if.

The directory is reserved for multiple fields.yml as each package, beat and ecs have it's own fields.yml.

Elasticsearch

Elasticsearch assets are the assets which are loaded into Elasticsearch. All of them are inside elasticsearch directory.

Ingest Pipeline

Asset Path: elasticsearch/ingest-pipeline/*.json

The Elasticsearch ingest pipeline contains information on how the data should be processed. Multiple ingest pipelines can depend on each other thanks to the pipeline processor. As during package creation, the exact names given to the pipeline by the package manager is not know, we will need to use some variables to reference. An example on this can be found here in Beats. It means the package manager will have to be able to understand this template language (we still need to decide what our template language is) and replace the pipeline ids with the correct values.

Index Template

Asset Path: elasticsearch/index-template/*.json

The Elasticsearch index template is used to have a template applied to certain index patterns. Inside the Index Template the values index_patterns is defined for the matching indices. As the indexing convention is either given by the index manager or the user, the package manager must be able to overwrite the index pattern.

On the Beats side today the Index Template is generated out of the fields.yml files. This allows to give more flexibility to generate the correct template for different Elasticsearch version. As we can release package packages for different version of Elasticsearch independently this is probably not needed anymore. I expect fields.yml to stick around as it's a nice way to create index templates and index patterns in one go. The package manager should be able to generate index templates and index patterns out of all the combined fields.yml.

An Index Template also relates to the ILM policy as it can reference to which ILM policy should be applied to the indices created.

Prebuilt Index Templates contained in packages have to follow the Index Templates v2 specification. Index Templates generated by EPM also follow this specification.

ILM Policy

Asset Path: elasticsearch/ilm-policy/*.json

The Elasticsearch index lifecycle management policy can be added / removed through the API. For the ILM policy it's important that the id / name of it matches what was configured in the index template.

The setup of ILM also requires to created an alias and a write index. It's important that this happens before the first data is ingested. More details on this can be found in the rollover documentation.

Even if an ILM policy is created after a template and the write index were created, it will still apply. But if data is ingested before the template and the write index exist, this will break the system.

Rollup Job

Asset Path: elasticsearch/rollup-job/*.json

A rollup job defines on how metric data is rolled up. One special thing about rollup jobs is that they can only be created if index and data is already around. Rollup jobs can potentially also be generated from fields.yml but it should be up to the package creator to do this.

Rollup jobs today do not support rollup templates which would be nice to have, see discussion here. This would allow to separate creation of the template and actually creation / start of the job.

A rollup job depends on an index pattern and has a target index. The package manager should potentially be able to configure this.

Index Data

Asset Path: elasticseearch/index/*.json

Index data is data that should be written into Elasticsearch. The data format is expected to be in the Bulk format.

If the user can configure the index, the package manager should potentially be able to overwrite / prefix the index fields inside the loaded data.

Loading of data can fail or partially fail. Because of this handling on failure must be possible.

ML Jobs

Asset Path: elasticsearch/ml-job/*.json

Elasticsearch Machine Learning Jobs can be created in Elasticsearch assuming ML is enabled. As soon as a job is started, the job creates results. If results are around, a Job can't be just removed anymore but also the results must be removed first (more details needed).

Data Frames Transform

Asset Path: elasticsearch/data-frame-transform/*.json

Data Frame Transforms can be used to transfrom documents. There are a few things which are special about data frame transforms:

Destination index must exist before creation
Source index must be exist before creation
If data frame uses ingest pipeline, it must exist before creation
Data Fram transform must be stopped before deletion

Some of the above limitations might be removed in the future.

Kibana

Kibana assets are the assets which are loaded in Kibana. The Kibana API docs can be found here. A large portion of the Kibana assets are saved objects. All saves objects are space aware, meaning the same object id with a different prefix can exists in multiple spaces.

Assuming the package manager generates the ids of the assets it must be capable to adjust the reference ids acrross dashboards, visualizations, search, index patterns.

Dashboard

Asset Path: kibana/dashboard/*.json

A Kibana dashboard consists of multiple visualisations it references.

Visualization

Asset Path: kibana/visualization/*.json

Visualizations are referenced inside dashboards and can reference a search object.

Search

Asset Path: kibana/search/*.json

The search object contains a saved search and is referenced by visualisations. A search object also references an index like "index": "filebeat-*". In case we allow users to adjust indices, this would have to be adjusted in the search object.

Infrastructure UI Source

Asset Path: kibana/infrastructure-ui-source/*.json

The Infrastructure UI source is used to tell the Logs and Metrics UI which indices to query for data and how to visualise the data.

The asset is like dashboards / visualizations just a saved object and can be loaded the same way. But the Logs UI could also add an API for a tighter integration. At the moment there is no selection in the UI to change / switch the source but it can be triggered through URL parameters.

Dataset

Asset Path: dataset/{dataset-name}/{package-structure}

All dataset are defined inside the dataset directory. An example here is the access dataset of the nginx package. Inside each dataset, the same structure is repeated which is defined for the overall package. In general ingest pipelines and fields definitions are only expected inside dataset. An dataset is basically a template for an input.

manifest.yml

Each dataset must contain a manifest.yml. It contains all information about the dataset and how to configure it.

# Needs to describe the type of this input. Currently either metric or log
type: metric

# Each input can be in its own release status
release: beta

# If set to true, this will be enabled by default in the input selection
default: true

# Defines variables which are used in the config files and can be configured by the user / replaced by the package manager.
vars:
  -
    # Name of the variable that should be replaced
    name: hosts

    # Default value of the variable which is used in the UI and in the config if not specified
    default:
      ["http://127.0.0.1"]
    required: true

    # OS specific configurations!
    os.darwin:
      - /usr/local/var/log/nginx/error.log*
    os.windows:
      - c:/programdata/nginx/logs/error.log*


    # Below are UI Configs. Should we prefix these with ui.*?

    # Title used for the UI
    title: "Hosts lists"

    # Description of the varaiable which could be used in the UI
    description: Nginx hosts

    # A special type can be specified here for the UI Input document. By default it is just a
    # text field.
    type: password

    required: true

  - name: period
    description: "Collection period. Valid values: 10s, 5m, 2h"
    default: "10s"
  - name: username
    type: text
  - name: password
    # This is the html input type?
    type: password

requirements:
  # Defines on which platform is input is available
  platform: ["linux", "freebsd"]
  elasticsearch.processors:
    # If a user does not have the user_agent processor, he should still be able to install the package but not
    # enable the access input
    - name: user_agent
      plugin: ingest-user-agent
    - name: geoip
      plugin: ingest-geoip

fields

The fields directory contains all fields.yml which are need to build the full template. All fields related to the dataset must be in here in one or multiple files.

An open question is on how the fields for all the processors and autodiscovery are loaded.

docs

The docs for each dataset are combined with the overall docs. For the datasets it is encouraged to have data.json as an example event available.

agent/input

Agent input configuration for the input. It's by design not an array but a single entry. The package manager will build a list out of it for the user.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ASSETS.md

ASSETS.md

Package assets

General Assets

Manifest

changelog.yml

Fields.yml

Elasticsearch

Ingest Pipeline

Index Template

ILM Policy

Rollup Job

Index Data

ML Jobs

Data Frames Transform

Kibana

Dashboard

Visualization

Search

Infrastructure UI Source

Dataset

Files

ASSETS.md

Latest commit

History

ASSETS.md

File metadata and controls

Package assets

General Assets

Manifest

changelog.yml

Fields.yml

Elasticsearch

Ingest Pipeline

Index Template

ILM Policy

Rollup Job

Index Data

ML Jobs

Data Frames Transform

Kibana

Dashboard

Visualization

Search

Infrastructure UI Source

Dataset