Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to load / process fields.yml #50325

Closed
wants to merge 3 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,128 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License;
* you may not use this file except in compliance with the Elastic License.
*/

import { readFileSync } from 'fs';
import { safeLoad } from 'js-yaml';

// This should become a copy of https://github.com/elastic/beats/blob/d9a4c9c240a9820fab15002592e5bb6db318543b/libbeat/mapping/field.go#L39
export interface Field {
keyword: string;
type: string;
required: boolean;
description: string;
fields: Field[];
}

test('test reading fields.yml', () => {
const yaml = readFileSync(path.join(__dirname, '/tests/fields.yml'));
const data = safeLoad(yaml.toString());

console.log(keyword());
data.forEach(data => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this work? I'd do data.forEach(item => {...}), i.e. name the argument to the callback different from the object you forEach over, if only for readability.

console.log(data as Field);
Copy link
Contributor

@skh skh Nov 13, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as is a TypeScript keyword, similar to a cast in Java (but as all type information is transpiled away, also not like a cast in Java. It basically makes the red squiggly underlines go away when nothing else works). It does nothing here, in particular it won't give you any added information in the console output.

});
expect(1 + 2).toStrictEqual(3);
});

function getDefaultProperties() {
// TODO: do be extended with https://github.com/elastic/beats/blob/d9a4c9c240a9820fab15002592e5bb6db318543b/libbeat/template/processor.go#L364
// Currently no defaults exist
return {};
}

function keyword() {
const property = getDefaultProperties();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another option is

Suggested change
const property = getDefaultProperties();
const property = Object.assign(getDefaultProperties(), {type: 'keyword'});

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There will be quite a few more fields which will be added to the properties object but with if / else conditions. So I wonder if it is discouraged in js to use the dot notation to do this directly?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One situation where this is necessary is if you type the object (in TypeScript) and the properties are mandatory. Then Object.assign() is a good way to prefill everything in one go. Other than that, I don't think so.


property.type = 'keyword';

return property;
}

function getBaseTemplate() {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@skh Our base elasticsearch template. Putting all these things for now but I'm aware it will not end up in the tests.

return {
// We need to decide which order we use for the templates
order: 1,
// To be completed with the correct index patterns
index_patterns: ['logs-nginx-access-abcd-*'],
settings: {
index: {
// ILM Policy must be added here
lifecycle: {
name: 'logs-default',
rollover_alias: 'logs-nginx-access-abcd',
},
// What should be our default for the compression?
codec: 'best_compression',
// W
mapping: {
total_fields: {
limit: '10000',
},
},
// This is the default from Beats? So far seems to be a good value
refresh_interval: '5s',
// Default in the stack now, still good to have it in
number_of_shards: '1',
// All the default fields which should be queried have to be added here.
// So far we add all keyword and text fields here.
query: {
default_field: ['message'],
},
// We are setting 30 because it can be devided by several numbers. Useful when shrinking.
number_of_routing_shards: '30',
},
},
mappings: {
// To be filled with interesting information about this specific index
_meta: {
package: 'foo',
},
// All the dynamic field mappings
dynamic_templates: [
// This makes sure all mappings are keywords by default
{
strings_as_keyword: {
mapping: {
ignore_above: 1024,
type: 'keyword',
},
match_mapping_type: 'string',
},
},
// Example of a dynamic template
{
labels: {
path_match: 'labels.*',
mapping: {
type: 'keyword',
},
match_mapping_type: 'string',
},
},
],
// As we define fields ahead, we don't need any automatic field detection
// This makes sure all the fields are mapped to keyword by default to prevent mapping conflicts
date_detection: false,
// All the properties we know from the fields.yml file
properties: {
container: {
properties: {
image: {
properties: {
name: {
ignore_above: 1024,
type: 'keyword',
},
},
},
},
},
},
},
// To be filled with the aliases that we need
aliases: {},
};
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
- name: log.file.path
type: keyword
required: false
description: >
The file from which the line was read. This field contains the absolute path to the file.
For example: `/var/log/system.log`.
- name: log.source.address
type: keyword
required: false
description: >
Source address from which the log event was read / sent from.
- name: log.offset
type: long
required: false
description: >
The file offset the reported line starts at.
- name: stream
type: keyword
required: false
description: >
Log stream when reading container logs, can be 'stdout' or 'stderr'