Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Filebeat] Add Shorewall module #12199

Closed
wants to merge 34 commits into from

Conversation

mdelapenya
Copy link
Contributor

I contributed Logstash patterns a few years ago (https://github.com/logstash-plugins/logstash-patterns-core/blob/master/patterns/firewalls#L86-L88), and wanted to migrate it following a more modern Beats approach.

Besides that, as part of the automation team, I wanted to know how a developer feels when contributing a new module to the project, trying to infer the pain points during the dev process, with the end goal of improving it within build & automation.

Type of change

  • New feature
  • Bugfix
  • This change requires a documentation update

What does this PR do?

This PR adds a filebeat module for Shorewall a Linux firewall, which follows Net Filter pattern for logging messages (Please see http://logi.cc/en/2010/07/netfilter-log-format/)

Why is this PR important?

This PR adds a new filebeat module, which is great in terms of contribution, but also adds a seed about a very basic tutorial or blog post about how to create a filebeat module from scratch.

Alongside, the PR is trying to represent a canonical manner of creating a new filebeat module. Therefore, all commits in here are written in a very isolated and educational manner, making easier to follow their history by reading commit messages from the initial one to the latest one. For that reason I tried to explain on each of them the rationale for that specific commit.

Development checklist

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • Any dependent changes have been merged and published in downstream modules

Main concerns about the development process in terms of dev experience

I tried to follow the tutorials but failed (my fault, it could have been a RTFM issue). I googled "create filebeat module" and clicked on the first result: https://www.elastic.co/guide/en/beats/devguide/current/filebeat-modules-devguide.html. Then I started following the steps in there. But failed at the very beginning when I had to create a fileset:

$ make create-fileset MODULE=shorewall FILESET=XXX'

At that point of the guide, it's not explained what a fileset is. I'm super lucky to have @sayden in the team so I hijacked him and his knowledge to help me during the entire process. That's why I completely skipped the tutorial at all (my RTFM fault again 😞).

Here I list with more topics that caused me trouble:

  • The guide does not explain that there is a make unit to run just the unit tests.
  • The guide just shows a way to run all tests: from filebeat directory make testsuite is an all-or-nothing approach, which takes a lot (more than 30 mins?).
  • The generation of the expected test output happens after tests run, which failed at any time because the documentation of the fields was incomplete or wrong. I had to disable those tests in the beats.py to be able to run the tests. In my mind, autogenerating a expected output must not be blocked by those checks, because it's autogenerated. I could have evolved it with every documentation iteration, but failed to it.
  • The grok pattern for shorewall already existed, so I expected to paste the %{SHOREWALL} keyword into the grok pattern and voilà! But no, I had to copy the representation of it, and parse it manually.
  • It was annoying to add a field type to the expression when I had already defined it in the expression:
- %{INT:nf_src_port}
+ %{INT:nf_src_port:int}
  • It was hard to understand the _meta/fields.yml approach to define fields. Also understanding that each fileset has its own fields.yml, which is superseeded by the parent one was not clear. I had to copy NATS module descriptors for that, and groomed/pruned what I need.
  • I missed fields in the ECS, like network.interface.in and network.interface.out, that's why I have to add custom shorewall.network.in/out.

@mdelapenya mdelapenya requested review from a team as code owners May 18, 2019 13:39
@elasticcla
Copy link

Hi @mdelapenya, we have found your signature in our records, but it seems like you have signed with a different e-mail than the one used in your Git commit. Can you please add both of these e-mails into your Github profile (they can be hidden), so we can match your e-mails to your Github profile?

@mdelapenya mdelapenya requested review from ruflin and sayden May 18, 2019 13:39
@sayden sayden changed the title [WIP] Contribute Shorewall module [WIP] Add Shorewall Filebeat module May 18, 2019
@sayden sayden added Team:Integrations Label for the Integrations team enhancement Filebeat Filebeat in progress Pull request is currently in progress. labels May 18, 2019
@sayden sayden changed the title [WIP] Add Shorewall Filebeat module [Filebeat] Add Shorewall module May 18, 2019
Copy link
Contributor

@sayden sayden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for contributing with this 🙂 A handful of changes are needed and CI didn't pass the quick test. (probably a make update in beats/filebeat folder is missing and / or a mage fmt update in x-pack/filebeat folder. Overall looks good but expect more changes after you submit the changes indicated now and / or after making CI green

Shorewal log files
fields:
- name: in
type: keywork
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Description of fields is "mandatory" and "extended" even if the potential target user is a Shorewall expert

@@ -0,0 +1 @@
May 28 17:23:25 myHost kernel: [3124658.791874] Shorewall:FORWARD:REJECT:IN=eth2 OUT=eth2 SRC=1.2.3.4 DST=1.2.3.4 LEN=141 TOS=0x00 PREC=0x00 TTL=63 ID=55251 PROTO=UDP SPT=5353 DPT=5353 LEN=121
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will be nice to have a handful more example lines with different configurations to avoid "happy paths". I guess that this line is the default logging configuration but Filebeat must be able to handle also missing fields. For example, I don't know if it's possible that OUT=eth2 appears as OUT= if no interface is defined as out or it will simply dissapear from the line.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added!

Shorewal network log files
fields:
- name: one
type: keywork
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The actions look like keyword indeed, good catch!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm possibly going to change them...

},
{
"date": {
"field": "timestamp",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Timezone is missing. Here you need the convert timezone plugin. Check another module for guidance about how to use it (or this PR #12079), it will make the document non JSON valid but this is expected when using the convert timezone plugin

filebeat/filebeat.reference.yml Outdated Show resolved Hide resolved
description: >
Shorewal log files
fields:
- name: in
Copy link
Member

@ruflin ruflin May 20, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this map to source.bytes and destination.bytes? https://www.elastic.co/guide/en/ecs/current/ecs-source.html

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm currently looking into it!

default:
- /var/log/shorewall.log
os.darwin:
- /usr/local/example/test.log*
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we don't know the paths for darwin / windows lets rather remove it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My fault, removing those now

Run make create-module MODULE=shorewall from the X-Pack's filebeat root directory
The command used to execute the tests is:
TESTING_FILEBEAT_MODULES=shorewall INTEGRATION_TESTS=1 GENERATE=1 BEAT_STRICT_PERMS=false nosetests -v -d -s tests/system/test_modules.py
This change is required because the test execution is the responsible
of generating the expected output for tests, which seems a bit weird from
testing prespective to run a test to obtains the expected output.

Maybe adding a check phase for this validation could help
I followed NATS example, so that here I define the parent field froup (shorewall)
and the children fields will be defined at the fileset level
The LEN field represents "Total length of IP packet in bytes"
    This command will:
    - format and normalise code
    - include new module into the existing list of Filebeat modules
… include new module into the existing list of Filebeat modules
We found a bug in mage when collecting fields from x-path modules
@mdelapenya mdelapenya force-pushed the shorewall-module branch 2 times, most recently from c5947c2 to 48e92fd Compare May 21, 2019 07:50
Copy link
Member

@ruflin ruflin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Skimmed through it and left some high level comments. Didn't deep dive into it yet as I see it's still in progress.

I wonder if you could also open an issue with the module Github issue template so we don't forget about other things for it like the Add Data UI which will not happen in this PR.

description: >
Reserved bits. The ECN flags "CWR" and "ECNE" will show up in the two
least significant bits of this field.
- name: time1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is time1 and time2?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm gonna change this, as that [x.y] is the timestamp coming from kernel

[float]
=== Compatibility

TODO: document with what versions of the software is this tested
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The TODO should be completed or removed

description: >
Fields from Shorewall logs.
fields:
- name: network
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@webmat Potentially useful for ECS?

@@ -0,0 +1,3 @@
dashboards:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I don't see dashboards in this PR (yet?) this should probably be removed.

@mdelapenya
Copy link
Contributor Author

Related to #12238

@sayden sayden removed their assignment May 29, 2019
@mdelapenya
Copy link
Contributor Author

I'm gonna close this, as it's staled. Reopening if needed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Filebeat Filebeat in progress Pull request is currently in progress. Team:Integrations Label for the Integrations team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants