Port two Harvester tests of log input to filestream in Golang #24190

kvch · 2021-02-23T16:22:23Z

What does this PR do?

This PR ports the following Python tests of the log input to filestream to Golang:

test_harvester.py:test_close_renamed
test_harvester.py:test_close_eof

The checks no longer rely on messages on Filebeat logs.

Why is it important?

Increased the test coverage of filestream:

File	Before	After
`filebeat/input/filestream/input.go`	0%	57.4%
`filebeat/input/filestream/prospector.go`	86.2%	90.0%
`filebeat/input/filestream/filestream.go`	67.1%	79.1%
`filebeat/input/filestream/fswatch.go`	51.9%	74.8%

Checklist

My code follows the style guidelines of this project
I have commented my code, particularly in hard-to-understand areas
~~- [ ] I have made corresponding changes to the documentation~~
~~- [ ] I have made corresponding change to the default configuration files~~
~~- [ ] I have added tests that prove my fix is effective or that my feature works~~
~~- [ ] I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.~~

elasticmachine · 2021-02-23T16:24:42Z

Pinging @elastic/agent (Team:Agent)

elasticmachine · 2021-02-23T16:33:58Z

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS

Expand to view the summary

Build stats

Build Cause: Pull request #24190 updated
Start Time: 2021-02-25T15:02:09.288+0000
Duration: 71 min 41 sec
Commit: 9ff606c

Test stats 🧪

Test	Results
Failed	0
Passed	13059
Skipped	2215
Total	15274

Trends 🧪

💚 Flaky test report

Tests succeeded.

Expand to view the summary

Test stats 🧪

Test	Results
Failed	0
Passed	13059
Skipped	2215
Total	15274

filebeat/input/filestream/fswatch.go

filebeat/input/filestream/input_test.go

urso · 2021-02-23T23:00:24Z

filebeat/input/filestream/input_test.go

+	err = ioutil.WriteFile(testlogPath, []byte("new first log line\nnew second log line\n"), 0644)
+	if err != nil {
+		t.Fatalf("cannot write log lines to test file: %+v", err)
+	}


Maybe this is too much for this test case?Given the test name we just want to validate that the original harvester is closed when the file is renamed. Yet the test simulates a file rotation scenario. Although I agree we need that kind of test, I wonder if we want to have a separate set of tests that deal with different configuration-combinations and file rotation.

Indeed. But in the first round of moving tests to Golang I would prefer to keep this functionality. This helps me keep track of what tests I have ported from the log input. When I move on to supporting various log rotation methods, I am OK with renaming or moving the test around.

Maybe you want to add an 'indicator' comment that you do not forget this particular line/test. e.g. // TODO: ... or XXX: ...

filebeat/input/filestream/input_test.go

urso · 2021-02-24T13:23:46Z

Although it is nice to see the test coverage increased, these tests have quite a few moving pieces and might even be sensitive to timing. How about naming the file input_integration_test.go with the appropriate build tags.

I wonder how 'stable' the test coverage is. Depending on timing I would presume coverage going up or down a little.

The checks no longer rely on messages on Filebeat logs.

+100

Out of curiosity... how does the translation to go impact the runtime of the particular tests?

kvch · 2021-02-24T14:28:30Z

It is significantly faster in Golang:

Test	Golang [sec]	Python [sec]
close_eof	0.019	0.39
close_renamed	0.03	0.59

(The values above are the averages of 5 test runs in both languages.)

urso · 2021-02-24T15:48:11Z

It is significantly faster in Golang:

Nice... it's like a factor of 20

filebeat/input/filestream/input_integration_test.go

urso · 2021-02-24T21:42:12Z

be careful with the filename. Not sure, but testing.go might be included with all builds, meaning that we might link in the full go testing system in our binary by accident here (which introduces quite a many new CLI flags).

urso · 2021-02-24T21:43:55Z

filebeat/input/filestream/input_integration_test.go

+func TestFilestreamCloseRenamed(t *testing.T) {
+	if runtime.GOOS == "windows" {
+		t.Skip("renaming files while Filebeat is running is not supported on Windows")
+	}


Really? I wonder if the issue is due to use trying to write to the "original" path after the rotation.

I got this error when I ran os.Rename: The process cannot access the file because it is being used by another process.

I think no because the test tried to rename the file after it had written lines to the file. It rather fails because the filestream input keeps the file open as it is waiting for new events to show up. To test my theory I closed the file on EOF, afterwards, I was able to rename the file on Windows as well.

When reopening restarting the input, is that detected as a rename? Can we create a test to validate the expected behavior?

If someone is able to rename a file on Windows while Filebeat is keeping it open, it is detected by the checker goroutine of logReader. So the reader is going to behave correctly.

When reopening restarting the input, is that detected as a rename? Can we create a test to validate the expected behavior?

What is the expected behaviour? If the input is restarted, it does not make sense to check if a file was renamed because close_renamed applies to opened files that are renamed during Filebeat running a Harvester for.

When the input is restarted, it finds the state in the registry for the renamed file, unless path file_identity strategy is selected, it will not start a new Harvester for the file because it is still the same file it encountered during the previous run.

These test cases are not special, we either already have tests for it or going to have them when all tests of log input is ported to use filestream.

I still would expect that the meta-data in the registry are updated... meaning that the prospector detects the file was renamed. Maybe it is rather worth an unit test for the prospector

OK, I will address this in a separate PR.

filebeat/input/filestream/input_integration_test.go

…c#24190) This PR ports the following Python tests of the `log` input to `filestream` to Golang: - `test_harvester.py:test_close_renamed` - `test_harvester.py:test_close_eof` The checks no longer rely on messages on Filebeat logs. Increased the test coverage of `filestream`: File | Before | After ----- | ----- | ----- `filebeat/input/filestream/input.go` | 0% | 57.4% `filebeat/input/filestream/prospector.go` | 86.2% | 90.0% `filebeat/input/filestream/filestream.go` | 67.1% | 79.1% `filebeat/input/filestream/fswatch.go` | 51.9% | 74.8% (cherry picked from commit c3b5a17)

#24604) This PR ports the following Python tests of the `log` input to `filestream` to Golang: - `test_harvester.py:test_close_renamed` - `test_harvester.py:test_close_eof` The checks no longer rely on messages on Filebeat logs. Increased the test coverage of `filestream`: File | Before | After ----- | ----- | ----- `filebeat/input/filestream/input.go` | 0% | 57.4% `filebeat/input/filestream/prospector.go` | 86.2% | 90.0% `filebeat/input/filestream/filestream.go` | 67.1% | 79.1% `filebeat/input/filestream/fswatch.go` | 51.9% | 74.8% (cherry picked from commit c3b5a17)

botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Feb 23, 2021

kvch requested a review from urso February 23, 2021 16:22

kvch added the Team:Elastic-Agent Label for the Agent team label Feb 23, 2021

botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label Feb 23, 2021

add tests

15d61e4

kvch force-pushed the test-filebeat-filestream-integration-tests branch from 437d52d to 15d61e4 Compare February 23, 2021 16:30

urso reviewed Feb 23, 2021

View reviewed changes