Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CI][flaky] reporting for PRs in GitHub #21853

Merged
merged 24 commits into from
Oct 27, 2020

Conversation

v1v
Copy link
Member

@v1v v1v commented Oct 15, 2020

What does this PR do?

Enable the flaky test analyser using the master branch as a source of truth for the time being.

Why is it important?

This will help to automate the flaky test analyser process:

  • By tracking test failures and flaky test failures.
  • Raising issues for those flaky test failures with the labels: flaky-test,ci-reported
  • Report the test flaky status as a GitHub comment.

Then contributors and reviewers will be able to understand the whole test context and whether those failures were already reported as flaky in the past.

IMPORTANT: the current GitHub issues automation won't create more than 3 issues per build, this will allow us to avoid any kind of corner case if there are suddenly a big number of flaky test failures.

What flaky tests failures have been found?

With the flaky test analyser:

Expand to view

image
image

With the runbld watcher analyser:

Expand to view

image

Screenshots

When flaky tests are found but not reported

For instance, if there were two test failures which were flaky then you will see something like the below screenshot:

image

When flaky tests are found and reported

image

And a GH issue will be created

image

And if the GH issue was created previously then there will be a new comment:

image

Issues

Consumes elastic/apm-pipeline-library#754 and elastic/apm-pipeline-library#791

Follow ups

  • Support release branches and 7.x

@v1v v1v added automation ci Team:Automation Label for the Observability productivity team v7.10.0 labels Oct 15, 2020
@v1v v1v self-assigned this Oct 15, 2020
@botelastic botelastic bot added needs_team Indicates that the issue/PR needs a Team:* label and removed needs_team Indicates that the issue/PR needs a Team:* label labels Oct 15, 2020
Jenkinsfile Outdated
notifyBuildResult(prComment: true, slackComment: true, slackNotify: (isBranch() || isTag()))
notifyBuildResult(prComment: true,
slackComment: true, slackNotify: (isBranch() || isTag())
analyzeFlakey: true, flakyReportIdx: "reporter-beats-pipeline-master")
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if we should use as many indices as release branches we got and use the CHANGE_TARGET for the PRs or BRANCH_NAME for the branches/tags.

But, anytime there is a new release, then we might need to create a new indice.

@cachedout , what are your thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this is an issue we need to figure out. I think that the thing to do here is to use one index per project and then include a field that we can filter against when searching for documents, but to achieve that I think we need to make some changes to the way we actually generate the flaky test analysis.

For now, I propose that we keep it as-is and address this in the test analyzer itself. Then we can come back and change the pipelines as-needed.

Jenkinsfile Outdated
notifyBuildResult(prComment: true, slackComment: true, slackNotify: (isBranch() || isTag()))
notifyBuildResult(prComment: true,
slackComment: true, slackNotify: (isBranch() || isTag())
analyzeFlakey: true, flakyReportIdx: "reporter-beats-pipeline-master")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this is an issue we need to figure out. I think that the thing to do here is to use one index per project and then include a field that we can filter against when searching for documents, but to achieve that I think we need to make some changes to the way we actually generate the flaky test analysis.

For now, I propose that we keep it as-is and address this in the test analyzer itself. Then we can come back and change the pipelines as-needed.

@elasticmachine
Copy link
Collaborator

elasticmachine commented Oct 15, 2020

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview

Expand to view the summary

Build stats

  • Build Cause: [Pull request #21853 event]

  • Start Time: 2020-10-27T12:50:37.767+0000

  • Duration: 92 min 16 sec

Test stats 🧪

Test Results
Failed 0
Passed 16367
Skipped 1344
Total 17711

@v1v v1v added windows-2016 Enable builds in the CI for windows-2016 windows-2012 Enable builds in the CI for windows-2012 arm Enable builds in the CI for ARM testing labels Oct 15, 2020
@v1v
Copy link
Member Author

v1v commented Oct 15, 2020

jenkins run the tests please

@v1v v1v added the macOS Enable builds in the CI for darwin testing label Oct 15, 2020
@v1v
Copy link
Member Author

v1v commented Oct 15, 2020

Jenkins run the tests please

1 similar comment
@v1v
Copy link
Member Author

v1v commented Oct 16, 2020

Jenkins run the tests please

…laky-test-analyser

* upstream/master: (22 commits)
  [Ingest Manager] Prevent reporting ecs version twice (elastic#21616)
  [CI] Use google storage to keep artifacts (elastic#21910)
  Update docs.asciidoc (elastic#21849)
  Kubernetes leaderelection improvements (elastic#21896)
  Apply name changes to elastic agent docs (elastic#21549)
  Add 7.7.1 relnotes to 7.8 docs (elastic#21937) (elastic#21941)
  [libbeat] Fix potential deadlock in the disk queue + add more unit tests (elastic#21930)
  Refactor docker watcher to fix flaky test and other small issues (elastic#21851)
  [CI] Add stage name in the step (elastic#21887)
  [docs] Remove extra word in autodiscover docs (elastic#21871)
  [CI] lint stage doesn't produce test reports (elastic#21888)
  Add tests of reader of filestream input (elastic#21814)
  [Ingest Manager] Use local temp instead of system one (elastic#21883)
  chore: delegate variant pushes to the right method (elastic#21861)
  [CI] kind setup fails sometimes (elastic#21857)
  Fix panic on add_docker_metadata close (elastic#21882)
  Add tests for fileProspector in filestream input (elastic#21712)
  [Filebeat][okta] Fix okta pagination (elastic#21797)
  Add cloud.account.id into add_cloud_metadata for gcp (elastic#21776)
  Fix syslog RFC 5424 parsing in CheckPoint module (elastic#21854)
  ...
@elasticmachine
Copy link
Collaborator

elasticmachine commented Oct 19, 2020

💚 Flaky test report

Tests succeeded.

Test stats 🧪

Test Results
Failed 0
Passed 16367
Skipped 1344
Total 17711

@v1v
Copy link
Member Author

v1v commented Oct 22, 2020

Jenkins run the tests please

@v1v v1v removed arm Enable builds in the CI for ARM testing macOS Enable builds in the CI for darwin testing labels Oct 22, 2020
@v1v
Copy link
Member Author

v1v commented Oct 22, 2020

Jenkins run the tests please

Build and Test / Filebeat Windows / test_close_renamed – test_harvester.Test

Oct 21, 2020 @ 04:00:25.019	Build and Test / Libbeat / Libbeat oss / TestClientPublishEventKerberosAware – elasticsearch
v1v added 10 commits October 26, 2020 18:20
…beats into feature/support-flaky-test-analyser

* 'feature/support-flaky-test-analyser' of github.com:v1v/beats:
  [CI] Enable winlogbeat (elastic#22142)
This reverts commit ba9957b.
…laky-test-analyser

* upstream/master:
  Add new licence status: expired (elastic#22180)
  [filebeat][okta] Make cursor optional for okta and update docs (elastic#22091)
  Add documentation of filestream input (elastic#21615)
  [Ingest Manager] Skip flaky gateway tests elastic#22177
  [CI] set env variable for the params (elastic#22143)
  Fix zeek connection pipeline (elastic#22151)
  Fix Google Cloud Function configuration file issues (elastic#22156)
  Remove old TODO on kubernetes node update (elastic#22074)
…laky-test-analyser

* upstream/master:
  Add new licence status: expired (elastic#22180)
  [filebeat][okta] Make cursor optional for okta and update docs (elastic#22091)
  Add documentation of filestream input (elastic#21615)
  [Ingest Manager] Skip flaky gateway tests elastic#22177
  [CI] set env variable for the params (elastic#22143)
  Fix zeek connection pipeline (elastic#22151)
  Fix Google Cloud Function configuration file issues (elastic#22156)
  Remove old TODO on kubernetes node update (elastic#22074)
…beats into feature/support-flaky-test-analyser

* 'feature/support-flaky-test-analyser' of github.com:v1v/beats:
@v1v v1v marked this pull request as ready for review October 27, 2020 12:51
@v1v v1v requested a review from a team as a code owner October 27, 2020 12:51
@v1v v1v merged commit 44bdabc into elastic:master Oct 27, 2020
v1v added a commit to v1v/beats that referenced this pull request Oct 27, 2020
v1v added a commit to v1v/beats that referenced this pull request Oct 27, 2020
v1v added a commit to v1v/beats that referenced this pull request Oct 29, 2020
* upstream/master: (93 commits)
  Update commands used in the quick start (elastic#22248)
  Add interval documentation to `monitor` metricset (elastic#22152)
  [CI] enable x-pack/packetbeat in the CI (elastic#22252)
  Fix awscloudwatch input documentation (elastic#22247)
  Add support for different Azure Cloud environments in the metricbeat azure module (elastic#21044)
  [CI] support windows-2008-r2 (elastic#19791)
  protect against accessing undefined variables in sysmon module (elastic#22236)
  [CI] archive only if failed steps (elastic#22220)
  Add pe fields to Sysmon module (elastic#22217)
  [CI][flaky] Support 7.x branches and PRs (elastic#22197)
  Perfmon - Fix regular expressions to comply to multiple parentheses in instance name and object (elastic#22146)
  ci: improve linting speed (elastic#22103)
  Move cloudfoundry tags with metadata to common metadata fields (elastic#22150)
  [Docs] Update custom beat docs (elastic#22194)
  [Ingest Manager] Agent fix snapshot download for upgrade (elastic#22175)
  Update shared-autodiscover.asciidoc (elastic#21827)
  [DOCS] Warn about compression and Azure Event Hub for Kafka (elastic#21578)
  [CI][flaky] reporting for PRs in GitHub (elastic#21853)
  [Packetbeat] Create x-pack magefile (elastic#21979)
  [Elastic Agent] Fix deb/rpm installation (elastic#22153)
  ...
v1v added a commit to v1v/beats that referenced this pull request Oct 29, 2020
* upstream/master: (93 commits)
  Update commands used in the quick start (elastic#22248)
  Add interval documentation to `monitor` metricset (elastic#22152)
  [CI] enable x-pack/packetbeat in the CI (elastic#22252)
  Fix awscloudwatch input documentation (elastic#22247)
  Add support for different Azure Cloud environments in the metricbeat azure module (elastic#21044)
  [CI] support windows-2008-r2 (elastic#19791)
  protect against accessing undefined variables in sysmon module (elastic#22236)
  [CI] archive only if failed steps (elastic#22220)
  Add pe fields to Sysmon module (elastic#22217)
  [CI][flaky] Support 7.x branches and PRs (elastic#22197)
  Perfmon - Fix regular expressions to comply to multiple parentheses in instance name and object (elastic#22146)
  ci: improve linting speed (elastic#22103)
  Move cloudfoundry tags with metadata to common metadata fields (elastic#22150)
  [Docs] Update custom beat docs (elastic#22194)
  [Ingest Manager] Agent fix snapshot download for upgrade (elastic#22175)
  Update shared-autodiscover.asciidoc (elastic#21827)
  [DOCS] Warn about compression and Azure Event Hub for Kafka (elastic#21578)
  [CI][flaky] reporting for PRs in GitHub (elastic#21853)
  [Packetbeat] Create x-pack magefile (elastic#21979)
  [Elastic Agent] Fix deb/rpm installation (elastic#22153)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
automation ci Team:Automation Label for the Observability productivity team v7.10.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants