[CI] Implement queue-based approach to running kibana test suites #44934

brianseeders · 2019-09-05T19:59:32Z

Re-visit this work after #44925 completes.

My initial testing for this saw full pipeline runtimes as low as 41 minutes, and this approach included creating a tarball of the entire post-build workspace and transferring it to other machines via GCS. With our latest pipeline work, we may actually be able to do this effectively without transferring workspaces, which could bring our pipeline time below 40 minutes.

General Idea (repeated for OSS and X-Pack):

Create a list of all test suites (e.g. individual files with tests in them), along with the config file needed to run each suite
Have multiple, parallel workers keep pulling from the queue of suites until all of the tests are complete
Re-run any suite with failed tests, to combat flaky tests

Considerations

Invocations of FTR have 5-10 seconds of overhead, because of on-the-fly Babel transpiling, and this quickly adds up with lots of invocations for small test suites
Starting up Elasticsearch and Kibana are time consuming, so workers should not switch between suites requiring different test configs too much. Ideally, they also wouldn't spend 1 minute spinning up ES+Kibana to run 1 test suite for 5 seconds.
GitHub checks broken down by ciGroup wouldn't make sense anymore, as ciGroups will be gone. There are too many test suites (100s) to show them individually. What should be shown instead?
If we're still primarily using the Jenkins UI for status/etc, these changes will make the problems worse
Even some individual test suites take a long time to run (functional.apps.maps.es_geo_grid_source) and should be further broken up or otherwise prioritized

elasticmachine · 2019-09-05T19:59:34Z

Pinging @elastic/kibana-operations

brianseeders added Team:Operations Team label for Operations Team Feature:CI Continuous integration labels Sep 5, 2019

spalger mentioned this issue Sep 11, 2019

[CI] Add automatic test group re-running for flaky tests #44927

Closed

tylersmalley assigned brianseeders Feb 20, 2020

mshustov mentioned this issue Jul 3, 2020

Kibana Developer Experience #70733

Open

41 tasks

tylersmalley added 1 and removed 1 labels Oct 11, 2021

exalate-issue-sync bot added impact:low Addressing this issue will have a low level of impact on the quality/strength of our product. loe:small Small Level of Effort labels Feb 16, 2022

exalate-issue-sync bot closed this as completed Feb 23, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CI] Implement queue-based approach to running kibana test suites #44934

[CI] Implement queue-based approach to running kibana test suites #44934

brianseeders commented Sep 5, 2019

elasticmachine commented Sep 5, 2019

[CI] Implement queue-based approach to running kibana test suites #44934

[CI] Implement queue-based approach to running kibana test suites #44934

Comments

brianseeders commented Sep 5, 2019

elasticmachine commented Sep 5, 2019