-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Flaky test: appsearch/stats metricset integration and system tests #19739
Comments
Pinging @elastic/integrations-services (Team:Services) |
Indeed, I tried pulling the docker image used by the tests from the
If I try to build the docker image for the
@jsoriano any ideas what might be going on? Are you the person to ping about the |
@ycombinator it is ok if you ping me about these problems 🙂 compose logging is a bit too verbose and confusing. We always try to pull the image before trying to build it, so if it is available in the registry we use it, and if not, we build it. For all the images we don't have available, the error about not found manifest is going to appear, but it is ignored, and then the image is built. Actually, it did build the image at the end:
This error is more significative:
Container finished on startup. This uses to mean that the container was actually started (thus the image was there), but stopped before reaching a healthy state. This may be caused by some lack of resources or some other source of flakiness in the image or service startup. |
Ah, thanks @jsoriano I didn't realize I could ignore the error from the registry so I didn't look further — thanks for clarifying this! Yes, we have had issues with this container taking too long to start up in the past. That's why we added a very long timeout here:
But evidently that is not enough time in CI or something else is more fundamentally broken when the container tries to start up. So now we need to figure out what's going on when the container tries to start up. |
Yes, this service uses to take some time to start. But take into account that the error in this case means that the container stopped, so it wouldn't have gone to a healthy state no matter how long we would have waited. |
Good point. So something is more fundamentally broken during this container's start up then. It's not just taking too long to start up. Makes sense, thanks. |
Take into account that the container may be "logically" fine, but get killed by OOM for example. |
Here is the issue, from the logs of the appsearch container:
|
Bumping up the appsearch image version to
However, bumping it up to @ioanatia any ideas why there is no |
Flaky Test
TestFetch
(integration test) andtest_stats
(system test)beats/x-pack/metricbeat/module/appsearch/stats/stats_integration_test.go
Line 18 in a00eaba
beats/x-pack/metricbeat/module/appsearch/test_appsearch.py
Line 14 in a00eaba
master
The test occasionally times out while bringing up the Docker container for appsearch. This error from the stack trace of the system test (see below) seems relevant:
Stack Trace
Integration test
System test
The text was updated successfully, but these errors were encountered: