Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor some integration tests #1359

Merged
merged 81 commits into from
Jun 5, 2023

Conversation

wind57
Copy link
Contributor

@wind57 wind57 commented May 31, 2023

No description provided.

wind57 and others added 30 commits December 4, 2021 07:59
/**
* equivalent of 'docker system prune', but for crictl.
*/
public static void systemPrune() {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there has to be an honest opinion about this: I was an idiot, this should have never existed. The logs were miss-leading and I got trapped into thinking that this solved our problem. It did not.

Our fix with "system prune", while it does not hurt anything, it is not solving much either. I started digging and found this one: k3s-io/k3s#1857 The problem we are seeing with those logs are related to k3s itself, and they have no impact on the cluster.

I'm dropping it.

@@ -612,38 +649,6 @@ private static String labelSelector(Map<String, String> labels) {
return labels.entrySet().stream().map(en -> en.getKey() + "=" + en.getValue()).collect(Collectors.joining(","));
}

private String events() {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these were added as part of me trying to understand what is wrong with the cluster when integration tests are failing, now that I know : we do not need them. In its fairness, they never helped, they were useless. Worse, I started looking into these logs, saw "a problem" (that's why systemPrune was introduced). That was never the real cause of the tests failing.

@wind57
Copy link
Contributor Author

wind57 commented Jun 2, 2023

This PR has two purposes:

  • clean-up the debugging code that I added as part of understanding why integration tests are sometimes failing. Also clean-up the code that is not needed, i.e.: the equivalent of docker system prune.

  • the second part is far more interesting, imho.

It changes the way we do integration tests, partially, for at least one package. The easiest way to explain is via an example. Consider two tests testA and testB.

In testA:

  • deploy our test image in namespace default
  • deploy busybox in namespace one
  • deploy wiremock in namespace two
  • make discovery work only based on namespace one (selective namespaces)
  • assert a few things
  • clean-up all things created above

In testB:

  • deploy our test image in namespace default
  • deploy busybox in namespace one
  • deploy wiremock in namespace two
  • make discovery work only based two namespaces one and two (selective namespaces)
  • assert a few things
  • clean-up all things created above

The only difference between these two tests is what selective namespaces we use for assertions.


If that is the case, then why deploy and redeploy everything again and again when the only thing that needs to change is ONE single environment variable (the selective namespace one). So why not patch the existing deployment (it will trigger a restart of pods), then make our assertions.

This cuts the run of the tests by tens of minutes.

The downside is that the code is slightly more involved to follow and may be a bit more cumbersome to debug, but considering the massive gains we get on the pipeline, this is nothing, imo.


Tests have not changed logically one bit, all that was changed is that now I make a few in order and issue kubect patch deployment ... (via java code), to add or override environment variables specific to that particular test.

This is the reason our pipeline build fails at times. It takes a lot of time to schedule this deployment and give it enough resources on the cluster that we get on github actions. The issue is the CPU - we only get 2 cores there.
I was able to re-produce this same issue on my wife's laptop (Intel) and limiting the cluster to two cores only. We are not bullet proof with this solution, but it is far better. Instead of deploying/deleting manifests 6 times (6 integration tests), we now do it only 2 times (still those 6 integration tests, but this time manifests are installed/deleted only twice, the rest is patching them)

@@ -51,7 +54,8 @@
/**
* @author wind57
*/
class KubernetesClientDiscoveryClientIT {
@TestMethodOrder(MethodOrderer.OrderAnnotation.class)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tests now need to run in a certain order, because we patch the deployment with new env variables, so we need to preserver order for correctness.

@wind57 wind57 changed the title System prune improvements Refactor some integration tests Jun 3, 2023
@wind57 wind57 marked this pull request as ready for review June 3, 2023 05:55
@wind57
Copy link
Contributor Author

wind57 commented Jun 3, 2023

@ryanjbaxter ready to be looked at, thank you

@ryanjbaxter ryanjbaxter merged commit b0b26c7 into spring-cloud:main Jun 5, 2023
@ryanjbaxter ryanjbaxter added this to the 3.0.4 milestone Jun 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
No open projects
Status: Done
Development

Successfully merging this pull request may close these issues.

3 participants