Skip to content
This repository has been archived by the owner on Jan 9, 2020. It is now read-only.

Integration testing on master #568

Open
foxish opened this issue Nov 30, 2017 · 24 comments
Open

Integration testing on master #568

foxish opened this issue Nov 30, 2017 · 24 comments

Comments

@foxish
Copy link
Member

foxish commented Nov 30, 2017

Elaborating on #545 (comment)
We need to separate our integration testing out - so we can make it run any arbitrary branch - including master.

cc @apache-spark-on-k8s/contributors @ifilonenko @erikerlandson @ssuchter @kimoonkim

@erikerlandson
Copy link
Member

are you thinking of factoring the IT into a separate repo?

@foxish
Copy link
Member Author

foxish commented Nov 30, 2017

I think that might be one way. If we just have the integration tests be built separately, as a project by itself, and then invoke it against master. It might not be the most expedient - open to other suggestions as well.

@erikerlandson
Copy link
Member

One advantage of that would be that it would be convenient to make separate branches, having different IT subsets targeted to various upstreaming stages of functionality

@erikerlandson
Copy link
Member

Would we then have to re-integrate them into the upstream so it has the correct IT?

@foxish
Copy link
Member Author

foxish commented Nov 30, 2017

It might be desirable for upstream also to have the same decoupled integration testing, similar to YARN/Mesos.

@erikerlandson
Copy link
Member

+1, that makes it a two-fer

@erikerlandson
Copy link
Member

maybe even a 3-fer, it keeps more code out of the main upstream PRs

@erikerlandson
Copy link
Member

cc @felixcheung

@tnachen
Copy link

tnachen commented Nov 30, 2017

Having it outside of the main repo is easier and faster to iterate, which is why we did it also for Mesos..

@erikerlandson
Copy link
Member

I added a tracking issue for jenkins-infra: https://github.com/ucbrise/jenkins-infra/issues/86

@felixcheung
Copy link

when you say decoupled testing, same as YARN/mesos, do you mean the test and source will be completely outside of the Apache Spark git repo?

(because as far as I can see, there is no open source YARN integration tests for Spark, for example)

@ssuchter
Copy link
Member

ssuchter commented Dec 1, 2017 via email

@ifilonenko
Copy link
Member

I agree with the separate repo. However, this will require a group of separate group to maintain and monitor this repo as well.

@felixcheung
Copy link

felixcheung commented Dec 1, 2017

If it's about fast iteration, I'd say go for it.
At some point though, I'd agree with @ssuchter OSS (or even, under ASF) and same branch/keep in-sync is very important.

@foxish
Copy link
Member Author

foxish commented Dec 1, 2017

I agree - we shouldn't look at it as a separate thing long term. It's expedient for now to separate it out - to get some confidence in upstream by running tests against it. Eventually, it should live alongside the rest of the code upstream, but I do think it needs some hardening before that, and we can't hit that for 2.3

@foxish
Copy link
Member Author

foxish commented Dec 1, 2017

@mccheah
Copy link

mccheah commented Dec 1, 2017

We probably want to use the testing version provided by #521. There's one more change required to make this pass on Amplab Jenkins. I'll patch this branch and bring it up to speed with our mainline, then we should think about how we're going to move the code over.

Automation is something to consider as well. We probably want to run these tests in the separate repository on every commit to master that affects Kubernetes. We could run the suite on every commit to master in general as well.

@foxish
Copy link
Member Author

foxish commented Dec 6, 2017

I think we can break the integration testing out further after #521 is merged.

The immediate next steps I see are:

For now, it would still depend on minikube. In the future, we can do even better.

  • Separate out the image building to use a registry. In minikube, we'll build and push to the minikube local registry which can be enabled as an addon.
  • Enable the integration-testing JAR to take additional parameters for K8s cluster, docker registry, and runnable spark distro.

This would enable running on other k8s clusters, enabling for example, stress testing on GKE if we want to do that, and also let other users test the release on their own clusters using our integration tests.

cc @ifilonenko @mccheah

@mccheah
Copy link

mccheah commented Dec 6, 2017

Are we still planning to run integration tests in the Jenkins PR builder on master? Or was the plan to run the tests in our own automated system?

@foxish
Copy link
Member Author

foxish commented Dec 6, 2017

Not sure if we'll get that done in time.
With this split, even if we don't get the integration tests checked in for Spark 2.3 or the spark prb changes, we'd still have the option to run it ourselves.

@mccheah
Copy link

mccheah commented Dec 6, 2017

@foxish I was thinking about this a little more and am concerned that this setup isn't too intuitive. (edit: the proposed setup being #568 (comment))

What these semantics are basically saying is that the main repository depends on the tests. This communicates that the main repository is checking the correctness of the tests, not the other way around.

To illustrate this point, consider the situation where a developer writes a new integration test for a feature they just merged into master. Consider the workflow:

  1. Developer writes unit tests and main feature in apache/master. They don't update the integration test version because integration tests aren't published yet. The change is merged.
  2. Developer writes and publishes integration tests that are broken for the feature.
  3. Developer updates main repository's integration test dependency to try the new tests.
  4. Developer finds the tests are wrong, so they have to publish a new integration test version.

This sequence of events makes it clear that the main repository is determining whether or not the integration tests are correct. But the more intuitive mental model is that the integration tests are validating the correctness of a given Spark artifact.

Consider also the situation when the main repository is changed such that the assumptions made by the given integration test version are invalid. In this workflow, the developer is blocked from merging into master before the new integration test version is published. But in between the time that the new integration tests are published and the main change merges, that new integration test version is only applicable for a version of the main repository that doesn't exist yet.

So I think we want the integration tests to pull down and test the repository under test, not the other way around.

@ssuchter
Copy link
Member

ssuchter commented Dec 6, 2017 via email

@foxish
Copy link
Member Author

foxish commented Dec 6, 2017

So I think we want the integration tests to pull down and test the
repository under test, not the other way around.

I agree with that @mccheah. I wasn't proposing the opposite but I guess it got confusing since I went into a bit of the implementation as well in my comment. I don't think we want to "publish" the integration test jar as a separate entity. My point was that jenkins could take care of building the latest integration tests (from wherever they live, which may be for now - https://github.com/apache-spark-on-k8s/spark-integration), and the distro (from a particular PR), and then have the integration tests run on the distro. The point about jars was just to make the separation point and isn't really a necessity.

I don't mind having the integration tests actually pull down the repo and build it for now if it helps save time. I think it's the same whether jenkins does, or if we have the test logic itself do it.

@mccheah
Copy link

mccheah commented Dec 6, 2017

Ah, sorry for the misunderstanding there. I think we want the integration tests to target a git hash on either a remote repository to clone or a local checkout for development iteration. But maybe we should support testing against Spark distribution tarballs also because we can save build time by using a pre-published Spark distribution instead of building it ourselves in the integration test CI.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants