Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-34736][K8S][TESTS] Kubernetes and Minikube version upgrade for integration tests #31829

Closed
wants to merge 2 commits into from

Conversation

attilapiros
Copy link
Contributor

@attilapiros attilapiros commented Mar 14, 2021

What changes were proposed in this pull request?

This PR upgrades Kubernetes and Minikube version for integration tests and removes/updates the old code for this new version.

Details of this changes:

  • As discussed in the mailing list: updating Minikube version from v0.34.1 to v1.7.3 and kubernetes version from v1.15.12 to v1.17.3.
  • making Minikube version checked and fail with an explanation when the test is started with on a version < v1.7.3.
  • removing minikube status checking code related to old Minikube versions
  • in the Minikube backend using fabric8's Config.autoConfigure() method to configure the kubernetes client to use the minikube k8s context (like it was in one of the Minikube's example)
  • Introducing persistentVolume test tag: this would be a temporary change to skip PVC tests in the Kubernetes integration test, as currently the PCV tests are blocking the move to Docker as Minikube's driver (for details please check https://issues.apache.org/jira/browse/SPARK-34738).

Why are the changes needed?

With the current suggestion one can run into several problems without noticing the Minikube/kubernetes version is the problem.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

It was tested on Mac with this script which installs each Minikube versions from v1.7.2 (including this version to test the negative case of the version check) and runs the integration tests.

It was started with:

./check_minikube_versions.zsh > test_log 2>&1

And there was only one build failure the rest was successful:

$ grep "BUILD SUCCESS" test_log | wc -l
      26
$ grep "BUILD FAILURE" test_log | wc -l
       1

It was for Minikube v1.7.2 and the log is:

KubernetesSuite:
*** RUN ABORTED ***
  java.lang.AssertionError: assertion failed: Unsupported Minikube version is detected: minikube version: v1.7.2.For integration testing Minikube version 1.7.3 or greater is expected.
  at scala.Predef$.assert(Predef.scala:223)
  at org.apache.spark.deploy.k8s.integrationtest.backend.minikube.Minikube$.getKubernetesClient(Minikube.scala:52)
  at org.apache.spark.deploy.k8s.integrationtest.backend.minikube.MinikubeTestBackend$.initialize(MinikubeTestBackend.scala:33)
  at org.apache.spark.deploy.k8s.integrationtest.KubernetesSuite.beforeAll(KubernetesSuite.scala:163)
  at org.scalatest.BeforeAndAfterAll.liftedTree1$1(BeforeAndAfterAll.scala:212)
  at org.scalatest.BeforeAndAfterAll.run(BeforeAndAfterAll.scala:210)
  at org.scalatest.BeforeAndAfterAll.run$(BeforeAndAfterAll.scala:208)
  at org.apache.spark.deploy.k8s.integrationtest.KubernetesSuite.org$scalatest$BeforeAndAfter$$super$run(KubernetesSuite.scala:43)
  at org.scalatest.BeforeAndAfter.run(BeforeAndAfter.scala:273)
  at org.scalatest.BeforeAndAfter.run$(BeforeAndAfter.scala:271)
  ...

Moreover I made a test with having multiple k8s cluster contexts, too.

@SparkQA
Copy link

SparkQA commented Mar 14, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/40621/

@SparkQA
Copy link

SparkQA commented Mar 14, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/40621/

Copy link
Member

@rvesse rvesse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, one comment about needing a doc update around the requirement for the context to be called minikube

@fbalicchia
Copy link

fbalicchia commented Apr 6, 2021

Hi, I'm experiencing in the same issue and and to resolve the issue I've applied the same changes. Could you please merge it ?
Thanks

@attilapiros
Copy link
Contributor Author

@fbalicchia Could you please share with us the error you run into (the one on which this PR helped on)?
By the way I cannot merge my own PR without an expert's approval (expert of the area where the PR belongs to). But your input will help the case.

@fbalicchia
Copy link

Using minikube v1.18.1 default driver is Docker and at the moment of writing masterUrl not necessary bind on port 8443 https://github.com/apache/spark/blob/master/resource-managers/kubernetes/integration-tests/src/test/scala/org/apache/spark/deploy/k8s/integrationtest/backend/minikube/Minikube.scala#L69.
Instead using default config or using factory with autoConfigure("minikube") library help us

@@ -176,4 +180,4 @@ properties+=(
-Dlog4j.logger.org.apache.spark=DEBUG
)

$TEST_ROOT_DIR/build/mvn integration-test -f $TEST_ROOT_DIR/pom.xml -pl resource-managers/kubernetes/integration-tests -am -Pscala-$SCALA_VERSION -P$HADOOP_PROFILE -Pkubernetes -Pkubernetes-integration-tests ${properties[@]}
$TEST_ROOT_DIR/build/mvn install -f $TEST_ROOT_DIR/pom.xml -pl resource-managers/kubernetes/integration-tests $BUILD_DEPENDENCIES_MVN_FLAG -Pscala-$SCALA_VERSION -P$HADOOP_PROFILE -Pkubernetes -Pkubernetes-integration-tests ${properties[@]}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The install phase is after integration-test, see default lifecycle at the maven doc:

Phase Description
... ...
package take the compiled code and package it in its distributable format, such as a JAR.
pre-integration-test perform actions required before integration tests are executed. This may involve things such as setting up the required environment.
integration-test process and deploy the package if necessary into an environment where integration tests can be run.
post-integration-test perform actions required after integration tests have been executed. This may including cleaning up the environment.
verify run any checks to verify the package is valid and meets quality criteria.
install install the package into the local repository, for use as a dependency in other projects locally.
... ...

With install the local repo will be updated with the fresh jars and in case of --skip-building-dependencies the build will reuse those.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I know that we need install sometimes, but for this one, I'm a little negative for converting integration-test to install because it pollutes the local repo while testing and reviewing someone-else's PR. It's really painful for committers to clean up the local repo everytime to avoid any side-effects for the other PRs. Is this really inevitable?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say this is the orherway around, assuming in both PRs we stick to install.
So when we call install in the first PR we could be sure the fresh jars will be used (even when the artifact we build referencing something outside of the maven reactor) and when we move to the next PR because of install we have again the fresh jars.

@SparkQA
Copy link

SparkQA commented Apr 8, 2021

Test build #137071 has finished for PR 31829 at commit de32835.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Apr 8, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/41649/

@SparkQA
Copy link

SparkQA commented Apr 8, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/41649/

@SparkQA
Copy link

SparkQA commented Apr 17, 2021

Test build #137499 has finished for PR 31829 at commit 1f594aa.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Apr 17, 2021

Kubernetes integration test unable to build dist.

exiting with code: 1
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42073/

@github-actions
Copy link

Test build #758035203 for PR 31829 at commit 855273a.

@attilapiros
Copy link
Contributor Author

rebased on master

@SparkQA
Copy link

SparkQA commented Apr 17, 2021

Test build #137502 has finished for PR 31829 at commit 855273a.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Apr 17, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42076/

@SparkQA
Copy link

SparkQA commented Apr 17, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42076/

@SparkQA
Copy link

SparkQA commented Apr 19, 2021

Test build #137566 has finished for PR 31829 at commit 855273a.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Apr 19, 2021

Test build #137602 has finished for PR 31829 at commit 855273a.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@attilapiros . Is there an ETA for the blocker you mentioned? If there is no ETA for that, it would be great if we can proceed a subset first partially. For those parts which are out of our control, we had better reduce the bottleneck or dependency as much as possible.

BTW, Spark 3.2 code freeze is July 1st 2021. So, we still have enough dev time though.

@attilapiros
Copy link
Contributor Author

@dongjoon-hyun actually I am pretty sure this helps (even with the skipping of the PVTests which when @shaneknapp sets the Minikube driver to Docker will fix the Kubernetes integration tests).

So could you please review this as it is?

@attilapiros attilapiros changed the title [WIP][SPARK-34736][K8S][TESTS] Kubernetes and Minikube version upgrade for integration tests [SPARK-34736][K8S][TESTS] Kubernetes and Minikube version upgrade for integration tests Apr 30, 2021
@dongjoon-hyun
Copy link
Member

Oh, I missed your last comment here. Very sorry, @attilapiros .

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM Thank you, @attilapiros and all.

Sorry for back and forth. On second thought, for the above all my comments, we can handle them later when we need them. Let's merge this and move forward. Thanks!

@attilapiros
Copy link
Contributor Author

Thanks @dongjoon-hyun.

Tested locally:

[INFO] --- scalatest-maven-plugin:2.0.0:test (integration-test) @ spark-kubernetes-integration-tests_2.12 ---
Discovery starting.
Discovery completed in 446 milliseconds.
Run starting. Expected test count is: 25
KubernetesSuite:
- Run SparkPi with no resources
- Run SparkPi with a very long application name.
- Use SparkLauncher.NO_RESOURCE
- Run SparkPi with a master URL without a scheme.
- Run SparkPi with an argument.
- Run SparkPi with custom labels, annotations, and environment variables.
- All pods have the same service account by default
- Run extraJVMOptions check on driver
- Run SparkRemoteFileTest using a remote data file
- Verify logging configuration is picked from the provided SPARK_CONF_DIR/log4j.properties
- Run SparkPi with env and mount secrets.
- Run PySpark on simple pi.py example
- Run PySpark to test a pyfiles example
- Run PySpark with memory customization
- Run in client mode.
- Start pod creation from template
- Launcher client dependencies
- SPARK-33615: Launcher client archives
- SPARK-33748: Launcher python client respecting PYSPARK_PYTHON
- SPARK-33748: Launcher python client respecting spark.pyspark.python and spark.pyspark.driver.python
- Launcher python client dependencies using a zip file
- Test basic decommissioning
- Test basic decommissioning with shuffle cleanup
- Test decommissioning with dynamic allocation & shuffle cleanups
- Test decommissioning timeouts
Run completed in 16 minutes, 51 seconds.
Total number of tests run: 25

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants