-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add wait-timeout flag to start command and refactor util/kubernetes #5121
Conversation
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: medyagh The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for taking on the much needed refactor.
@@ -241,6 +241,11 @@ export MINIKUBE_HOME="${TEST_HOME}/.minikube" | |||
export MINIKUBE_WANTREPORTERRORPROMPT=False | |||
export KUBECONFIG="${TEST_HOME}/kubeconfig" | |||
|
|||
# Build the gvisor image. This will be copied into minikube and loaded by ctr. | |||
# Used by TestContainerd for Gvisor Test. | |||
docker build -t gcr.io/k8s-minikube/gvisor-addon:latest -f testdata/gvisor-addon-Dockerfile out |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unrelated?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was added by Priya to fix the gvisor test ( to be added before the integration tests) but it was added after the minikube clean up ! so it was not being used by the test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this only moved the command up the script
cmd/minikube/cmd/start.go
Outdated
@@ -148,6 +150,7 @@ func initMinikubeFlags() { | |||
startCmd.Flags().String(networkPlugin, "", "The name of the network plugin.") | |||
startCmd.Flags().Bool(enableDefaultCNI, false, "Enable the default CNI plugin (/etc/cni/net.d/k8s.conf). Used in conjunction with \"--network-plugin=cni\".") | |||
startCmd.Flags().Bool(waitUntilHealthy, true, "Wait until Kubernetes core services are healthy before exiting.") | |||
startCmd.Flags().Duration(waitTimeout, 3*time.Minute, "max time to wait for Kubernetes core services to be healthy.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This default seems quite short for certain environments: Previously, it was 5-minutes per pod. 5 minutes overall perhaps?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I should had added Per Kubernetes service... ( this is not for all) and it used to be 5 min per componenent
@@ -32,7 +32,7 @@ func TestMain(m *testing.M) { | |||
os.Exit(m.Run()) | |||
} | |||
|
|||
var startTimeout = flag.Int("timeout", 25, "number of minutes to wait for minikube start") | |||
var startTimeout = flag.Duration("timeout", 25*time.Minute, "max duration to wait for a full minikube start") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
8x the default we give to users is crazy. I can understand 2x, but more than that I feel like we are making ourselves reliable at the expense of users.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry for the confusion, the --wait-time out is Per component not for all start.
before this PR the total wait was 5 x Per component
this PR reduces it to 3 min for end user (per component)
so it is not too much more than what we do for the end user. (if we count 5* per component")
but I intend to have another clean up PR in the integration tests ,to get rid of all kind of Retrying Start in parallel. once fixed all the flakes ( certs, and corruptions...)
@tstromberg I believe I solved all the comments |
/retest this please |
Removes more than a few unused functions such as :
Moves funcs out of
pkg/utils
intopgk/kube
.Moves
ExtraOptions
type and its funcs frompkg/util
topkg/minikube/config
package. and Rename func.ContainsString
to.ContainsParam
Choose better timeout based on kubernetes consts
Added logs for how long it took for each component and k8s-app to come up ( to be used later to fine-tune our default waiting time) which will appear in the logs like :
Added a new flag for start cmd
wait-timeout
that specifies maxwait per component
.Reduced the default
wait per component
from 5 minutes to 3 minutes for end usersIncrease default
wait per component
from 5 minutes to 13 minutes for parallel integration tests.Parameterized integration tests to accept
wait-timeout
Fixed the test setup for givsor test which was put after the e2e test run. moved up the script.
Closes #5122
and Hopefully reduces some test flakes due to timeout
topics I like to know the reviewer's opinion on :
Golang Parallel Test logging gotcha! :
I was hoping to see the duration metric logs in the tests !
but I found out sometimes it doesn't log at all and sometimes logs at most 2 sets of them.
That makes me believe golang only outputs the not PAUSED Tests and the paused tests (which their VM still are running and our wait for running func is still working on them) will not output to the logs. I created an issue in golang to track this : golang/go#33706