-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rebase 1.10.0 #19137
rebase 1.10.0 #19137
Conversation
@sjenning seeing a kubelet startup failure in our e2e setup that seems caused by https://github.com/kubernetes/kubernetes/pull/59769/files#diff-bf28da68f62a8df6e99e447c4351122dR1331 comparing origin.log from a good e2e run from master (kube 1.9.1) and a bad e2e run on this PR (kube 1.10.0), both appear to have hit the in 1.9. it was logged as an error repeatedly:
in 1.10, this error is now fatal:
questions:
|
@@ -169,6 +168,22 @@ func ClientMapperFromConfig(config *rest.Config) resource.ClientMapperFunc { | |||
}) | |||
} | |||
|
|||
// setKubernetesDefaults sets default values on the provided client config for accessing the | |||
// Kubernetes API or returns an error if any of the defaults are impossible or invalid. | |||
func setKubernetesDefaults(config *rest.Config) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You think this is safer than exposing the function upstream?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, I expect further shenanigans in legacyscheme usage upstream
@@ -7,7 +7,7 @@ import ( | |||
"github.com/golang/glog" | |||
|
|||
controllerapp "k8s.io/kubernetes/cmd/kube-controller-manager/app" | |||
_ "k8s.io/kubernetes/plugin/pkg/scheduler/algorithmprovider" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow, this is tragic and so symptomatic of the scheduler problems.
newFunc := func(protocol api.Protocol, ip net.IP, port int) (userspace.ProxySocket, error) { | ||
return newUnidlerSocket(protocol, ip, port, signaler) | ||
} | ||
return userspace.NewCustomProxier(loadBalancer, listenIP, iptables, exec, pr, syncPeriod, minSyncPeriod, udpIdleTimeout, newFunc) | ||
return userspace.NewCustomProxier(loadBalancer, listenIP, iptables, exec, pr, syncPeriod, minSyncPeriod, udpIdleTimeout, nodePortAddresses, newFunc) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@DirectXMan12 ptal
@liggitt https://github.com/liggitt/origin/commit/be4d28cd7f303d13c0dc95fb530f386739c5b306 (i was able to spawn 1.10 cluster via cluster up with this, guess it will make some tests pass) |
@@ -420,6 +421,7 @@ func (c *DeploymentController) makeDeployerPod(deployment *v1.ReplicationControl | |||
RestartPolicy: v1.RestartPolicyNever, | |||
ServiceAccountName: c.serviceAccount, | |||
TerminationGracePeriodSeconds: &gracePeriod, | |||
ShareProcessNamespace: &shareProcessNamespace, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this a thing? Is the default bad somehow?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also the deployer pod is single-container Pod... From what I saw in kube, this is still an alpha feature that is disabled by default anyway ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes the default explicit, had to set something or explicitly ignore the field in the unit test, which I didn't want to do
@@ -109,12 +109,9 @@ function os::build::version::kubernetes_vars() { | |||
# Try to match the "git describe" output to a regex to try to extract | |||
# the "major" and "minor" versions and whether this is the exact tagged | |||
# version or whether the tree is between two tagged versions. | |||
if [[ "${KUBE_GIT_VERSION}" =~ ^v([0-9]+)\.([0-9]+)(\.[0-9]+)*([-].*)?$ ]]; then | |||
if [[ "${KUBE_GIT_VERSION}" =~ ^v([0-9]+)\.([0-9]+)\. ]]; then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because it didn't actually work against the output of git describe, and all we care about is the major/minor bits
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you give the output of git describe
? IIRC this will break stuff in OSE
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you give the output of git describe?
v1.10.0-46-g9070269
IIRC this will break stuff in OSE
nope, this is setting kube minor/major version, which are unset today. this isn't touching anything openshift-related
hack/test-cmd.sh
Outdated
@@ -19,7 +19,7 @@ function find_tests() { | |||
local full_test_list=() | |||
local selected_tests=() | |||
|
|||
full_test_list=( $(find "${OS_ROOT}/test/cmd" -name '*.sh') ) | |||
full_test_list=( $(find "${OS_ROOT}/test/cmd" -name '*.sh' | sort) ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer unsorted. Can you randomize instead please.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why? I like knowing how much time I have left in my tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We found a lot of bugs once we stopped sorting in the past, people were not cleaning up cluster resources in test X and not questioning the state of the universe in test X+N, then writing test X+N to only work when run after text X...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
randomizing without being reproducible leads to flakes that magically pass on a subsequent retry
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is that an issue today? You're arguing to change the status quo :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is that an issue today?
yeah, hit it while running this suite.
You're arguing to change the status quo :)
will push a revert with the next batch of changes. I don't think the current random+unreproducible state is helpful, but I don't care enough to argue.
@@ -129,7 +129,7 @@ func NewCmdDebug(fullName string, f *clientcmd.Factory, in io.Reader, out, errou | |||
} | |||
|
|||
cmd := &cobra.Command{ | |||
Use: "debug RESOURCE/NAME [ENV1=VAL1 ...] [-c CONTAINER] [options] [-- COMMAND]", | |||
Use: "debug RESOURCE/NAME [ENV1=VAL1 ...] [-c CONTAINER] [flags] [-- COMMAND]", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not that I object, but why do I care about this change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because cobra changed to auto append [flags]
if you don't include it in your usage, which screws up usage for things like rsh
@@ -117,7 +117,7 @@ echo "certs: ok" | |||
os::test::junit::declare_suite_end | |||
|
|||
os::test::junit::declare_suite_start "cmd/admin/groups" | |||
os::cmd::expect_success_and_text 'oc adm groups new shortoutputgroup -o name' 'groups/shortoutputgroup' | |||
os::cmd::expect_success_and_text 'oc adm groups new shortoutputgroup -o name' 'group/shortoutputgroup' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, that's neat. The command needs updating. Printer looks off
test/cmd/basicresources.sh
Outdated
@@ -92,7 +92,7 @@ echo "pods: ok" | |||
os::test::junit::declare_suite_end | |||
|
|||
os::test::junit::declare_suite_start "cmd/basicresources/label" | |||
os::cmd::expect_success_and_text 'oc create -f examples/hello-openshift/hello-pod.json -o name' 'pod/hello-openshift' | |||
os::cmd::expect_success_and_text 'oc create -f examples/hello-openshift/hello-pod.json -o name' 'pod.*/hello-openshift' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not obvious to me. What are the characters between pod and slash?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed
|
||
# Test that infos printer supports all outputFormat options | ||
os::cmd::expect_success_and_text 'oc new-app node -o yaml | oc set env -f - MYVAR=value' 'deploymentconfig "node" updated' | ||
os::cmd::expect_success 'oc new-app node -o yaml | oc set env -f - MYVAR=value -o custom-colums="NAME:.metadata.name"' | ||
os::cmd::expect_success_and_text 'oc new-app node -o yaml | oc set env -f - MYVAR=value -o yaml' 'apiVersion: v1' | ||
os::cmd::expect_success_and_text 'oc new-app node -o yaml | oc set env -f - MYVAR=value -o json' '"apiVersion": "v1"' | ||
os::cmd::expect_success_and_text 'oc new-app node -o yaml | oc set env -f - MYVAR=value -o wide' 'node' | ||
os::cmd::expect_success_and_text 'oc new-app node -o yaml | oc set env -f - MYVAR=value -o name' 'deploymentconfigs/node' | ||
os::cmd::expect_success_and_text 'oc new-app node -o yaml | oc set env -f - MYVAR=value -o name' 'deploymentconfig/node' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@juanvallejo @soltysh another command that needs updating.
test/cmd/router.sh
Outdated
@@ -84,7 +84,7 @@ os::cmd::expect_success 'oc adm policy add-scc-to-user privileged -z ipfailover' | |||
os::cmd::expect_success_and_text 'oc adm ipfailover --virtual-ips="1.2.3.4" --dry-run' 'Creating IP failover' | |||
os::cmd::expect_success_and_text 'oc adm ipfailover --virtual-ips="1.2.3.4" --dry-run' 'Success \(dry run\)' | |||
os::cmd::expect_success_and_text 'oc adm ipfailover --virtual-ips="1.2.3.4" --dry-run -o yaml' 'name: ipfailover' | |||
os::cmd::expect_success_and_text 'oc adm ipfailover --virtual-ips="1.2.3.4" --dry-run -o name' 'deploymentconfig/ipfailover' | |||
os::cmd::expect_success_and_text 'oc adm ipfailover --virtual-ips="1.2.3.4" --dry-run -o name' 'deploymentconfig.*/ipfailover' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
specificity please
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@@ -19,7 +19,7 @@ os::cmd::expect_success_and_text 'oc status --suggest' 'dc/simple-deployment has | |||
os::cmd::expect_failure_and_text 'oc set probe dc/simple-deployment --liveness --get-url=http://google.com:80 --local' 'You must provide one or more resources by argument or filename' | |||
# test --dry-run flag with -o formats | |||
os::cmd::expect_success_and_text 'oc set probe dc/simple-deployment --liveness --get-url=http://google.com:80 --dry-run' 'simple-deployment' | |||
os::cmd::expect_success_and_text 'oc set probe dc/simple-deployment --liveness --get-url=http://google.com:80 --dry-run -o name' 'deploymentconfigs/simple-deployment' | |||
os::cmd::expect_success_and_text 'oc set probe dc/simple-deployment --liveness --get-url=http://google.com:80 --dry-run -o name' 'deploymentconfig/simple-deployment' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@juanvallejo @soltysh another command to update
@@ -117,7 +117,7 @@ echo "certs: ok" | |||
os::test::junit::declare_suite_end | |||
|
|||
os::test::junit::declare_suite_start "cmd/admin/groups" | |||
os::cmd::expect_success_and_text 'oc adm groups new shortoutputgroup -o name' 'groups/shortoutputgroup' | |||
os::cmd::expect_success_and_text 'oc adm groups new shortoutputgroup -o name' 'group/shortoutputgroup' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@juanvallejo @soltysh another command to update
And I'm going to take a break. Someone is going to help fix the apiserver wiring, right.... right...? |
scaler, _ := kubectl.ScalerFor(kapi.Kind("ReplicationController"), client) | ||
|
||
// TODO: implement for RC? | ||
var scalesGetter scaleclient.ScalesGetter |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is now panicking the deployer pods...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's in a FIXME commit in a WIP PR… not quite ready for review :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mfojtik fixed
yeah, openshift/kubernetes@3ca675e effectively changed the default back to false. will do the same for the rebase and follow up with long term plan
no backport needed, realized this was fixing up an incorrect choice I made when picking replacements for the factory-provided decoder that went away. picked things that can go into master now into #19200 |
re @liggitt
I don't think so
Upstream used to tolerate this error with retries. There is no indication on the PR that changed it why this was done. fyi @derekwaynecarr |
@liggitt @sjenning - the issue described from the kubelet is tied to changes from LocalStorageCapacityIsolation feature, which unfortunately even when disabled still has kubelet do actions to figure out rootfs configuration. i am trying to figure out if we can change cAdvisor to handle /tmpfs as a --root-dir, will report back. |
/retest |
1 similar comment
/retest |
new commits that need reviewing (in service of fixing the "no kind Namespace found in v1" error the cluster-up tests were hitting when /oapi was unavailable)
|
seeing flakes in gcp (1-2 different failures in successive runs)
|
/retest |
@liggitt i believe that flakes are pre-existing |
👍 from me to merging as is |
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: liggitt, mfojtik The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@Kargakis the tide does not seem to handle merging of this PR, should we merge this by hand or is there some special label that enable merging for this one ? |
By hand for now. |
Green merging this after talking to @Kargakis |
@liggitt awesome job! |
Indeed! |
Go ahead and pause the origin->kubernetes publishing until we record the origin sha and adjust the bot to publish to the origin-3.10-kubernetes-1.10.0 branch |
@liggitt paused |
based on:
openshift start
openshift start --master-config=... --node-config=...
openshift start master --config=...
openshift start node --config=...
openshift start master api --config=...
openshift start master controllers --config=...
oc cluster up
extended_networkingfollow-ups:
-p
flag is removed--runtime-config
apis/
prefix deprecated-o name
--runtime-config=apis/...
-o name
in test/cmd tests and update to use grouped APIs - sweep commands outputting ungrouped objects with-o name
in test/cmd tests and update to use grouped APIs #19947