Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

k8s e2e tests [149/149] #533

Closed
runcom opened this issue May 20, 2017 · 34 comments
Closed

k8s e2e tests [149/149] #533

runcom opened this issue May 20, 2017 · 34 comments

Comments

@runcom
Copy link
Member

runcom commented May 20, 2017

Not that bad for a very first run of k8s e2e tests: 117/151 😸

Full logs available at:

test yourself with:

# kube commit 2899f47bc

$ sudo setenforce 0

# either this way or start crio.service
$ cd $GOPATH/src/github.com/kubernetes-incubator/cri-o && \
  sudo ./crio --cgroup-manager=systemd --log debug.log --debug --runtime \
  $GOPATH/src/github.com/opencontainers/runc/runc --conmon $PWD/conmon/conmon \
  --seccomp-profile=$PWD/seccomp.json

$ sudo PATH=$GOPATH/src/k8s.io/kubernetes/third_party/etcd:${PATH} \
  PATH=$PATH GOPATH=$GOPATH \
  ALLOW_PRIVILEGED=1 \
  CONTAINER_RUNTIME=remote CONTAINER_RUNTIME_ENDPOINT='/var/run/crio.sock \
  --runtime-request-timeout=5m' ALLOW_SECURITY_CONTEXT="," \
  DNS_SERVER_IP="192.168.1.5" API_HOST="192.168.1.5" \
  API_HOST_IP="192.168.1.5" KUBE_ENABLE_CLUSTER_DNS=true ./hack/local-up-cluster.sh

# on Fedora
$ sudo systemctl stop firewalld
$ sudo iptables -F

$ KUBERNETES_PROVIDER=local KUBECONFIG=/var/run/kubernetes/admin.kubeconfig \
  go run hack/e2e.go -v --test -test_args="-host=https://localhost:6443 --ginkgo.focus=\[Conformance\]" \
  | tee e2e.log

# enjoy
Summarizing 2 Failures:

[Fail] [k8s.io] Projected [It] should project all components that make up the projection API [Conformance] [Volume] [Projection]
/home/amurdaca/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/test/e2e/framework/util.go:2213

[Fail] [k8s.io] DNS [It] should provide DNS for the cluster [Conformance]
/home/amurdaca/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/test/e2e/dns.go:223

Ran 151 of 603 Specs in 5721.093 seconds
FAIL! -- 149 Passed | 2 Failed | 0 Pending | 452 Skipped --- FAIL: TestE2E (5721.12s)
FAIL

Ginkgo ran 1 suite in 1h35m21.34754297s
Test Suite Failed
@mrunalp
Copy link
Member

mrunalp commented May 21, 2017

Nice - next target is getting these to 0 failures :)

@runcom
Copy link
Member Author

runcom commented May 22, 2017

alright, trimming down the list above a lot :) many of the failures is just beacuse local-up-cluster.sh runs with SecurityContextDeny. I'll re-run the test and post an updated list :)

(also updated first comment for a way to run e2e for anyone to try out)

@runcom runcom changed the title k8s e2e tests k8s e2e tests [117/151] May 22, 2017
@runcom
Copy link
Member Author

runcom commented May 22, 2017

FYI I figured the following test:

[Fail] [k8s.io] EmptyDir volumes [It] volume on default medium should have the correct mode [Conformance] [Volume]
/home/amurdaca/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/test/e2e/framework/pods.go:76

will be fixed by using ALLOW_PRIVILEGED=1 in local-up-cluster.sh or, if you run k8s directly, by adding --allow-privileged to kubelet.

Not sure that fixes other tests here, I'll re-run them once I finish the run I started in my previous comment :)

@runcom runcom changed the title k8s e2e tests [117/151] k8s e2e tests [137/151] May 22, 2017
@runcom
Copy link
Member Author

runcom commented May 22, 2017

first comment updated with new result, 137/151, going to re-run the suite as per my previous comment.

@runcom
Copy link
Member Author

runcom commented May 22, 2017

diff --git a/server/container_create.go b/server/container_create.go
index c15985e..5cbab69 100644
--- a/server/container_create.go
+++ b/server/container_create.go
@@ -589,6 +589,14 @@ func (s *Server) createSandboxContainer(ctx context.Context, containerID string,
 
 	containerImageConfig := containerInfo.Config
 
+	// TODO: volume handling in CRI-O
+	//       right now, we do just mount tmpfs in order to have images like
+	//       gcr.io/k8s-testimages/redis:e2e to work with CRI-O
+	for dest := range containerImageConfig.Config.Volumes {
+		destOptions := []string{"mode=1777", "size=" + strconv.Itoa(64*1024*1024), label.FormatMountLabel("", sb.mountLabel)}
+		specgen.AddTmpfsMount(dest, destOptions)
+	}
+
 	processArgs, err := buildOCIProcessArgs(containerConfig, containerImageConfig)
 	if err != nil {
 		return nil, err

The patch above makes many other tests pass for tests with images that defines a Config.Volumes map. Can be used as a workaround for now till we have correct volume handling in CRI-O. I'll make a PR for it shortly cause there's no reason for now to have those images not working. The tmpfs stuff is an hack, yes, still useful for now.

@mrunalp
Copy link
Member

mrunalp commented May 22, 2017

I think we can add this tmpfs volume as a temporary fix. The downside is more RAM usage for containers with VOLUMEs so would want to move to disk-backed cri-o managed volumes.

@runcom
Copy link
Member Author

runcom commented May 22, 2017

I think we can add this tmpfs volume as a temporary fix. The downside is more RAM usage for containers with VOLUMEs so would want to move to disk-backed cri-o managed volumes.

I'll test with that patch and see how it goes. Will report status here

@runcom
Copy link
Member Author

runcom commented May 22, 2017

for the record, I'm running e2e on Fedora 25 and all network tests currently failing seem to be resolved by just sudo iptables -F before running tests again - no clue why but it's something documented

@mrunalp
Copy link
Member

mrunalp commented May 22, 2017

@runcom iptables -F removes all the rules so not surprising :) We do want to fix it the right way by just adding the right rules. I wouldn't advise running iptables -F outside of a test VM.

@runcom runcom changed the title k8s e2e tests [137/151] k8s e2e tests [139/151] May 22, 2017
@runcom
Copy link
Member Author

runcom commented May 22, 2017

139/151, updated first comment and logs available at https://runcom.red/e2e-1.log

@mrunalp
Copy link
Member

mrunalp commented May 22, 2017

One of them is the projected failure which is really a bug in kube so one less to worry about ;)

@runcom
Copy link
Member Author

runcom commented May 22, 2017

1st and 2nd test used to pass with node-e2e though (and also if you look at the first run of e2e tests I did https://runcom.red/e2e.log you can see them passing), we need to understand why they're failing now.

@runcom
Copy link
Member Author

runcom commented May 28, 2017

[Fail] [k8s.io] PreStop [It] should call prestop when killing a pod [Conformance]
/home/amurdaca/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/test/e2e/pre_stop.go:174

Fixed by #537

@mrunalp
Copy link
Member

mrunalp commented May 28, 2017 via email

@runcom
Copy link
Member Author

runcom commented May 28, 2017

Port forwarding tests panic, fix is here #542

though, it seems our nsenter/socat implementation isn't correctly working (even if it's copy/pasted from the dockershim one, I verified this by running the same tests with docker, and they pass with it, but fail with CRI-O for some weird, network, issue I guess).

@mrunalp
Copy link
Member

mrunalp commented May 28, 2017 via email

@runcom
Copy link
Member Author

runcom commented May 28, 2017

found the root cause, it's actually a bug in CRI-O itself, port forwarding tests now passes with #542 + a fix coming in a moment 😎

@runcom
Copy link
Member Author

runcom commented May 28, 2017

All port forwarding tests fixed by #542 and #543 - @mrunalp PTAL at those :)

@runcom
Copy link
Member Author

runcom commented May 28, 2017

I'll re-run the whole e2e once those 2 PRs are merged.

note that these 4 pass also:

[Fail] [k8s.io] Kubectl client [k8s.io] Kubectl run rc [It] should create an rc from an image [Conformance]
/home/amurdaca/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/test/e2e/kubectl.go:1165

[Fail] [k8s.io] Kubectl client [k8s.io] Kubectl rolling-update [It] should support rolling-update to same image [Conformance]
/home/amurdaca/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/test/e2e/kubectl.go:177

[Fail] [k8s.io] Probing container [It] should *not* be restarted with a /healthz http liveness probe [Conformance]
/home/amurdaca/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/test/e2e/common/container_probe.go:404

[Fail] [k8s.io] KubeletManagedEtcHosts [It] should test kubelet managed /etc/hosts file [Conformance]
/home/amurdaca/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/test/e2e/framework/exec_util.go:107

@runcom runcom changed the title k8s e2e tests [139/151] k8s e2e tests [144/151] May 29, 2017
@runcom
Copy link
Member Author

runcom commented May 29, 2017

we're now at 144/151 🎉 , I've updated the result in the first comment so look there for failing tests. The remaining failing tests are either network issues (probably because of my laptop) or tests that relies on attach (but @mrunalp is working on it 👍)

@runcom
Copy link
Member Author

runcom commented May 29, 2017

@sameo @mrunalp this test works fine only after disabling firewalld and flushing iptables (and also applying #544 to master), otherwise it always fail (maybe more net tests are also blocked by firewalld)

systemctl stop firewalld
iptables -F
[Fail] [k8s.io] PreStop [It] should call prestop when killing a pod [Conformance]
/home/amurdaca/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/test/e2e/pre_stop.go:174

@runcom
Copy link
Member Author

runcom commented May 29, 2017

So 3 of the 7 failures are flaky or misconfigurations, we need to tackle just 4 then, that are all related to attach :)

@mrunalp
Copy link
Member

mrunalp commented May 29, 2017 via email

@runcom
Copy link
Member Author

runcom commented Jun 5, 2017

Turns out this test:

[Fail] [k8s.io] Kubectl client [k8s.io] Guestbook application [It] should create and stop a working application [Conformance]
/home/amurdaca/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/test/e2e/kubectl.go:1718

is failing just because we run the tests w/o a DNS service when doing local-up-cluster, one way to fix it when running the test is to follow this https://github.com/linzichang/kubernetes/blob/master/examples/guestbook/README.md#finding-a-service

(in the CI, we'll likely switch to env DNS as pointed out in that readme)

@mrunalp
Copy link
Member

mrunalp commented Jun 5, 2017

Yeah, we could switch to env for that test 👍

@runcom
Copy link
Member Author

runcom commented Jun 5, 2017

That test is basically failing because hack/local-up-cluster.sh is docker specific I guess https://github.com/kubernetes/kubernetes/blob/master/hack/local-up-cluster.sh#L52-L58

@runcom
Copy link
Member Author

runcom commented Jun 7, 2017

So the only test failing now is the attach one :) I just confirmed the following test works fine (it was a kubelet misconfiguration wrt to kube-dns). We need to enable kube-dns in local-up-cluster to test it, like this:

$ sudo PATH=$GOPATH/src/k8s.io/kubernetes/third_party/etcd:${PATH} \    
  PATH=$PATH GOPATH=$GOPATH \
  ALLOW_PRIVILEGED=1 \
  CONTAINER_RUNTIME=remote CONTAINER_RUNTIME_ENDPOINT='/var/run/crio.sock \
  --runtime-request-timeout=5m' ALLOW_SECURITY_CONTEXT="," DNS_SERVER_IP="192.168.1.5" API_HOST="192.168.1.5" API_HOST_IP="192.168.1.5" KUBE_ENABLE_CLUSTER_DNS=true ./hack/local-up-cluster.sh

After setting KUBE_ENABLE_CLUSTER_DNS=true and setting API_HOST, API_HOST_IP and DNS_SERVER_IP to the vm ip address, the test passes:

• [SLOW TEST:116.981 seconds]
[k8s.io] Kubectl client
/home/amurdaca/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/test/e2e/framework/framework.go:656
  [k8s.io] Guestbook application
  /home/amurdaca/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/test/e2e/framework/framework.go:656
    should create and stop a working application [Conformance]
    /home/amurdaca/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/test/e2e/kubectl.go:375
------------------------------
SSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSJun  7 10:54:12.965: INFO: Running AfterSuite actions on all node
Jun  7 10:54:12.965: INFO: Running AfterSuite actions on node 1

Ran 1 of 603 Specs in 117.032 seconds
SUCCESS! -- 1 Passed | 0 Failed | 0 Pending | 602 Skipped PASS

Ginkgo ran 1 suite in 1m57.331986003s
Test Suite Passed

So, we are at 148/151 :)

@runcom runcom changed the title k8s e2e tests [144/151] k8s e2e tests [148/151] Jun 7, 2017
@runcom
Copy link
Member Author

runcom commented Jun 7, 2017

Updated first comment :)

@runcom
Copy link
Member Author

runcom commented Jun 14, 2017

attach test now passes! 🎉 🎉 🎉

@rajatchopra @dcbw @sameo could you help figuring out the network flakyness? probably some misconfiguration:

[Fail] [k8s.io] DNS [It] should provide DNS for the cluster [Conformance]

the above is never passing for some reason

The test below passes only if we stop firewalld and flush iptables before running tests:

# systemctl stop firewalld
# iptables -F

[Fail] [k8s.io] PreStop [It] should call prestop when killing a pod [Conformance]
/home/amurdaca/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/test/e2e/pre_stop.go:174

@runcom runcom changed the title k8s e2e tests [148/151] k8s e2e tests [149/151] Jun 16, 2017
@rhatdan
Copy link
Contributor

rhatdan commented Jul 11, 2017

@mrunalp Can we close this issue? I think all tests are passed now

@runcom
Copy link
Member Author

runcom commented Jul 11, 2017

let's leave this open till we figure the DNS stuff out

@runcom runcom changed the title k8s e2e tests [149/151] k8s e2e tests [151/151] Sep 8, 2017
@runcom
Copy link
Member Author

runcom commented Sep 8, 2017

FYI I got all tests running in the CI :) so i've updated the title and I'm just waiting for #631 to be merged :)

@runcom runcom changed the title k8s e2e tests [151/151] k8s e2e tests [149/149] Sep 9, 2017
@wking
Copy link
Contributor

wking commented Feb 23, 2018

FYI I got all tests running in the CI :) so i've updated the title and I'm just waiting for #631 to be merged :)

#631 was merged last September. I'm not sure if we want to leave it open as some sort of tracker issue? Personally, I prefer issue labels for that sort of thing.

@rhatdan
Copy link
Contributor

rhatdan commented Mar 16, 2018

I am going to close this issue, Since it seems to be fixed.

@rhatdan rhatdan closed this as completed Mar 16, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants