Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Workspace namespace is not prepared properly if metrics.k8s.io API is not enabled on the cluster #19869

Closed
23 tasks
Tracked by #20326
sleshchenko opened this issue May 26, 2021 · 8 comments
Closed
23 tasks
Tracked by #20326
Labels
area/che-server kind/bug Outline of a bug - must adhere to the bug report template. severity/P1 Has a major impact to usage or development of the system. sprint/next

Comments

@sleshchenko
Copy link
Member

Describe the bug

Workspace namespace is not prepared properly if metrics.k8s.io API is not enabled on the cluster.

Che version

  • latest
  • nightly
  • other: please specify

Steps to reproduce

  1. Get the cluster with disabled metrics.k8s.io API. I have
CodeReady Containers version: 1.25.0+0e5748c8
OpenShift version: 4.7.5 (embedded in executable)

and it's disabled there by default.
2. Deploy Che with OpenShift OAuth.
3. Login into Che as a developer.
4. Create any workspace.
5. Check that you got error

Workspace java-maven-5vxne failed to start. Failed to start the workspace with ID: workspaceqpox564qxxopua8h, reason: Internal Server Error occurred, error time: 2021-05-26 12:39:02

Expected behavior: Workspace is created and started without any issue with the first try.

  1. Try to start it again, it will be started. But it may fail in other places, since roles are not initialized properly.
    ^ the second and further steps work just due another bug, where we don't update Roles if ServiceAccount already exist Update existing ServiceAccount if some Role/RoleBinding was changed  #19697

Expected behavior: Workspace is fully working.

Runtime

  • kubernetes (include output of kubectl version)
  • Openshift (include output of oc version)
  • minikube (include output of minikube version and kubectl version)
  • minishift (include output of minishift version and oc version)
  • docker-desktop + K8S (include output of docker version and kubectl version)
  • crc 1.25.0+0e5748c8

Screenshots

Screenshot_20210526_154303

Installation method

  • chectl
    • provide a full command that was used to deploy Eclipse Che (including the output)
    • provide an output of chectl version command
  • OperatorHub
  • I don't know

Environment

  • my computer
    • Windows
    • Linux
    • macOS
  • Cloud
    • Amazon
    • Azure
    • GCE
    • other (please specify)
  • Dev Sandbox (workspaces.openshift.com)
  • other: please specify

Eclipse Che Logs

2021-05-26 12:39:02,053[nio-8080-exec-5]  [ERROR] [c.a.c.r.RuntimeExceptionMapper 47]   - Internal Server Error occurred, error time: 2021-05-26 12:39:02
io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: POST at: https://10.217.4.1/apis/authorization.openshift.io/v1/namespaces/developer-che/roles. Message: Forbidden!Configured service account doesn't have access. Service account may have been revoked. roles.authorization.openshift.io "workspace-metrics" is forbidden: user "developer" (groups=["system:authenticated:oauth" "system:authenticated"]) is attempting to grant RBAC permissions not currently held:
{APIGroups:["metrics.k8s.io"], Resources:["nodes"], Verbs:["get" "list" "watch"]}
{APIGroups:["metrics.k8s.io"], Resources:["pods"], Verbs:["get" "list" "watch"]}.
	at io.fabric8.kubernetes.client.dsl.base.OperationSupport.requestFailure(OperationSupport.java:583)
	at io.fabric8.kubernetes.client.dsl.base.OperationSupport.assertResponseCode(OperationSupport.java:520)
	at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:487)
	at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:448)

Additional context

@sleshchenko sleshchenko added kind/bug Outline of a bug - must adhere to the bug report template. area/che-server labels May 26, 2021
@che-bot che-bot added the status/need-triage An issue that needs to be prioritized by the curator responsible for the triage. See https://github. label May 26, 2021
@mmorhun mmorhun added severity/P1 Has a major impact to usage or development of the system. and removed status/need-triage An issue that needs to be prioritized by the curator responsible for the triage. See https://github. labels May 26, 2021
@l0rd
Copy link
Contributor

l0rd commented Jul 13, 2021

@skabashnyuk would you be able to work on this in current sprint? Looks like critical one

@skabashnyuk
Copy link
Contributor

@l0rd we planning to work on #19697

@l0rd
Copy link
Contributor

l0rd commented Jul 13, 2021

@skabashnyuk help me understand the impact of this issue vs #19697. This issue affects installations on CRC (workspaces fails to start) whereas #19697 looks like a less common use case (Che SA roler and rolebindings are updated, but I am not sure why).

@skabashnyuk
Copy link
Contributor

@l0rd for me its looks like test environment vs production. However, if you insist we can take #19869 instead of #19697

@l0rd
Copy link
Contributor

l0rd commented Jul 13, 2021

I don't want to change your priorities yet. I would like to understand when users are affected by #19697 first: always or in a particular circumstance?

@skabashnyuk
Copy link
Contributor

I think #19697 related to the namespaces/projects created prior #19651

@l0rd
Copy link
Contributor

l0rd commented Jul 13, 2021

Ok I think I have a better understanding and #19697 looks indeed more critical. I am continuing the discussion there.

@skabashnyuk
Copy link
Contributor

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/che-server kind/bug Outline of a bug - must adhere to the bug report template. severity/P1 Has a major impact to usage or development of the system. sprint/next
Projects
None yet
Development

No branches or pull requests

6 participants