Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test success but kubectl returns "couldn't get current server API group list" #23032

Closed
xmsanchez opened this issue Mar 14, 2023 · 6 comments
Closed
Labels

Comments

@xmsanchez
Copy link

xmsanchez commented Mar 14, 2023

Expected behavior:
Kubectl commands should return the requested resources.

NOTE: Installing Teleport on a kubernetes cluster through the helm chart WORKS. The issue happens when using AWS, based on the "starter-cluster" code: https://github.com/gravitational/teleport/tree/master/examples/aws/terraform/starter-cluster

Current behavior:

E0314 11:21:23.679777  356227 memcache.go:238] couldn't get current server API group list: an error on the server ("unknown") has prevented the request from succeeding
E0314 11:21:23.754503  356227 memcache.go:238] couldn't get current server API group list: an error on the server ("unknown") has prevented the request from succeeding
E0314 11:21:23.822577  356227 memcache.go:238] couldn't get current server API group list: an error on the server ("unknown") has prevented the request from succeeding
E0314 11:21:23.891471  356227 memcache.go:238] couldn't get current server API group list: an error on the server ("unknown") has prevented the request from succeeding
E0314 11:21:23.957102  356227 memcache.go:238] couldn't get current server API group list: an error on the server ("unknown") has prevented the request from succeeding
Error from server (InternalError): an error on the server ("unknown") has prevented the request from succeeding

If I output verbose I get:

E0314 11:22:47.936300  356449 memcache.go:238] couldn't get current server API group list: an error on the server ("unknown") has prevented the request from succeeding
I0314 11:22:47.936327  356449 cached_discovery.go:120] skipped caching discovery info due to an error on the server ("unknown") has prevented the request from succeeding
I0314 11:22:47.936585  356449 helpers.go:246] server response object: [{
  "metadata": {},
  "status": "Failure",
  "message": "an error on the server (\"unknown\") has prevented the request from succeeding",
  "reason": "InternalError",
  "details": {
    "causes": [
      {
        "reason": "UnexpectedServerResponse",
        "message": "unknown"
      }
    ]
  },
  "code": 500
}]

Bug details:

  • Teleport version --> 11.2.1 (using AWS OSS AMI: 126027368216/gravitational-teleport-ami-oss-11.2.1)
  • Recreation steps --> Add a new cluster, install the agent on the target cluster. Login and try any kubectl command (e.g.: kubectl get nodes)
  • Debug logs --> Attached above

Just for reference, this is my current configuration (sensible data REDACTED):

/etc/teleport.d/conf

TELEPORT_ROLE=auth,node,proxy
EC2_REGION=eu-west-1
TELEPORT_AUTH_SERVER_LB=localhost
TELEPORT_CLUSTER_NAME=teleport
[email protected]
TELEPORT_DOMAIN_NAME=teleport.example.com
TELEPORT_EXTERNAL_HOSTNAME=teleport.example.com
TELEPORT_DYNAMO_TABLE_NAME=teleport
TELEPORT_DYNAMO_EVENTS_TABLE_NAME=teleport-events
TELEPORT_LICENSE_PATH=
TELEPORT_LOCKS_TABLE_NAME=teleport-locks
TELEPORT_PROXY_SERVER_LB=teleport.example.com
TELEPORT_S3_BUCKET=teleport-eu-west-1-xxxxxxxxxxxxx
TELEPORT_ENABLE_MONGODB=false
TELEPORT_ENABLE_MYSQL=false
TELEPORT_ENABLE_POSTGRES=false
USE_LETSENCRYPT=true
USE_ACM=false

/etc/teleport.yaml

# Auto-generated by /usr/local/bin/teleport-generate-config from values in /etc/teleport.d/conf
teleport:
  nodename: ip-10-x-x-x-eu-west-1-compute-internal
  advertise_ip: 10.x.x.x
  log:
    # output: stderr
    output: /var/lib/teleport/teleport.log
    severity: DEBUG
  data_dir: /var/lib/teleport
  storage:
    type: dynamodb
    region: eu-west-1
    table_name: teleport
    audit_events_uri: dynamodb://teleport-events
    audit_sessions_uri: s3://teleport-eu-west-1-xxxxxxxxxxxxx/records

auth_service:
  enabled: yes
  keep_alive_interval: 1m
  keep_alive_count_max: 3
  listen_addr: 0.0.0.0:3025
  authentication:
    second_factor: otp
  cluster_name: teleport

ssh_service:
  enabled: yes
  listen_addr: 0.0.0.0:3022

proxy_service:
  enabled: yes
  listen_addr: 0.0.0.0:3023
  tunnel_listen_addr: 0.0.0.0:3080
  web_listen_addr: 0.0.0.0:3080
  public_addr: teleport.example.com:3080
  ssh_public_addr: teleport.example.com:3023
  tunnel_public_addr: teleport.example.com:3080
  https_keypairs:
    - cert_file: /var/lib/teleport/fullchain.pem
      key_file: /var/lib/teleport/privkey.pem
  kubernetes:
    enabled: yes
    listen_addr: 0.0.0.0:3026
    public_addr: ['teleport.example.com:3026']

Generated kubeconfig when using tsh login

apiVersion: v1
clusters:
- cluster:
    certificate-authority-data: REDACTED
    server: https://teleport.example.com:3026
  name: teleport
contexts:
- context:
    cluster: teleport
    user: teleport-test-cluster
  name: teleport-test-cluster
current-context: teleport-test-cluster
kind: Config
preferences: {}
users:
- name: teleport-test-cluster
  user:
    exec:
      apiVersion: client.authentication.k8s.io/v1beta1
      args:
      - kube
      - credentials
      - --kube-cluster=test-cluster
      - --teleport-cluster=teleport
      command: /usr/local/bin/tsh
      env: null
      provideClusterInfo: false
@xmsanchez xmsanchez added the bug label Mar 14, 2023
@xmsanchez
Copy link
Author

Solved. It was on me. I had to use a lot of verbose, though:
kubectl get nodes -v=10

This returned A LOT more information on the error.

I don't have the output anymore but basically the error was that the Teleport role contained multiple "kubernetes_users"**. So I ended up removing some roles for the user that was requesting access and it started working.

For anyone struggling with non-descriptive kubectl errors just use the verbose flag, it will help a lot.

I still need to understand better why this conflict appeared but at least I made it work.

@senthil262006
Copy link

swapoff -a
systemctl start crio
systemctl start kubelet.service
systemctl stop firewalld.service
export KUBECONFIG=/etc/kubernetes/admin.conf( i have logged as rootuser)

This is resolved this issue for me

@sukritisharma05
Copy link

@senthil262006 where should we execute these commands? Should it be done on teleport server ?

@Narullah404
Copy link

In case someone gets this error in future, I managed to resolve exporting AWS credentials e.g

in linux

export AWS_ACCESS_KEY_ID="<your access key>"

https://repost.aws/knowledge-center/eks-api-server-unauthorized-error

@ManuOlivarh
Copy link

swapoff -a systemctl start crio systemctl start kubelet.service systemctl stop firewalld.service export KUBECONFIG=/etc/kubernetes/admin.conf( i have logged as rootuser)

This is resolved this issue for me

@alifiroozi80
Copy link

Restarting Teleport Service was the trick for me:

sudo systemctl restart teleport

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants