Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] Support embeded etcd #286

Closed
http403 opened this issue Jun 17, 2020 · 6 comments
Closed

[FEATURE] Support embeded etcd #286

http403 opened this issue Jun 17, 2020 · 6 comments
Assignees
Labels
enhancement New feature or request
Milestone

Comments

@http403
Copy link

http403 commented Jun 17, 2020

Is your feature request related to a problem or a Pull Request?

In #262 already described the cluster master-0 node can't shutdown otherwise unable to recover. dqlite also have problems with constant database locked error showed in k3s-io/k3s#755 and k3s-io/k3s#1917. In my case where I deploy Rancher on k3s, cattle-cluster-agent failed to start and seems the restart process overwhelmed dqlite and rancher/kind, causing cascading failure resulting all master node exited.

Scope of your request

  • different functionality for an existing command/flag
    • create

Describe the solution you'd like

In Jun 7, the embedded etcd had merged into k3s master. I believe switching to the embedded etcd will solve the problem and there are no reasons not to switch over in favor a more mainstream solution.

@http403 http403 added the enhancement New feature or request label Jun 17, 2020
@iwilltry42
Copy link
Member

Hi @http403 , thanks for opening this issue!
You're definitely right, that etcd promises a more stable experience here.
We'd also like to have it in place of dqlite.
k3d is a community project and I am not directly involved with the Rancher folks that are building k3s.
However, from that PR which introduced embedded etcd, we can take this comment:

This work is targeted to be included in k3s 1.19 (which should be released first half of August according to the k8s 1.19 release schedule).

So I fear that we just have to wait until it's fully included and active in k3s 🤔
If there's some feature flag, you can always use it by passing a flag to the k3s server command when starting the cluster (--k3s-server-arg) 👍

@iwilltry42 iwilltry42 self-assigned this Jun 18, 2020
@iwilltry42 iwilltry42 added this to the 3.1.0 milestone Jun 18, 2020
@iwilltry42 iwilltry42 modified the milestones: 3.1.0, 3.2.0 Oct 6, 2020
@iwilltry42
Copy link
Member

Happy to see that embedded etcd landed in k3s v1.19.x and will be promoted to GA/stable very soon :)
I tested it and it works pretty well so far 👍

@http403
Copy link
Author

http403 commented Nov 25, 2020

So I don't have to do anything and it will work out of the box?

@iwilltry42
Copy link
Member

@http403 the latest k3d version will give you k3s v1.19.x as the default k3s version (you may as well choose one via --image rancher/k3s:v1.19.whatever).
And those use etcd by default as the embedded database for HA setups (so you need to create more than one server or pass the --cluster-init flag to the single server).
Additionally, we already ironed out some issues that k3s faced with k3s/etcd a while ago.

@djfinnoy
Copy link

djfinnoy commented Feb 1, 2021

I'm still getting seeing this issue, am I doing something wrong?

$ k3d-version
k3d version v3.4.0
k3s version v1.19.4-k3s1 (default)

I create, stop and restart a cluster:

k3d cluster create dev --servers 3
k3d cluster stop dev
k3d cluster start dev

Afterwards, kubectl becomes unstable:

k get pods -n kube-system
Error from server (ServiceUnavailable): the server is currently unable to handle the request (get pods)

get pods -n kube-system
Error from server (ServiceUnavailable): the server is currently unable to handle the request (get pods)

k get pods -n kube-system
NAME                                     READY   STATUS      RESTARTS   AGE
coredns-66c464876b-ntj6m                 1/1     Running     0          6m39s
helm-install-traefik-zk7gh               0/1     Completed   0          6m39s
local-path-provisioner-7ff9579c6-z2zrg   1/1     Running     0          6m39s
metrics-server-7b4f8b595-58gqj           1/1     Running     0          6m39s
svclb-traefik-j42lj                      2/2     Running     0          6m13s
svclb-traefik-qxcn7                      2/2     Running     2          6m13s
svclb-traefik-v9hql                      2/2     Running     0          6m13s
traefik-5dd496474-bdb5d                  1/1     Running     0          6m14s

get pods -n kube-system
Error from server (ServiceUnavailable): the server is currently unable to handle the request (get pods)

@iwilltry42
Copy link
Member

Hi @djfinnoy , there's something going on: #452 & #467, that also makes it fail with etcd.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants