Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CA rotation: controller-manager needs a separate ca.crt file #1350

Open
anguslees opened this issue Jan 15, 2019 · 33 comments
Open

CA rotation: controller-manager needs a separate ca.crt file #1350

anguslees opened this issue Jan 15, 2019 · 33 comments
Labels
area/security help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. sig/auth Categorizes an issue or PR as relevant to SIG Auth.
Milestone

Comments

@anguslees
Copy link
Member

What happened?

I tried to (manually) rotate my cluster's CA key over the weekend. I discovered that /etc/kubernetes/pki/ca.crt can actually include multiple CA keys, and this is key (hah!) to rotating the CA key.

kube-controller-manager however, can only accept a single key in the file pointed to by --cluster-signing-cert-file, since this is the key used to sign things, and not to verify things (so having multiple keys doesn't make sense). kube-controller-manager exits immediately (with a helpful error) if --cluster-signing-cert-file includes multiple keys.

I think pointing kube-controller-manager --cluster-signing-cert-file to ca.crt works for the simple (single key) case, but is incorrect in general, since it prevents ca.crt file from being used to rotate keys. I think the correct path is to either:

  • Use a different file for --cluster-signing-cert-file that only contains the single "primary" CA cert.
    or
  • Change kube-controller-manager upstream to only use the first cert in ca.crt or some other logic to ignore additional certs.

What you expected to happen?

Able to append a new CA cert to /e/k/pki/ca.crt and have both CA certs accepted by controller jobs without other impact.

How to reproduce it (as minimally and precisely as possible)?

Append an additional cert to /e/k/pki/ca.crt and restart kube-controller-manager pod

@neolit123 neolit123 added area/security priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. labels Jan 15, 2019
@neolit123 neolit123 added this to the v1.14 milestone Jan 15, 2019
@neolit123
Copy link
Member

cc @liztio @fabriziopandini

@fabriziopandini
Copy link
Member

@anguslees thanks for this issue and for the detailed explanation!
Certification rotation IMO is a topic that deserve more attention in general, and I think that we should re-iterate at sig level on two points that up to know have no clear prioritization:

  1. certificate rotation enhancements ??? (kubelet, kubeconfig files etc.)
  2. document how certificate rotation works (automatic or DIY)

What discussed in this issue is clearly part of 1; I personally prefer the idea to change the kube-controller-manager, because this will ease the pain on users, but I'm open to reconsider this in the context of the discussion above...

@fabriziopandini fabriziopandini added the help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. label Jan 23, 2019
@timothysc timothysc modified the milestones: v1.14, Next Jan 27, 2019
@timothysc
Copy link
Member

This is a broader topic and we should loop in @kubernetes/sig-auth-misc on approach + fixes.

@k8s-ci-robot k8s-ci-robot added the sig/auth Categorizes an issue or PR as relevant to SIG Auth. label Jan 27, 2019
@liggitt
Copy link
Member

liggitt commented Jan 27, 2019

For the immediate question, the signing cert/key given to the controller manager should be distinct from the trust bundle. Coupling the two and having kube-controller-manager treat the first cert in the bundle specially seems ill-advised, since order in trust bundles is not typically significant.

@rojkov
Copy link

rojkov commented Apr 24, 2019

Signing certs and trust bundles are clearly different things indeed.

So, I tried to separate one from the other in rojkov/kubernetes@afed24f. Cluster creation (and node join) seems to work.
The algorithm for cert renewal is such that new CA certs are added to the bundle and get removed only after their expiration date.
I'm not sure how to test upgrade scenarios though.

@anguslees
Copy link
Member Author

Signing certs and trust bundles are clearly different things indeed.

Agreed. To be clear though, the signing happens with a private key, already called out separately (--cluster-signing-key-file). We just need some way to find the corresponding public key (cert). I agree the two are conceptually separate and that the signing cert might not also be in the trust bundle somewhere, but this would be odd in this situation.

(I think we should just use a separate file, as @liggitt suggests and @rojkov's change implements. Just pointing out that we could modify controller-manager to try to find the matching public key in the trust bundle if we really wanted to avoid a separate file)

@astrieanna
Copy link

@rojkov it looks like you have already made some good changes for this issue. Are you planning to submit a PR, or interested in help testing your change?

We're interested in this feature as well and would consider working on a similar implementation and PR otherwise. Sounds like the current recommendation is to have a separate trust bundle file and keep the ca.crt file containing the signing credentials.

@rojkov
Copy link

rojkov commented May 22, 2019

@astrieanna I'm consumed by another project ATM, but I'll try to submit a PR by the end of this week.

@rojkov
Copy link

rojkov commented May 23, 2019

@astrieanna it seems a simple rebase is not enough now after @fabriziopandini refactored the cert renewal code. If you need to fix this issue urgently please consider submitting your own PR.

I'll get back to k8s in 2 weeks hopefully.

@neolit123 neolit123 modified the milestones: Next, v1.16 Jul 2, 2019
@neolit123 neolit123 removed the help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. label Jul 3, 2019
@neolit123
Copy link
Member

neolit123 commented Aug 3, 2019

hi, this topic needs a design proposal.
deadline for KEPs is done for 1.16 and thus this has to be move to the next cycle.

@neolit123 neolit123 modified the milestones: v1.16, v1.17 Aug 3, 2019
@fabriziopandini
Copy link
Member

As per kubeadm office hours discussion, we are considering certificate rotation in scope of the kubeadm operator #1698

@neolit123
Copy link
Member

this is too late for 1.17.
/milestone v1.18
also if someone has the time please create a proposal, ideally in google doc and let's discuss it.
/help

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 22, 2020
@anguslees
Copy link
Member Author

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 23, 2020
@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 21, 2021
@neolit123
Copy link
Member

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 21, 2021
@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 22, 2021
@SataQiu
Copy link
Member

SataQiu commented May 24, 2021

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 24, 2021
@Swetad90
Copy link

For this particular reason (ie change the pointing of the client-ca and cluster-signing-cert to different CA), I'm not able to automate the CA rotation. I thought, I can use a config file that can change the arguments kube-controller-manager accepts, but per the official doc https://kubernetes.io/docs/reference/command-line-tools-reference/kube-controller-manager/ doesn't show if KCM can accept config files.

I use command line arguments to start a KCM docker container, so changing arguments to point to different certs is not possible while rotating the CA.

@neolit123
Copy link
Member

this is the officially approved guide for rotating CA:
https://kubernetes.io/docs/tasks/tls/manual-rotation-of-ca-certificates/

it includes notes about client-ca-file and cluster-signing-cert-file.

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 29, 2021
@neolit123 neolit123 removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 23, 2021
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 21, 2022
@neolit123 neolit123 added lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Feb 22, 2022
@fabriziopandini
Copy link
Member

/unassign

@enj enj added this to SIG Auth Jan 9, 2023
@github-project-automation github-project-automation bot moved this to Needs Triage in SIG Auth Jan 9, 2023
@nilekhc nilekhc moved this from Needs Triage to Pending other SIGs in SIG Auth Apr 10, 2023
@chenk008
Copy link

chenk008 commented Jan 17, 2024

this is the officially approved guide for rotating CA: https://kubernetes.io/docs/tasks/tls/manual-rotation-of-ca-certificates/

it includes notes about client-ca-file and cluster-signing-cert-file.

May I ask why client-ca-file cannot be CA bundles? I think it is only related to KCM https auth.

@neolit123
Copy link
Member

neolit123 commented Jan 17, 2024

this is the officially approved guide for rotating CA: https://kubernetes.io/docs/tasks/tls/manual-rotation-of-ca-certificates/
it includes notes about client-ca-file and cluster-signing-cert-file.

May I ask why client-ca-file cannot be CA bundles? I think it is only related to KCM https auth.

if you have tested it and it works with bundles, you can PR the same page with a fix.
one would expect for it to work with bundles, because it about CAs for a client...
it's also documented as such here:
https://kubernetes.io/docs/reference/command-line-tools-reference/kube-controller-manager/

similar Q asked here:
kubernetes/website#31882

@anguslees
Copy link
Member Author

anguslees commented Jan 18, 2024

Oh, I see the doc text points back to this issue as justification for the kube-controller-manager exception ("Issue 1350 for kubeadm tracks an bug with the kube-controller-manager being unable to accept a CA bundle.")

To be clear: This (#1350) is a bug report against kubeadm, not kube-controller-manager. kube-controller-manager always accepted CA bundles in all the right places just fine, but the way kubeadm configured kube-controller-manager was incorrect and conflated CA bundle and CA signing cert. If you don't use kubeadm, or are willing to deviate from the kubeadm-generated config files, then there was never an issue here.

@et304383
Copy link

et304383 commented Jul 2, 2024

Oh, I see the doc text points back to this issue as justification for the kube-controller-manager exception ("Issue 1350 for kubeadm tracks an bug with the kube-controller-manager being unable to accept a CA bundle.")

To be clear: This (#1350) is a bug report against kubeadm, not kube-controller-manager. kube-controller-manager always accepted CA bundles in all the right places just fine, but the way kubeadm configured kube-controller-manager was incorrect and conflated CA bundle and CA signing cert. If you don't use kubeadm, or are willing to deviate from the kubeadm-generated config files, then there was never an issue here.

Can you please elaborate? If one used kubeadm previously, how does one move away from a kubeadm generated config file? IE - how would you propose updating in place using a CA bundle?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/security help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. sig/auth Categorizes an issue or PR as relevant to SIG Auth.
Projects
Status: Pending other SIGs
Development

No branches or pull requests