Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Watching tls cert/key file changes to reload the server automatically #2850

Closed
3 of 6 tasks
StarAurryon opened this issue Nov 10, 2021 · 6 comments
Closed
3 of 6 tasks
Labels
feat New feature or request. stale Feedback from one or more authors is required to proceed.

Comments

@StarAurryon
Copy link
Contributor

Preflight checklist

Describe your problem

When using hydra within kubernetes and with certmanager the TLS certificates are only valid for 90 days. After 60 days cert manager renew the cert and update the files in the Pod.

I have not found anything in the doc.

Describe your ideal solution

Hydra (but also Kratos) could watch the TLS key and cert files if defined in the config file for change and restart the http server if needed.

Workarounds or alternatives

Restarting the service with an external supervisor.

Version

All

Additional Context

If there is no solution, as always I can provide a PR ;)

@StarAurryon StarAurryon added the feat New feature or request. label Nov 10, 2021
@aeneasr
Copy link
Member

aeneasr commented Nov 11, 2021

PRs welcomed :) Technically, I think this probably needs a HTTP server restart which might be difficult (but probably achievable) to implement given the way how we bootstrap.

It will need a lot of tests to ensure the restart happens cleanly (e.g. no left-over config instances, no left-over FS watches, etc).

@zepatrik
Copy link
Member

zepatrik commented Nov 15, 2021

One could also not re-initialize everything, but instead keep the registry and just start a new http.Server that uses the same handlers? That results still in a short down-time I assume, but if you instead keep the same net.Socket and only create a new server it could even happen with zero downtime?
Definitely needs some smart ideas.

@StarAurryon
Copy link
Contributor Author

StarAurryon commented Nov 15, 2021

That results still in a short down-time

I don't know how go handle the closing of the http server. I hope that it finalizes current requests properly. After that clients might have their tls session dropped as the key/cert will be changed. I don't know how this is handled with HTTP2 as multiple requests can use the same TCP session, if i am not wrong.

If you instead keep the same net.Socket

I believe that's the way to go as you don't have to check again if the port is already bound and if Kubernetes check it at a wrong time the service could be considered as down. Its better to keep the port open and delay the request acceptance.

@aeneasr
Copy link
Member

aeneasr commented Nov 28, 2021

@StarAurryon
Copy link
Contributor Author

StarAurryon commented Dec 8, 2021

I have made some research. In my opinion the best way to implement this is to use tls.Config GetCertificate func(*ClientHelloInfo) (*Certificate, error) https://pkg.go.dev/crypto/tls#Config.

Basically each time a new TLS session is initiated this function is called.

What I propose is to:

  1. Load the cert + key files at startup and crash if it's not ok.
  2. Watch the certificate file with iNotify mechanism.
  3. If changed test it to verify that it is a valid file cert + key files or throw an error message.
  4. Replace the certificate provided by the GetCertificate session if 3 was ok.

@github-actions
Copy link

github-actions bot commented Dec 9, 2022

Hello contributors!

I am marking this issue as stale as it has not received any engagement from the community or maintainers for a year. That does not imply that the issue has no merit! If you feel strongly about this issue

  • open a PR referencing and resolving the issue;
  • leave a comment on it and discuss ideas on how you could contribute towards resolving it;
  • leave a comment and describe in detail why this issue is critical for your use case;
  • open a new issue with updated details and a plan for resolving the issue.

Throughout its lifetime, Ory has received over 10.000 issues and PRs. To sustain that growth, we need to prioritize and focus on issues that are important to the community. A good indication of importance, and thus priority, is activity on a topic.

Unfortunately, burnout has become a topic of concern amongst open-source projects.

It can lead to severe personal and health issues as well as opening catastrophic attack vectors.

The motivation for this automation is to help prioritize issues in the backlog and not ignore, reject, or belittle anyone.

If this issue was marked as stale erroneously you can exempt it by adding the backlog label, assigning someone, or setting a milestone for it.

Thank you for your understanding and to anyone who participated in the conversation! And as written above, please do participate in the conversation if this topic is important to you!

Thank you 🙏✌️

@github-actions github-actions bot added the stale Feedback from one or more authors is required to proceed. label Dec 9, 2022
@github-actions github-actions bot closed this as completed Jan 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feat New feature or request. stale Feedback from one or more authors is required to proceed.
Projects
None yet
Development

No branches or pull requests

3 participants