Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor cert handling #9463

Merged
merged 7 commits into from
Dec 22, 2023
Merged

Conversation

katheris
Copy link
Contributor

Type of change

  • Refactoring

Description

Tidy up certificate code:

  • Remove duplicated logic to create Map for Secret data
  • Remove certificate logic from ModelUtils class

This PR contributes to #5630. I have created this as a separate PR to make it easier to review, this change reduces the number of places that are handling certificates to make it easier to remove the dependence on Kubernetes Secrets in a future PR.

Checklist

Please go through this checklist and make sure all applicable tasks have been done

  • Write tests
  • Make sure all tests pass
  • Update documentation
  • Check RBAC rights for Kubernetes / OpenShift roles
  • Try your changes from Pod inside your Kubernetes and OpenShift cluster, not just locally
  • Reference relevant issue(s) and close them after merging
  • Update CHANGELOG.md
  • Supply screenshots for visual changes, such as Grafana dashboards

@katheris katheris marked this pull request as ready for review December 14, 2023 18:00
Copy link
Member

@scholzj scholzj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The certificate-manager module has nothing to do with Secrets. It is only responsible for issuing the certificates. So I'm not sure if we should put any logic related to the Secrets into it. I'm surprised it even builds, because it IMHO should not have any dependency on Fabric8 and should therefore not know the Secret class.

@@ -25,6 +25,117 @@
*/
public class SecretCertProvider {

/**
* A certificate entry in a ConfigMap. Each entry contains an entry name and data.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* A certificate entry in a ConfigMap. Each entry contains an entry name and data.
* A certificate entry in a Secret. Each entry contains an entry name and data.

@scholzj scholzj added this to the 0.40.0 milestone Dec 15, 2023
@katheris
Copy link
Contributor Author

The certificate-manager module has nothing to do with Secrets. It is only responsible for issuing the certificates. So I'm not sure if we should put any logic related to the Secrets into it. I'm surprised it even builds, because it IMHO should not have any dependency on Fabric8 and should therefore not know the Secret class.

The SecretCertProvider class was already in the certificate-manager module and contained methods to create Secrets for certificates so I moved the methods there rather than creating another separate class that deals with certificate Secrets.

It looks like the addSecret methods in that class are currently used from the Ca class which is in operator-common, so the other methods in SecretCertProvider that are there already can't be moved into cluster-operator without a refactor of the Ca class. However I could move the new methods that I am adding to SecretCertProvider to a new class in cluster-operator?

In a follow-up PR I think it would be good to remove any mention of Secrets from the Ca, but I am hesitant to do that as part of this PR as the bigger the refactor the harder it is to review. I did wonder though if as part of that refactor the best outcome would be to have the cluster-operator and user-operator modules have their own separate logic for storing certificates in Secrets, rather than having any of that in operator-common. But I figured that change probably needed an agreed proposal first, what do you think?

@scholzj
Copy link
Member

scholzj commented Dec 18, 2023

The certificate-manager module has nothing to do with Secrets. It is only responsible for issuing the certificates. So I'm not sure if we should put any logic related to the Secrets into it. I'm surprised it even builds, because it IMHO should not have any dependency on Fabric8 and should therefore not know the Secret class.

The SecretCertProvider class was already in the certificate-manager module and contained methods to create Secrets for certificates so I moved the methods there rather than creating another separate class that deals with certificate Secrets.

It looks like the addSecret methods in that class are currently used from the Ca class which is in operator-common, so the other methods in SecretCertProvider that are there already can't be moved into cluster-operator without a refactor of the Ca class. However I could move the new methods that I am adding to SecretCertProvider to a new class in cluster-operator?

In a follow-up PR I think it would be good to remove any mention of Secrets from the Ca, but I am hesitant to do that as part of this PR as the bigger the refactor the harder it is to review. I did wonder though if as part of that refactor the best outcome would be to have the cluster-operator and user-operator modules have their own separate logic for storing certificates in Secrets, rather than having any of that in operator-common. But I figured that change probably needed an agreed proposal first, what do you think?

That is a good point. But maybe the best would be to move the SecretCertProvider to operator-common rather than add to it? (It also seems to have its own issues as half of the methods in that class are used only by its test, so they probably should not be there at all) WDYT?

@scholzj
Copy link
Member

scholzj commented Dec 18, 2023

It might be also worth thinking about whether all of these things are really needed. SecretEntry seems like a weird thing used only in some parts of the project that can be easily replaced by a set of constants.

@katheris
Copy link
Contributor Author

That is a good point. But maybe the best would be to move the SecretCertProvider to operator-common rather than add to it? (It also seems to have its own issues as half of the methods in that class are used only by its test, so they probably should not be there at all) WDYT?

That's a good idea. Yeah, I did also notice that. It seemed like this whole module was added in a commit to introduce a separate certificate module, but the commit message implies that there might be follow up changes, but I didn't see any. @ppatierno do you know if there were other changes expected that would use these methods that are currently only used in test? If not then I think we should definitely clean up the methods that aren't being used.

@katheris
Copy link
Contributor Author

katheris commented Dec 18, 2023

It might be also worth thinking about whether all of these things are really needed. SecretEntry seems like a weird thing used only in some parts of the project that can be easily replaced by a set of constants.

My hope is to get to a point that we only have .p12 etc extensions in a single place, rather than constructed differently everywhere, but when I started looking through the changes needed it made sense to do this as a separate PR as there are so many different uses of it. I expect SecretEntry will get replaced with private constants only used by a single class in the end.

* Change ConfigMap to Secret in Javadoc

Signed-off-by: Katherine Stanley <[email protected]>
Copy link
Member

@scholzj scholzj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left bunch of comments. But honestly, I still don't get how you decided what should be moved to SecretCertProvider and what to CertUtils. They seem to originate from cluster-operator (at least those I checked - apart from SecretEntry) as well. So why not move them to CertUtils and keep them in the same module instead of moving it into module that is shared more widely into a class that does not seem to fit the module anyway?

/**
* @return the generation of the current CA certificate as an annotation
*/
public Map<String, String> caCertGenerationFullAnnotation() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • I think we use to do Map.of() in most of the new code.
  • Should it return Map.Entry if it is a singular by name?
  • Why is this used from Cruise Control, ZooKeeper and then from CertUtils?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated to change it to Map.Entry. At the moment all of the clustered components (Cruise Control, ZooKeeper, Kafka) call ModelUtils directly to create certificate Secrets. The other components with a single pod call buildTrustedCertificateSecret which is in CertUtils. So that's why there are call from the different places.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do all the tests you modified here really test the ModelUtils? Or do they now test the CertUtils?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are certificate renewal tests, so not explicitly testing either ModelUtils or CertUtils

@katheris
Copy link
Contributor Author

I left bunch of comments. But honestly, I still don't get how you decided what should be moved to SecretCertProvider and what to CertUtils. They seem to originate from cluster-operator (at least those I checked - apart from SecretEntry) as well. So why not move them to CertUtils and keep them in the same module instead of moving it into module that is shared more widely into a class that does not seem to fit the module anyway?

I was trying to keep a separation between the classes that need to understand the structure of how certificates are stored in Kube Secrets and those that don't. However, I think you're right that sharing those methods more widely as a result is not the right choice. I think for this PR keeping them in CertUtils makes sense, and then I can hopefully create the better separation in a follow-up. I'll make some changes

@scholzj
Copy link
Member

scholzj commented Dec 19, 2023

Are the CI failures related to the PR? Especially these ones?

[ERROR] Errors: 
[ERROR] io.strimzi.operator.cluster.operator.assembly.CertificateRenewalTest.testRenewalOfDeploymentCertificatesDelayedRenewal
[ERROR]   Run 1: CertificateRenewalTest.testRenewalOfDeploymentCertificatesDelayedRenewal:109 » NullPointer Cannot invoke "java.util.Map$Entry.getKey()" because "entries[0]" is null
[ERROR]   Run 2: CertificateRenewalTest.testRenewalOfDeploymentCertificatesDelayedRenewal:109 » NullPointer Cannot invoke "java.util.Map$Entry.getKey()" because "entries[0]" is null
[ERROR]   Run 3: CertificateRenewalTest.testRenewalOfDeploymentCertificatesDelayedRenewal:109 » NullPointer Cannot invoke "java.util.Map$Entry.getKey()" because "entries[0]" is null
[ERROR]   Run 4: CertificateRenewalTest.testRenewalOfDeploymentCertificatesDelayedRenewal:109 » NullPointer Cannot invoke "java.util.Map$Entry.getKey()" because "entries[0]" is null
[ERROR]   Run 5: CertificateRenewalTest.testRenewalOfDeploymentCertificatesDelayedRenewal:109 » NullPointer Cannot invoke "java.util.Map$Entry.getKey()" because "entries[0]" is null
[ERROR]   Run 6: CertificateRenewalTest.testRenewalOfDeploymentCertificatesDelayedRenewal:109 » NullPointer Cannot invoke "java.util.Map$Entry.getKey()" because "entries[0]" is null
[INFO] 
[ERROR] io.strimzi.operator.cluster.operator.assembly.CertificateRenewalTest.testRenewalOfDeploymentCertificatesDelayedRenewalOutsideOfMaintenanceWindow
[ERROR]   Run 1: CertificateRenewalTest.testRenewalOfDeploymentCertificatesDelayedRenewalOutsideOfMaintenanceWindow:142 » NullPointer Cannot invoke "java.util.Map$Entry.getKey()" because "entries[0]" is null
[ERROR]   Run 2: CertificateRenewalTest.testRenewalOfDeploymentCertificatesDelayedRenewalOutsideOfMaintenanceWindow:142 » NullPointer Cannot invoke "java.util.Map$Entry.getKey()" because "entries[0]" is null
[ERROR]   Run 3: CertificateRenewalTest.testRenewalOfDeploymentCertificatesDelayedRenewalOutsideOfMaintenanceWindow:142 » NullPointer Cannot invoke "java.util.Map$Entry.getKey()" because "entries[0]" is null
[ERROR]   Run 4: CertificateRenewalTest.testRenewalOfDeploymentCertificatesDelayedRenewalOutsideOfMaintenanceWindow:142 » NullPointer Cannot invoke "java.util.Map$Entry.getKey()" because "entries[0]" is null
[ERROR]   Run 5: CertificateRenewalTest.testRenewalOfDeploymentCertificatesDelayedRenewalOutsideOfMaintenanceWindow:142 » NullPointer Cannot invoke "java.util.Map$Entry.getKey()" because "entries[0]" is null
[ERROR]   Run 6: CertificateRenewalTest.testRenewalOfDeploymentCertificatesDelayedRenewalOutsideOfMaintenanceWindow:142 » NullPointer Cannot invoke "java.util.Map$Entry.getKey()" because "entries[0]" is null
[INFO] 
[ERROR] io.strimzi.operator.cluster.operator.assembly.CertificateRenewalTest.testRenewalOfDeploymentCertificatesWithNullSecret
[ERROR]   Run 1: CertificateRenewalTest.testRenewalOfDeploymentCertificatesWithNullSecret:43 » NullPointer Cannot invoke "java.util.Map$Entry.getKey()" because "entries[0]" is null
[ERROR]   Run 2: CertificateRenewalTest.testRenewalOfDeploymentCertificatesWithNullSecret:43 » NullPointer Cannot invoke "java.util.Map$Entry.getKey()" because "entries[0]" is null
[ERROR]   Run 3: CertificateRenewalTest.testRenewalOfDeploymentCertificatesWithNullSecret:43 » NullPointer Cannot invoke "java.util.Map$Entry.getKey()" because "entries[0]" is null
[ERROR]   Run 4: CertificateRenewalTest.testRenewalOfDeploymentCertificatesWithNullSecret:43 » NullPointer Cannot invoke "java.util.Map$Entry.getKey()" because "entries[0]" is null
[ERROR]   Run 5: CertificateRenewalTest.testRenewalOfDeploymentCertificatesWithNullSecret:43 » NullPointer Cannot invoke "java.util.Map$Entry.getKey()" because "entries[0]" is null
[ERROR]   Run 6: CertificateRenewalTest.testRenewalOfDeploymentCertificatesWithNullSecret:43 » NullPointer Cannot invoke "java.util.Map$Entry.getKey()" because "entries[0]" is null
[INFO] 
[ERROR] io.strimzi.operator.cluster.operator.assembly.CertificateRenewalTest.testRenewalOfDeploymentCertificatesWithRenewingCa
[ERROR]   Run 1: CertificateRenewalTest.testRenewalOfDeploymentCertificatesWithRenewingCa:76 » NullPointer Cannot invoke "java.util.Map$Entry.getKey()" because "entries[0]" is null
[ERROR]   Run 2: CertificateRenewalTest.testRenewalOfDeploymentCertificatesWithRenewingCa:76 » NullPointer Cannot invoke "java.util.Map$Entry.getKey()" because "entries[0]" is null
[ERROR]   Run 3: CertificateRenewalTest.testRenewalOfDeploymentCertificatesWithRenewingCa:76 » NullPointer Cannot invoke "java.util.Map$Entry.getKey()" because "entries[0]" is null
[ERROR]   Run 4: CertificateRenewalTest.testRenewalOfDeploymentCertificatesWithRenewingCa:76 » NullPointer Cannot invoke "java.util.Map$Entry.getKey()" because "entries[0]" is null
[ERROR]   Run 5: CertificateRenewalTest.testRenewalOfDeploymentCertificatesWithRenewingCa:76 » NullPointer Cannot invoke "java.util.Map$Entry.getKey()" because "entries[0]" is null
[ERROR]   Run 6: CertificateRenewalTest.testRenewalOfDeploymentCertificatesWithRenewingCa:76 » NullPointer Cannot invoke "java.util.Map$Entry.getKey()" because "entries[0]" is null

@katheris
Copy link
Contributor Author

Are the CI failures related to the PR? Especially these ones?

Yeah, looks like changing the Collections.SingletonMap to Map.of caused some failures related to mock methods, I've added the correct mocking to the tests.

Signed-off-by: Katherine Stanley <[email protected]>
@katheris
Copy link
Contributor Author

@scholzj I've added an extra commit that makes better use of the SecretEntry enum

@scholzj
Copy link
Member

scholzj commented Dec 20, 2023

/azp run regression

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@ppatierno
Copy link
Member

@ppatierno do you know if there were other changes expected that would use these methods that are currently only used in test? If not then I think we should definitely clean up the methods that aren't being used.

It's hard to remember and tell. Maybe they were used before then not anymore and being left overs in the tests. I think that after so long time having this cert manager implementation working, we can clean up by removing these methods.

Copy link
Member

@ppatierno ppatierno left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for the refactoring!

@scholzj
Copy link
Member

scholzj commented Dec 21, 2023

/azp run regression

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@scholzj
Copy link
Member

scholzj commented Dec 21, 2023

/azp run kraft-regression

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@scholzj scholzj merged commit e57742e into strimzi:main Dec 22, 2023
29 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants