Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GLBC] Better certificate handling #639

Merged
merged 6 commits into from
Apr 26, 2017

Conversation

nicksardo
Copy link
Contributor

Fixes #609
Current Behavior:
On first sync after start, the controller asserts a certificate with the primary name, and possibly orphans a certificate with the secondary name. On next certificate update, a collision occurs.

New Behavior:
Due to the orphaned certificate, some users have two certificates (one with -1 and one without). Most likely the primary-named certificate is being used, but we can't make that assumption. Therefore, if the controller doesn't know what certificate is being used (on first sync), the code will now inspect the Target HTTPS Proxy for the existing ssl certificate. It'll continue using that certificate until it finds a change in the secret. If users have two certificates, the controller will attempt deletion of any certificate using the desired name.

The PR also fixes a bug when transitioning from a pre-shared certificate to a secret-based certificate. The controller deletes the pre-shared certificate when it really should ignore it.

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Apr 21, 2017
@k8s-reviewable
Copy link

This change is Reviewable

@nicksardo
Copy link
Contributor Author

cc @tonglil I'm betting you haven't noticed this pre-shared cert bug, but better for you to know in advance.

@nicksardo nicksardo changed the title [GCLB] Better certificate handling [GLBC] Better certificate handling Apr 21, 2017
@coveralls
Copy link

Coverage Status

Coverage increased (+0.2%) to 46.42% when pulling 8f0f2b780ed511c624ff30fa5ecff744d8cfd2d3 on nicksardo:gce-ssl-name-fix into 4bd0f49 on kubernetes:master.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.2%) to 46.412% when pulling 8f0f2b780ed511c624ff30fa5ecff744d8cfd2d3 on nicksardo:gce-ssl-name-fix into 4bd0f49 on kubernetes:master.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.8%) to 46.973% when pulling 748b5c5 on nicksardo:gce-ssl-name-fix into 4bd0f49 on kubernetes:master.

Copy link
Member

@bowei bowei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logic is good -- suggested taking the code blocks and making them self contained functions for readability


// Retrieve known ssl certificate
if expectedCertName != "" {
l.sslCert, _ = l.cloud.GetSslCertificate(expectedCertName)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

document why ignoring error here..,

// Use the named GCE cert when it is specified by the annotation.
if certName != "" {
if preSharedCertName != "" {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move this block to its own method func (l *L7) usePreSharedCert()

(you can most likely come up with a better name)

@@ -384,32 +408,28 @@ func (l *L7) checkSSLCert() (err error) {
// TODO: Clean this code up into a ring buffer.
primaryCertName := l.namer.Truncate(fmt.Sprintf("%v-%v", sslCertPrefix, l.Name))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see below, move this code to func (l) nextCertificateName()

l.sslCert = cert
return nil
}

// Handle secret-created cert
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move this block of code to own func (l) populateSSLCert()

} else {
certName = primaryCertName
if l.sslCert == nil || ingCert != l.sslCert.Certificate {
newCertName := primaryCertName
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make a new method nextCertficateName(), it'll make this logic easier to follow

newCertName := l.nextCertificateName()

func (l*L7) nextCertificateName() string {
  primaryCertName := ...
  ...
  if l.sslCert != nil && ...
}

// Perform a delete in case a certificate exists with the exact name
// This certificate should be unused since we check the target proxy's certificate prior
// to this point. Although, it's possible an actor pointed a target proxy to this certificate.
if err = l.cloud.DeleteSslCertificate(newCertName); err != nil && !utils.IsHTTPErrorCode(err, http.StatusNotFound) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a little tangential to thie review, but a lot of the StatusNotFound if checks can be made more succinct by

func ignoreHTTPNotFound(err error) error {
  if err != nil && utils.IsHTTPErrorCode(err, http.StatusNotFound) { return nil }
  return err
}

then all of these sites will be

if err = ignoreHTTPNotFound(l.cloud.DeleteSslCertificate(newCertName)); err != nil {
}

return fmt.Errorf("unable to delete ssl certificate with name %q, expected it to be unused. err: %v", newCertName, err)
}

glog.Infof("Creating new sslCertificates %v for %v", newCertName, l.Name)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not plural? .V(2)

@coveralls
Copy link

Coverage Status

Coverage increased (+0.9%) to 47.122% when pulling d592e3a225d0c9577c320bc40f378373b183b1f8 on nicksardo:gce-ssl-name-fix into 4bd0f49 on kubernetes:master.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.9%) to 47.107% when pulling d592e3a225d0c9577c320bc40f378373b183b1f8 on nicksardo:gce-ssl-name-fix into 4bd0f49 on kubernetes:master.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.9%) to 47.107% when pulling d592e3a225d0c9577c320bc40f378373b183b1f8 on nicksardo:gce-ssl-name-fix into 4bd0f49 on kubernetes:master.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.9%) to 47.126% when pulling f704d81 on nicksardo:gce-ssl-name-fix into 7deb1a2 on kubernetes:master.

}
l.fws = nil
Copy link
Contributor Author

@nicksardo nicksardo Apr 26, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this line was in the wrong block, so I moved it out.

}
l.ip = nil
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

l.ip is a bit special since it can be a shared resource among ingresses. We want to continue other resource cleanup even if this fails.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, the controller should only delete the IP if it was reserved by the controller. Might add a name-prefix check

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, l.ip is never populated with a user-specified static IP.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.9%) to 47.157% when pulling b8f81c0 on nicksardo:gce-ssl-name-fix into 7deb1a2 on kubernetes:master.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.9%) to 47.119% when pulling b8f81c0 on nicksardo:gce-ssl-name-fix into 7deb1a2 on kubernetes:master.

@bowei
Copy link
Member

bowei commented Apr 26, 2017

@bowei

Copy link
Member

@bowei bowei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@nicksardo nicksardo merged commit cd3e546 into kubernetes:master Apr 26, 2017
@nicksardo nicksardo deleted the gce-ssl-name-fix branch April 26, 2017 18:15
@coveralls
Copy link

Coverage Status

Coverage increased (+0.9%) to 47.157% when pulling b2e5d44 on nicksardo:gce-ssl-name-fix into 7deb1a2 on kubernetes:master.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.9%) to 47.157% when pulling b2e5d44 on nicksardo:gce-ssl-name-fix into 7deb1a2 on kubernetes:master.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.9%) to 47.15% when pulling b2e5d44 on nicksardo:gce-ssl-name-fix into 7deb1a2 on kubernetes:master.

@tonglil
Copy link
Contributor

tonglil commented Apr 28, 2017

Thank you @nicksardo for notifying & fixing!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

GLBC: k8s 1.6 fails to replace certificate
6 participants