-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for configuring TLS ServerName for health checks #9475
Conversation
bfaa0a2
to
c69c85e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the PR!
I think this change looks good, and the implementation looks both correct and complete. Nice work!
The only concern I have with this change is the refactor in http_decode_test.go
. I appreciate the desire to reduce some of the boilerplate here, but I think these tests are already a quite difficult to understand, and with the reduced boilerplate I think unfortunately it makes then even harder to understand.
I suspect another way we can approach these tests is by defining two JSON inputs that set all the fields, one with all CamelCase keys, one with all snake_case keys, and use those two inputs to test the permutations. There will need to be one other test case for the one field that has more then 2 options, but that's probably ok.
I don't think we need to make that change in this PR, but I think it would be good to use the existing test structure until we can make that change. Alternatively if you wanted to try out that approach first and then introduce this additional field, that would be great as well.
We could also start that approach with only the new field being added, and later on come back and add the other fields as we remove the old test cases.
What do you think about that approach?
if !c.enableAgentTLSForChecks() { | ||
return &tls.Config{ | ||
InsecureSkipVerify: skipVerify, | ||
ServerName: serverName, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a change in behaviour here where serviceName
will be defaulted to the server or node name, where as previously it was not. I'm not sure if this is likely to cause problems or not.
Was this change intentional or incidental? I think if this part of the change isn't required it would be good to maintain the previous behaviour.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did make this change intentionally: the client may not care to verify which cert the server is returning, but the server may still need the SNI to determine which cert to use.
I'm not sure if this is likely to cause problems or not.
I don't have a good sense of this either. I'd argue my change is good for consistency of SNI handling, but it's not required for my use case and I can revert it if we're worried about the behavior change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll look into this some more to see if I can figure out how most servers handle this field. This change may very well be safe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
JFTR, this broke the use-case of using Vault as a CA for Nomad services.
The service health checks used to verify the certificate against the service IP so we put it into ip_sans
. As of Consul 1.10 it checks against the unqualified hostname so they all got a mismatch and I had to change all my Nomad plans to use $alt_names := env "attr.unique.consul.name" | printf "alt_names=%s" -}}
and re-deploy.
I'm surprised this didn't break more people's clusters but everybody is probably running with TLS verification off anyways. ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inclusion of the Consul server/ node name in the ServerName field of a client TLS config used exclusively for health checking doesn't appear to be a desirable behavior. I've opened #17481 to fix this and provide an explanation.
Yep that's fair, I was just optimizing for future authors adding new fields. I'll revert my changes and take a look at either implementing your suggestion or just carrying forward the existing copy/paste. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good, thank you for working on this!
@@ -188,6 +188,11 @@ The table below shows this endpoint's support for | |||
The value can be further limited for all checks of a given agent using the | |||
`check_output_max_size` flag in the agent. | |||
|
|||
- `TLSServerName` `(string: "")` - Specifies an optional string used to set the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These docs looks good. I think we also need to document this in website/content/docs/discovery/checks.mdx
, which is linked from the config docs.
c69c85e
to
7db2474
Compare
@cbroglie is attempting to deploy a commit to the HashiCorp Team on Vercel. A member of the Team first needs to authorize it. |
cb05fa6
to
c981f49
Compare
@dnephin I believe I've made all the requested changes and rebased, let me know if anything is missing |
c981f49
to
c633249
Compare
Some TLS servers require SNI, but the Golang HTTP client doesn't include it in the ClientHello when connecting to an IP address. This change adds a new TLSServerName field to health check definitions to optionally set it. This fixes hashicorp#9473.
c633249
to
f0307c7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! This looks good to me.
I noticed that CI was not running on this PR. I believe that is because you have CircleCI enabled on your fork, so the jobs are running as part of your project and for some reason are not being linked back to github.
Once https://app.circleci.com/pipelines/github/cbroglie/consul looks good, I think this is ready to merge.
🍒 If backport labels were added before merging, cherry-picking will start automatically. To retroactively trigger a backport after merging, add backport labels and re-run https://circleci.com/gh/hashicorp/consul/338341. |
Thanks! |
Some TLS servers require SNI, but the Golang HTTP client doesn't
include it in the ClientHello when connecting to an IP address. This
change adds a new TLSServerName field to health check definitions to
optionally set it. This fixes #9473.