Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make TLS version configurable #540

Closed
chrishenzie opened this issue Oct 1, 2021 · 10 comments
Closed

Make TLS version configurable #540

chrishenzie opened this issue Oct 1, 2021 · 10 comments
Labels
feature New feature or request

Comments

@chrishenzie
Copy link
Contributor

chrishenzie commented Oct 1, 2021

The ZAPI client for Harvest appears to use TLS 1.0 for its requests.

From the TLS config documentation:

	// MinVersion contains the minimum TLS version that is acceptable.
	// If zero, TLS 1.0 is currently taken as the minimum.
	MinVersion uint16

In NetApp ONTAP clusters with FIPs mode enabled, a minimum TLS version of v1.2 is required, which causes Harvest to break with the following error:

5:56PM ERR goharvest2/cmd/collectors/zapiperf/zapiperf.go:1170 > instance request error="connection error => Post <URL>: remote error: tls: internal error"

It would be nice to make the minimum TLS version configurable or maybe bump the default to support FIPs mode.

CC @Jiawei0227

@chrishenzie chrishenzie added the feature New feature or request label Oct 1, 2021
@cgrinds
Copy link
Collaborator

cgrinds commented Oct 1, 2021

thanks for the bug @chrishenzie you linked to the certificate_auth code path. Is there a desire to also fix this for the basic_auth path too? I'm inclined to bump the default, but need to check how this impacts 7-mode filers first.

@chrishenzie
Copy link
Contributor Author

Yes I think fixing both paths is preferable

@cgrinds
Copy link
Collaborator

cgrinds commented Oct 1, 2021

Thanks! As suspected, using

TLSClientConfig: &tls.Config{
    MinVersion: tls.VersionTLS12,
}

fails when connecting to 7-mode systems.

connection error => connection error => Post "https://10.65.59.32:443/servlets/netapp.servlets.admin.XMLrequest_filer":
tls: server selected unsupported protocol version 301 Poller=v-7

@chrishenzie
Copy link
Contributor Author

chrishenzie commented Oct 1, 2021

@cgrinds Thanks for the investigation!

More interesting behavior... when we updated our cluster from 9.8 to 9.9.1 this issue went away (e.g. API calls succeeded without setting a min or max TLS version). I'm not familiar with this area, but maybe there was some TLS version negotiation that was fixed with the more recent version?

@cgrinds
Copy link
Collaborator

cgrinds commented Oct 2, 2021

that is interesting - what Harvest is doing now looks correct. MinVersion is zero, which means it defaults to TLS 1.0 and MaxVersion is zero which defaults to TLS 1.3.

When negotiating with a FIPs enabled server, Harvest and the server should settle on a version that both support. I tried this on a cluster with NetApp Release 9.8P2: Tue Feb 16 03:49:46 UTC 2021 and everything worked as expected when I enabled FIPs via security config modify -interface SSL -is-fips-enabled true. I'll see if I can find anything about FIPs or TLS in ONTAP changelogs.

Do you know the exact version of 9.8 you were using?

@Jiawei0227
Copy link

We are using NetApp Release 9.8P1. Maybe that's related? Basically we use ansible module to turn on FIPs mode:

- name: Enable FIPS mode
  na_ontap_security_config:
    name: ssl
    is_fips_enabled: true
    <<: *ontap

it should be equivalent to run security config modify -interface SSL -is-fips-enabled true. And then all of a sudden we get errors like "remote error: tls: internal error"

We switched to 9.9.1 release and the issue went away.

@cgrinds
Copy link
Collaborator

cgrinds commented Oct 3, 2021

Thanks for the details @Jiawei0227

Yes, the Ansible block you pasted is equivalent to the shell commands.

Digging through ONTAP changelogs, the closest FIPs related TLS issue I've found so far is

  • 1355645 which is a corrupted sshd_config due to invalid key exchange algorithms

This issue was fixed in 9.8P3 and 9.9.1 (and eariler releases).
I'm going to try to find a 9.8P1 FIPs system and reproduce to verify.

In summary, it sounds like you're unblocked and Harvest is fine from a TLS FIPs perspective.

@cgrinds
Copy link
Collaborator

cgrinds commented Oct 4, 2021

@chrishenzie and @Jiawei0227 I found a 9.8 system NetApp Release Dirtwolf__9.8.0: Wed Jan 20 19:00:08 UTC 2021 and enabled FIPs. Once I did that, Harvest collects metrics from that cluster without issue.

curl -s 'http://localhost:13002/metrics' | wc -l
    2622

That adds support to the hypothesis that your system had a corrupted sshd_config. Do you have another system that hasn't been upgraded yet and exhibits the problem? If so, we can try to get more information there, otherwise I'm inclined to close this. Thoughts?

@chrishenzie
Copy link
Contributor Author

@cgrinds I think both systems we are working on have since been upgraded. You can close and we can revisit this in the future should it happen again. Thank you for looking into this!

@cgrinds
Copy link
Collaborator

cgrinds commented Oct 6, 2021

sounds good. feel free to reopen if you see it again

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants