Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Adds SSL mutual auth support to intra-cluster requests #5482

Merged

Conversation

AlanConfluent
Copy link
Member

@AlanConfluent AlanConfluent commented May 26, 2020

Description

Allows for the configuration of SSL mutual auth for the internal listener ksql.internal.listener. KSQL then considers the request to be done by a SystemUser and considers it authenticated.

This is done by supporting a separate key store and trust store in the server for internal requests, and using these when issuing inter-node requests.

This way, the main ssl configurations can be used for the public API (e.g. https://external.example.com) and new internal configs can be used to set the ssl configuration for each node (e.g. https://node-1.internal.example.com). This way each node can have an internal identity, which is required to verify the certificates for internal requests.

Testing done

Wrote a functional test that uses https on both the listeners, each with a different certificate, as well as basic auth.

Reviewer checklist

  • Ensure docs are updated if necessary. (eg. if a user visible feature is being added or changed).
  • Ensure relevant issues are linked (description should include text like "Fixes #")

Copy link
Member

@JimGalasyn JimGalasyn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, with a few suggestions.

Copy link
Contributor

@purplefox purplefox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

import org.mockito.junit.MockitoJUnitRunner;

@RunWith(MockitoJUnitRunner.class)
public class SystemAuthenticationHandlerTest {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a fan of fine grained unit tests like these - they end up being very brittle and can provide a false sense of security. Would much rather see more coarse grained tests :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unit tests are often somewhat brittle for this reason, but the benefit is that you can test every code path.

I think this is fairly in line with the codebase. What would you change to make it coarser?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bit incongruous to add a fine grained mock test for this handler - the other handlers don't have this and that's been the approach we've taken with the new API. Instead we use much more coarse grained test which test the code as it would actually be used in a real setup. E.g. take a look at AuthTest.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unlike full blown integration tests these kinds of tests run really fast, so you should be able to write enough tests that exercise all the code paths, but unlike fine grained unit tests, they run exactly as they would in the real system, and we're not testing the implementation directly thus allowing it to evolve and be refactored much more easily.

BTW... I'm well known for my opinion on fine grained unit tests (you should see my Twitter conversations), and I'm currently thinking of writing a blog post on them and why I think this approach - mainly championed by "TDD" has been super damaging in our industry ;)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was following the pattern in ksqldb-rest-app/src/test/java/io/confluent/ksql/api/auth/... where other auth handers do this kind of test. It would seem a shame to delete my nice test.

I'm happen to try to write a few test cases in AuthTest to specifically test this, though it's likely going to be a bit complex since it requires setting up https. I'll give this a shot.

Copy link
Contributor

@big-andy-coates big-andy-coates left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @AlanConfluent

If I'm honest I'm not sure we need all the SNI stuff, though it's not my area of expertise and I know you've spent some time looking into this.

SNI is new to me, but from my quick google it looks like SNI is related to virtual hosting, and seems completely unnecessary for ksqlDB. Have you enabled this to allow a ksqlDB node to use different ssl certs for its internal and external listeners? Do we think this is something users need?

I fear you may have implemented this off the back of talking with me - if that is the case, then my apologies. I seem to remember discussing something about needing a shadow set of config for SSL for the internal listener. However, thinking this through, each ksqlDB node only needs one identity, so only one ssl cert, which can be used for either the client facing endpoint, the internal endpoint, or both. Likewise, only a single trust store would be needed, (or no trust store at all), which would include the CAs / certs the node should trust. Then, as you have done, we'd need two separate 'client auth mode' settings to control if mutual auth is required for the client facing and internal endpoints.

Maybe you can understand your thinking?

The other main point I'd like to raise is that this PR adds complexity to the internal routing of requests as you've only added support for inter-node mutual-auth for some of the inter-node communication.
All inter-node communication should be covered by the enhancement you're making! If that means moving proxied pull query requests or scattered show queries requests onto a internal specific endpoint so that we can separate them from client requests, then we should do that. It sounds like a good long term design to clearly separate the two anyway.

Finally, there's a lot of code changes to existing integration tests in this PR. Why is this? I wouldn't have expected those tests to have to change at all. Is this change because of the SNI and/or the fact that we now need two clients for internal communication?

Personally, I'd strongly suggest raising a separate PR to migrate the internal requests to a separate endpoint and merging that first, so that this PR's scope is reduced, and removing the SNI and related code e.g. HostResolver, (assuming my gut is right and this isn't needed) - this will then leave a much smaller scoped and targeted PR.

And once again, my apologies if you thought this was the route to go after talking to me.

Copy link
Contributor

@big-andy-coates big-andy-coates left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, didn't mean to approve - meant to block!

I'd like to review any changes you make, but if I'm unavailable and If you've clean up the internal communication to all use mutual auth, and if you've remove the SNI stuff, then you can consider my block removed.

@AlanConfluent
Copy link
Member Author

SNI is new to me, but from my quick google it looks like SNI is related to virtual hosting, and seems completely unnecessary for ksqlDB. Have you enabled this to allow a ksqlDB node to use different ssl certs for its internal and external listeners? Do we think this is something users need?

From my understanding, it enables the serving of multiple different identities/certificates, depending on the host name requested. It may get most use in virtual hosting, but my intention was to have both an external and an internal identity. The reason has to do with how they're requested and verified.

Externally, some user may request https://external.example.com and the server returns a cert which has the name "external.example.com". This match is what allows the client to verify the identity.

If internally, we have a different hostname, say https://internal.example.com, and it returns a cert with the name "external.example.com", verification is not possible. With SNI enabled (with two identities in the key store), the server matches the requested hostname with certs in its store and is able to return one for "internal.example.com" instead. Of course, we can avoid entirely using SNI by just configuring the internal listener with a separate key store with a cert for "internal.example.com".

However, thinking this through, each ksqlDB node only needs one identity, so only one ssl cert, which can be used for either the client facing endpoint, the internal endpoint, or both. Likewise, only a single trust store would be needed, (or no trust store at all), which would include the CAs / certs the node should trust. Then, as you have done, we'd need two separate 'client auth mode' settings to control if mutual auth is required for the client facing and internal endpoints.

If we want just one identity, the externally facing identity, it makes the verification on the client more complex. Also, it likely prevents you from having different identities for each node in your cluster (since you probably have just one externally facing cert). On the plus side, any KSQL currently configured to do https externally can just turn this on. We have a couple options to do this:

  • Route the internal requests externally using the external hostname, just as if it was an external client. This breaks the notion of a network protected internal listener since that interface must now be exposed. Also, any efficiencies that might have been gained may be lost since it would likely mean routing the request outside of the local network, and back in again. Also, it might be hard to differentiate between a trusted, mutually authed client, and an internal cluster node.
  • Not sure if you noticed the HostAliasResolver interface I added (for testing!), but it effectively allows you to make aliases for a hostname such that you could say I want to request host "external.example.com" using a SocketAddress referring to https://internal.example.com:internap_port. This would allow requests to the internal address, but would compare the cert to "external.example.com". The only downside is it adds complexity to the requests since there's a notion of recipient identity vs actual hostname.

Between these, the second is fairly straightforward to do. It would only require configuring internal clients correctly.

Do you see why I had originally thought to introduce two identities?

The other main point I'd like to raise is that this PR adds complexity to the internal routing of requests as you've only added support for inter-node mutual-auth for some of the inter-node communication.
All inter-node communication should be covered by the enhancement you're making! If that means moving proxied pull query requests or scattered show queries requests onto a internal specific endpoint so that we can separate them from client requests, then we should do that. It sounds like a good long term design to clearly separate the two anyway.

That's a fair comment. I considered that I might do this in two parts to reduce complexity by only using this on system endpoints first. But it's very easy to it for all internal calls on the internal listener. Since proxied pull query requests or scattered show queries are exposed separately on the internal listener, they are already separated, even if not in code organization.

I had originally had two separate Verticles, one with external, and the other internal, but I was told by @purplefox that in Vert.x this is an anti-pattern, so we have one Verticle installed in two places.

Finally, there's a lot of code changes to existing integration tests in this PR. Why is this? I wouldn't have expected those tests to have to change at all. Is this change because of the SNI and/or the fact that we now need two clients for internal communication?

I believe that most of the changes I made were in the new test I added. I added a bunch of helper functions that are necessary to do https requests in tests using HostAliasResolver. These were to test with certain certs even though the requests are going to localhost.

@AlanConfluent
Copy link
Member Author

@big-andy-coates I've addressed your comments. I switched from using SNI to just using a separate key store and trust store. It's simpler that way and also lets me control which identity is being sent from the client, which the Vert.x client api didn't otherwise allow.

@purplefox purplefox self-requested a review May 28, 2020 09:32
Copy link
Contributor

@purplefox purplefox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey Alan,

A couple of questions/comments:

  1. Some of our users will be running a cluster in a completely secured environment with no public access and they don't require any kind of auth - does this solution allow all the internal endpoints to be used without auth requirement? (This shouldn't be the default, but needs to be supported).
  2. Do we now require executing inter node pull query requests over TLS? If so, have we you considered the performance implications? TLS handshake can have twice the network latency of a non TLS handshake. If this is the case we should consider prioritising reusing of connections for inter node pull query requests.

import org.mockito.junit.MockitoJUnitRunner;

@RunWith(MockitoJUnitRunner.class)
public class SystemAuthenticationHandlerTest {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bit incongruous to add a fine grained mock test for this handler - the other handlers don't have this and that's been the approach we've taken with the new API. Instead we use much more coarse grained test which test the code as it would actually be used in a real setup. E.g. take a look at AuthTest.

import org.mockito.junit.MockitoJUnitRunner;

@RunWith(MockitoJUnitRunner.class)
public class SystemAuthenticationHandlerTest {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unlike full blown integration tests these kinds of tests run really fast, so you should be able to write enough tests that exercise all the code paths, but unlike fine grained unit tests, they run exactly as they would in the real system, and we're not testing the implementation directly thus allowing it to evolve and be refactored much more easily.

BTW... I'm well known for my opinion on fine grained unit tests (you should see my Twitter conversations), and I'm currently thinking of writing a blog post on them and why I think this approach - mainly championed by "TDD" has been super damaging in our industry ;)

@AlanConfluent
Copy link
Member Author

Hey Alan,

A couple of questions/comments:

  1. Some of our users will be running a cluster in a completely secured environment with no public access and they don't require any kind of auth - does this solution allow all the internal endpoints to be used without auth requirement? (This shouldn't be the default, but needs to be supported).

This works fine with no auth. If you don't set ksql.internal.ssl.client.authentication, it defaults to NONE and will fall back on other auth used in the system, which is to say that if you have jaas or the security plugin, it will disallow internal endpoints (though you can explicitly whitelist them if you want). If you have no auth configured, it will require no auth.

  1. Do we now require executing inter node pull query requests over TLS? If so, have we you considered the performance implications? TLS handshake can have twice the network latency of a non TLS handshake. If this is the case we should consider prioritising reusing of connections for inter node pull query requests.

For forwarded pull queries, yes, it would be over TLS. I do think that connection pooling would be required to enable if performance was a big concern.

@AlanConfluent
Copy link
Member Author

@big-andy-coates We had a conversation the other day offline and I think we're on the same page with respect to using two certificates.

Also, I was able to figure out how to use an alias to choose the certificate we want. While Vert.x doesn't directly support it, I wrote a quick utility that loads the keystore in memory, grabs the cert/key with the given alias, and creates an in memory single-entry version that can be passed to Vert.x. This way, we just have a config for the alias we want to use, if any and it all reads from the same keystore. It's a better fit for us than SNI since that allows for serving different certs from the same listener, which we don't need.

@purplefox
Copy link
Contributor

Also, I was able to figure out how to use an alias to choose the certificate we want. While Vert.x doesn't directly support it, I wrote a quick utility that loads the keystore in memory, grabs the cert/key with the given alias, and creates an in memory single-entry version that can be passed to Vert.x. This way, we just have a config for the alias we want to use, if any and it all reads from the same keystore. It's a better fit for us than SNI since that allows for serving different certs from the same listener, which we don't need.

Glad to hear that approach worked :)

Copy link
Contributor

@big-andy-coates big-andy-coates left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking good @AlanConfluent

I've left a fair few nits and suggestions which you may want to look at. There's also a resource leak highlighted.

Aside from that the docs could do with beefing up. I know engineers don't like writing docs, but our users also hate trying to work out how to configure security! ;). So the clearer the docs and the more worked examples the better. I also think this will help ensure this all works as expected in the different configurations. Adding integration tests that also test each configuration is a good idea too, (until such time that we can have tests that test the docs).

Finally, I think that the HostAliasResolver now peppered through the production code is a code smell. This is test-only code. I've noted an example of how you can remove this from one bit of the production code and it would be great if you can take a look and see how it can be fully removed. Happy to help if you get stuck.

Comment on lines 228 to 231
In addition to some of the HTTPS configurations above, your key store must now
contain not only the certificate/key for the endpoints exposed with `listeners`, but also
the additional certificate/key for your internal listener set with
`ksql.internal.listener`. If your internal certificate is self-signed, a trust store is required to contain certificates for nodes in your cluster.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This, and the example config below, seems to assume the users in securing the client facing endpoint via SSL as well. That may not be the case.

Also, the ksql.ssl.keystore.alias.xxx in the example config aren't explained anywhere.

Configuring this stuff is normally a pain point for users, so the more hand-holding and the better the examples we can provide the better. To that end, I would encourage adding different sections for :

  • Configuring ksqlDB to use SSL mutual auth for inter-node comms and SSL server auth for client comms. This section should explain about the aliases configs and how these relate to the keystores - I'd be tempted to give a step-by-step guide of how to create the necessary keystores and truststores, so that its really clear where in the process the user defines the alias they later reference in the ksql.ssl.keystore.alias.xxx configs.
  • Configuring ksqlDB to use SSL mutual auth for inter-node comms only, (which assumes some other auth method on the client facing comms).
  • Configuring ksqlDB to use jaas for external - does this require them to have specific paths in the exclude list for specific features to work? If it does, that's not really a workable solution going forward as that list will change. Users generally customize their properties files and then upgrade by updating the jars only - can we have some single setting that turns off auth on internal calls?
  • Configuring ksqlDB to use jaas for internal and external - if this is even possible?
  • Configuring ksqlDB to use a custom auth handler installed by the user - if this is even possible?

I know the later points above are outside the scope of this PR - but now that we've got this new internal listener config we need to make sure the documentation covers the different valid combinations of auth setup for the listeners and make sure they are easy to configure. Of course, you can update the config in a follow-on PR. But document it we should - otherwise we've done all the hard work, but users will likely struggle to use the feature!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to see if @JimGalasyn has the time to try and follow the steps to create a working secure two-node setup!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I've improved the documentation here quite a lot.

This, and the example config below, seems to assume the users in securing the client facing endpoint via SSL as well. That may not be the case.

I removed that assumption. The first example is with internal only. I also give a second example that shows setting things up with both.

Also, the ksql.ssl.keystore.alias.xxx in the example config aren't explained anywhere.

I now give more elaborated explanations on it.

Configuring this stuff is normally a pain point for users, so the more hand-holding and the better the examples we can provide the better. To that end, I would encourage adding different sections for :

Configuring ksqlDB to use SSL mutual auth for inter-node comms and SSL server auth for client comms. This section should explain about the aliases configs and how these relate to the keystores - I'd be tempted to give a step-by-step guide of how to create the necessary keystores and truststores, so that its really clear where in the process the user defines the alias they later reference in the ksql.ssl.keystore.alias.xxx configs.

I added a whole section on how to generate the keystore and trust store. It seems like maybe this could live elsewhere, but it's right in the middle of things at the moment.

Configuring ksqlDB to use SSL mutual auth for inter-node comms only, (which assumes some other auth method on the client facing comms).

Added

Configuring ksqlDB to use jaas for external - does this require them to have specific paths in the exclude list for specific features to work? If it does, that's not really a workable solution going forward as that list will change.

Added.

This is what's used in one of my functional tests. It requires no exclude lists. That was one of the big changes I did was to have KSQL consider this request authenticated as a SystemUser.

Users generally customize their properties files and then upgrade by updating the jars only - can we have some single setting that turns off auth on internal calls?

I'm confused by what you mean. Do you want a flag that negates other auth settings and disables it? At the moment, ksql.internal.ssl.client.authentication=NONE works more or less that way. It may still use SSL on the internal connection, but it won't require verification on either side (server or client).

Configuring ksqlDB to use jaas for internal and external - if this is even possible?

Not possible at the moment for internal.

Configuring ksqlDB to use a custom auth handler installed by the user - if this is even possible?
I know the later points above are outside the scope of this PR - but now that we've got this new internal listener config we need to make sure the documentation covers the different valid combinations of auth setup for the listeners and make sure they are easy to configure. Of course, you can update the config in a follow-on PR. But document it we should - otherwise we've done all the hard work, but users will likely struggle to use the feature!

I added documentation for the above cases. I agree we should keep working on this and I'm open to doing followup work there.

@@ -54,6 +54,11 @@ public void handle(final RoutingContext routingContext) {
return;
}

if (SystemAuthenticationHandler.hasAuthorization(routingContext)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't this just be:

Suggested change
if (SystemAuthenticationHandler.hasAuthorization(routingContext)) {
if (SystemAuthenticationHandler. isAuthenticatedAsSystemUser(routingContext)) {

Not sure the second hasAuthorization method adds anything here...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. Done.

try {
input = new FileInputStream(keyStorePath);
final KeyStore keyStore = KeyStore.getInstance(KEYSTORE_TYPE);
keyStore.load(new FileInputStream(keyStorePath), pw);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

input stream leak...

Suggested change
keyStore.load(new FileInputStream(keyStorePath), pw);
keyStore.load(input, pw);

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops. Good catch

final Optional<BasicCredentials> credentials,
final LocalProperties localProperties,
final HttpClientOptions httpClientOptions,
final Consumer<HttpClientOptions> sslHttpClientOptionsConsumer,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are we passing in a consumer that is just called back with the httpClientOptions passed in? This feels like an anti-pattern to me.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because it's only used when SSL is used... I agree I don't like it. I'll instead pass in a Function<Boolean, HttpClientOptions> factory that takes in if it's ssl in place of this as well as the HttpClientOptions before it.

Comment on lines 87 to 88
final String aliasHost = hostAliasResolver.map(resolver -> resolver.resolve(server.getHost()))
.orElse(server.getHost());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than store a Optional<HostResolver>, why not default to a HostResolver impl that just returns the value passed? e.g. Function.identify(). This would simplify this code.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I effectively did this with your BiFunction<Integer, String, SocketAddress> socketAddressFactory suggestion.


public KsqlClient(
final Map<String, String> clientProps,
final Optional<BasicCredentials> credentials,
final LocalProperties localProperties,
final HttpClientOptions httpClientOptions
final HttpClientOptions httpClientOptions,
final Optional<HostAliasResolver> hostAliasResolver
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems a shame to require every use of this production class to supply a HostAliasResolver, given this should only be passed for test cases, right? And most test cases won't need this either. Just one or two specific tests that are testing the new code you're adding. Yet HostAliasResolver is mentioned around 40 times in this PR.

Is there another way?

At the very least you could remove this parameter from this constructor and add another constructor, annotated with @VisibleForTesting that takes the resolver. However, ideally, the production code should know nothing of HostAliasResolver...

I think this can be achieved by injecting an interface to handle the SocketAddress.inetSocketAddress call. So rather than taking a HostAliasResolver the @VisiableForTesting constructor takes BiFunction<Integer, String, SocketAddress> socketAddressFactory. Test code can then use this to perform the same aliasing that HostAliasResolver does now. Please see if you can remove HostAliasResolver from the production code base, or at least from the non- @VisibleForTesting constructors.

SocketAddress.inetSocketAddress(server.getPort(), aliasHost)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I agree I didn't really like having to add this, but you need some "seam" to inject the test version.

Nice suggestion. I did the factory as you mentioned. A named interface is definitely a little misleading for test-only code. Now I have a @VisibleForTesting constructor where I take it, and the other just uses the default SocketAddress::inetSocketAddress.

Copy link
Member Author

@AlanConfluent AlanConfluent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@big-andy-coates Thanks for the close review. I feel like this PR has improved a lot. I believe all of your comments have been addressed. Probably the main thing that could be improved upon (hopefully in a followup) is the documentation. Tell me what you think!

Comment on lines 228 to 231
In addition to some of the HTTPS configurations above, your key store must now
contain not only the certificate/key for the endpoints exposed with `listeners`, but also
the additional certificate/key for your internal listener set with
`ksql.internal.listener`. If your internal certificate is self-signed, a trust store is required to contain certificates for nodes in your cluster.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I've improved the documentation here quite a lot.

This, and the example config below, seems to assume the users in securing the client facing endpoint via SSL as well. That may not be the case.

I removed that assumption. The first example is with internal only. I also give a second example that shows setting things up with both.

Also, the ksql.ssl.keystore.alias.xxx in the example config aren't explained anywhere.

I now give more elaborated explanations on it.

Configuring this stuff is normally a pain point for users, so the more hand-holding and the better the examples we can provide the better. To that end, I would encourage adding different sections for :

Configuring ksqlDB to use SSL mutual auth for inter-node comms and SSL server auth for client comms. This section should explain about the aliases configs and how these relate to the keystores - I'd be tempted to give a step-by-step guide of how to create the necessary keystores and truststores, so that its really clear where in the process the user defines the alias they later reference in the ksql.ssl.keystore.alias.xxx configs.

I added a whole section on how to generate the keystore and trust store. It seems like maybe this could live elsewhere, but it's right in the middle of things at the moment.

Configuring ksqlDB to use SSL mutual auth for inter-node comms only, (which assumes some other auth method on the client facing comms).

Added

Configuring ksqlDB to use jaas for external - does this require them to have specific paths in the exclude list for specific features to work? If it does, that's not really a workable solution going forward as that list will change.

Added.

This is what's used in one of my functional tests. It requires no exclude lists. That was one of the big changes I did was to have KSQL consider this request authenticated as a SystemUser.

Users generally customize their properties files and then upgrade by updating the jars only - can we have some single setting that turns off auth on internal calls?

I'm confused by what you mean. Do you want a flag that negates other auth settings and disables it? At the moment, ksql.internal.ssl.client.authentication=NONE works more or less that way. It may still use SSL on the internal connection, but it won't require verification on either side (server or client).

Configuring ksqlDB to use jaas for internal and external - if this is even possible?

Not possible at the moment for internal.

Configuring ksqlDB to use a custom auth handler installed by the user - if this is even possible?
I know the later points above are outside the scope of this PR - but now that we've got this new internal listener config we need to make sure the documentation covers the different valid combinations of auth setup for the listeners and make sure they are easy to configure. Of course, you can update the config in a follow-on PR. But document it we should - otherwise we've done all the hard work, but users will likely struggle to use the feature!

I added documentation for the above cases. I agree we should keep working on this and I'm open to doing followup work there.

@@ -54,6 +54,11 @@ public void handle(final RoutingContext routingContext) {
return;
}

if (SystemAuthenticationHandler.hasAuthorization(routingContext)) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. Done.

Comment on lines 69 to 73
// The requirements for being considered a system call on behalf of the SystemUser are that
// SSL mutual auth is in effect for the connection (meaning that the request is verified to be
// coming from a known set of servers in the cluster), and that it came on the internal
// listener interface, meaning that it's being done with the authorization of the system
// rather than directly on behalf of the user. Mutual auth is only enforced when SSL is used.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Moved to java docs.

try {
input = new FileInputStream(keyStorePath);
final KeyStore keyStore = KeyStore.getInstance(KEYSTORE_TYPE);
keyStore.load(new FileInputStream(keyStorePath), pw);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops. Good catch


public KsqlClient(
final Map<String, String> clientProps,
final Optional<BasicCredentials> credentials,
final LocalProperties localProperties,
final HttpClientOptions httpClientOptions
final HttpClientOptions httpClientOptions,
final Optional<HostAliasResolver> hostAliasResolver
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I agree I didn't really like having to add this, but you need some "seam" to inject the test version.

Nice suggestion. I did the factory as you mentioned. A named interface is definitely a little misleading for test-only code. Now I have a @VisibleForTesting constructor where I take it, and the other just uses the default SocketAddress::inetSocketAddress.

final Optional<BasicCredentials> credentials,
final LocalProperties localProperties,
final HttpClientOptions httpClientOptions,
final Consumer<HttpClientOptions> sslHttpClientOptionsConsumer,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because it's only used when SSL is used... I agree I don't like it. I'll instead pass in a Function<Boolean, HttpClientOptions> factory that takes in if it's ssl in place of this as well as the HttpClientOptions before it.

Comment on lines 87 to 88
final String aliasHost = hostAliasResolver.map(resolver -> resolver.resolve(server.getHost()))
.orElse(server.getHost());
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I effectively did this with your BiFunction<Integer, String, SocketAddress> socketAddressFactory suggestion.

@AlanConfluent
Copy link
Member Author

@big-andy-coates You had wanted another review after the last. Please tell me if you're good with it.

Copy link
Contributor

@big-andy-coates big-andy-coates left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @AlanConfluent

This is looking good!

Few nits and suggestions in the code. Defo worth fixing the hardcoded ports in the tests.

Aside from that, I still think the docs need more love. The new section you've added is much better. However, it could be integrated with the existing page better. Think about it from the perspective of a new user coming to the page and try and take them on a journey that gets them to the right part of the doc. Also, once you've happy with your doc changes, get Jim to have another look before committing.


ksqlDB supports securing inter-node communication using SSL mutual authentication.

For more information about configuring `ksql.internal.listener`, see [Configuring Listeners of a ksqlDB Cluster](index.html#configuring-listeners-of-a-ksqldb-cluster).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Link to the markdown docs so that the links work in Github!

Suggested change
For more information about configuring `ksql.internal.listener`, see [Configuring Listeners of a ksqlDB Cluster](index.html#configuring-listeners-of-a-ksqldb-cluster).
For more information about configuring `ksql.internal.listener`, see [Configuring Listeners of a ksqlDB Cluster](index.md#configuring-listeners-of-a-ksqldb-cluster).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


### Using Authentication for Both the Internal and External Listener

Client facing basic HTTP authentication can be used alongside authentication for the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add recommendation about using SSL encryption for external listener if using plain-text BASIC http auth - or better still, just update the example to use SSL-client-auth.

The existing Configure ksqlDB for Basic HTTP Authentication section has such a recommendation. Equally this existing section could be updated to do-the-right-thing and have both BASIC and SSL client auth.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -219,6 +219,134 @@ credentials when starting the CLI by using the `--user` and
<ksql-install>bin/ksql --user fred --password letmein http://localhost:8088
```

Configure ksqlDB for Internal Authentication
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've just had a look at what's already in this document and I think, rather than adding a brand new section, the details you're covering need to be integrated with what is already there.

Currently, the sections above cover:

  • "Configuring ksqlDB for HTTPS" i.e. External encrypted using SSL-client auth.
    • 'Configure the CLI for HTTPS', i.e. CLI side config needed to talk to server over ssl.
  • "Configure ksqlDB for Basic HTTP Authentication", i.e. External secured using HTTP-BASIC auth.
    • 'Configure the CLI for Basic HTTP Authentication', i.e. CLI config needed for HTTP-BASIC.

NOTE: the text right at the top of the doc, which is an index of sorts, link to the above sections.

Then the sections you've added are:

  • Internal secured using SSL-mutual auth
  • Internal secured using SSL-mutual auth + External secured using HTTP-BASIC auth.
  • Internal secured using SSL-mutual auth + External encrypted using SSL-client auth

NOTE: the text right at the top of the doc needs updating to link to your new sections.

Others have also previously added new sections without updating the preamble / index at the top. So we could do with fixing that.

Recommended structure:

What I'd recommend is updating the blurb at the top to something that calls out 'single listener' vs 'dual listener' difference, and 'authentication' vs 'encryption' and links to appropriate sections, and includes links to any new sections added by others. Maybe something like the following.

Note: Each index level would link to the appropriate section in the doc below. Each configuration section title clearly states, in a consistent manner, what's being configured. This will allow people to jump to the big they need.

It also directs ccloud users away to the ccloud docs, allowing this page to focus on the on-prem setup.


ksqlDB supports several combinations of encryption and authentication on its client-facing and internal endpoints, and also supports many of the security features of the other services it communicates with, like {{ site.aktm }} and {{ site.sr }}.


Then each section heading can explain the top level items that come within it. For example:


Securing ksqlDB on premise

This section covers how to secure installations of ksqlDB outside of Confluent Cloud, e.g. on-premise installations or manual installations on other cloud platforms.

The section is split into:

Securing ksqlDB installation

ksqlDB supports two main deployment modes:

  • Securing interactive deployments: interactive deployments are those where the ksqlDB servers accept client connections.
  • Securing headless deployments: headless deployments are those where the ksqlDB servers do not accept client connections. They read the SQL statements they should run from a file on-disk.

Securing interactive deployments

Securing the interactive ksqlDB installation involves securing the HTTP endpoints the ksqlDB server is listening on.

As well as accepting connections and requests from clients, a multi-node ksqlDB cluster also requires inter-node communications. You can choose to configure the external client and internal inter-node communication separately or over a single listener:

  • Securing single listener setup: ideal for single-node installations, or where the inter-node communication is over the same network interfaces as client communication.
  • Securing dual listener setup: useful where inter-node communication is over a different network interfaces or requires different authentication or encryption configuration.

I'm sure you get the idea!

Also, take a look at the existing preamble text at the top. I'm sure there's useful information in there. I'd be restructuring it and moving it into the appropriate sections. e.g. the bit about how to pass the config file to the server should be in the Securing ksqlDB on premise section.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've just had a look at what's already in this document and I think, rather than adding a brand new section, the details you're covering need to be integrated with what is already there.

I followed your instructions fairly closely, so I think we're good there.

@JimGalasyn Can you take a look at my docs changes again? Also, How can I verify that the local anchor links work correctly? None of my tools allow for checking this.

ksql.internal.listener=https://node-1.internal.example.com:8099

# This enables mutual auth checking for the internal listener
ksql.internal.ssl.client.authentication=REQUIRED
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe include ksql.ssl.client.authentication= whatever, just so users are aware?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, added ksql.ssl.client.authentication=NONE since most configurations use another mechanism for client authentication.

Comment on lines 292 to 413
In such a configuration, you must specify which key pair is used for a given
listener by providing a key store alias. For example,
if set, `ksql.ssl.keystore.alias.internal` will be used to find the key store entry
with the given alias when setting up the internal listener. Similarly,
`ksql.ssl.keystore.alias.external` is used for the client listener `listeners`.
Below is an example configuration:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perfect!

if (keyStorePath != null && !keyStorePath.isEmpty()) {
final JksOptions keyStoreOptions = new JksOptions()
.setPassword(keyStorePassword.value());
if (keyStoreAlias != null && !keyStoreAlias.isEmpty()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure. Maybe explicitly setting it via ksql.what.ever=null in the config file? If that doesn't allow nulls then great!

private static final TestKsqlRestApp REST_APP_0 = TestKsqlRestApp
.builder(TEST_HARNESS::kafkaBootstrapServers)
.withEnabledKsqlClient(LOCALHOST_FACTORY)
.withProperty(KsqlRestConfig.LISTENERS_CONFIG, "http://0.0.0.0:8088")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Avoid hardcoded ports in tests - it causes tests to fail if the ports in use, e.g.

  • If the build server is running two builds on the same box
  • An engineer is running two builds, for two different change sets, at the same time.
  • An engineer is running ksql to test something else, while they've also got a build running.
  • etc.

I think you should be able to just set it to http://0.0.0.0:0 and the server will auto-pick a free port. It certainly did in Jetty days, not sure about Vert.x. Check with Tim if it doesn't work. I'm sure he'll know how to.

Please update all places the port is hardcoded.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This works fine for listeners, so I did that there. There's a bit of an issue with doing this for KsqlRestConfig.INTERNAL_LISTENER_CONFIG because I have to also reference the port when setting KsqlRestConfig.ADVERTISED_LISTENER_CONFIG. At the moment, configs have to be set by the time the server starts up, and yet using port 0, the port doesn't get resolved until the server starts.

This can possibly be resolved by reworking the code a bit, but that seems like work for a followup.

public static SimpleKsqlClient instance(Map<String, Object> clientProps) {
return new DefaultKsqlClient(Optional.empty(), clientProps);
public static SimpleKsqlClient instance(Map<String, Object> clientProps,
final BiFunction<Integer, String, SocketAddress> socketAddressFactory) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth having two versions: one with and one without this function? Seems like most things won't need the ability to override the host.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, added.

this.httpNonTlsClient = createHttpClient(vertx, clientProps, httpClientOptions, false);
this.httpTlsClient = createHttpClient(vertx, clientProps, httpClientOptions, true);
}

public KsqlClient(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth adding java docs on the params? The new socketAddressFactory probably isn't obvious

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

[see above](#configure-ksqldb-for-basic-http-authentication).

Configuring internal for SSL-mutual authentication + external for SSL encryption
--------------------------------------------------------------------------------
Copy link
Member

@JimGalasyn JimGalasyn Jun 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
--------------------------------------------------------------------------------
----------------------------------------------------------------------------------

Copy link
Member

@JimGalasyn JimGalasyn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, with a few suggestions, great content!

@AlanConfluent AlanConfluent merged commit 82b137f into confluentinc:master Jun 16, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants