Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add config to disable pull queries when validation is required #3879

Merged
merged 3 commits into from
Nov 18, 2019

Conversation

vinothchandar
Copy link
Contributor

@vinothchandar vinothchandar commented Nov 16, 2019

fixes #3863

Description

  • Added ksql.query.pull.skip.access.validator to control if pull queries work without validation
  • By default, Pull queries error out, if auth validation is needed
  • Replaced DUMMY_VALIDATOR with Optional<> interface for KsqlAuthorizationValidatorFactory
  • Fixed some tests, added test cases

Testing done

  • Unit tests
  • Local testing for performance w and w/o validation enabled

Reviewer checklist

  • Ensure docs are updated if necessary. (eg. if a user visible feature is being added or changed).
  • Ensure relevant issues are linked (description should include text like "Fixes #")

@vinothchandar
Copy link
Contributor Author

Perf out of box (Kafka does not have ACLs)

15:56:37 [ksql-benchmark]$ wrk -t 8 -c 16 -d 120 --latency -s ./pull-query.lua http://localhost:8088/query
Running 2m test @ http://localhost:8088/query
  8 threads and 16 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     3.68ms    1.95ms  57.60ms   78.50%
    Req/Sec   560.74    159.23     1.37k    54.50%
  Latency Distribution
     50%    3.27ms
     75%    4.47ms
     90%    6.03ms
     99%   10.14ms
  535959 requests in 2.00m, 175.83MB read
Requests/sec:   4463.31
Transfer/sec:      1.46MB

@vinothchandar
Copy link
Contributor Author

vinothchandar commented Nov 17, 2019

With validation turned on in Kafka

Pull queries fail

ksql> show tables;

 Table Name       | Kafka Topic      | Format | Windowed 
---------------------------------------------------------
 ORDER_QUANTITIES | ORDER_QUANTITIES | JSON   | false    
---------------------------------------------------------
ksql> SELECT * FROM order_quantities WHERE ROWKEY = 'order-1';
Pull queries are not currently supported, when access validation against Kafka is configured. If you really want to bypass this limitation please set ksql.query.pull.skip.access.validator=true If true, KSQL will  NOT enforce access validation checks for pull queries, which could expose Kafka topics which are secured with RBAC or ACLs. Please enable only after careful consideration. Default value: false
ksql> 

Failure is swift

[ksql-benchmark]$ wrk -t 8 -c 16 -d 120 --latency -s ./pull-query.lua http://localhost:8088/query
Running 2m test @ http://localhost:8088/query
  8 threads and 16 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     5.88ms   11.23ms 648.38ms   94.22%
    Req/Sec   487.08    258.83     1.10k    58.73%
  Latency Distribution
     50%    3.27ms
     75%    5.34ms
     90%   11.40ms
     99%   44.22ms
  410856 requests in 2.00m, 245.67MB read
  Socket errors: connect 0, read 0, write 0, timeout 31
  Non-2xx or 3xx responses: 410856
Requests/sec:   3420.79
Transfer/sec:      2.05MB

Has no spurious clients allocated

ksql-perf-w-validation

and push queries work

ksql> SET 'auto.offset.reset'='earliest';
Successfully changed local property 'auto.offset.reset' from 'earliest' to 'earliest'.
ksql> SELECT * FROM order_quantities EMIT CHANGES LIMIT 10;

+---------------------------------------+---------------------------------------+---------------------------------------+---------------------------------------+
|ROWTIME                                |ROWKEY                                 |ORDER_ID                               |TOTAL_QTY                              |
+---------------------------------------+---------------------------------------+---------------------------------------+---------------------------------------+
|1572465670162                          |order-281.0                            |null                                   |9                                      |
|1572465670437                          |order-615.0                            |null                                   |8                                      |
|1572465677709                          |order-728.0                            |null                                   |7                                      |
|1572465682418                          |order-733.0                            |null                                   |7                                      |
|1572465684522                          |order-364.0                            |null                                   |5                                      |
|1572465698077                          |order-377.0                            |null                                   |0                                      |
|1572465698872                          |order-22.0                             |null                                   |5                                      |
|1572465699660                          |order-5.0                              |null                                   |4                                      |
|1572465701959                          |order-153.0                            |null                                   |5                                      |
|1572465704614                          |order-836.0                            |null                                   |4                                      |
Limit Reached
Query terminated
ksql>

@vinothchandar vinothchandar changed the title [WIP] feat: add config to disable pull queries when validation is required feat: add config to disable pull queries when validation is required Nov 17, 2019
Copy link
Member

@vpapavas vpapavas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @vinothchandar! LGTM +1 !

Copy link
Contributor

@big-andy-coates big-andy-coates left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @vinothchandar,

Pull queries are supported by the websocket endpoint, WSQueryEndpoint, so that class will need similar changes put in place.

I'm a little nervous of approving this PR for two main reasons:

  1. All the more functional tests, (CliTest, RestQueryTranslationTest, PullQueryFunctionalTest, RestApiTest etc), have been switched over to set this 'skip validation' config to true. This feels very wrong. Where's the test testing our normal / recommended path?
  2. It looks like this PR disables pull queries for any KSQL server running against a secured Kafka cluster, e.g. one secured with ACLs or some custom auth mech, i.e. it disabled pull queries for anyone using KSQL in a modern production environment. Is this really what we want? I guess we're releasing this as a preview and we can look to quickly iterate and remove this restriction. But just wanted to call this out and make sure this is indeed our intent with this PR.

"ksql.query.pull.skip.access.validator";
public static final boolean KSQL_PULL_QUERIES_SKIP_ACCESS_VALIDATOR_DEFAULT = false;
public static final String KSQL_PULL_QUERIES_SKIP_ACCESS_VALIDATOR_DOC = "If true, KSQL will "
+ " NOT enforce access validation checks for pull queries, which could expose Kafka topics"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

validation checks here is a bit vague. Can we be more explicit? This is really about skips authorization checks. The docs and the name of this config should reflect this.

I think the actual functionality is that with this set to the default false KSQL won't support pull queries when running against a secure Kafka. With it set to true KSQL won't check the user has access to the topics underlying the the materialized state the pull query is accessing.

Is that right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just matched it to the config above. I think this is okay, since these aspects are explained in the other config anyway.

@big-andy-coates big-andy-coates requested a review from a team November 18, 2019 12:44
@apurvam
Copy link
Contributor

apurvam commented Nov 18, 2019

It looks like this PR disables pull queries for any KSQL server running against a secured Kafka cluster, e.g. one secured with ACLs or some custom auth mech, i.e. it disabled pull queries for anyone using KSQL in a modern production environment. Is this really what we want? I guess we're releasing this as a preview and we can look to quickly iterate and remove this restriction. But just wanted to call this out and make sure this is indeed our intent with this PR.

This is accurate. The main worry is if someone runs a pull query benchmark against a production cluster, the current mechanism could actually destabilize the cluster due to its design. IMO the reputations damage due to that could be worse that having the feature be opt in with the risks understood.

@derekjn is on board with this. cc @MichaelDrogalis for visibility in case he missed the other threads .

Copy link
Contributor Author

@vinothchandar vinothchandar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a note on the WebSocket endpoint.

This feels very wrong. Where's the test testing our normal / recommended path?

For pull queries, the normal path is to have them working no? I made the test changes so the existing functional tests can keep testing functionality when pull queries are enabled. and added tests to ensure the error is thrown when validating and we get past it when not validating. I feel this is adequate.

It looks like this PR disables pull queries for any KSQL server running against a secured Kafka cluster,

+1 to apurva's comment. Unfortunately, this is the suggested way forward and we can undo this quickly in the next release.

"ksql.query.pull.skip.access.validator";
public static final boolean KSQL_PULL_QUERIES_SKIP_ACCESS_VALIDATOR_DEFAULT = false;
public static final String KSQL_PULL_QUERIES_SKIP_ACCESS_VALIDATOR_DOC = "If true, KSQL will "
+ " NOT enforce access validation checks for pull queries, which could expose Kafka topics"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just matched it to the config above. I think this is okay, since these aspects are explained in the other config anyway.

fixes confluentinc#3863

 - Added `ksql.query.pull.skip.access.validator` to control if pull queries work without validation
 - By default, Pull queries error out, if auth validation is needed
 - Replaced DUMMY_VALIDATOR with Optional<> interface for KsqlAuthorizationValidatorFactory
 - Fixed some tests, added test cases
@derekjn
Copy link
Contributor

derekjn commented Nov 18, 2019

LGTM 👍

Copy link
Contributor

@agavra agavra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - summarizing my thoughts on the main points:

  • I think our test coverage is OK. Our "supported" way to run KSQL queries on secure kafka clusters is to accept the security risk in the meantime - this is what we test. We also test that the default behavior on secure clusters is to fail pull queries. Together I feel our coverage is sufficient, especially considering a long-term fix is already in the works
  • I feel that we should secure the WebSocket API in the same way. Better safe than have the bad rep of the WebSocket API taking down Kafka clusters.

Not sure I like changing it to optional instead of a dummy class and just having the config logic in there, but I'm not opinionated enough to block you 😂

@agavra agavra requested a review from a team November 18, 2019 18:49
@vinothchandar vinothchandar merged commit ccc636d into confluentinc:master Nov 18, 2019
vinothchandar added a commit to vinothchandar/ksql that referenced this pull request Nov 26, 2019
…c#3879)

fixes confluentinc#3863

 - Added `ksql.query.pull.skip.access.validator` to control if pull queries work without validation
 - By default, Pull queries error out, if auth validation is needed
 - Replaced DUMMY_VALIDATOR with Optional<> interface for KsqlAuthorizationValidatorFactory
 - Fixed some tests, added test cases
 - Applied on both `query` and websocket endpoints
vinothchandar added a commit that referenced this pull request Nov 27, 2019
…sters (#3980)

* refactor: lazy initialization of clients (admin,sr,ksql,connect) (#3696)
- Made client creation lazy by memoizing them.

* feat: add config to disable pull queries when validating (#3879)

fixes #3863

 - Added `ksql.query.pull.skip.access.validator` to control if pull queries work without validation
 - By default, Pull queries error out, if auth validation is needed
 - Replaced DUMMY_VALIDATOR with Optional<> interface for KsqlAuthorizationValidatorFactory
 - Fixed some tests, added test cases
 - Applied on both `query` and websocket endpoints
vinothchandar added a commit that referenced this pull request Nov 27, 2019
No-op merge commit. kept all changes in master

* refactor: lazy initialization of clients (admin,sr,ksql,connect) (#3696)
* feat: add config to disable pull queries when validating (#3879)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Pull Query: Disable Pull Queries when auth validation is needed on /query endpoint
7 participants