-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(client): Java client with push + pull query support #5200
Conversation
* @param columnName name of column. | ||
* @return column value. | ||
*/ | ||
Boolean getBoolean(String columnName); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd originally wanted to add similar methods for getting decimals but that requires either:
- parsing the schema to find the precision and scale
- asking the caller to specify precision and scale in the getter
and neither seems great. Is it worth updating the endpoint to return the schema in a more structured form? Feels maybe like overkill.
I also considered adding getter methods for arrays (lists) and maps but I'm not sure how valuable those methods would be without parsing the schema for specific subtypes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think most users care too muchabout the precision of a decimal. I think we should just return a BigDecimal for a DECIMAL column with whatever scale and precision is appropriate for the value. This is what JDBC does btw https://docs.oracle.com/javase/7/docs/api/java/sql/ResultSet.html#getBigDecimal(int)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For a struct - we can just return that as a JsonObject.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, I don't think many users will care about schema either. In the vast majority of cases the back end schema will be well known and fixed, and the developer will know this when issuing queries and doing stuff with the results. Very rarely will be a front end be coding against a completely dynamic and unknown back end.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For a struct - we can just return that as a JsonObject.
To continue an offline discussion: we previously thought it made sense to use vanilla Java types (List, Map) instead of Vert.x types (JsonArray, JsonObject) in the client interfaces so that apps using the client don't need Vert.x as a dependency, but more recently you said maybe it makes sense to use the Vert.x types for better type safety.
This still feels strange to me, though. It's one thing to give users the option to provide their own Vert.x instance and/or Vert.x HttpClientOptions, but requiring the use of Vert.x types in order to use the client at all feels odd. IMO it feels fine to expose the fact that the client uses Vert.x under the hood in order to benefit users seeking more advanced use cases, but requiring the use of Vert.x for all client use cases seems unwarranted.
A third option could be to wrap the Vert.x types but that seems like overkill.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally I think it's ok to expose the Vert.x types, or if you prefer to wrap them in our own type and delegate internally. Either way really. But I think it would be a shame to lose the functionality that those classes have which I think would be useful to users. E.g. type safe getters, easy conversion to JSON string, conversion to buffers etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed in KLIP-26, I plan to introduce types that wrap the Vert.x types. Stand by for a follow-up PR with this change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! The client certainly seems to be taking shape :)
A few comments, nothing major, the overall shape looks good and moving in the right direction.
|
||
<dependencies> | ||
<dependency> | ||
<groupId>io.confluent.ksql</groupId> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is ok for now. But before we ship I think it's important that we don't depend on any ksqlDB server side stuff - otherwise a whole load of dependencies will be pulled into the client jar, which will it hard to use by users.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Had a look. The main dependencies right now are:
- QueryResponseMetadata, which is used by QueryResponseHandler in order to deserialize the object
- BufferedPublisher, which is extended by QueryResultImpl
- BaseSubscriber, which is extended by PollableSubscriber
What's your recommendation for removing these dependencies? I see the value in not having the client depend on any of the server modules but I also don't think it makes sense to duplicate these classes. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we will need to factor out the reactive streams base classes into their own module - e.g. "reactive-common" and have both the server and client depend on that.
For QueryResponseMetaData we could follow the pattern of the old REST API and put the shared classes in their own package (i.e. like rest-model). Or perhaps we should use rest-model?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got'cha, this makes a lot of sense. Will do in a follow-up PR.
|
||
void close(); | ||
|
||
static Client create(ClientOptions clientOptions) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I notice this method is not used in tests. I think it's better to use the interface method to create a client than directly instantiating ClientImpl. This enables us to change the implementation more easily without breaking clients. We should consider hiding the constructor of ClientImpl (e.g. making package protected or private and indirecting through a factory)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd also add a version of create that takes a Vertx instance. This allows the client to use any existing Vert.x the user might already be using in their app, thus alllowing thread pools to be reused etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call on the motivation for using the interface method over directly instantiating ClientImpl. I've updated the tests, and also applied the analogous change to ClientOptionsImpl.
I'm not seeing a way to make the constructor for ClientImpl package private, though. Client.java
is in a different package from ClientImpl.java
so if Client.java
is able to instantiate ClientImpl
, then ClientImpl
must have a public method for instantiation, whether that's a constructor or a factory method. How do people usually work around this?
I'd also add a version of create that takes a Vertx instance.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Take a look how it's done in Vert.x https://github.com/eclipse-vertx/vert.x/blob/master/src/main/java/io/vertx/core/Vertx.java#L86
Basically the interface uses a static factory instance to actually create the implemention. The ServiceHelper is used to load the factory at run-time by scanning the classpath for implementations of the factory. The factory itself is in the same package as VertxImpl so the VertxImpl constructor can be package protected. It's a bit convoluted and may be overkill for us right now, might be sufficient to not worry about hiding the constructor but perhaps adding javadoc to it saying it should not be instantiated directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm interesting, the VertxImpl constructor is package protected but the factory implementation (in the same package) is still public: https://github.com/eclipse-vertx/vert.x/blob/3.8/src/main/java/io/vertx/core/impl/VertxFactoryImpl.java#L23
so if a user wanted to circumvent the intent of the VertxImpl constructor being private they could still do so by calling new VertxImplFactory().vertx()
, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably, but the intention here isn't to protect against malicious users - it's pretty trivial to construct any object, even if it has an inaccessible private/protected/package protected constructor, using reflection. The idea is to nudge users to the right API :)
|
||
package io.confluent.ksql.api.client; | ||
|
||
public interface ClientOptions { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe also expose the Vert.x HttpClientOptions? There are probably other settings (e.g. pool size) that users might want to tweak.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm trying to understand the expected behavior if a user provides HttpClienttOptions
: will ClientOptionsImpl
update the provided HttpClienttOptions
according to the other fields, or will the user-provided HttpClientOptions
be used directly?
The latter doesn't seem very user-friendly since then the user would be responsible for duplicating the work of ClientImpl
in populating HttpClientOptions
based on ClientOptions
, but the former also seems confusing since the user would have to know which HttpClientOptions
properties will be overridden and which won't.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would probably expose the HttpClientOptions directly and not have similar methods on ClientOptions at all. I.e. only have options on ClientOptions if they're not related to HTTP. If you're not comfortable exposing the HttpClientOptions directly you could wrap them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I.e. only have options on ClientOptions if they're not related to HTTP.
What counts as "not related to HTTP"? Of the options exposed so far (host, port, useTls, trustStore, keyStore, and basicAuth), all of them have equivalents in the Vert.x HttpClientOptions besides basicAuth. If we were to only expose HttpClientOptions and not have similar options on ClientOptions then ClientOptions would become
public interface ClientOptions {
ClientOptions setBasicAuthCredentials(String username, String password);
boolean isUseBasicAuth();
String getBasicAuthUsername();
String getBasicAuthPassword();
ClientOptions copy();
static ClientOptions create(HttpClientOptions httpClientOptions) {
return new ClientOptionsImpl(httpClientOptions);
}
Is this what you're proposing? I feel like I've misunderstood.
If you're not comfortable exposing the HttpClientOptions directly you could wrap them.
I assume this means creating a wrapper type around HttpClientOptions, rather than wrapping the individual methods of HttpClientOptions into ClientOptions (as exposing too many options in ClientOptions feels like it'll overwhelm the user). If we were to create a wrapper type around HttpClientOptions, would it not be better to leave the more commonly used methods in ClientOptions itself (as is currently the case) and only wrap the other options in HttpClientOptions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm going to change my mind on this one. I think it's ok to just use our own ClientOptions and not to expose HttpClientOptions. If we find we need to expose more properties over time on ClientOptions we can do that.
(Aside: BTW I think we should support token auth on the client too, not just basic auth)
ksqldb-api-client/src/main/java/io/confluent/ksql/api/client/QueryResult.java
Outdated
Show resolved
Hide resolved
* @param columnName name of column. | ||
* @return column value. | ||
*/ | ||
Boolean getBoolean(String columnName); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think most users care too muchabout the precision of a decimal. I think we should just return a BigDecimal for a DECIMAL column with whatever scale and precision is appropriate for the value. This is what JDBC does btw https://docs.oracle.com/javase/7/docs/api/java/sql/ResultSet.html#getBigDecimal(int)
ksqldb-api-client/src/test/java/io/confluent/ksql/api/client/ClientTest.java
Outdated
Show resolved
Hide resolved
ksqldb-api-client/src/test/java/io/confluent/ksql/api/client/ClientTest.java
Outdated
Show resolved
Hide resolved
ksqldb-api-client/src/test/java/io/confluent/ksql/api/client/ClientTest.java
Outdated
Show resolved
Hide resolved
} | ||
|
||
@Test | ||
public void shouldNotSubscribeIfPolling() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Imho I would like to see these kinds of tests conducted using the actual API and no mocks.
As you know I'm not a fan of fine grained unit tests and mocks as they can constrain the implementation, and often what you're testing doesn't really correspond to what the system really does thus resulting in bugs slipping through and a false sense of security.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I'll add equivalent tests to ClientTest
as part of revamping / improving test coverage. My vote is to leave these unit tests in place, though, until we see them become brittle. I think it's useful to be able to scan a test file and understand the key pieces of functionality for a class without having to dig through integration tests. Though I guess Java docs on the class/interface itself should serve this purpose in most cases, so maybe that's not a good reason...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As long as there's equivalent test coverage on the actual API I think that's fine. And as long as you won't be upset if I end up deleting them after spending an hour trying to refactor them if the implementation changes ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update: going to add these additional tests in a follow-up PR.
And as long as you won't be upset if I end up deleting them after spending an hour trying to refactor them if the implementation changes ;)
Fine by me ;)
import org.junit.Before; | ||
import org.junit.Test; | ||
|
||
public class RowImplTest { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, not a fan of fine grained unit tests. I'd prefer to see the behaviour of a row tested on instance of Row interface returned from the actual API rather than the particular implementation RowImpl. If we later change the implementation these kinds of tests get very brittle and hard to refactor whereas tests that test against the interface don't.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving as overall approach looks sound, on basis that review comments are addressed (or not addressed if my comment does not make sense - let's discuss ;) )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the review @purplefox -- super helpful comments and suggestions! Responded inline.
|
||
<dependencies> | ||
<dependency> | ||
<groupId>io.confluent.ksql</groupId> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Had a look. The main dependencies right now are:
- QueryResponseMetadata, which is used by QueryResponseHandler in order to deserialize the object
- BufferedPublisher, which is extended by QueryResultImpl
- BaseSubscriber, which is extended by PollableSubscriber
What's your recommendation for removing these dependencies? I see the value in not having the client depend on any of the server modules but I also don't think it makes sense to duplicate these classes. What do you think?
ksqldb-api-client/src/main/java/io/confluent/ksql/api/client/Client.java
Outdated
Show resolved
Hide resolved
ksqldb-api-client/src/main/java/io/confluent/ksql/api/client/Client.java
Outdated
Show resolved
Hide resolved
|
||
void close(); | ||
|
||
static Client create(ClientOptions clientOptions) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call on the motivation for using the interface method over directly instantiating ClientImpl. I've updated the tests, and also applied the analogous change to ClientOptionsImpl.
I'm not seeing a way to make the constructor for ClientImpl package private, though. Client.java
is in a different package from ClientImpl.java
so if Client.java
is able to instantiate ClientImpl
, then ClientImpl
must have a public method for instantiation, whether that's a constructor or a factory method. How do people usually work around this?
I'd also add a version of create that takes a Vertx instance.
Done.
|
||
package io.confluent.ksql.api.client; | ||
|
||
public interface ClientOptions { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm trying to understand the expected behavior if a user provides HttpClienttOptions
: will ClientOptionsImpl
update the provided HttpClienttOptions
according to the other fields, or will the user-provided HttpClientOptions
be used directly?
The latter doesn't seem very user-friendly since then the user would be responsible for duplicating the work of ClientImpl
in populating HttpClientOptions
based on ClientOptions
, but the former also seems confusing since the user would have to know which HttpClientOptions
properties will be overridden and which won't.
ksqldb-api-client/src/test/java/io/confluent/ksql/api/client/ClientTest.java
Outdated
Show resolved
Hide resolved
ksqldb-api-client/src/test/java/io/confluent/ksql/api/client/ClientTest.java
Outdated
Show resolved
Hide resolved
ksqldb-api-client/src/test/java/io/confluent/ksql/api/client/ClientTest.java
Outdated
Show resolved
Hide resolved
} | ||
|
||
@Test | ||
public void shouldNotSubscribeIfPolling() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I'll add equivalent tests to ClientTest
as part of revamping / improving test coverage. My vote is to leave these unit tests in place, though, until we see them become brittle. I think it's useful to be able to scan a test file and understand the key pieces of functionality for a class without having to dig through integration tests. Though I guess Java docs on the class/interface itself should serve this purpose in most cases, so maybe that's not a good reason...
import org.junit.Before; | ||
import org.junit.Test; | ||
|
||
public class RowImplTest { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks again for the reviews (and patient explanations) @purplefox !
I've addressed the majority of comments and am planning to merge this PR once the build passes. Follow-up PRs will contain:
- additional test coverage: error cases, functionality currently only tested via unit tests, TLS mutual auth, result schemas with decimal and complex types, QueryResultImpl#isComplete()
- remove dependency on ksqlDB server module
- introduction of structured types (to wrap Vert.x JsonObject and JsonArray), and Row#getDecimal() methods
- expose Vert.x HttpClientOptions, pending discussion in feat(client): Java client with push + pull query support #5200 (comment)
- other interface changes resulting from the discussion in KLIP-26
* @param columnName name of column. | ||
* @return column value. | ||
*/ | ||
Boolean getBoolean(String columnName); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed in KLIP-26, I plan to introduce types that wrap the Vert.x types. Stand by for a follow-up PR with this change.
} | ||
|
||
@Test | ||
public void shouldNotSubscribeIfPolling() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update: going to add these additional tests in a follow-up PR.
And as long as you won't be upset if I end up deleting them after spending an hour trying to refactor them if the implementation changes ;)
Fine by me ;)
|
||
@Override | ||
public boolean isComplete() { | ||
return false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got'cha. I've renamed a couple internal variables in BufferedPublisher
to better reflect this. Thanks for the clarification!
|
||
ClientOptions setUseClientAuth(boolean useClientAuth); | ||
|
||
ClientOptions setVerifyHost(boolean verifyHost); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These additional TLS options were needed to get the tests working. In a follow-up PR they'll either be removed in favor of exposing Vert.x HttpClientOptions (pending discussion in #5200 (comment)) or I'll refactor all the TLS options into a separate interface in order to clean up this one.
* @param timeUnit unit for timeout param. | ||
* @return the row, if available; else, null. | ||
*/ | ||
Row poll(long timeout, TimeUnit timeUnit); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vcrfxia As mentioned on the KLIP: why do we not use Duration
instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the bump. I've updated the KLIP and will update the code in a future PR (along with a multitude of other feedback from the KLIP).
Description
Still in the process of adding more test coverage but the bulk of the changes and functional tests are here.
To review, start with
Client.java
and the associated interfaces. Then look at the implementations, starting withClientImpl.java
. Unit/functional tests with example usage are inClientTest.java
.Additional test coverage to come (potentially in follow-up PRs) include tests for:
Implementation for the insert stream methods will be a separate PR. Docs will come later as well.
Testing done
Manual + unit tests.
Reviewer checklist