-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adds support for Select, Scan and Search queries #85
Conversation
- Changes interval query and response API in order to support the three new queries - Legacy mode for Scan queries is configurable from application.conf - Updates documentation and examples for new queries - Updates unit tests to examine new queries - All Druid Queries have toDebugString utility function, in order to get the corresponding native JSON representation of the query as a string (useful for debugging purposes) ing-bankGH-83
Thanks @anskari. I’ll have a look at them this friday! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this adds a lot of value, and I'm in favour of merging.
|
||
For example the following: | ||
|
||
```scala |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would include the import ing.wbaa.druid.dql.DSL._
in this snippet.
} | ||
def as[T](implicit decoder: Decoder[T]): T = decoder.decodeJson(this.event) match { | ||
case Left(e) => throw e | ||
case Right(value) => value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These matches could be replaced by .toTry.get
, although I suppose it's debatable whether that's truly a readability improvement, as it obfuscates the throw
.
(This comment applies to a couple of other places too)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine with either as long as it is consistent
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like both, but I will change to .toTry.get
since it is shorter.
val requestSeries = query.streamSeriesAs[SelectResult].runWith(Sink.seq) | ||
whenReady(requestSeries) { response => | ||
response.size shouldBe numberOfResults | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know the details of how the Akka streaming stuff works, so open question: is the test response large enough compared to the buffer sizes that all the streaming stuff is truly exercised (i.e. it's more than a single-chunk response)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point.
Regarding the test, the granularity type of the testing query is hourly and the threshold of events for each result is 10. In practise the datasource contains data only for a single day 2015-09-12
. Therefore, the select query will result to an array of 24 JSON result structures (one for each hour) and each one will contain 10 events (due to threshold of 10).
Streaming JSON is being performed on the top-level JSON result structures (24 in our test), therefore in situations that the threshold is large we may not going to have any benefit from streaming. Only when the top-level JSON structures are many (e.g., months of data and hourly granularity) we benefit from streaming.
In order to have streaming support also for nested structures, we need something like akka-stream-alpakka-json-streaming. In fact this is something that I've tried to implement at the beginning, but I couldn't make it to work with the nested structure of Select response.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. Nice improvements indeed!
One minor thing I noticed is the naming DruidResponseScanImpl
e.g. I would probably have chosen to use DruidScanResponse
. You do use DruidScanResult(s)
, so just to make it more consistent.
I agree, I will change the name to |
- Simplify decoding functions by using `.toTry.get` instead of pattern matching. - Rename DruidResponseScanImpl to DruidScanResponse ing-bankGH-83
Thank you for updating the PR |
GH-83