Adds support for Select, Scan and Search queries #85

anskarl · 2020-01-01T23:39:38Z

Changes interval query and response API in order to support the three new queries
Legacy mode for Scan queries is configurable from application.conf
Updates documentation and examples for new queries
Updates unit tests to examine new queries
All Druid Queries have toDebugString utility function, in order to get the corresponding native JSON representation of the query as a string (useful for debugging purposes)

- Changes interval query and response API in order to support the three new queries - Legacy mode for Scan queries is configurable from application.conf - Updates documentation and examples for new queries - Updates unit tests to examine new queries - All Druid Queries have toDebugString utility function, in order to get the corresponding native JSON representation of the query as a string (useful for debugging purposes) ing-bankGH-83

bjgbeelen · 2020-01-12T06:10:24Z

Thanks @anskari. I’ll have a look at them this friday!

ing-bankGH-83

barend

I think this adds a lot of value, and I'm in favour of merging.

barend · 2020-01-17T09:30:11Z

README.md

+
+For example the following:
+
+```scala


I would include the import ing.wbaa.druid.dql.DSL._ in this snippet.

barend · 2020-01-17T09:39:22Z

src/main/scala/ing/wbaa/druid/DruidResponse.scala

-      }
+  def as[T](implicit decoder: Decoder[T]): T = decoder.decodeJson(this.event) match {
+    case Left(e)      => throw e
+    case Right(value) => value


These matches could be replaced by .toTry.get, although I suppose it's debatable whether that's truly a readability improvement, as it obfuscates the throw.

(This comment applies to a couple of other places too)

I'm fine with either as long as it is consistent

I like both, but I will change to .toTry.get since it is shorter.

barend · 2020-01-17T10:27:45Z

src/test/scala/ing/wbaa/druid/dql/DQLSpec.scala

+      val requestSeries = query.streamSeriesAs[SelectResult].runWith(Sink.seq)
+      whenReady(requestSeries) { response =>
+        response.size shouldBe numberOfResults
+      }


I don't know the details of how the Akka streaming stuff works, so open question: is the test response large enough compared to the buffer sizes that all the streaming stuff is truly exercised (i.e. it's more than a single-chunk response)?

Good point.

Regarding the test, the granularity type of the testing query is hourly and the threshold of events for each result is 10. In practise the datasource contains data only for a single day 2015-09-12. Therefore, the select query will result to an array of 24 JSON result structures (one for each hour) and each one will contain 10 events (due to threshold of 10).

Streaming JSON is being performed on the top-level JSON result structures (24 in our test), therefore in situations that the threshold is large we may not going to have any benefit from streaming. Only when the top-level JSON structures are many (e.g., months of data and hourly granularity) we benefit from streaming.

In order to have streaming support also for nested structures, we need something like akka-stream-alpakka-json-streaming. In fact this is something that I've tried to implement at the beginning, but I couldn't make it to work with the nested structure of Select response.

bjgbeelen

Looks good. Nice improvements indeed!

One minor thing I noticed is the naming DruidResponseScanImpl e.g. I would probably have chosen to use DruidScanResponse. You do use DruidScanResult(s), so just to make it more consistent.

anskarl · 2020-01-17T21:02:57Z

Looks good. Nice improvements indeed!

One minor thing I noticed is the naming DruidResponseScanImpl e.g. I would probably have chosen to use DruidScanResponse. You do use DruidScanResult(s), so just to make it more consistent.

I agree, I will change the name to DruidScanResponse.

- Simplify decoding functions by using `.toTry.get` instead of pattern matching. - Rename DruidResponseScanImpl to DruidScanResponse ing-bankGH-83

barend · 2020-01-18T12:28:42Z

Thank you for updating the PR

changes version to 2.3.1-SNAPSHOT

d9d307a

ing-bankGH-83

barend approved these changes Jan 17, 2020

View reviewed changes

bjgbeelen approved these changes Jan 17, 2020

View reviewed changes

Code improvements for Select, Scan and Search queries feature:

4854bb1

- Simplify decoding functions by using `.toTry.get` instead of pattern matching. - Rename DruidResponseScanImpl to DruidScanResponse ing-bankGH-83

krisgeus merged commit a4317b6 into ing-bank:master Jan 18, 2020

anskarl mentioned this pull request May 8, 2020

Release v2.4.0 #97

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adds support for Select, Scan and Search queries #85

Adds support for Select, Scan and Search queries #85

anskarl commented Jan 1, 2020

bjgbeelen commented Jan 12, 2020

barend left a comment

barend Jan 17, 2020

barend Jan 17, 2020

bjgbeelen Jan 17, 2020

anskarl Jan 17, 2020

barend Jan 17, 2020

anskarl Jan 17, 2020

bjgbeelen left a comment

anskarl commented Jan 17, 2020

barend commented Jan 18, 2020

Adds support for Select, Scan and Search queries #85

Adds support for Select, Scan and Search queries #85

Conversation

anskarl commented Jan 1, 2020

bjgbeelen commented Jan 12, 2020

barend left a comment

Choose a reason for hiding this comment

barend Jan 17, 2020

Choose a reason for hiding this comment

barend Jan 17, 2020

Choose a reason for hiding this comment

bjgbeelen Jan 17, 2020

Choose a reason for hiding this comment

anskarl Jan 17, 2020

Choose a reason for hiding this comment

barend Jan 17, 2020

Choose a reason for hiding this comment

anskarl Jan 17, 2020

Choose a reason for hiding this comment

bjgbeelen left a comment

Choose a reason for hiding this comment

anskarl commented Jan 17, 2020

barend commented Jan 18, 2020