LIVY-19. Add Spark SQL support #148

meisam · 2016-06-10T17:25:57Z

LIVY-19. Add Spark SQL support

This leaves the SparkInterpreter untouched and confines most of the changes to the livy-repl component.
The SparkSqlInterpreter is based on SparkInterpreter and needs to be polished.
Your feedback is really appreciated.

Task-Url: https://issues.cloudera.org/browse/LIVY-19

ksakellis · 2016-06-10T17:40:06Z

Please update this pr to point to the livy bug: LIVY-19

Implementing a Livy SparkSql Intrepreter. This leaves the SparkInterpreter untouched and makes confines most of the changes to the livy-repl component. Task-Url: https://issues.cloudera.org/browse/LIVY-19

meisam · 2016-06-10T18:54:56Z

I updated the pull request description and reworded the commits to link to LIVY-19.

codecov-io · 2016-06-10T19:18:59Z

Codecov Report

Merging #148 into master will decrease coverage by -10.71%.

@@             Coverage Diff             @@
##           master     #148       +/-   ##
===========================================
- Coverage   71.53%   60.83%   -10.71%     
===========================================
  Files          91       73       -18     
  Lines        4697     3965      -732     
  Branches      811      651      -160     
===========================================
- Hits         3360     2412      -948     
- Misses        861     1243      +382     
+ Partials      476      310      -166

Impacted Files	Coverage Δ
...c/main/scala/com/cloudera/livy/sessions/Kind.scala	`0% <ø> (-50%)`	❌
...a/com/cloudera/livy/repl/SparkSqlInterpreter.scala	`0% <ø> (ø)`
...main/scala/com/cloudera/livy/repl/ReplDriver.scala	`66.66% <ø> (+34.16%)`	✅
api/src/main/java/com/cloudera/livy/JobHandle.java	`0% <ø> (-100%)`	❌
core/src/main/scala/com/cloudera/livy/Utils.scala	`0% <ø> (-93.75%)`	❌
...ore/src/main/scala/com/cloudera/livy/Logging.scala	`0% <ø> (-81.82%)`	❌
...ain/scala/com/cloudera/livy/server/WebServer.scala	`0% <ø> (-61.23%)`	❌
...cala/com/cloudera/livy/sessions/SessionState.scala	`0% <ø> (-44.45%)`	❌
...ain/java/com/cloudera/livy/rsc/FutureListener.java	`33.33% <ø> (-33.34%)`	❌
... and 74 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 69ac11e...3ee5815. Read the comment docs.

zjffdu · 2016-06-10T23:59:29Z

repl/src/main/scala/com/cloudera/livy/repl/SparkSqlInterpreter.scala

+  override def close(): Unit = synchronized {
+    if (hiveContext != null) {
+      // clean up and close hive context here
+    }


maybe set hiveContext as null

zjffdu · 2016-06-13T05:56:48Z

Another comment is that maybe we can reuse the SparkContext in SparkInterpreter. We can just add sql support in rest api. e.g.

curl   http://localhost:8998/sessions/0/statements -X POST -H 'Content-Type: application/json' -d '{"sql":"select * from table_1"}'

So that for tools like hue and zeppelin, different paragraph can share data easily as they share the same SparkContext.

vanzin · 2016-06-13T16:41:14Z

Hi @meisam,

Before we can even look at this patch, we need a signed ICLA and, if it applies, a signed CCLA from your employer. Please check the wiki: https://github.com/cloudera/livy/wiki/Contributing-to-Livy

vanzin · 2016-06-13T16:41:47Z

maybe we can reuse the SparkContext in SparkInterpreter

Without having looked at the code, I like that idea.

meisam · 2016-06-13T18:14:08Z

@vanzin
Where can I find the link to the CCLA? I'll work with my manager to have it signed and sent.

vanzin · 2016-06-13T18:17:16Z

It's at the link I posted above.

meisam · 2016-06-13T18:20:43Z

@zjffdu I was thinking more along these lines

curl http://localhost:8998/sessions/0/sql-statements -X POST -H 'Content-Type: application/json' -d '{"code":"select * from table_1"}'

This keeps com.cloudera.livy.ExecuteRequest untouched.

zjffdu · 2016-06-14T02:44:40Z

@meisam That make sense.

ksakellis · 2016-06-20T19:02:37Z

@meisam Please print, sign and email to [email protected] both the ICLA:
https://github.com/cloudera/livy/wiki/Individual-Contributor-License-Agreement-(ICLA)
and the CCLA with your name on the Schedule A signed by your employer:
https://github.com/cloudera/livy/wiki/Corporate-Contributor-License-Agreement-(CCLA)

meisam · 2016-10-10T18:04:39Z

@ksakellis I mailed the ICLA and CCLA to [email protected] last week.

alex-the-man · 2016-11-02T00:40:50Z

Livy interpreters support 2 magics: %json and %table.
I think we can add another magic: %sql to support this use without adding adding a new REST API.

To run a SQL statement, user can just POST sessions/<id>/statements with {"code":"%sql <sql>"}

zjffdu · 2016-11-02T00:47:25Z

Before that, I think we should implement sparkcontext sharing across language #178 , because it is very often user want to access table that is created in spark/pyspark/sparkr interpreter.

alex-the-man · 2016-11-02T01:06:30Z

Sorry let me clarify myself. I'm purposing to add %sql magic to all interpreters. Instead of creating a new interpreter. It shouldn't depend on #178.

meisam · 2016-11-02T21:49:31Z

@tc0312 How would a %sql magic work with SQL code that spans over multiple lines? Or queries that contain comments in them?

alex-the-man · 2016-11-02T21:59:26Z

It's up to us to define how %sql magic works. We can define it in a way that it works with multiple lines and comments.

meisam · 2016-11-02T22:12:30Z

@tc0312 I guess I should clarify myself. My question is, which approach is easier to implement? Having a %sql magic? Or having a separate SQL interpreter?

meisam · 2016-11-02T22:15:37Z

@tc0312 Actually there's a third approach that @zjffdu suggested: '{"sql":"select * from table_1"}'

alex-the-man · 2016-11-03T19:56:58Z

I think they are both easy to implement. I would prefer a magic approach because Livy already has magics and cloudera/hue is using it.

sven0726 · 2017-05-29T14:20:07Z

What was the final result of this issue?

zjffdu · 2017-05-31T23:05:43Z

It needs more careful design in https://issues.cloudera.org/browse/LIVY-194
We'd like to share sparkcontext across scala/python/R/sql

pkasinathan · 2017-06-01T00:40:23Z

Hi @zjffdu,

Enabling shared spark context across scala/python/R is next phase and it may take sometime.

But, when we support Scala, Python and R interpreters already, can we also add SQL interpreter?

If we have direct SQL interpreter enabled, then it will be very easy for users to submit SQL statements directly to interpreter instead of wrapping it up with HiveContext (<2.0) or Spark Session (>2.0) or using SQL magic everytime.

Please let me know your thoughts.

Prabhu

Livy-19. Add Spark SQL support

3ee5815

Implementing a Livy SparkSql Intrepreter. This leaves the SparkInterpreter untouched and makes confines most of the changes to the livy-repl component. Task-Url: https://issues.cloudera.org/browse/LIVY-19

meisam force-pushed the LIVY-19 branch from 12e4780 to 3ee5815 Compare June 10, 2016 18:51

meisam changed the title ~~DTTAHOE-77: Create SQL Kind Session on Livy Job Server~~ LIVY-19. Add Spark SQL support Jun 10, 2016

zjffdu reviewed Jun 10, 2016
View reviewed changes

alex-the-man force-pushed the master branch 2 times, most recently from 39a5162 to 2d6e026 Compare September 7, 2016 14:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LIVY-19. Add Spark SQL support #148

LIVY-19. Add Spark SQL support #148

meisam commented Jun 10, 2016 •

edited

Loading

ksakellis commented Jun 10, 2016

meisam commented Jun 10, 2016

codecov-io commented Jun 10, 2016 •

edited

Loading

zjffdu Jun 10, 2016

zjffdu commented Jun 13, 2016

vanzin commented Jun 13, 2016

vanzin commented Jun 13, 2016

meisam commented Jun 13, 2016

vanzin commented Jun 13, 2016

meisam commented Jun 13, 2016

zjffdu commented Jun 14, 2016

ksakellis commented Jun 20, 2016

meisam commented Oct 10, 2016

alex-the-man commented Nov 2, 2016 •

edited

Loading

zjffdu commented Nov 2, 2016

alex-the-man commented Nov 2, 2016

meisam commented Nov 2, 2016 •

edited

Loading

alex-the-man commented Nov 2, 2016 •

edited

Loading

meisam commented Nov 2, 2016

meisam commented Nov 2, 2016

alex-the-man commented Nov 3, 2016

sven0726 commented May 29, 2017

zjffdu commented May 31, 2017

pkasinathan commented Jun 1, 2017 •

edited

Loading

LIVY-19. Add Spark SQL support #148

Are you sure you want to change the base?

LIVY-19. Add Spark SQL support #148

Conversation

meisam commented Jun 10, 2016 • edited Loading

ksakellis commented Jun 10, 2016

meisam commented Jun 10, 2016

codecov-io commented Jun 10, 2016 • edited Loading

Codecov Report

zjffdu Jun 10, 2016

Choose a reason for hiding this comment

zjffdu commented Jun 13, 2016

vanzin commented Jun 13, 2016

vanzin commented Jun 13, 2016

meisam commented Jun 13, 2016

vanzin commented Jun 13, 2016

meisam commented Jun 13, 2016

zjffdu commented Jun 14, 2016

ksakellis commented Jun 20, 2016

meisam commented Oct 10, 2016

alex-the-man commented Nov 2, 2016 • edited Loading

zjffdu commented Nov 2, 2016

alex-the-man commented Nov 2, 2016

meisam commented Nov 2, 2016 • edited Loading

alex-the-man commented Nov 2, 2016 • edited Loading

meisam commented Nov 2, 2016

meisam commented Nov 2, 2016

alex-the-man commented Nov 3, 2016

sven0726 commented May 29, 2017

zjffdu commented May 31, 2017

pkasinathan commented Jun 1, 2017 • edited Loading

meisam commented Jun 10, 2016 •

edited

Loading

codecov-io commented Jun 10, 2016 •

edited

Loading

alex-the-man commented Nov 2, 2016 •

edited

Loading

meisam commented Nov 2, 2016 •

edited

Loading

alex-the-man commented Nov 2, 2016 •

edited

Loading

pkasinathan commented Jun 1, 2017 •

edited

Loading