Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-5135][SQL] Add support for describe [extended] table to DDL in SQLContext #3935

Closed
wants to merge 2 commits into from

Conversation

OopsOutOfMemory
Copy link
Contributor

@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@OopsOutOfMemory
Copy link
Contributor Author

e.g.

import org.apache.spark.sql.SQLContext
val sqlContext  = new SQLContext(sc) 
import sqlContext._  

val jsonDDL = s"""  
      |CREATE TEMPORARY TABLE jsonTable  
      |USING org.apache.spark.sql.json  
      |OPTIONS (  
      | path  'file:///Users/shengli/git_repos/spark/examples/src/main/resources/people.json'  
      |)""".stripMargin

sqlContext.sql(jsonDDL)

sqlContext.sql("describe jsonTable").collect()

sqlContext.sql("describe extended jsonTable").collect()

scala> sqlContext.sql("describe jsonTable").collect()
15/01/08 01:20:11 INFO sources.DDLParser: is extended ? -> false
res1: Array[org.apache.spark.sql.Row] = Array([age,int,null], [name,string,null])

scala> 

scala> sqlContext.sql("describe extended jsonTable").collect()
15/01/08 01:20:14 INFO sources.DDLParser: is extended ? -> true
res2: Array[org.apache.spark.sql.Row] = Array([age,int,null], [name,string,null], [# extended,null,null])

@marmbrus
Copy link
Contributor

marmbrus commented Jan 7, 2015

ok to test

@marmbrus
Copy link
Contributor

marmbrus commented Jan 7, 2015

you'll also need to add test cases

@SparkQA
Copy link

SparkQA commented Jan 7, 2015

Test build #25179 has started for PR 3935 at commit 1c02744.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Jan 7, 2015

Test build #25179 has finished for PR 3935 at commit 1c02744.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • case class DDLStrategy(context: SQLContext) extends Strategy
    • case class DDLDescribeCommand(

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/25179/
Test FAILed.

@@ -328,18 +328,19 @@ class SQLContext(@transient val sparkContext: SparkContext)
def numPartitions = self.numShufflePartitions

def strategies: Seq[Strategy] =
extraStrategies ++ (
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: a :: b :: c :: Nil is exact the same with Seq(a, b, c). :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, I just make this same as hive context way.
@chenghao-intel
This test failed with Refer to this link for build results (access rights to CI server needed):
no rights to CL server ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems a bug of error reporting, anyway, the root reason is it failed in the code style checking.
/home/jenkins/workspace/SparkPullRequestBuilder/sql/core/src/main/scala/org/apache/spark/sql/execution/commands.scala:198:21: Insert a space after the start of the comment

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we revert SQLContext in this PR? Since the change is equivalent.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's ok, it can at least save some space since :: is 2 chars, , is 1 char.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As @chenghao-intel said, do you mind not changing this part? It would be great if the PR focuses on what it does.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chenghao-intel @rxin
ok, got it. I will change this.

@chenghao-intel
Copy link
Contributor

That's a nice feature in general, I agree with @marmbrus , we do need a test suite for this.

@OopsOutOfMemory
Copy link
Contributor Author

@chenghao-intel @marmbrus
Added test cases.

@OopsOutOfMemory
Copy link
Contributor Author

ok to test

@SparkQA
Copy link

SparkQA commented Jan 8, 2015

Test build #25198 has started for PR 3935 at commit d0fe2d6.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Jan 8, 2015

Test build #25198 has finished for PR 3935 at commit d0fe2d6.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • case class DDLStrategy(context: SQLContext) extends Strategy
    • case class DDLDescribeCommand(

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/25198/
Test PASSed.


override def run(sqlContext: SQLContext) = {
val rows = new ArrayBuffer[Row]()
rows += Row("# col_name", "data_type", "comment")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"# col_name", "data_type", "comment" is the output field name, we'd better not take that as part of the output.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chenghao-intel
ok, I will remove it.
If future support partition table here, I will add it like hive does:

# Partition Information          
# col_name              data_type               comment                              
patition_col_name               string                  None  

@SparkQA
Copy link

SparkQA commented Jan 8, 2015

Test build #25216 has started for PR 3935 at commit fbe6f6d.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Jan 8, 2015

Test build #25216 has finished for PR 3935 at commit fbe6f6d.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/25216/
Test PASSed.

@SparkQA
Copy link

SparkQA commented Jan 8, 2015

Test build #25226 has started for PR 3935 at commit 47f5593.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Jan 8, 2015

Test build #25226 has finished for PR 3935 at commit 47f5593.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • case class DDLDescribeCommand(dbName: Option[String], tableName: String, isExtended: Boolean) extends RunnableCommand

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/25226/
Test FAILed.

@OopsOutOfMemory
Copy link
Contributor Author

@scwf @chenghao-intel
I've changed the implementation to RunnableCommand and removed DDLStrategy.
Could u review it ?

@OopsOutOfMemory
Copy link
Contributor Author

test this please.

@SparkQA
Copy link

SparkQA commented Jan 8, 2015

Test build #25227 has started for PR 3935 at commit f5821ae.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Jan 8, 2015

Test build #25229 has started for PR 3935 at commit aa34164.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Jan 16, 2015

Test build #25639 has started for PR 3935 at commit 521bbd7.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Jan 16, 2015

Test build #25639 has finished for PR 3935 at commit 521bbd7.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • case class DDLDescribeCommand(

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/25639/
Test FAILed.

@OopsOutOfMemory
Copy link
Contributor Author

@rxin The rebase may have some problems, cause changed 278 files ? how do I revert it ?

@rxin
Copy link
Contributor

rxin commented Jan 16, 2015

Not sure - do you have a backup? Maybe just take a diff and apply the diff on master.

@SparkQA
Copy link

SparkQA commented Jan 16, 2015

Test build #25651 has started for PR 3935 at commit 9efdf35.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Jan 16, 2015

Test build #25651 has finished for PR 3935 at commit 9efdf35.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • case class DDLDescribeCommand(

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/25651/
Test FAILed.

@SparkQA
Copy link

SparkQA commented Jan 16, 2015

Test build #25654 has started for PR 3935 at commit 9d22708.

  • This patch merges cleanly.

@scwf
Copy link
Contributor

scwf commented Jan 16, 2015

@OopsOutOfMemory, you need revert unnecessary changes

@SparkQA
Copy link

SparkQA commented Jan 16, 2015

Test build #25655 has started for PR 3935 at commit 5b7ae19.

  • This patch merges cleanly.

@OopsOutOfMemory
Copy link
Contributor Author

Thanks @scwf @rxin, conflicts resolved cleanly, this now up-to-date.

@SparkQA
Copy link

SparkQA commented Jan 16, 2015

Test build #25654 has finished for PR 3935 at commit 9d22708.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • case class DDLDescribeCommand(

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/25654/
Test PASSed.

@SparkQA
Copy link

SparkQA commented Jan 16, 2015

Test build #25655 has finished for PR 3935 at commit 5b7ae19.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • case class DDLDescribeCommand(

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/25655/
Test PASSed.

@SparkQA
Copy link

SparkQA commented Jan 16, 2015

Test build #25665 has started for PR 3935 at commit d1689e2.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Jan 16, 2015

Test build #25665 has finished for PR 3935 at commit d1689e2.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • case class DDLDescribeCommand(

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/25665/
Test PASSed.

val ddlPlan = ddlParser(sqlText)
val basicPlan = try {
HiveQl.parseSql(sqlText)
}catch {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a space here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually it's ok i will fix it myself

rxin added a commit to rxin/spark that referenced this pull request Jan 21, 2015
[SPARK-5135][SQL] Add support for describe [extended] table to DDL in SQLContext

Conflicts:
	sql/catalyst/src/main/scala/org/apache/spark/sql/types/dataTypes.scala
@rxin
Copy link
Contributor

rxin commented Jan 21, 2015

Thanks. I resolved the conflict and pushed a PR #4127

@OopsOutOfMemory
Copy link
Contributor Author

Thanks @rxin
Should I close this PR?

@rxin
Copy link
Contributor

rxin commented Jan 21, 2015

Yea we can close this one. Will merge that one when tests pass.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants