Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-34152][SQL] Make CreateViewStatement.child to be LogicalPlan's children so that it's resolved in analyze phase #31273

Closed
wants to merge 38 commits into from

Conversation

imback82
Copy link
Contributor

What changes were proposed in this pull request?

This PR proposes to make CreateViewStatement.child to be LogicalPlan's children so that it's resolved in the analyze phase.

Why are the changes needed?

Currently, the CreateViewStatement.child is resolved when the create view command runs, which is inconsistent with other plan resolutions. For example, you may see the following in the physical plan:

== Physical Plan ==
Execute CreateViewCommand (1)
   +- CreateViewCommand (2)
         +- Project (4)
            +- UnresolvedRelation (3)

Does this PR introduce any user-facing change?

Yes. For the example, you will now see the resolved plan:

== Physical Plan ==
Execute CreateViewCommand (1)
   +- CreateViewCommand (2)
         +- Project (5)
            +- SubqueryAlias (4)
               +- LogicalRelation (3)

How was this patch tested?

Updated existing tests.

@github-actions github-actions bot added the SQL label Jan 21, 2021
@@ -2029,7 +2029,7 @@ class DataSourceV2SQLSuite
test("CREATE VIEW") {
val v = "testcat.ns1.ns2.v"
val e = intercept[AnalysisException] {
sql(s"CREATE VIEW $v AS SELECT * FROM tab1")
sql(s"CREATE VIEW $v AS SELECT 1")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be updated now that ResolveSessionCatalog handles CreateViewStatement only if its child is resolved.

case plan if !plan.resolved => plan.expressions.flatMap(_.collect {
case s @ SubqueryAlias(_, view: View) if view.isTempView =>
Seq(s.identifier.qualifier :+ s.identifier.name)
case s: SubqueryAlias if s.getTagValue(SUBQUERY_TYPE_TAG).exists(_ == "tempView") =>
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems a bit hacky, but SubqueryAlias(_, view: View) couldn't handle all the cases. For example,

spark.range(10).createTempView("t")
sql("CREATE VIEW v AS SELECT * FROM t")

The child is:

Project [id#16L]
+- SubqueryAlias t
   +- Range (0, 10, step=1, splits=Some(2))

@cloud-fan, @viirya Any suggestion to capture the view properly? Does it make sense to add an isView field to SubqueryAlias?

Copy link
Contributor Author

@imback82 imback82 Jan 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or is it safe to do: case s: SubqueryAlias if catalog.isTempView(s.identifier.name)?

I see the comment "After replacement, it is impossible to detect whether the SubqueryAlias is added/generated from a temporary view." If we can add a field to SubqueryAlias, this should be easy to handle.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good point. We probably should use View to wrap temp view as well, to be truly consistent with permanent views. cc @linhongliu-db

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, will try that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 if it works, adding tag seems hacky.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we not wrap dataframe temp view with View? The original goal is to make SQL temp view and permanent view consistent.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. In that case, we are left with two options:

  1. Add isView: Boolean field to the SubqueryAlias, or
  2. Introduce a new logical plan for handling dataframe temp view.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will try 2) if there is no opposition to the approach.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK I get your point now. If we need a new logical plan, then it's better to reuse View with some modification. +1 to make View.desc optional and set it to None for dataframe temp view. Or we can generate column names for dataframe temp view in View.desc, like col1, col2, ..., to bypass the check.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cloud-fan / @viirya I updated to make View.desc optional to support dataframe temp views. Could you please check again? Thanks!

val e = cls.getConstructor(classOf[Seq[Expression]], clsForUDAF, classOf[Int], classOf[Int])
.newInstance(input,
clazz.getConstructor().newInstance().asInstanceOf[Object], Int.box(1), Int.box(1))
val e = cls.getConstructor(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some UDAF related change, is it related to the view stuff here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, we need to collect temp functions by name. Currently, UDAF doesn't store a name, so I added it.

@SparkQA
Copy link

SparkQA commented Jan 21, 2021

Test build #134304 has finished for PR 31273 at commit 71c01e8.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jan 21, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/38891/

@SparkQA
Copy link

SparkQA commented Jan 21, 2021

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/38891/

@imback82 imback82 changed the title [Spark-34152][SQL] Make CreateViewStatement.child to be LogicalPlan's children so that it's resolved in analyze phase [WIP][Spark-34152][SQL] Make CreateViewStatement.child to be LogicalPlan's children so that it's resolved in analyze phase Jan 23, 2021
@imback82 imback82 marked this pull request as draft January 23, 2021 05:10
@SparkQA
Copy link

SparkQA commented Jan 23, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/38982/

@SparkQA
Copy link

SparkQA commented Jan 23, 2021

Test build #134396 has finished for PR 31273 at commit 8dc1961.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jan 23, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/38982/

@SparkQA
Copy link

SparkQA commented Jan 25, 2021

Test build #134465 has started for PR 31273 at commit 4b3f184.

@SparkQA
Copy link

SparkQA commented Jan 25, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39051/

@SparkQA
Copy link

SparkQA commented Jan 25, 2021

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39051/

@SparkQA
Copy link

SparkQA commented Jan 28, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39217/

@SparkQA
Copy link

SparkQA commented Jan 28, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39217/

@SparkQA
Copy link

SparkQA commented Jan 28, 2021

Test build #134629 has finished for PR 31273 at commit bfabe9f.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jan 31, 2021

Test build #134683 has finished for PR 31273 at commit ac663aa.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Feb 24, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39972/

@SparkQA
Copy link

SparkQA commented Feb 24, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39972/

// Inserting into a file-based temporary view is allowed.
// (e.g., spark.read.parquet("path").createOrReplaceTempView("t").
// Thus, we need to look at the raw plan of a temporary view.
getTempViewRawPlan(relation) match {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah sorry I missed this. The relation here can be a table relation, not temp view, so unwrapRelationPlan maybe a better name.

We can fix it in followup if there are no other comments to address.

* child should be a logical plan parsed from the `CatalogTable.viewText`, should throw an error
* if the `viewText` is not defined.
* A container for holding the view description(CatalogTable) and info whether the view is temporary
* or not. If the view description is available, the child should be a logical plan parsed from the
Copy link
Contributor

@cloud-fan cloud-fan Feb 24, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the view description is available, it's always available now. should be if it's a SQL (temp) view, ...

inputPlan: LogicalPlan,
expectedPlan: LogicalPlan,
caseSensitive: Boolean = true): Unit = {
checkAnalysisWithTransform(inputPlan, expectedPlan, caseSensitive) { plan =>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shall we inline checkAnalysisWithTransform here?

@@ -111,12 +111,11 @@ case class CreateViewCommand(

// When creating a permanent view, not allowed to reference temporary objects.
// This should be called after `qe.assertAnalyzed()` (i.e., `child` can be resolved)
verifyTemporaryObjectsNotExists(catalog, isTemporary, name, child)
verifyTemporaryObjectsNotExists(catalog, isTemporary, name, analyzedPlan)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since child is already resolved, we don't need to call sparkSession.sessionState.executePlan(child) and re-analyze it now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CacheTableAsSelectExec.query may not be analyzed? (other places seem fine)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just analyze it in CacheTableAsSelectExec and remove analyzing here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok let's fix it in followup then. We probably should also fix CacheTableAsSelect and resolve the query in the analyzer.

Copy link
Contributor Author

@imback82 imback82 Feb 25, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cloud-fan if we remove sparkSession.sessionState.executePlan(child) and use child directly, do you think there would be an issue similar to CacheTable since child will be an optimized plan, not an analyzed plan?

@SparkQA
Copy link

SparkQA commented Feb 24, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39985/

@SparkQA
Copy link

SparkQA commented Feb 24, 2021

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39985/

@SparkQA
Copy link

SparkQA commented Feb 24, 2021

Test build #135392 has finished for PR 31273 at commit 84d817c.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan
Copy link
Contributor

thanks, merging to master!

@cloud-fan cloud-fan closed this in 714ff73 Feb 24, 2021
@SparkQA
Copy link

SparkQA commented Feb 24, 2021

Test build #135406 has finished for PR 31273 at commit 82d58ba.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@imback82
Copy link
Contributor Author

Thanks @cloud-fan and @viirya for the review!

@LuciferYang
Copy link
Contributor

@imback82 @cloud-fan @viirya after this pr, when run the UTs of sql/core module, there will be a lot of warning logs like follows:

20:14:06.700 WARN org.apache.spark.sql.execution.command.CommandUtils: Exception when attempting to uncache `largeAndSmallInts`
org.apache.spark.sql.AnalysisException: Table or view not found: largeAndSmallInts;
'UnresolvedRelation [largeAndSmallInts], [], false

	at org.apache.spark.sql.catalyst.analysis.package$AnalysisErrorAt.failAnalysis(package.scala:42)
	at org.apache.spark.sql.catalyst.analysis.CheckAnalysis.$anonfun$checkAnalysis$1(CheckAnalysis.scala:123)
	at org.apache.spark.sql.catalyst.analysis.CheckAnalysis.$anonfun$checkAnalysis$1$adapted(CheckAnalysis.scala:94)
	at org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:182)
	at org.apache.spark.sql.catalyst.analysis.CheckAnalysis.checkAnalysis(CheckAnalysis.scala:94)
	at org.apache.spark.sql.catalyst.analysis.CheckAnalysis.checkAnalysis$(CheckAnalysis.scala:91)
	at org.apache.spark.sql.catalyst.analysis.Analyzer.checkAnalysis(Analyzer.scala:155)
	at org.apache.spark.sql.catalyst.analysis.Analyzer.$anonfun$executeAndCheck$1(Analyzer.scala:176)
	at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.markInAnalyzer(AnalysisHelper.scala:228)
	at org.apache.spark.sql.catalyst.analysis.Analyzer.executeAndCheck(Analyzer.scala:173)
	at org.apache.spark.sql.execution.QueryExecution.$anonfun$analyzed$1(QueryExecution.scala:74)
	at org.apache.spark.sql.catalyst.QueryPlanningTracker.measurePhase(QueryPlanningTracker.scala:111)
	at org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$1(QueryExecution.scala:144)
	at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:772)
	at org.apache.spark.sql.execution.QueryExecution.executePhase(QueryExecution.scala:144)
	at org.apache.spark.sql.execution.QueryExecution.analyzed$lzycompute(QueryExecution.scala:74)
	at org.apache.spark.sql.execution.QueryExecution.analyzed(QueryExecution.scala:72)
	at org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:64)
	at org.apache.spark.sql.Dataset$.$anonfun$ofRows$1(Dataset.scala:90)
	at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:772)
	at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:88)
	at org.apache.spark.sql.DataFrameReader.table(DataFrameReader.scala:918)
	at org.apache.spark.sql.SparkSession.table(SparkSession.scala:593)
	at org.apache.spark.sql.internal.CatalogImpl.$anonfun$uncacheTable$2(CatalogImpl.scala:495)
	at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
	at scala.Option.getOrElse(Option.scala:189)
	at org.apache.spark.sql.internal.CatalogImpl.uncacheTable(CatalogImpl.scala:495)
	at org.apache.spark.sql.execution.command.CommandUtils$.uncacheTableOrView(CommandUtils.scala:395)
	at org.apache.spark.sql.execution.command.CreateViewCommand.run(views.scala:121)
	at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
	at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
	at org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79)
	at org.apache.spark.sql.Dataset.$anonfun$logicalPlan$1(Dataset.scala:228)
	at org.apache.spark.sql.Dataset.$anonfun$withAction$1(Dataset.scala:3705)
	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$5(SQLExecution.scala:103)
	at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:163)
	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:90)
	at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:772)
	at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:64)
	at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3703)
	at org.apache.spark.sql.Dataset.<init>(Dataset.scala:228)
	at org.apache.spark.sql.Dataset$.$anonfun$ofRows$1(Dataset.scala:91)
	at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:772)
	at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:88)
	at org.apache.spark.sql.Dataset.withPlan(Dataset.scala:3733)
	at org.apache.spark.sql.Dataset.createOrReplaceTempView(Dataset.scala:3318)
	at org.apache.spark.sql.test.SQLTestData.largeAndSmallInts(SQLTestData.scala:91)
	at org.apache.spark.sql.test.SQLTestData.largeAndSmallInts$(SQLTestData.scala:83)
	at org.apache.spark.sql.test.TestSparkSession$testData$.largeAndSmallInts$lzycompute(TestSQLContext.scala:50)
	at org.apache.spark.sql.test.TestSparkSession$testData$.largeAndSmallInts(TestSQLContext.scala:50)
	at org.apache.spark.sql.test.SQLTestData.loadTestData(SQLTestData.scala:300)
	at org.apache.spark.sql.test.SQLTestData.loadTestData$(SQLTestData.scala:293)
	at org.apache.spark.sql.test.TestSparkSession$testData$.loadTestData(TestSQLContext.scala:50)
	at org.apache.spark.sql.test.TestSparkSession.loadTestData(TestSQLContext.scala:47)
	at test.org.apache.spark.sql.JavaDatasetSuite.setUp(JavaDatasetSuite.java:62)
	at sun.reflect.GeneratedMethodAccessor121.invoke(Unknown Source)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
	at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
	at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
	at org.junit.internal.runners.statements.RunBefores.invokeMethod(RunBefores.java:33)
	at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:24)
	at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
	at org.junit.rules.ExpectedException$ExpectedExceptionStatement.evaluate(ExpectedException.java:258)
	at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
	at org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
	at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
	at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
	at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
	at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
	at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
	at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
	at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
	at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
	at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:364)
	at org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:272)
	at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:237)
	at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:158)
	at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:428)
	at org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:162)
	at org.apache.maven.surefire.booter.ForkedBooter.run(ForkedBooter.java:562)
	at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:548)

Is this expected behavior?

@LuciferYang
Copy link
Contributor

Exception when attempting to uncache xxx similar logs have been printed more than 3700 times in test log

@imback82
Copy link
Contributor Author

Thanks @LuciferYang, let me take a look.

@cloud-fan
Copy link
Contributor

@LuciferYang thanks for reporting! I'm fixing it in #31650

dongjoon-hyun pushed a commit that referenced this pull request Feb 25, 2021
…'t exist

### What changes were proposed in this pull request?

This PR fixes a mistake in #31273. When CREATE OR REPLACE a temp view, we need to uncache the to-be-replaced existing temp view. However, we shouldn't uncache if there is no existing temp view.

This doesn't cause real issues because the uncache action is failure-safe. But it produces a lot of warning messages.

### Why are the changes needed?

Avoid unnecessary warning logs.

### Does this PR introduce _any_ user-facing change?

no

### How was this patch tested?

manually run tests and check the warning messages.

Closes #31650 from cloud-fan/warnning.

Authored-by: Wenchen Fan <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
} else {
aliasedPlan
TemporaryViewRelation(
prepareTemporaryViewFromDataFrame(name, aliasedPlan),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is a mistake: in the if branch, we call prepareTemporaryView(viewIdent, ..., and here we should also use viewIdent instead of name. Global temp view has 2-part name and the name may not be qualified (e.g. df.createGlobalTempView).

@imback82 can you help to fix it with a test?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch. will fix.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created #31783. Thanks!

cloud-fan pushed a commit that referenced this pull request Mar 9, 2021
… correctly stored

### What changes were proposed in this pull request?

This PR proposed to fix a bug introduced in #31273 (https://github.com/apache/spark/pull/31273/files#r589494855).
### Why are the changes needed?

This fixes a bug where global temp view's database name was not passed correctly.

### Does this PR introduce _any_ user-facing change?

Yes, now the global temp view's database is correctly stored.

### How was this patch tested?

Added a new test that catches the bug.

Closes #31783 from imback82/SPARK-34152-bug-fix.

Authored-by: Terry Kim <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
cloud-fan pushed a commit that referenced this pull request Mar 11, 2021
…alysis phase, and AlterViewAs should invalidate the cache

### What changes were proposed in this pull request?

This PR proposes the following:
   * `AlterViewAs.query` is currently analyzed in the physical operator `AlterViewAsCommand`, but it should be analyzed during the analysis phase.
   *  When `spark.sql.legacy.storeAnalyzedPlanForView` is set to true, store `TermporaryViewRelation` which wraps the analyzed plan, similar to #31273.
   *  Try to uncache the view you are altering.

### Why are the changes needed?

Analyzing a plan should be done in the analysis phase if possible.

Not uncaching the view (existing behavior) seems like a bug since the cache may not be used again.

### Does this PR introduce _any_ user-facing change?

Yes, now the view can be uncached if it's already cached.

### How was this patch tested?

Added new tests around uncaching.

The existing tests such as `SQLViewSuite` should cover the analysis changes.

Closes #31652 from imback82/alter_view_child.

Authored-by: Terry Kim <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
cloud-fan pushed a commit that referenced this pull request Mar 22, 2021
…d take/return more concrete types

### What changes were proposed in this pull request?

Now that all the temporary views are wrapped with `TemporaryViewRelation`(#31273, #31652, and #31825), this PR proposes to update `SessionCatalog`'s APIs for temporary views to take or return more concrete types.

APIs that will take `TemporaryViewRelation` instead of `LogicalPlan`:
```
createTempView, createGlobalTempView, alterTempViewDefinition
```

APIs that will return `TemporaryViewRelation` instead of `LogicalPlan`:
```
getRawTempView, getRawGlobalTempView
```

APIs that will return `View` instead of `LogicalPlan`:
```
getTempView, getGlobalTempView, lookupTempView
```

### Why are the changes needed?

Internal refactoring to work with more concrete types.

### Does this PR introduce _any_ user-facing change?

No, this is internal refactoring.

### How was this patch tested?

Updated existing tests affected by the refactoring.

Closes #31906 from imback82/use_temporary_view_relation.

Authored-by: Terry Kim <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants