-
Notifications
You must be signed in to change notification settings - Fork 28.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-34701][SQL] Remove analyzing temp view again in CreateViewCommand #31933
Changes from 5 commits
ff7fabb
5829e48
c0e9723
6fdd9e0
e04fbdd
a14b3b9
507b00c
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -48,8 +48,8 @@ import org.apache.spark.sql.util.SchemaUtils | |
* @param properties the properties of this view. | ||
* @param originalText the original SQL text of this view, can be None if this view is created via | ||
* Dataset API. | ||
* @param child the logical plan that represents the view; this is used to generate the logical | ||
* plan for temporary view and the view schema. | ||
* @param analyzedPlan the logical plan that represents the view; this is used to generate the | ||
* logical plan for temporary view and the view schema. | ||
* @param allowExisting if true, and if the view already exists, noop; if false, and if the view | ||
* already exists, throws analysis exception. | ||
* @param replace if true, and if the view already exists, updates it; if false, and if the view | ||
|
@@ -62,15 +62,15 @@ case class CreateViewCommand( | |
comment: Option[String], | ||
properties: Map[String, String], | ||
originalText: Option[String], | ||
child: LogicalPlan, | ||
analyzedPlan: LogicalPlan, | ||
cloud-fan marked this conversation as resolved.
Show resolved
Hide resolved
|
||
allowExisting: Boolean, | ||
replace: Boolean, | ||
viewType: ViewType) | ||
extends RunnableCommand { | ||
|
||
import ViewHelper._ | ||
|
||
override def innerChildren: Seq[QueryPlan[_]] = Seq(child) | ||
override def innerChildren: Seq[QueryPlan[_]] = Seq(analyzedPlan) | ||
|
||
if (viewType == PersistedView) { | ||
require(originalText.isDefined, "'originalText' must be provided to create permanent view") | ||
|
@@ -96,11 +96,6 @@ case class CreateViewCommand( | |
} | ||
|
||
override def run(sparkSession: SparkSession): Seq[Row] = { | ||
// If the plan cannot be analyzed, throw an exception and don't proceed. | ||
val qe = sparkSession.sessionState.executePlan(child) | ||
qe.assertAnalyzed() | ||
val analyzedPlan = qe.analyzed | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Actually this is a regression. I think we need to make There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Good catch. One issue with "make CreateViewCommand.analyzedPlan a real child" is that it will become an optimized plan, which will affect the caching. I can try to make certain command's children to skip optimizer. WDYT? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It will be great to have such a mechanism to skip optimizer, useful to There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. OK, I will give it a shot. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @cloud-fan I created #32032 to handle this scenario. Please let me know what you think. TIA! |
||
|
||
if (userSpecifiedColumns.nonEmpty && | ||
userSpecifiedColumns.length != analyzedPlan.output.length) { | ||
throw new AnalysisException(s"The number of columns produced by the SELECT clause " + | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -94,14 +94,21 @@ case class CacheTableAsSelectExec( | |
override lazy val relationName: String = tempViewName | ||
|
||
override lazy val planToCache: LogicalPlan = { | ||
// CacheTableAsSelectExec.query is not resolved yet (e.g., not a child of CacheTableAsSelect) | ||
// in order to skip optimizing it; note that we need to pass an analyzed plan to | ||
// CreateViewCommand for the cache to work correctly. Thus, the query is analyzed below. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can we further clean this up by creating |
||
val qe = sparkSession.sessionState.executePlan(query) | ||
qe.assertAnalyzed() | ||
val analyzedPlan = qe.analyzed | ||
imback82 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
Dataset.ofRows(sparkSession, | ||
CreateViewCommand( | ||
name = TableIdentifier(tempViewName), | ||
userSpecifiedColumns = Nil, | ||
comment = None, | ||
properties = Map.empty, | ||
originalText = Some(originalText), | ||
child = query, | ||
analyzedPlan = analyzedPlan, | ||
allowExisting = false, | ||
replace = false, | ||
viewType = LocalTempView | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shall we move it to L682?