[SPARK-21871][SQL] Fix infinite loop when bytecode size is larger than spark.sql.codegen.hugeMethodLimit #19440

gatorsmile · 2017-10-05T18:28:34Z

What changes were proposed in this pull request?

When exceeding spark.sql.codegen.hugeMethodLimit, the runtime fallbacks to the Volcano iterator solution. This could cause an infinite loop when FileSourceScanExec can use the columnar batch to read the data. This PR is to fix the issue.

How was this patch tested?

Added a test

gatorsmile · 2017-10-05T18:33:51Z

cc @maropu @rednaxelafx

SparkQA · 2017-10-05T21:10:02Z

Test build #82485 has finished for PR 19440 at commit 473bbf0.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

maropu · 2017-10-06T00:07:02Z

sql/core/src/main/scala/org/apache/spark/sql/execution/WholeStageCodegenExec.scala

+        s"`${SQLConf.WHOLESTAGE_HUGE_METHOD_LIMIT.key}`:\n$treeString")
+      child match {
+        // For batch file source scan, we should continue executing it
+        case f: FileSourceScanExec if f.supportsBatch => // do nothing


I feel a little weird WholeStageCodegenExec has specific error handling for each spark plan. Could we handle this error inside FileSourceScanExec? For example, how about checking if parent.isInstanceOf[WholeStageCodegenExec] in FileSourceScanExec?

If we do it in FileSourceScanExec , we are unable to know which causes the fallback. Now, we have at least two reasons that trigger the fallback.

Ideally, we should not call WholeStageCodegenExec in doExecute.

yea, I totally agree that we need to refactor this in future. Anyway, it's ok for now.

maropu · 2017-10-06T00:07:45Z

Thanks for pining! LGTM except for one comment.

viirya · 2017-10-06T01:03:00Z

sql/core/src/test/scala/org/apache/spark/sql/execution/WholeStageCodegenSuite.scala

+
+      withSQLConf(SQLConf.WHOLESTAGE_MAX_NUM_FIELDS.key -> "202",
+        SQLConf.WHOLESTAGE_HUGE_METHOD_LIMIT.key -> "8000") {
+        // donot return batch, because whole stage codegen is disabled for wide table (>202 columns)


Is this comment wrong or I misunderstand it? Looks like it returns batch as it asserts supportsBatch.

this is copied and pasted. will fix it.

viirya · 2017-10-06T01:07:17Z

sql/core/src/main/scala/org/apache/spark/sql/execution/WholeStageCodegenExec.scala

-      return child.execute()
+        s"`${SQLConf.WHOLESTAGE_HUGE_METHOD_LIMIT.key}`:\n$treeString")
+      child match {
+        // For batch file source scan, we should continue executing it


It's better to explain why we should continue it. Otherwise later readers may not understand it immediately.

viirya · 2017-10-06T01:08:07Z

LGTM except two minor comments.

SparkQA · 2017-10-06T06:29:57Z

Test build #82494 has finished for PR 19440 at commit b8eb6a0.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

gatorsmile · 2017-10-06T06:34:08Z

Thanks! Merged to master.

ajithme · 2018-11-20T12:25:13Z

@gatorsmile i have a question, should also be handled from other execs.? For example, like https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/columnar/InMemoryTableScanExec.scala#L306 and https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2ScanExec.scala#L111

…n spark.sql.codegen.hugeMethodLimit When exceeding `spark.sql.codegen.hugeMethodLimit`, the runtime fallbacks to the Volcano iterator solution. This could cause an infinite loop when `FileSourceScanExec` can use the columnar batch to read the data. This PR is to fix the issue. Added a test Author: gatorsmile <[email protected]> Closes apache#19440 from gatorsmile/testt.

gatorsmile added 2 commits October 5, 2017 11:20

fix.

6e2f532

fix.

473bbf0

maropu reviewed Oct 6, 2017

View reviewed changes

viirya reviewed Oct 6, 2017

View reviewed changes

fix.

b8eb6a0

asfgit closed this in 83488cc Oct 6, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-21871][SQL] Fix infinite loop when bytecode size is larger than spark.sql.codegen.hugeMethodLimit #19440

[SPARK-21871][SQL] Fix infinite loop when bytecode size is larger than spark.sql.codegen.hugeMethodLimit #19440

gatorsmile commented Oct 5, 2017

gatorsmile commented Oct 5, 2017

SparkQA commented Oct 5, 2017

maropu Oct 6, 2017

gatorsmile Oct 6, 2017

maropu Oct 6, 2017

maropu commented Oct 6, 2017

viirya Oct 6, 2017

gatorsmile Oct 6, 2017

viirya Oct 6, 2017

viirya commented Oct 6, 2017

SparkQA commented Oct 6, 2017

gatorsmile commented Oct 6, 2017

ajithme commented Nov 20, 2018

[SPARK-21871][SQL] Fix infinite loop when bytecode size is larger than spark.sql.codegen.hugeMethodLimit #19440

[SPARK-21871][SQL] Fix infinite loop when bytecode size is larger than spark.sql.codegen.hugeMethodLimit #19440

Conversation

gatorsmile commented Oct 5, 2017

What changes were proposed in this pull request?

How was this patch tested?

gatorsmile commented Oct 5, 2017

SparkQA commented Oct 5, 2017

maropu Oct 6, 2017

Choose a reason for hiding this comment

gatorsmile Oct 6, 2017

Choose a reason for hiding this comment

maropu Oct 6, 2017

Choose a reason for hiding this comment

maropu commented Oct 6, 2017

viirya Oct 6, 2017

Choose a reason for hiding this comment

gatorsmile Oct 6, 2017

Choose a reason for hiding this comment

viirya Oct 6, 2017

Choose a reason for hiding this comment

viirya commented Oct 6, 2017

SparkQA commented Oct 6, 2017

gatorsmile commented Oct 6, 2017

ajithme commented Nov 20, 2018