Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-33879][SQL] Char Varchar values fails w/ match error as partition columns #30887

Closed
wants to merge 2 commits into from
Closed

Conversation

yaooqinn
Copy link
Member

What changes were proposed in this pull request?

spark-sql> select * from t10 where c0='abcd';
20/12/22 15:43:38 ERROR SparkSQLDriver: Failed in [select * from t10 where c0='abcd']
scala.MatchError: CharType(10) (of class org.apache.spark.sql.types.CharType)
	at org.apache.spark.sql.catalyst.expressions.CastBase.cast(Cast.scala:815)
	at org.apache.spark.sql.catalyst.expressions.CastBase.cast$lzycompute(Cast.scala:842)
	at org.apache.spark.sql.catalyst.expressions.CastBase.cast(Cast.scala:842)
	at org.apache.spark.sql.catalyst.expressions.CastBase.nullSafeEval(Cast.scala:844)
	at org.apache.spark.sql.catalyst.expressions.UnaryExpression.eval(Expression.scala:476)
	at org.apache.spark.sql.catalyst.catalog.CatalogTablePartition.$anonfun$toRow$2(interface.scala:164)
	at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:238)
	at scala.collection.Iterator.foreach(Iterator.scala:941)
	at scala.collection.Iterator.foreach$(Iterator.scala:941)
	at scala.collection.AbstractIterator.foreach(Iterator.scala:1429)
	at scala.collection.IterableLike.foreach(IterableLike.scala:74)
	at scala.collection.IterableLike.foreach$(IterableLike.scala:73)
	at org.apache.spark.sql.types.StructType.foreach(StructType.scala:102)
	at scala.collection.TraversableLike.map(TraversableLike.scala:238)
	at scala.collection.TraversableLike.map$(TraversableLike.scala:231)
	at org.apache.spark.sql.types.StructType.map(StructType.scala:102)
	at org.apache.spark.sql.catalyst.catalog.CatalogTablePartition.toRow(interface.scala:158)
	at org.apache.spark.sql.catalyst.catalog.ExternalCatalogUtils$.$anonfun$prunePartitionsByFilter$3(ExternalCatalogUtils.scala:157)
	at org.apache.spark.sql.catalyst.catalog.ExternalCatalogUtils$.$anonfun$prunePartitionsByFilter$3$adapted(ExternalCatalogUtils.scala:156)

c0 is a partition column, it fails in the partition pruning rule

In this PR, we relace char/varchar w/ string type before the CAST happends

Why are the changes needed?

bugfix, see the case above

Does this PR introduce any user-facing change?

no

How was this patch tested?

yes, new tests

@github-actions github-actions bot added the SQL label Dec 22, 2020
@SparkQA
Copy link

SparkQA commented Dec 22, 2020

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/37806/

@SparkQA
Copy link

SparkQA commented Dec 22, 2020

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/37806/

@yaooqinn
Copy link
Member Author

cc @cloud-fan @maropu @HyukjinKwon thanks for review

Copy link
Contributor

@cloud-fan cloud-fan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch!

@SparkQA
Copy link

SparkQA commented Dec 22, 2020

Test build #133208 has finished for PR 30887 at commit b46659b.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@yaooqinn
Copy link
Member Author

retest this please

@SparkQA
Copy link

SparkQA commented Dec 22, 2020

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/37825/

@SparkQA
Copy link

SparkQA commented Dec 22, 2020

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/37825/

@SparkQA
Copy link

SparkQA commented Dec 22, 2020

Test build #133227 has finished for PR 30887 at commit b46659b.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan
Copy link
Contributor

Seems lile real test failures: org.apache.spark.sql.HiveCharVarcharTestSuite.char type comparison: partition pruning

@yaooqinn
Copy link
Member Author

So weird that GA passed...

@SparkQA
Copy link

SparkQA commented Dec 22, 2020

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/37833/

@SparkQA
Copy link

SparkQA commented Dec 22, 2020

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/37833/

@SparkQA
Copy link

SparkQA commented Dec 22, 2020

Test build #133235 has finished for PR 30887 at commit 6e3e39b.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@yaooqinn
Copy link
Member Author

retest this please

@SparkQA
Copy link

SparkQA commented Dec 23, 2020

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/37846/

@SparkQA
Copy link

SparkQA commented Dec 23, 2020

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/37846/

@SparkQA
Copy link

SparkQA commented Dec 23, 2020

Test build #133248 has finished for PR 30887 at commit 6e3e39b.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@yaooqinn
Copy link
Member Author

retest this please

@HyukjinKwon
Copy link
Member

org.scalatest.exceptions.TestFailedException: transform.sql
Expected "...h status 127. Error:[ /bin/bash: some_non_existent_command: command not found]", but got "...h status 127. Error:[]" Result did not match for query #2
SELECT TRANSFORM(a)
USING 'some_non_existent_command' AS (a)
FROM t

is being fixed in #30896 (comment)

@HyukjinKwon
Copy link
Member

Merged to master and branch-3.1.

@yaooqinn
Copy link
Member Author

thanks for merging and review

HyukjinKwon pushed a commit that referenced this pull request Dec 23, 2020
…ion columns

### What changes were proposed in this pull request?

```sql
spark-sql> select * from t10 where c0='abcd';
20/12/22 15:43:38 ERROR SparkSQLDriver: Failed in [select * from t10 where c0='abcd']
scala.MatchError: CharType(10) (of class org.apache.spark.sql.types.CharType)
	at org.apache.spark.sql.catalyst.expressions.CastBase.cast(Cast.scala:815)
	at org.apache.spark.sql.catalyst.expressions.CastBase.cast$lzycompute(Cast.scala:842)
	at org.apache.spark.sql.catalyst.expressions.CastBase.cast(Cast.scala:842)
	at org.apache.spark.sql.catalyst.expressions.CastBase.nullSafeEval(Cast.scala:844)
	at org.apache.spark.sql.catalyst.expressions.UnaryExpression.eval(Expression.scala:476)
	at org.apache.spark.sql.catalyst.catalog.CatalogTablePartition.$anonfun$toRow$2(interface.scala:164)
	at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:238)
	at scala.collection.Iterator.foreach(Iterator.scala:941)
	at scala.collection.Iterator.foreach$(Iterator.scala:941)
	at scala.collection.AbstractIterator.foreach(Iterator.scala:1429)
	at scala.collection.IterableLike.foreach(IterableLike.scala:74)
	at scala.collection.IterableLike.foreach$(IterableLike.scala:73)
	at org.apache.spark.sql.types.StructType.foreach(StructType.scala:102)
	at scala.collection.TraversableLike.map(TraversableLike.scala:238)
	at scala.collection.TraversableLike.map$(TraversableLike.scala:231)
	at org.apache.spark.sql.types.StructType.map(StructType.scala:102)
	at org.apache.spark.sql.catalyst.catalog.CatalogTablePartition.toRow(interface.scala:158)
	at org.apache.spark.sql.catalyst.catalog.ExternalCatalogUtils$.$anonfun$prunePartitionsByFilter$3(ExternalCatalogUtils.scala:157)
	at org.apache.spark.sql.catalyst.catalog.ExternalCatalogUtils$.$anonfun$prunePartitionsByFilter$3$adapted(ExternalCatalogUtils.scala:156)
```
c0 is a partition column, it fails in the partition pruning rule

In this PR, we relace char/varchar w/ string type before the CAST happends

### Why are the changes needed?

bugfix, see the case above

### Does this PR introduce _any_ user-facing change?

no

### How was this patch tested?

yes, new tests

Closes #30887 from yaooqinn/SPARK-33879.

Authored-by: Kent Yao <[email protected]>
Signed-off-by: HyukjinKwon <[email protected]>
(cherry picked from commit 2287f56)
Signed-off-by: HyukjinKwon <[email protected]>
@SparkQA
Copy link

SparkQA commented Dec 23, 2020

Test build #133263 has finished for PR 30887 at commit 6e3e39b.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants