Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] [Spark 4] Exceptions from casting to DATE/TIMESTAMP do not match the Spark's exceptions, with ANSI enabled #11556

Open
mythrocks opened this issue Oct 2, 2024 · 0 comments
Labels
bug Something isn't working Spark 3.5+ Spark 3.5+ issues Spark 4.0+ Spark 4.0+ issues

Comments

@mythrocks
Copy link
Collaborator

mythrocks commented Oct 2, 2024

Description
With ANSI enabled, when an invalid STRING row is cast to DATE, the exception string from Spark does not match that from Spark.

Edit: Looks like this isn't exclusive to Spark 4. Reproducible on Spark 3.5, and likely others.

Repro

sql(" SELECT '7452-68-35' AS str ").write.mode("overwrite").parquet("/tmp/myth/repro")

spark.conf.set("spark.rapids.sql.hasExtendedYearValues", false )
spark.conf.set("spark.sql.ansi.enabled", true )

spark.read.parquet("/tmp/myth/repro").selectExpr(" CAST(str AS DATE) ").show

With Apache Spark 4, the exception looks as follows:

org.apache.spark.SparkDateTimeException: [CAST_INVALID_INPUT] The value '7452-68-35' of the type "STRING" cannot be cast to "DATE" because it is malformed. Correct the value as per the syntax, or change its target type. Use `try_cast` to tolerate malformed input and return NULL instead. If necessary set "spark.sql.ansi.enabled" to "false" to bypass this error. SQLSTATE: 22018

The plugin fails thus:

java.time.DateTimeException: One or more values could not be converted to DateType
        at com.nvidia.spark.rapids.GpuCast$.$anonfun$checkResultForAnsiMode$5(GpuCast.scala:1331)
        at com.nvidia.spark.rapids.GpuCast$.$anonfun$checkResultForAnsiMode$5$adapted(GpuCast.scala:1329)
        at com.nvidia.spark.rapids.Arm$.withResource(Arm.scala:30)

Expected behavior
The exceptions for the same Spark version with/without the plugin should match.

This needs to be sorted out whenever ANSI mode behaviours are tackled.

@mythrocks mythrocks added bug Something isn't working Spark 4.0+ Spark 4.0+ issues Spark 3.5+ Spark 3.5+ issues labels Oct 2, 2024
@mythrocks mythrocks changed the title [BUG] [Spark 4] Exceptions from casting STRING to DATE do not match the Spark's exceptions, with ANSI enabled [BUG] [Spark 4] Exceptions from casting STRING to DATE/TIMESTAMP do not match the Spark's exceptions, with ANSI enabled Oct 2, 2024
@mythrocks mythrocks changed the title [BUG] [Spark 4] Exceptions from casting STRING to DATE/TIMESTAMP do not match the Spark's exceptions, with ANSI enabled [BUG] [Spark 4] Exceptions from casting to DATE/TIMESTAMP do not match the Spark's exceptions, with ANSI enabled Oct 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Spark 3.5+ Spark 3.5+ issues Spark 4.0+ Spark 4.0+ issues
Projects
None yet
Development

No branches or pull requests

1 participant