Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-35170][SQL] Extend BinaryOperator by SubtractDates and SubtractTimestamps #32267

Closed
wants to merge 3 commits into from

Conversation

MaxGekk
Copy link
Member

@MaxGekk MaxGekk commented Apr 21, 2021

What changes were proposed in this pull request?

In the PR, I propose to modify the SubtractDates and SubtractTimestamps expressions to extend BinaryOperator instead of BinaryExpression.

Why are the changes needed?

To improve code maintenance.

Does this PR introduce any user-facing change?

No

How was this patch tested?

By existing test suites.

@MaxGekk
Copy link
Member Author

MaxGekk commented Apr 21, 2021

@gengliangwang @cloud-fan FYI, this PR changes behavior in errors slightly (I guess it is not critical) :

[info] - typeCoercion/native/promoteStrings.sql *** FAILED *** (5 seconds, 764 milliseconds)
[info]   typeCoercion/native/promoteStrings.sql
[info]   Expected "...data type mismatch: [argument 1 requires timestamp type, however, ''1'' is of string type].; line 1 pos 7", but got "...data type mismatch: [differing types in '('1' - CAST('2017-12-11 09:30:00.0' AS TIMESTAMP))' (string and timestamp)].; line 1 pos 7" Result did not match for query #23
[info]   typeCoercion/native/decimalPrecision.sql
[info]   Expected "...data type mismatch: [argument 2 requires timestamp type, however, 'CAST(1 AS DECIMAL(3,0))' is of decimal(3,0) type].; line 1 pos 7", but got "...data type mismatch: [differing types in '(CAST('2017-12-11 09:30:00.0' AS TIMESTAMP) - CAST(1 AS DECIMAL(3,0)))' (timestamp and decimal(3,0))].; line 1 pos 7" Result did not match for query #121
[info]   SELECT cast('2017-12-11 09:30:00.0' as timestamp) - cast(1 as decimal(3, 0)) FROM t (SQLQueryTestSuite.scala:459)

@MaxGekk
Copy link
Member Author

MaxGekk commented Apr 21, 2021

Though, no. It changes the behavior actually - an exception instead of NULL:

[info] - typeCoercion/native/promoteStrings.sql *** FAILED *** (13 seconds, 407 milliseconds)
[info]   "NULL" did not contain "Exception" Exception did not match for query #24
[info]   SELECT '1' - cast('2017-12-11 09:30:00' as date)        FROM t, expected: NULL, but got: java.sql.SQLException
[info]   org.apache.hive.service.cli.HiveSQLException: Error running query: org.apache.spark.sql.AnalysisException: cannot resolve '('1' - CAST('2017-12-11 09:30:00' AS DATE))' due to data type mismatch: differing types in '('1' - CAST('2017-12-11 09:30:00' AS DATE))' (string and date).; line 1 pos 7;

@SparkQA
Copy link

SparkQA commented Apr 21, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42251/

@SparkQA
Copy link

SparkQA commented Apr 21, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42251/

-- !query output
NULL
org.apache.spark.sql.AnalysisException
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cloud-fan @gengliangwang Are we ok to change the behavior?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't look good...Why does that happen?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably because we have special type coercion logic for BinaryOperators?

@@ -1008,7 +1008,7 @@ SELECT cast('2017-12-11 09:30:00.0' as timestamp) - cast(1 as decimal(3, 0)) FRO
struct<>
-- !query output
org.apache.spark.sql.AnalysisException
cannot resolve '(CAST('2017-12-11 09:30:00.0' AS TIMESTAMP) - CAST(1 AS DECIMAL(3,0)))' due to data type mismatch: argument 2 requires timestamp type, however, 'CAST(1 AS DECIMAL(3,0))' is of decimal(3,0) type.; line 1 pos 7
cannot resolve '(CAST('2017-12-11 09:30:00.0' AS TIMESTAMP) - CAST(1 AS DECIMAL(3,0)))' due to data type mismatch: differing types in '(CAST('2017-12-11 09:30:00.0' AS TIMESTAMP) - CAST(1 AS DECIMAL(3,0)))' (timestamp and decimal(3,0)).; line 1 pos 7
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my point of view, old error looks better.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@SparkQA
Copy link

SparkQA commented Apr 21, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42260/

@SparkQA
Copy link

SparkQA commented Apr 21, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42260/

@SparkQA
Copy link

SparkQA commented Apr 21, 2021

Test build #137724 has finished for PR 32267 at commit 02e2fbb.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Apr 21, 2021

Test build #137733 has finished for PR 32267 at commit f0abdbf.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@MaxGekk
Copy link
Member Author

MaxGekk commented May 18, 2021

I am closing this because:

  1. Behavior change
  2. New error message becomes worse

@MaxGekk MaxGekk closed this May 18, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants