Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-34677][SQL] Support the +/- operators over ANSI SQL intervals #31789

Closed
wants to merge 7 commits into from

Conversation

MaxGekk
Copy link
Member

@MaxGekk MaxGekk commented Mar 9, 2021

What changes were proposed in this pull request?

Extend the Add, Subtract and UnaryMinus expression to support DayTimeIntervalType and YearMonthIntervalType added by #31614.

Note: the expressions can throw the overflow exception independently from the SQL config spark.sql.ansi.enabled. In this way, the modified expressions always behave in the ANSI mode for the intervals.

Why are the changes needed?

To conform to the ANSI SQL standard which defines -/+ over intervals:
Screenshot 2021-03-09 at 21 59 22

Does this PR introduce any user-facing change?

Should not since new types have not been released yet.

How was this patch tested?

By running new tests in the test suites:

$ build/sbt "test:testOnly *ArithmeticExpressionSuite"
$ build/sbt "test:testOnly *ColumnExpressionSuite"

@github-actions github-actions bot added the SQL label Mar 9, 2021
@SparkQA
Copy link

SparkQA commented Mar 9, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/40490/

@SparkQA
Copy link

SparkQA commented Mar 9, 2021

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/40490/

@SparkQA
Copy link

SparkQA commented Mar 10, 2021

Test build #135908 has finished for PR 31789 at commit 18b1db3.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@MaxGekk
Copy link
Member Author

MaxGekk commented Mar 10, 2021

@HyukjinKwon @cloud-fan Could you take a look at this PR, please.

@@ -173,6 +180,11 @@ abstract class BinaryArithmetic extends BinaryOperator with NullIntolerant {
def calendarIntervalMethod: String =
sys.error("BinaryArithmetics must override either calendarIntervalMethod or genCode")

/** Name of the function for this expression on [[DayTimeIntervalType]] and
* [[YearMonthIntervalType]] types. */
def intervalMethod: String =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we reuse exactMathMethod?

Copy link
Member Author

@MaxGekk MaxGekk Mar 10, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, since exactMathMethod is always defined in the expressions, we can directly invoke exactMathMethod.get. Though I can add a guard if exactMathMethod.isDefined.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can use assert to make sure it's defined for interval types.

@SparkQA
Copy link

SparkQA commented Mar 10, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/40510/

@SparkQA
Copy link

SparkQA commented Mar 10, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/40510/

@MaxGekk
Copy link
Member Author

MaxGekk commented Mar 11, 2021

@cloud-fan @HyukjinKwon @yaooqinn Any more feedback for the changes?

.toDF("year-month-A", "day-time-A", "year-month-B", "day-time-B")
val negatedDF = df.select(-$"year-month-A", -$"day-time-A")
checkAnswer(negatedDF, Row(Period.ofMonths(-10), Duration.ofDays(-10)))
val sumDF = df.select($"year-month-A" + $"year-month-B", $"day-time-A" + $"day-time-B")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sumDF -> addDF

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does sum function accept the new interval types?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does sum function accept the new interval types?

sum operates over NumericType only, see

override def inputTypes: Seq[AbstractDataType] = Seq(NumericType)
. If you think we should support new intervals in it, let's do that separately since this PR is about operators +/-.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I opened the JIRA SPARK-34716 to don't forget about it.

@SparkQA
Copy link

SparkQA commented Mar 11, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/40550/

@SparkQA
Copy link

SparkQA commented Mar 11, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/40550/

@cloud-fan
Copy link
Contributor

thanks, merging to master!

@cloud-fan cloud-fan closed this in 9d3d25b Mar 11, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants