[SPARK-35220][SQL] DayTimeIntervalType/YearMonthIntervalType show different between Hive SerDe and row format delimited #32335

AngersZhuuuu · 2021-04-25T10:33:03Z

What changes were proposed in this pull request?

DayTimeIntervalType/YearMonthIntervalString show different between Hive SerDe and row format delimited.
Create this pr to add a test and have disscuss.

For this problem I think we have two direction:

leave it as current and add a item t explain this in migration guide docs.
Since we should not change hive serde's behavior, so we can cast spark row format delimited's behavior to use cast DayTimeIntervalType/YearMonthIntervalType as HIVE_STYLE

Why are the changes needed?

Add UT

Does this PR introduce any user-facing change?

No

How was this patch tested?

added ut

…ifferent between hive SerDe and row format delimited

AngersZhuuuu · 2021-04-25T10:33:29Z

Gentle ping @cloud-fan @MaxGekk @maropu

SparkQA · 2021-04-25T11:26:46Z

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42441/

SparkQA · 2021-04-25T11:26:47Z

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42441/

SparkQA · 2021-04-25T12:23:22Z

Test build #137920 has finished for PR 32335 at commit ab94c21.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2021-04-25T15:11:02Z

Test build #137921 has finished for PR 32335 at commit 9f1bd02.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

HyukjinKwon · 2021-04-26T02:26:19Z

Merged to master.

AngersZhuuuu · 2021-04-26T02:33:23Z

@HyukjinKwon Since you just merged this, I think I need to add a follow up one to add this behavior in migration guide? ok?

HyukjinKwon · 2021-04-26T03:05:32Z

please go ahead.

cloud-fan · 2021-04-26T04:58:28Z

sql/core/src/test/scala/org/apache/spark/sql/execution/BaseScriptTransformationSuite.scala

+            |FROM v
+            |""".stripMargin),
+          identity,
+          Row("INTERVAL '1 00:00:00' DAY TO SECOND", "INTERVAL '0-10' YEAR TO MONTH") :: Nil)


so the spark-sql shell and df.show have different formats for intervals?

so the spark-sql shell and df.show have different formats for intervals?

Yea, have this problem too, since spark sql follow hive format. What should I to do next?

Is interval format the only difference between hive format and spark cast?

Is interval format the only difference between hive format and spark cast?

Yea， ANSI_STYLE and HIVE_STYLE

Maybe we should have a new Expression ToHiveString and use it in df.show and TRANSFORM, so that they are consistent.

Maybe we should have a new Expression ToHiveString and use it in df.show and TRANSFORM, so that they are consistent.

Yea, create a ticket https://issues.apache.org/jira/browse/SPARK-35228

[SPARK-35220][SQL] DayTimeIntervalType/YearMonthIntervalString show d…

ab94c21

…ifferent between hive SerDe and row format delimited

github-actions bot added the SQL label Apr 25, 2021

AngersZhuuuu changed the title ~~[SPARK-35220][SQL] DayTimeIntervalType/YearMonthIntervalString show different between Hive SerDe and row format delimited~~ [SPARK-35220][SQL] DayTimeIntervalType/YearMonthIntervalType show different between Hive SerDe and row format delimited Apr 25, 2021

Update BaseScriptTransformationSuite.scala

9f1bd02

HyukjinKwon approved these changes Apr 26, 2021

View reviewed changes

HyukjinKwon closed this in 6f782ef Apr 26, 2021

cloud-fan reviewed Apr 26, 2021

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-35220][SQL] DayTimeIntervalType/YearMonthIntervalType show different between Hive SerDe and row format delimited #32335

[SPARK-35220][SQL] DayTimeIntervalType/YearMonthIntervalType show different between Hive SerDe and row format delimited #32335

AngersZhuuuu commented Apr 25, 2021

AngersZhuuuu commented Apr 25, 2021

SparkQA commented Apr 25, 2021

SparkQA commented Apr 25, 2021

SparkQA commented Apr 25, 2021

SparkQA commented Apr 25, 2021

HyukjinKwon commented Apr 26, 2021

AngersZhuuuu commented Apr 26, 2021

HyukjinKwon commented Apr 26, 2021

cloud-fan Apr 26, 2021

AngersZhuuuu Apr 26, 2021

cloud-fan Apr 26, 2021

AngersZhuuuu Apr 26, 2021

cloud-fan Apr 26, 2021 •

edited

Loading

AngersZhuuuu Apr 26, 2021

[SPARK-35220][SQL] DayTimeIntervalType/YearMonthIntervalType show different between Hive SerDe and row format delimited #32335

[SPARK-35220][SQL] DayTimeIntervalType/YearMonthIntervalType show different between Hive SerDe and row format delimited #32335

Conversation

AngersZhuuuu commented Apr 25, 2021

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

AngersZhuuuu commented Apr 25, 2021

SparkQA commented Apr 25, 2021

SparkQA commented Apr 25, 2021

SparkQA commented Apr 25, 2021

SparkQA commented Apr 25, 2021

HyukjinKwon commented Apr 26, 2021

AngersZhuuuu commented Apr 26, 2021

HyukjinKwon commented Apr 26, 2021

cloud-fan Apr 26, 2021

Choose a reason for hiding this comment

AngersZhuuuu Apr 26, 2021

Choose a reason for hiding this comment

cloud-fan Apr 26, 2021

Choose a reason for hiding this comment

AngersZhuuuu Apr 26, 2021

Choose a reason for hiding this comment

cloud-fan Apr 26, 2021 • edited Loading

Choose a reason for hiding this comment

AngersZhuuuu Apr 26, 2021

Choose a reason for hiding this comment

cloud-fan Apr 26, 2021 •

edited

Loading