Skip to content

Commit

Permalink
[SPARK-36970][SQL] Manual disabled format B of date_format functi…
Browse files Browse the repository at this point in the history
…on to make Java 17 compatible with Java 8

### What changes were proposed in this pull request?
The `date_format` function with `B` format has different behavior when use Java 8 and Java 17, `select date_format('2018-11-17 13:33:33.333', 'B')` in `datetime-formatting-invalid.sql` can prove this.

The case result with Java 8 is

```
-- !query
select date_format('2018-11-17 13:33:33.333', 'B')
-- !query schema
struct<>
-- !query output
java.lang.IllegalArgumentException
Unknown pattern letter: B
```

and the case result with Java 17 is

```
- datetime-formatting-invalid.sql *** FAILED ***
  datetime-formatting-invalid.sql
  Expected "struct<[]>", but got "struct<[date_format(2018-11-17 13:33:33.333, B):string]>" Schema did not match for query Kyligence#34
  select date_format('2018-11-17 13:33:33.333', 'B'): -- !query
  select date_format('2018-11-17 13:33:33.333', 'B')
  -- !query schema
  struct<date_format(2018-11-17 13:33:33.333, B):string>
  -- !query output
  in the afternoon (SQLQueryTestSuite.scala:469)
```

We found that this is due to the new support of format `B` in Java 17

```
'B' is used to represent Pattern letters to output a day period in Java 17

     *  Pattern  Count  Equivalent builder methods
     *  -------  -----  --------------------------
     *    B       1      appendDayPeriodText(TextStyle.SHORT)
     *    BBBB    4      appendDayPeriodText(TextStyle.FULL)
     *    BBBBB   5      appendDayPeriodText(TextStyle.NARROW)
```

And through [ http://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html]( http://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html) , we can confirm that format `B` is not documented/supported for `date_format` function currently.

So the main change of this pr is manual disabled format `B` of `date_format` function in `DateTimeFormatterHelper` to make Java 17 compatible with Java 8.

### Why are the changes needed?
Ensure that Java 17 and Java 8 have the same behavior.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?

- Pass the Jenkins or GitHub Action
- Manual test `SQLQueryTestSuite` with JDK 17

**Before**

```
- datetime-formatting-invalid.sql *** FAILED ***
  datetime-formatting-invalid.sql
  Expected "struct<[]>", but got "struct<[date_format(2018-11-17 13:33:33.333, B):string]>" Schema did not match for query Kyligence#34
  select date_format('2018-11-17 13:33:33.333', 'B'): -- !query
  select date_format('2018-11-17 13:33:33.333', 'B')
  -- !query schema
  struct<date_format(2018-11-17 13:33:33.333, B):string>
  -- !query output
  in the afternoon (SQLQueryTestSuite.scala:469)
```

**After**
The test `select date_format('2018-11-17 13:33:33.333', 'B')` in `datetime-formatting-invalid.sql`  passed

Closes apache#34237 from LuciferYang/SPARK-36970.

Authored-by: yangjie01 <[email protected]>
Signed-off-by: Max Gekk <[email protected]>
  • Loading branch information
LuciferYang authored and MaxGekk committed Oct 12, 2021
1 parent dc1db95 commit 1af7072
Show file tree
Hide file tree
Showing 3 changed files with 7 additions and 3 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -279,7 +279,11 @@ private object DateTimeFormatterHelper {
// localized, for the default Locale.US, it uses Sunday as the first day of week, while in Spark
// 2.4, the SimpleDateFormat uses Monday as the first day of week.
final val weekBasedLetters = Set('Y', 'W', 'w', 'u', 'e', 'c')
final val unsupportedLetters = Set('A', 'n', 'N', 'p')
// SPARK-36970: `select date_format('2018-11-17 13:33:33.333', 'B')` failed with Java 8,
// but use Java 17 will return `in the afternoon` because 'B' is used to represent
// `Pattern letters to output a day period` in Java 17. So there manual disabled `B` for
// compatibility with Java 8 behavior.
final val unsupportedLetters = Set('A', 'B', 'n', 'N', 'p')
// The quarter fields will also be parsed strangely, e.g. when the pattern contains `yMd` and can
// be directly resolved then the `q` do check for whether the month is valid, but if the date
// fields is incomplete, e.g. `yM`, the checking will be bypassed.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -76,7 +76,7 @@ trait DatetimeFormatterSuite extends SparkFunSuite with SQLHelper with Matchers

Seq(true, false).foreach { isParsing =>
// not support by the legacy one too
val unsupportedBoth = Seq("QQQQQ", "qqqqq", "eeeee", "A", "c", "n", "N", "p", "e")
val unsupportedBoth = Seq("QQQQQ", "qqqqq", "eeeee", "A", "B", "c", "n", "N", "p", "e")
unsupportedBoth.foreach { pattern =>
intercept[IllegalArgumentException](checkFormatterCreation(pattern, isParsing))
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -314,7 +314,7 @@ select date_format('2018-11-17 13:33:33.333', 'B')
struct<>
-- !query output
java.lang.IllegalArgumentException
Unknown pattern letter: B
Illegal pattern character: B


-- !query
Expand Down

0 comments on commit 1af7072

Please sign in to comment.