Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support user-defined and incomplete date formats #273

Merged

Conversation

GumpacG
Copy link

@GumpacG GumpacG commented Jun 1, 2023

Description

  • Added support user defined (custom) date formats (backed by java syntax)
  • Updated support for predefined formats (doc, list, another doc)
  • Combinations of custom and named formats are also supported
  • Values produced by incomplete date formatters (e.g. year, week_year) returned as TIMESTAMP

TODOs

Sample for test

Mapping:

{
  "mappings" : {
    "properties" : {
      "custom_date" : {
        "type" : "date",
        "format" : "yyyy-MM-dd"
      }
   }
}

Data:

{"index": {}}
{"epoch_millis": "1984-04-12"}

Sample 2

Mapping

{
    "mappings":
    {
        "properties":
        {
            "custom_time" :
            {
                "type" : "date",
                "format" : "::: k-A || A    "
            },
            "incomplete_1" :
            {
                "type" : "date",
                "format" : "year"
            },
            "incomplete_2" :
            {
                "type" : "date",
                "format" : "E-w"
            },
            "incomplete_custom_date" :
            {
                "type" : "date",
                "format" : "uuuu"
            },
            "incomplete_custom_time" :
            {
                "type" : "date",
                "format" : "HH"
            },
            "incorrect" :
            {
                "type" : "date",
                "format" : "'___'"
            },
            "epoch_sec" :
            {
                "type" : "date",
                "format" : "epoch_second"
            },
            "epoch_milli" :
            {
                "type" : "date",
                "format" : "epoch_millis"
            },
            "custom_no_delimiter_date" :
            {
                "type" : "date",
                "format" : "uuuuMMdd"
            },
            "custom_no_delimiter_time" :
            {
                "type" : "date",
                "format" : "HHmmss"
            },
            "custom_no_delimiter_ts" :
            {
                "type" : "date",
                "format" : "uuuuMMddHHmmss"
            }
        }
    }
}

Data

{"index": {}}
{ "custom_time":  "85476321", "incomplete_1" : 1984, "incomplete_2": null, "incomplete_custom_date": 1999, "incomplete_custom_time" : 10, "incorrect" : null, "epoch_sec" : 42, "epoch_milli" : 42, "custom_no_delimiter_date" : "19841020", "custom_no_delimiter_time" : "102030", "custom_no_delimiter_ts" : "19841020153548" }
{"index": {}}
{ "custom_time":  "::: 9-32476542", "incomplete_1" : 2022, "incomplete_2": null, "incomplete_custom_date": 3021, "incomplete_custom_time" : 20, "incorrect" : null, "epoch_sec" : 100500, "epoch_milli" : 100500, "custom_no_delimiter_date" : "19610412", "custom_no_delimiter_time" : "090700", "custom_no_delimiter_ts" : "19610412090700" }

Result set

SELECT * FROM test
+--------------------------+-----------+-------------------------+---------------------+------------------------+--------------+------------------------+--------------------------+--------------+------------------------+---------------------+ 
| custom_no_delimiter_date | incorrect | epoch_milli             | epoch_sec           | incomplete_custom_time | custom_time  | custom_no_delimiter_ts | custom_no_delimiter_time | incomplete_2 | incomplete_custom_date | incomplete_1        |
+--------------------------+-----------+-------------------------+---------------------+------------------------+--------------+------------------------+--------------------------+--------------+------------------------+---------------------+
| date                     | timestamp | timestamp               | timestamp           | time                   | time         | timestamp              | time                     | timestamp    | date                   | timestamp           |
+--------------------------+-----------+-------------------------+---------------------+------------------------+--------------+------------------------+--------------------------+--------------+------------------------+---------------------+
| 1984-10-20               | null      | 1970-01-01 00:00:00.042 | 1970-01-01 00:00:42 | 10:00:00               | 23:44:36.321 | 1984-10-20 15:35:48    | 10:20:30                 | null         | 1999-01-01             | 1984-01-01 00:00:00 |
| 1961-04-12               | null      | 1970-01-01 00:01:40.5   | 1970-01-02 03:55:00 | 20:00:00               | 09:01:16.542 | 1961-04-12 09:07:00    | 09:07:00                 | null         | 3021-01-01             | 2022-01-01 00:00:00 |
+--------------------------+-----------+-------------------------+---------------------+------------------------+--------------+------------------------+--------------------------+--------------+------------------------+---------------------+

Check List

  • New functionality includes testing.
    • All tests pass, including unit test, integration test and doctest
  • New functionality has been documented.
    • New functionality has javadoc added
    • New functionality has user manual doc added
  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@GumpacG GumpacG marked this pull request as draft June 2, 2023 15:44
@codecov

This comment was marked as spam.

{"index": {}}
{"epoch_millis": "450608862000.123456", "epoch_second": "450608862.000123456", "date_optional_time": "1984-04-12T09:07:42.000Z", "strict_date_optional_time": "1984-04-12T09:07:42.000Z", "strict_date_optional_time_nanos": "1984-04-12T09:07:42.000123456Z", "basic_date": "19840412", "basic_date_time": "19840412T090742.000Z", "basic_date_time_no_millis": "19840412T090742Z", "basic_ordinal_date": "1984103", "basic_ordinal_date_time": "1984103T090742.000Z", "basic_ordinal_date_time_no_millis": "1984103T090742Z", "basic_time": "090742.000Z", "basic_time_no_millis": "090742Z", "basic_t_time": "T090742.000Z", "basic_t_time_no_millis": "T090742Z", "basic_week_date": "1984W154", "strict_basic_week_date": "1984W154", "basic_week_date_time": "1984W154T090742.000Z", "strict_basic_week_date_time": "1984W154T090742.000Z", "basic_week_date_time_no_millis": "1984W154T090742Z", "strict_basic_week_date_time_no_millis": "1984W154T090742Z", "date": "1984-04-12", "strict_date": "1984-04-12", "date_hour": "1984-04-12T09", "strict_date_hour": "1984-04-12T09", "date_hour_minute": "1984-04-12T09:07", "strict_date_hour_minute": "1984-04-12T09:07", "date_hour_minute_second": "1984-04-12T09:07:42", "strict_date_hour_minute_second": "1984-04-12T09:07:42", "date_hour_minute_second_fraction": "1984-04-12T09:07:42.000", "strict_date_hour_minute_second_fraction": "1984-04-12T09:07:42.000", "date_hour_minute_second_millis": "1984-04-12T09:07:42.000", "strict_date_hour_minute_second_millis": "1984-04-12T09:07:42.000", "date_time": "1984-04-12T09:07:42.000Z", "strict_date_time": "1984-04-12T09:07:42.000123456Z", "date_time_no_millis": "1984-04-12T09:07:42Z", "strict_date_time_no_millis": "1984-04-12T09:07:42Z", "hour": "09", "strict_hour": "09", "hour_minute": "09:07", "strict_hour_minute": "09:07", "hour_minute_second": "09:07:42", "strict_hour_minute_second": "09:07:42", "hour_minute_second_fraction": "09:07:42.000", "strict_hour_minute_second_fraction": "09:07:42.000", "hour_minute_second_millis": "09:07:42.000", "strict_hour_minute_second_millis": "09:07:42.000", "ordinal_date": "1984-103", "strict_ordinal_date": "1984-103", "ordinal_date_time": "1984-103T09:07:42.000123456Z", "strict_ordinal_date_time": "1984-103T09:07:42.000123456Z", "ordinal_date_time_no_millis": "1984-103T09:07:42Z", "strict_ordinal_date_time_no_millis": "1984-103T09:07:42Z", "time": "09:07:42.000Z", "strict_time": "09:07:42.000Z", "time_no_millis": "09:07:42Z", "strict_time_no_millis": "09:07:42Z", "t_time": "T09:07:42.000Z", "strict_t_time": "T09:07:42.000Z", "t_time_no_millis": "T09:07:42Z", "strict_t_time_no_millis": "T09:07:42Z", "week_date": "1984-W15-4", "strict_week_date": "1984-W15-4", "week_date_time": "1984-W15-4T09:07:42.000Z", "strict_week_date_time": "1984-W15-4T09:07:42.000Z", "week_date_time_no_millis": "1984-W15-4T09:07:42Z", "strict_week_date_time_no_millis": "1984-W15-4T09:07:42Z", "weekyear_week_day": "1984-W15-4", "strict_weekyear_week_day": "1984-W15-4", "year_month_day": "1984-04-12", "strict_year_month_day": "1984-04-12", "yyyy-MM-dd": "1984-04-12", "HH:mm:ss": "09:07:42", "yyyy-MM-dd_OR_epoch_millis": "450608862000.123456", "hour_minute_second_OR_t_time": "T09:07:42.000Z"}
{"epoch_millis": "450608862000.123456", "epoch_second": "450608862.000123456", "date_optional_time": "1984-04-12T09:07:42.000Z", "strict_date_optional_time": "1984-04-12T09:07:42.000Z", "strict_date_optional_time_nanos": "1984-04-12T09:07:42.000123456Z", "basic_date": "19840412", "basic_date_time": "19840412T090742.000Z", "basic_date_time_no_millis": "19840412T090742Z", "basic_ordinal_date": "1984103", "basic_ordinal_date_time": "1984103T090742.000Z", "basic_ordinal_date_time_no_millis": "1984103T090742Z", "basic_time": "090742.000Z", "basic_time_no_millis": "090742Z", "basic_t_time": "T090742.000Z", "basic_t_time_no_millis": "T090742Z", "basic_week_date": "1984W154", "strict_basic_week_date": "1984W154", "basic_week_date_time": "1984W154T090742.000Z", "strict_basic_week_date_time": "1984W154T090742.000Z", "basic_week_date_time_no_millis": "1984W154T090742Z", "strict_basic_week_date_time_no_millis": "1984W154T090742Z", "date": "1984-04-12", "strict_date": "1984-04-12", "date_hour": "1984-04-12T09", "strict_date_hour": "1984-04-12T09", "date_hour_minute": "1984-04-12T09:07", "strict_date_hour_minute": "1984-04-12T09:07", "date_hour_minute_second": "1984-04-12T09:07:42", "strict_date_hour_minute_second": "1984-04-12T09:07:42", "date_hour_minute_second_fraction": "1984-04-12T09:07:42.000", "strict_date_hour_minute_second_fraction": "1984-04-12T09:07:42.000", "date_hour_minute_second_millis": "1984-04-12T09:07:42.000", "strict_date_hour_minute_second_millis": "1984-04-12T09:07:42.000", "date_time": "1984-04-12T09:07:42.000Z", "strict_date_time": "1984-04-12T09:07:42.000123456Z", "date_time_no_millis": "1984-04-12T09:07:42Z", "strict_date_time_no_millis": "1984-04-12T09:07:42Z", "hour": "09", "strict_hour": "09", "hour_minute": "09:07", "strict_hour_minute": "09:07", "hour_minute_second": "09:07:42", "strict_hour_minute_second": "09:07:42", "hour_minute_second_fraction": "09:07:42.000", "strict_hour_minute_second_fraction": "09:07:42.000", "hour_minute_second_millis": "09:07:42.000", "strict_hour_minute_second_millis": "09:07:42.000", "ordinal_date": "1984-103", "strict_ordinal_date": "1984-103", "ordinal_date_time": "1984-103T09:07:42.000123456Z", "strict_ordinal_date_time": "1984-103T09:07:42.000123456Z", "ordinal_date_time_no_millis": "1984-103T09:07:42Z", "strict_ordinal_date_time_no_millis": "1984-103T09:07:42Z", "time": "09:07:42.000Z", "strict_time": "09:07:42.000Z", "time_no_millis": "09:07:42Z", "strict_time_no_millis": "09:07:42Z", "t_time": "T09:07:42.000Z", "strict_t_time": "T09:07:42.000Z", "t_time_no_millis": "T09:07:42Z", "strict_t_time_no_millis": "T09:07:42Z", "week_date": "1984-W15-4", "strict_week_date": "1984-W15-4", "week_date_time": "1984-W15-4T09:07:42.000Z", "strict_week_date_time": "1984-W15-4T09:07:42.000Z", "week_date_time_no_millis": "1984-W15-4T09:07:42Z", "strict_week_date_time_no_millis": "1984-W15-4T09:07:42Z", "weekyear_week_day": "1984-W15-4", "strict_weekyear_week_day": "1984-W15-4", "year_month_day": "1984-04-12", "strict_year_month_day": "1984-04-12", "yyyy-MM-dd": "1984-04-12", "custom_time": "09:07:42 PM", "yyyy-MM-dd_OR_epoch_millis": "450608862000.123456", "hour_minute_second_OR_t_time": "T09:07:42.000Z", "custom_timestamp": "1984-04-12 10:07:42 ---- PM", "custom_date_or_date": "1984-04-12"}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is getting really big.
We should split this out into a separate file when we do the real fix.

@Yury-Fridlyand Yury-Fridlyand changed the title [POC] Support user-defined date formats Support user-defined and incomplete date formats Jun 29, 2023
@Yury-Fridlyand Yury-Fridlyand marked this pull request as ready for review June 29, 2023 04:30
@acarbonetto
Copy link

acarbonetto commented Jul 6, 2023

Found one issue. Note that the year of 1999 isn't being received in the response.

Mapping:

{
  "mappings" : {
    "properties" : {
      "custom_time" : {
        "type" : "date",
        "format" : "yyyy HH:mm"
      }
    }
  }
}

data:

{"index": {}}
{"custom_time": "1999 01:01"}

returns:

{
    "schema": [
        {
            "name": "custom_time",
            "type": "timestamp"
        }
    ],
    "datarows": [
        [
            "1970-01-01 01:01:00"
        ]
    ],
    "total": 2,
    "size": 2,
    "status": 200
}

@GumpacG
Copy link
Author

GumpacG commented Jul 6, 2023

When a mapping with incomplete time/date format is loaded and queried, the return type should remain the same.
For example:

"incomplete_custom_time" :
            {
                "type" : "date",
                "format" : "HH"
            },

Should remain a time. It currently is returned as a timestamp. Incomplete custom date also has the same behaviour.

@GumpacG
Copy link
Author

GumpacG commented Jul 6, 2023

When a mapping with incomplete time/date format is loaded and queried, the return type should remain the same. For example:

"incomplete_custom_time" :
            {
                "type" : "date",
                "format" : "HH"
            },

Should remain a time. It currently is returned as a timestamp. Incomplete custom date also has the same behaviour.

Also test that CAST to timestamp from incomplete_custom_time returns todays date with the custom time.

@GumpacG
Copy link
Author

GumpacG commented Jul 7, 2023

Checkstyle is failing and a few IT needs to get updated

@Yury-Fridlyand
Copy link

Yury-Fridlyand commented Jul 7, 2023

Found one issue. Note that the year of 1999 isn't being received in the response.

This comes from org.opensearch.common.time.DateFormatters. Maybe not a bug though. To avoid that we probably need to split custom format into date and time parts and then split values.


Should remain a time. It currently is returned as a timestamp. Incomplete custom date also has the same behaviour

Fixed in 6d214a5


Also test that CAST to timestamp from incomplete_custom_time returns todays date with the custom time.

Works as expected: converting time to date/dt/ts adds today's date (not epoch).

opensearchsql> select CAST(incomplete_custom_time as TIMESTAMP) from dt_formats;
fetched rows / total rows = 2/2
+---------------------------------------------+
| CAST(incomplete_custom_time as TIMESTAMP)   |
|---------------------------------------------|
| 2023-07-07 10:00:00                         |
| 2023-07-07 20:00:00                         |
+---------------------------------------------+

Checkstyle is failing and a few IT needs to get updated

Fixed in c086dad thanks for noticing

@Yury-Fridlyand Yury-Fridlyand merged commit 56e5621 into integ-custom-datetime-formats Jul 8, 2023
@Yury-Fridlyand Yury-Fridlyand deleted the dev-custom-datetime-formats branch July 8, 2023 01:51
matthewryanwells pushed a commit that referenced this pull request Jul 11, 2023
…roject#1821)

* Support user-defined and incomplete date formats (#273)

* Check custom formats for characters

Signed-off-by: Guian Gumpac <[email protected]>

* Removed duplicated code

Signed-off-by: Guian Gumpac <[email protected]>

* Reworked checking for exprcoretype

Signed-off-by: Guian Gumpac <[email protected]>

* Changed check for time

Signed-off-by: Guian Gumpac <[email protected]>

* Rework processing custom and incomplete formats and add tests.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Values of incomplete and incorrect formats to be returned as `TIMESTAMP` instead of `STRING`.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Complete fix and update tests.

Signed-off-by: Yury-Fridlyand <[email protected]>

* More fixes for god of fixes.

Signed-off-by: Yury-Fridlyand <[email protected]>

---------

Signed-off-by: Guian Gumpac <[email protected]>
Signed-off-by: Yury-Fridlyand <[email protected]>
Co-authored-by: Yury-Fridlyand <[email protected]>

* Refactoring.

Signed-off-by: Yury-Fridlyand <[email protected]>

---------

Signed-off-by: Guian Gumpac <[email protected]>
Signed-off-by: Yury-Fridlyand <[email protected]>
Co-authored-by: Guian Gumpac <[email protected]>
MitchellGale pushed a commit that referenced this pull request Jul 11, 2023
…roject#1821)

* Support user-defined and incomplete date formats (#273)

* Check custom formats for characters

Signed-off-by: Guian Gumpac <[email protected]>

* Removed duplicated code

Signed-off-by: Guian Gumpac <[email protected]>

* Reworked checking for exprcoretype

Signed-off-by: Guian Gumpac <[email protected]>

* Changed check for time

Signed-off-by: Guian Gumpac <[email protected]>

* Rework processing custom and incomplete formats and add tests.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Values of incomplete and incorrect formats to be returned as `TIMESTAMP` instead of `STRING`.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Complete fix and update tests.

Signed-off-by: Yury-Fridlyand <[email protected]>

* More fixes for god of fixes.

Signed-off-by: Yury-Fridlyand <[email protected]>

---------

Signed-off-by: Guian Gumpac <[email protected]>
Signed-off-by: Yury-Fridlyand <[email protected]>
Co-authored-by: Yury-Fridlyand <[email protected]>

* Refactoring.

Signed-off-by: Yury-Fridlyand <[email protected]>

---------

Signed-off-by: Guian Gumpac <[email protected]>
Signed-off-by: Yury-Fridlyand <[email protected]>
Co-authored-by: Guian Gumpac <[email protected]>
Signed-off-by: Mitchell Gale <[email protected]>
MitchellGale pushed a commit that referenced this pull request Jul 12, 2023
…roject#1821) (opensearch-project#1830)

* Support user-defined and incomplete date formats (#273)

* Check custom formats for characters

Signed-off-by: Guian Gumpac <[email protected]>

* Removed duplicated code

Signed-off-by: Guian Gumpac <[email protected]>

* Reworked checking for exprcoretype

Signed-off-by: Guian Gumpac <[email protected]>

* Changed check for time

Signed-off-by: Guian Gumpac <[email protected]>

* Rework processing custom and incomplete formats and add tests.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Values of incomplete and incorrect formats to be returned as `TIMESTAMP` instead of `STRING`.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Complete fix and update tests.

Signed-off-by: Yury-Fridlyand <[email protected]>

* More fixes for god of fixes.

Signed-off-by: Yury-Fridlyand <[email protected]>

---------

Signed-off-by: Guian Gumpac <[email protected]>
Signed-off-by: Yury-Fridlyand <[email protected]>
Co-authored-by: Yury-Fridlyand <[email protected]>

* Refactoring.

Signed-off-by: Yury-Fridlyand <[email protected]>

---------

Signed-off-by: Guian Gumpac <[email protected]>
Signed-off-by: Yury-Fridlyand <[email protected]>
Co-authored-by: Guian Gumpac <[email protected]>
(cherry picked from commit a60b222)

Co-authored-by: Yury-Fridlyand <[email protected]>
Yury-Fridlyand added a commit that referenced this pull request Aug 22, 2023
…roject#1821) (opensearch-project#1840)

* Support user-defined and incomplete date formats (#273)

* Check custom formats for characters

Signed-off-by: Guian Gumpac <[email protected]>

* Removed duplicated code

Signed-off-by: Guian Gumpac <[email protected]>

* Reworked checking for exprcoretype

Signed-off-by: Guian Gumpac <[email protected]>

* Changed check for time

Signed-off-by: Guian Gumpac <[email protected]>

* Rework processing custom and incomplete formats and add tests.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Values of incomplete and incorrect formats to be returned as `TIMESTAMP` instead of `STRING`.

Signed-off-by: Yury-Fridlyand <[email protected]>

* Complete fix and update tests.

Signed-off-by: Yury-Fridlyand <[email protected]>

* More fixes for god of fixes.

Signed-off-by: Yury-Fridlyand <[email protected]>

---------

Signed-off-by: Guian Gumpac <[email protected]>
Signed-off-by: Yury-Fridlyand <[email protected]>
Co-authored-by: Yury-Fridlyand <[email protected]>

* Refactoring.

Signed-off-by: Yury-Fridlyand <[email protected]>

---------

Signed-off-by: Guian Gumpac <[email protected]>
Signed-off-by: Yury-Fridlyand <[email protected]>
Co-authored-by: Guian Gumpac <[email protected]>
(cherry picked from commit a60b222)

Co-authored-by: Yury-Fridlyand <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants