Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make Interval Parsing More Permissive #6390

Open
samuelcolvin opened this issue Sep 12, 2024 · 8 comments
Open

Make Interval Parsing More Permissive #6390

samuelcolvin opened this issue Sep 12, 2024 · 8 comments
Labels
enhancement Any new improvement worthy of a entry in the changelog

Comments

@samuelcolvin
Copy link
Contributor

Describe the bug

While working on apache/datafusion#12448 I noticed that the string 5 day hour is not interpreted correctly by parse_interval_month_day_nano_config, it should be considered the same as 5 day 0 hour, instead it return an error.

To Reproduce

run that function, or look at apache/datafusion#12448

Expected behavior

parse_interval_month_day_nano_config should match postgres behaviour in general

Additional context

Related to #6211 where I improved interval parsing a lot, although apparently not enough.

@alamb
Copy link
Contributor

alamb commented Sep 13, 2024

I think 5 day hour is a pretty uncommon usecase . Thank you for filing this

@samuelcolvin
Copy link
Contributor Author

samuelcolvin commented Sep 14, 2024

The most likely scenario where it could come up is select interval '5 day' hour, but I agree even that seems like either allowing a common mistake, or a side effect of their very permissive parser.


While we're talking about weird things postgres supports, as per apache/datafusion-sqlparser-rs#1345 (comment), the following are also supported:

select interval '1 month + 2 day'
select interval '1 minute + 2 s'  -- makes sense, all units are supported in arithmetic expressions
select interval '1 minute - 2 s'
select interval '1 minute + 2'  -- makes sense since raw numbers are interpreted as seconds
select interval '1 minute * 2s'  -- this is crazy! the "*" operator is interpretted as "+" so this evaluates as 0:01:02
select interval '1 minute * 2' -- same! 
select interval '1 minute minute' -- repeated units are ignored
select interval '1 month + 2 day' day -- same thing with the units just concatenated

@ByteBaker
Copy link
Contributor

@alamb do we have a plan on integrating this inside Arrow?

@alamb
Copy link
Contributor

alamb commented Sep 21, 2024

@alamb do we have a plan on integrating this inside Arrow?

I am not sure what you mean by this question @ByteBaker

@ByteBaker
Copy link
Contributor

I meant to ask if we plan to handle more cases in our parsing logic? To be closer to parsing abilities of Postgres? As @samuelcolvin pointed out above.

@alamb
Copy link
Contributor

alamb commented Sep 24, 2024

I meant to ask if we plan to handle more cases in our parsing logic? To be closer to parsing abilities of Postgres? As @samuelcolvin pointed out above.

I guess in general I think it would be driven by contributor needs -- if someone wants needs more exotic parsing support, we can definitely consider it, but if it gets too specific, it probably makes sense to keep the parsing in the application and not in arrow-rs

@ByteBaker
Copy link
Contributor

In that case we need to decide if we wanna keep this one open or not.

@alamb
Copy link
Contributor

alamb commented Sep 25, 2024

Let's leave it open so it is easier to find if someone else hits the problem

@tustvold tustvold added enhancement Any new improvement worthy of a entry in the changelog and removed bug labels Sep 28, 2024
@tustvold tustvold changed the title Interval parsing interprets 5 day hour wrongly Make Interval Parsing More Permissive Sep 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Any new improvement worthy of a entry in the changelog
Projects
None yet
Development

No branches or pull requests

4 participants