Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

orc Minimum is error? #51

Closed
hongli-my opened this issue Apr 11, 2022 · 3 comments
Closed

orc Minimum is error? #51

hongli-my opened this issue Apr 11, 2022 · 3 comments

Comments

@hongli-my
Copy link

write orc file:

output = open("/tmp/new.orc", "wb")
writer = pyorc.Writer(output, "struct<col0:int,col1:string,col2:timestamp>", timezone=zoneinfo.ZoneInfo('Asia/Shanghai'))
t = datetime(2022, 4, 10, 9, 30, 30, 0)
writer.write((100, "1000", t))
writer.close()

use orc-statistics to show

--- Column 3 ---
Data type: Timestamp
Values: 1
Has null: no
Minimum: 2022-04-10 03:30:58.800
LowerBound: 2022-04-10 03:30:58.800
Maximum: 2022-04-10 03:30:58.800
UpperBound: 2022-04-10 03:30:58.801
./orc-1.7.3/build/tools/src/orc-contents /tmp/new.orc
{"col0": 100, "col1": "1000", "col2": "2022-04-10 11:30:30.0"}

why Maximum or Minimum is 2022-04-10 03:30:58.800

@hongli-my
Copy link
Author

use java example https://www.jianshu.com/p/a9f8b7aa3c28 is right

@noirello
Copy link
Owner

Thank you for reporting.
It seems like any issue in the C++ core lib with timestamp columns, when the writer's time zone is not UTC.

I opened a PR to fix it.

@noirello
Copy link
Owner

0.7.0 has been released, using ORC C++ Core 1.7.5 with the time zone fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants