You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I hit a weird error when using DuckDB WASM via insertJSONFromPath with JSON files. TLDR, one of the fields contained a timestamp in milliseconds which was marked as Timestamp[s] by DuckDB WASM and returned in microseconds. I found it a bit unexpected (with my data) that:
DuckDB itself recognized the timestamp (rather than just return the int or float)
The unit was incorrect
It was even more unexpected to notice that this behavior was rather inconsistent and was depending on the number of records included in JSON file. Sometimes it just returned INTEGER but in other times it returned TIMESTAMP. After some further digging, I was able to create a minimal file, which shows the unexpected behavior: https://gist.github.com/pmm-motif/762bf9878a2f80629e664ba9534896d9
After loading the JSON via insertJSONFromPath(...) and doing DESCRIBE TABLE <table-name> the type for ts is TIMESTAMP_S (rather than, say, INTEGER, which I was anticipating).
I suspect the reason might be related to scoring algorithm. I think it's actually wrong to try parsing any numeric data as timestamps, since:
the check doesn't seem to analyze if the the timestamps looks like a valid value anyways
I hit a weird error when using DuckDB WASM via
insertJSONFromPath
with JSON files. TLDR, one of the fields contained a timestamp in milliseconds which was marked asTimestamp[s]
by DuckDB WASM and returned in microseconds. I found it a bit unexpected (with my data) that:It was even more unexpected to notice that this behavior was rather inconsistent and was depending on the number of records included in JSON file. Sometimes it just returned INTEGER but in other times it returned TIMESTAMP. After some further digging, I was able to create a minimal file, which shows the unexpected behavior: https://gist.github.com/pmm-motif/762bf9878a2f80629e664ba9534896d9
After loading the JSON via
insertJSONFromPath(...)
and doingDESCRIBE TABLE <table-name>
the type forts
isTIMESTAMP_S
(rather than, say,INTEGER
, which I was anticipating).I suspect the reason might be related to scoring algorithm. I think it's actually wrong to try parsing any numeric data as timestamps, since:
My proposal is to actually do not try auto-parsing number as timestamp
The text was updated successfully, but these errors were encountered: