Bug: fix case where num_rows
and total_byte_size
are not defined (stat should be None instead of Some(0))
#2976
Labels
bug
Something isn't working
I'd love to create a pull request if there isn't any problem.
master...thomas-k-cameron:arrow-datafusion:stat-should-be-None-instead-of-Some(0)
Describe the bug
Some fields of
Statistics
could returnSome(0)
when the value is not available.TODO comment says that some fields are supposed to be
None
instead ofSome(0)
under some circumstances but the value is set toSome(0)
even when it is supposed to haveNone
.To Reproduce
I haven't ran into any breaking bugs yet, however, the functions is used in a method provided by one of the method implemented on
ListingTable
struct andSome(0)
is returned when theoptions.collect_stat
is set tofalse
(aka when it is not supposed to be available).And it appears that it is generating
Statistics
is passed to a method implemented onArc<dyn FileFormat>
so chance of someone running into a bug is not zero.Also,
pub async fn get_statistics_with_limit
is a exported function.Expected behavior
returns
None
instead ofSome(0)
.Alternatives Considered
Some(0)
toNone
.However, this solution would be rather labour intensive as the field is being used by everywhere (over 200 occurrence)
Also anyone using it cannot determine is the size is 0 or the value is just not available
None
in ListingTable::async fn list_files_for_scan<'a>ListingTable::async fn list_files_for_scan<'a>
is the only function callingget_statistics_with_limit
. It can be fixed by assigningNone
to statistics.However,
get_statistics_with_limit
is being exported so I didn't think it really solves the problem.The
Statistics
struct is basically an aggregate of allStatistics
struct passed onto the function.Some values may have fields that is not
None
.Considering that
Statistics
is not meant to always return a accurate value, I think this could work too.Additional context
Is it suppose to run get_statistics_with_limit when
options.collect_stat
is set tofalse
?Thank you very much for reading my issue and please let me know if I'm getting it wrong.
The text was updated successfully, but these errors were encountered: