-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cleanup array_has
#12460
cleanup array_has
#12460
Conversation
@@ -203,24 +192,26 @@ fn array_has_dispatch_for_scalar<O: OffsetSizeTrait>( | |||
return Ok(Arc::new(BooleanArray::from(vec![Some(false)]))); | |||
} | |||
let eq_array = compare_with_eq(values, needle, is_nested)?; | |||
let mut final_contained = vec![None; haystack.len()]; | |||
for (i, offset) in offsets.windows(2).enumerate() { | |||
let mut final_contained = BooleanArray::builder(haystack.len()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is using builder faster then From<Vec>
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Turns out what you had was faster, appologies, I had assumed boolean builder existed because it was significantly faster. Reverting.
BooleanArray_builder time: [18.503 µs 18.518 µs 18.532 µs]
change: [-0.1321% +0.0661% +0.2517%] (p = 0.51 > 0.05)
No change in performance detected.
Found 10 outliers among 100 measurements (10.00%)
1 (1.00%) low severe
2 (2.00%) low mild
7 (7.00%) high mild
BooleanArray_vec time: [12.353 µs 12.496 µs 12.653 µs]
change: [-26.653% -26.272% -25.814%] (p = 0.00 < 0.05)
Performance has improved.
Found 15 outliers among 100 measurements (15.00%)
1 (1.00%) low mild
1 (1.00%) high mild
13 (13.00%) high severe
extern crate criterion;
use arrow::array::BooleanArray;
use criterion::{black_box, criterion_group, criterion_main, Criterion};
fn criterion_benchmark(c: &mut Criterion) {
c.bench_function("BooleanArray_builder", |b| {
b.iter(|| {
let mut final_contained = BooleanArray::builder(8192);
for i in 0..8192 {
if i % 3 == 0 {
final_contained.append_value(true);
} else if i % 2 == 0 {
final_contained.append_value(false);
} else {
final_contained.append_null();
}
}
black_box(final_contained.finish());
})
});
c.bench_function("BooleanArray_vec", |b| {
b.iter(|| {
let mut vec = vec![None; 8192];
for i in 0..8192 {
if i % 3 == 0 {
vec[i] = Some(true);
} else if i % 2 == 0 {
vec[i] = Some(false);
}
}
black_box(BooleanArray::from(vec));
})
});
}
criterion_group!(benches, criterion_benchmark);
criterion_main!(benches);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
🚀 |
Thanks @samuelcolvin and @jayzhan211 |
Rationale for this change
While working on #12459 I noticed a few things that could do with being cleaned up.
What changes are included in this PR?
ArrayHas::invoke
to be cleaner and easier to understand — logic hasn't changedRewriteSmall tweak toarray_has_dispatch_for_scalar
to useBooleanArray
builder, not aVec
array_has_dispatch_for_scalar
Are these changes tested?
Tests for
array_has
already exist.Are there any user-facing changes?
no.