Look into contributing / integrating with `polars`? #19

jvdd · 2023-01-19T20:32:09Z

Integration / operability with polars can be realized in two manners:

provide the functionality of the various downsamplers via a Trait that is implemented for polars.
optimize polars its downsampling (& perhaps upsamplng) functionality https://pola-rs.github.io/polars-book/user-guide/howcani/timeseries/resampling.html

This Issue is, of course, open for discussion about how users / contributors see this

On another note, perhaps the argminmax its runtime speed / algorithm can be contributed to various parts of polars?

The text was updated successfully, but these errors were encountered:

jmakov · 2023-02-12T15:15:08Z

I'd be interested in both points since it seems not possible to use tsdownsample with polars without conversion. Also related to the second point, would be really great to see comparison with argminmax and current polars implementation especially since it seems I've stumbled upon a regression (fixed for now): pola-rs/polars#6794 (my code uses polars_expr.arg_max() a lot in a for loop).

jvdd · 2023-02-13T07:49:25Z

Hi @jmakov, I'll touch upon both points here. But first some questions.

Are you using polars in Python (and thus not in Rust)?
-> If you are using Python polars, I suppose that you mean with "conversion" casting the data to an numpy array? Does this result in a large runtime and/or memory overhead?

Below I'll share a bit more about the underlying Rust code and how I envision this integration.

`tsdownsample` integration with `polars`

Some background 🙃 : At the time of creating this library I was mainly focused on creating Python bindings for (highly optimized) downsampling algorithms (written in Rust). At the time I was mainly (actually only) focussed on supporting numpy arrays. As a result, the implemented algorithms in this library are very tightly coupled to ndarray::ArrayView1<T>. This currently limits the flexibility / other use cases that the underlying Rust code could serve (e.g., integrating with polars).
=> So, the current tsdownsample architecture does not (conveniently) allow integration with other data (array-like) types.

How can we solve this?

Changing the implementation to slice (&[T]) and adding the downsampling algorithms to one or multiple Traits should conveniently allow (i.e., with very little additional code) to use tsdownsample its downsampling algorithms on different datatypes (e.g., Vec, arrow::PrimitiveArray, polars datatypes, ...).

With this in mind I refactored argminmax (which is truly the core of the downsampling algorithms) to &[T] and conveniently added support for Vec and appache arrow, see jvdd/argminmax#13
Somewhere in the future (can't really give a proper timeline) I plan to do this as well for tsdownsample.

`argminmax` and `polars`

Some related Issues in the argminmax repo:

🔍 check support for polars jvdd/argminmax#14
Make datatypes opt-in jvdd/argminmax#22 => here the polars creator considers adding argminmax to polars

(very basic) benchmarks

When comparing it with polars - I notice a 10x performance gain for smaller arrays (e.g. 100k values), for larger arrays (e.g., 50M values) the performance gain is around 3x.

Results for 100k values ⬇️

argminmax: 0.000556642 sec
polars: 0.005885739 sec

My benchmarking code.

Disclaimer: is certainly not best practice!! - but results are still quite indicative

main.rs ⬇️

use argminmax::ArgMinMax;
use polars::prelude::*;
use rand::Rng;
use std::time::Instant;

fn get_random_array(n: usize) -> Vec<i32> {
    let mut rng = rand::thread_rng();
    let mut v = Vec::with_capacity(n);
    for _ in 0..n {
        v.push(rng.gen_range(0, 100));
    }
    v[0] = i32::MAX;
    v
}


fn main() {
    let n: usize = 100_000;
    let n_repeat: usize = 50;
    let v: Vec<i32> = get_random_array(n);

    let start = Instant::now();
    for _ in 0..n_repeat {
        let (_, max_index) = v.argminmax();
        assert_eq!(max_index, 0);
    }
    let elapsed = start.elapsed();
    println!("argminmax: {}.{:09} sec", elapsed.as_secs(), elapsed.subsec_nanos());


    // Create a polars Series
    let s = Series::new("foo", v);

    let start = Instant::now();
    for _ in 0..n_repeat {
        let max_index = s.arg_max();
        assert_eq!(max_index, Some(0));
    }
    let elapsed = start.elapsed();
    println!("polars: {}.{:09} sec", elapsed.as_secs(), elapsed.subsec_nanos());
}

Cargo.toml ⬇️

[package]
name = "polars_bench"
version = "0.1.0"
edition = "2021"

[dependencies]
argminmax =  { git = "https://github.com/jvdd/argminmax" }
polars = { version = "0.27.2", features = ["simd"] }
rand = { version = "0.7.2" }
rand_distr = { version = "0.2.2", default-features = false }

Hope this answer helps! I'm interested in hearing what you think of this.

Cheers,
Jeroen

P.S. you are (of course) always welcome to start contributing to either package or further extend (e.g., in Python) the above benchmarks and share your findings here 😊

jmakov · 2023-02-13T11:24:16Z

Thanks for the quick response. I'm using polars from python. I guess I could try it only with numpy arrays.

Another question: in my code I'm looking for the first occurrence (index) where a value is greater than given value (in the same column). And I wanted to call argminmax from tsdownsample. But having a quick look at tsdownsample code I don't find an obvious way to call it on my numpy array. Am I missing sth? I only checked the python code since I'm not familiar with rust.

jvdd · 2023-02-13T12:43:30Z

Indeed! I haven't created any proper numpy (or Python) bindings for argminmax.

One (somewhat hacky) solution is to use argminmax through tsdownsample its MinMaxDownsampler (this uses argminmax to return the index values of the min & max values) -> calling MinMaxDownsampler with n_out=2 should call argminmax exactly once for the whole array.

Here a reusable Python implementation ⬇️

import numpy as np; from tsdownsample import MinMaxDownsampler

downsampler = MinMaxDownsampler()
def argminmax(arr: np.ndarray) -> (int, int):
    """Returns (min_index, max_index) for the given array."""
    # Call argminmax exactly once on the data (by downsampling it to 2 datapoints)
    (idx1, idx2) = downsampler.downsample(arr, n_out=2)
    # Return the (argmin_index, argmax_index)
    if arr[idx1] < arr[idx2]:
        # idx1 is the argmin_index
        return idx1, idx2
    else:
        # idx1 is the argmax_index
        return idx2, idx1

But if you will be working with numpy arrays - you might as well just call np.argmax 🤔 - should be +/- same performance (except for f16)

jmakov · 2023-02-13T13:09:04Z

Was already calling np.argmax then refactored to polars and saw big perf increase 15min on a 116 core cluster to 5min on one 24 core node). Oh, ok, though argminmax would give an additional boost. Thank you very much for clarifications!

jvdd · 2023-02-13T13:21:15Z

Can you share some (minimal) Python code of how you use polars for this? Do I understand correctly that polars performs argmin / argmax multi-threaded?

=> If polars calls argmin / argmax multi-threaded, I am quite confident that integrating argminmax in polars will result in a further performance boost (the 3-10x performance gain vs polars I talked about above is just the single-threaded execuction).

argminmax scales really well multi-threaded (as argmin / argmax is a CPU x memory bound issue) - this is exactly what we observed in the multi-threaded MinMaxDownsampler!

jmakov · 2023-02-13T13:29:33Z

I'm using ray.io to distribute work and make polars use only 1 process. The code taken from pola-rs/polars#6794 where I first reported performance regression and started looking for alternatives:

$ test.py
import numpy
import polars

numpy.random.seed(42)


df_size = 10**5

df = polars.DataFrame({"idx": range(0, df_size), "col1": numpy.random.randint(1, 1_000, df_size)})

def get_idx_where_first_value_greater(df):
    for row in df.iter_rows():
        idx = row[0]
        (df[idx, 1] < df[idx:, 1]).arg_max()

get_idx_where_first_value_greater(df)

You can run it like this to prevent multithreading:
export POLARS_MAX_THREADS=1 && time python test.py

jvdd · 2023-02-13T14:52:06Z

I monitored the runtime of the get_idx_where_first_value_greater and observed that by far the largest chunk (95+%) of your time goes to (df[idx, 1] < df[idx:, 1]) operation (and not the arg_max)

See my experiment below ⬇️

import polars
import numpy as np
import timeit

np.random.seed(42)

df_size = 10**4  # set this to 10x lower (otherwise the benchmark takes way to long)

df = polars.DataFrame({"idx": range(0, df_size), "col1": np.random.randint(1, 1_000, df_size)})

# Only create a bool Series
def get_idx_where_first_value_greater_cmp(df):
    for row in df.iter_rows():
        idx = row[0]
        (df[idx, 1] < df[idx:, 1]) #.arg_max()

# Original code
def get_idx_where_first_value_greater_arg_max(df):
    for row in df.iter_rows():
        idx = row[0]
        (df[idx, 1] < df[idx:, 1]).arg_max()

print("polars diff")
res = timeit.timeit('get_idx_where_first_value_greater_cmp(df)', globals=globals(), number=7)
print(res, "s")

print("polars diff argmax")
res = timeit.timeit('get_idx_where_first_value_greater_arg_max(df)', globals=globals(), number=7)
print(res, "s")

Can you confirm this?

jmakov · 2023-02-13T15:05:13Z

In the issue I referenced somebody ran py-spy from where it seems most of the time was in .argmax(). Also the fix addressed that and the regression went away.
I ran my original example with and without .arg_max() commented out. Original script takes about 6.14s, .arg_max() commented out about 5.7s. Looks like you're right, didn't consider that at all - most of the time taken by "<" operation...

jvdd · 2023-02-13T15:28:49Z

Ran line_profiler and this seems to confirm taht the (df[idx, 1] < df[idx:, 1]) is the bottleneck

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
    10                                           def get_idx_where_first_value_greater(df):
    11    100000  114806212.0   1148.1      0.4      for row in df.iter_rows():
    12    100000   48883868.0    488.8      0.2          idx = row[0]
    13    100000 25935368268.0 259353.7     98.4         d = (df[idx, 1] < df[idx:, 1])
    14    100000  265593691.0   2655.9      1.0          d.arg_max()

jmakov · 2023-02-13T15:30:49Z

Thanks! I think that solves the issue with looking for faster argmax :)

jvdd · 2023-02-13T16:36:54Z

Not really sure about this, but I think using arg_where might suit your problem better than using argmax.

Was fun toying around with some polars code! 👌

jmakov · 2023-02-13T18:26:50Z

Didn't know that existed, will check it out, thanks!

jvdd · 2023-07-05T08:31:00Z

Closing as pola-rs/polars#8074 brings argminmax to polars

kszlim · 2024-01-04T19:19:27Z

It would be great to support this (meaning this package's) high level functionality with arrow arrays and/or polars dataframes/series directly too.

jvdd added the enhancement New feature or request label Jan 19, 2023

jvdd changed the title ~~Look into contributing / integrating this to polars?~~ Look into contributing / integrating with polars? Jan 19, 2023

jvdd closed this as completed Jul 5, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Look into contributing / integrating with `polars`? #19

Look into contributing / integrating with `polars`? #19

jvdd commented Jan 19, 2023 •

edited

Loading

jmakov commented Feb 12, 2023 •

edited

Loading

jvdd commented Feb 13, 2023 •

edited

Loading

jmakov commented Feb 13, 2023 •

edited

Loading

jvdd commented Feb 13, 2023

jmakov commented Feb 13, 2023 •

edited

Loading

jvdd commented Feb 13, 2023

jmakov commented Feb 13, 2023 •

edited

Loading

jvdd commented Feb 13, 2023

jmakov commented Feb 13, 2023 •

edited

Loading

jvdd commented Feb 13, 2023

jmakov commented Feb 13, 2023

jvdd commented Feb 13, 2023

jmakov commented Feb 13, 2023

jvdd commented Jul 5, 2023

kszlim commented Jan 4, 2024

Look into contributing / integrating with polars? #19

Look into contributing / integrating with polars? #19

Comments

jvdd commented Jan 19, 2023 • edited Loading

jmakov commented Feb 12, 2023 • edited Loading

jvdd commented Feb 13, 2023 • edited Loading

tsdownsample integration with polars

How can we solve this?

argminmax and polars

(very basic) benchmarks

My benchmarking code.

jmakov commented Feb 13, 2023 • edited Loading

jvdd commented Feb 13, 2023

jmakov commented Feb 13, 2023 • edited Loading

jvdd commented Feb 13, 2023

jmakov commented Feb 13, 2023 • edited Loading

jvdd commented Feb 13, 2023

jmakov commented Feb 13, 2023 • edited Loading

jvdd commented Feb 13, 2023

jmakov commented Feb 13, 2023

jvdd commented Feb 13, 2023

jmakov commented Feb 13, 2023

jvdd commented Jul 5, 2023

kszlim commented Jan 4, 2024

Look into contributing / integrating with `polars`? #19

Look into contributing / integrating with `polars`? #19

jvdd commented Jan 19, 2023 •

edited

Loading

jmakov commented Feb 12, 2023 •

edited

Loading

jvdd commented Feb 13, 2023 •

edited

Loading

`tsdownsample` integration with `polars`

`argminmax` and `polars`

jmakov commented Feb 13, 2023 •

edited

Loading

jmakov commented Feb 13, 2023 •

edited

Loading

jmakov commented Feb 13, 2023 •

edited

Loading

jmakov commented Feb 13, 2023 •

edited

Loading