Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for polars #88

Open
kklein opened this issue Aug 27, 2024 · 2 comments · May be fixed by #90
Open

Add support for polars #88

kklein opened this issue Aug 27, 2024 · 2 comments · May be fixed by #90
Labels
enhancement New feature or request question Further information is requested

Comments

@kklein
Copy link
Collaborator

kklein commented Aug 27, 2024

As of now, covariates X, treatment assignments w and observey outcomes y can be provided as numpy datastructures (np.ndarray) or as pandas datastructures (pd.Series and pd.DataFrame) respectively.

A PR to allow for X to be scipy.sparse.csr_matrix is in the making: PR #86

It might be beneficial to allow for polars datastructures, too.

One question that might arise is how we deal with a potential additional dependency. Do we want to wrap every polars-dependent piece of code in a try-block that tries to import? Do we want to make polars a run dependency of metalearners?

If you'd like to use metalearners with polars please let us know. :)

@kklein kklein added enhancement New feature or request question Further information is requested labels Aug 27, 2024
@kklein
Copy link
Collaborator Author

kklein commented Aug 29, 2024

One question that might arise is how we deal with a potential additional dependency. Do we want to wrap every polars-dependent piece of code in a try-block that tries to import? Do we want to make polars a run dependency of metalearners?

In this comment, Marco Gorelli suggests the following approach to avoid introducing a run-time dependency:

if (pl := sys.modules.get('polars', None) is not None and isinstance(data, pl.DataFrame):
    # Polars-related logic
    ...

@kklein kklein linked a pull request Sep 3, 2024 that will close this issue
1 task
@baggiponte
Copy link

baggiponte commented Sep 26, 2024

Hey there! Sick project. I would suggest you look into narwhals by @MarcoGorelli and @FBruzzesi others for support for any dataframe backend.

narwhals is a very thin library that handles compatibility with a bunch of other dataframe backends (polars, dask, modin...). it's currently used, among others, by Altair, scikit-lego, and it's being integrated into Plotly.

Marco and Francesco are super nice and can provide a lot of support for this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request question Further information is requested
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants