Skip to content

Commit

Permalink
merge main and maybe fix pandas
Browse files Browse the repository at this point in the history
  • Loading branch information
FBruzzesi committed Oct 7, 2024
2 parents 0061e9b + f2b7a40 commit f5812e4
Show file tree
Hide file tree
Showing 107 changed files with 4,469 additions and 900 deletions.
6 changes: 5 additions & 1 deletion .github/release-drafter.yml
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
exclude-labels:
- skip changelog
- ignore
- release
- dependencies
name-template: 'Narwhals v$RESOLVED_VERSION'

change-template: '- $TITLE (#$NUMBER)'
Expand Down Expand Up @@ -34,6 +35,9 @@ autolabeler:
- label: release
title:
- '/^([Rr]elease)/'
- label: ignore
title:
- '/^\[pre-commit.ci\]/'

version-resolver:
major:
Expand Down
46 changes: 46 additions & 0 deletions .github/workflows/downstream_tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -86,3 +86,49 @@ jobs:
run: |
cd scikit-lego
pytest -n auto --disable-warnings --cov=sklego -m "not cvxpy and not formulaic and not umap"
shiny:
strategy:
matrix:
python-version: ["3.12"]
os: [ubuntu-latest]

runs-on: ${{ matrix.os }}
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: ${{ matrix.python-version }}
- name: Install uv
uses: astral-sh/setup-uv@v3
with:
enable-cache: "true"
cache-suffix: ${{ matrix.python-version }}
cache-dependency-glob: "**requirements*.txt"
- name: clone-shiny
run: |
git clone https://github.com/posit-dev/py-shiny.git
cd py-shiny
git log
- name: install-basics
run: uv pip install --upgrade tox virtualenv setuptools --system
- name: install-shiny-dev
run: |
cd py-shiny
uv pip install -e ".[dev,test]" --system
- name: install-narwhals-dev
run: |
uv pip uninstall narwhals --system
uv pip install -e . --system
- name: show-deps
run: uv pip freeze
- name: Run pytest
run: |
cd py-shiny
python tests/pytest/asyncio_prevent.py
pytest
- name: Run mypy
run: |
cd py-shiny
uv pip install mypy --system
mypy shiny
6 changes: 4 additions & 2 deletions .github/workflows/extremes.yml
Original file line number Diff line number Diff line change
Expand Up @@ -119,6 +119,8 @@ jobs:
kaggle kernels output "marcogorelli/variable-brink-glacier"
- name: install-polars
run: python -m pip install *.whl
- name: install-pandas-nightly
run: pip install --pre --extra-index-url https://pypi.anaconda.org/scientific-python-nightly-wheels/simple pandas
- name: install-reqs
run: uv pip install --upgrade tox virtualenv setuptools pip -r requirements-dev.txt --system
- name: uninstall pyarrow
Expand All @@ -127,8 +129,8 @@ jobs:
# run: uv pip install --extra-index-url https://pypi.fury.io/arrow-nightlies/ --pre pyarrow --system
- name: uninstall pandas
run: uv pip uninstall pandas --system
- name: install-pandas-nightly
run: uv pip install --prerelease=allow --pre --extra-index-url https://pypi.anaconda.org/scientific-python-nightly-wheels/simple pandas --system
- name: show-deps
run: uv pip freeze
- name: uninstall numpy
run: uv pip uninstall numpy --system
- name: install numpy nightly
Expand Down
4 changes: 2 additions & 2 deletions .github/workflows/pytest.yml
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ jobs:
cache-suffix: ${{ matrix.python-version }}
cache-dependency-glob: "**requirements*.txt"
- name: install-reqs
run: uv pip install --upgrade tox virtualenv setuptools -r requirements-dev.txt ibis-framework[duckdb] --system
run: uv pip install --upgrade tox virtualenv setuptools -r requirements-dev.txt --system
- name: show-deps
run: uv pip freeze
- name: Run pytest
Expand Down Expand Up @@ -87,7 +87,7 @@ jobs:
- name: show-deps
run: uv pip freeze
- name: install ibis
run: uv pip install ibis-framework[duckdb] --system
run: uv pip install ibis-framework[duckdb]>=6.0.0 --system
# Ibis puts upper bounds on dependencies, and requires Python3.10+,
# which messes with other dependencies on lower Python versions
if: matrix.python-version == '3.12'
Expand Down
7 changes: 6 additions & 1 deletion .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
repos:
- repo: https://github.com/astral-sh/ruff-pre-commit
# Ruff version.
rev: 'v0.6.5'
rev: 'v0.6.7'
hooks:
# Run the formatter.
- id: ruff-format
Expand Down Expand Up @@ -46,3 +46,8 @@ repos:
args: [--skip-errors]
additional_dependencies:
- black==22.12.0
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v4.6.0
hooks:
- id: name-tests-test
exclude: ^tests/utils\.py
6 changes: 5 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,11 +13,12 @@
Extremely lightweight and extensible compatibility layer between dataframe libraries!

- **Full API support**: cuDF, Modin, pandas, Polars, PyArrow
- **Lazy-only support**: Dask
- **Interchange-level support**: Ibis, Vaex, anything else which implements the DataFrame Interchange Protocol

Seamlessly support all, without depending on any!

-**Just use** a subset of **the Polars API**, no need to learn anything new
-**Just use** [a subset of **the Polars API**](https://narwhals-dev.github.io/narwhals/api-reference/), no need to learn anything new
-**Zero dependencies**, Narwhals only uses what
the user passes in so your library can stay lightweight
- ✅ Separate **lazy** and eager APIs, use **expressions**
Expand Down Expand Up @@ -117,6 +118,9 @@ Narwhals has been featured in several talks, podcasts, and blog posts:
- [Talk Python to me Podcast](https://youtu.be/FSH7BZ0tuE0)
Ahoy, Narwhals are bridging the data science APIs

- [Python Bytes Podcast](https://www.youtube.com/live/N7w_ESVW40I?si=y-wN1uCsAuJOKlOT&t=382)
Episode 402, topic #2

- [Super Data Science: ML & AI Podcast](https://www.youtube.com/watch?v=TeG4U8R0U8U)
Narwhals: For Pandas-to-Polars DataFrame Compatibility

Expand Down
1 change: 1 addition & 0 deletions docs/api-reference/dataframe.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,7 @@
- to_numpy
- to_pandas
- unique
- unpivot
- with_columns
- with_row_index
- write_csv
Expand Down
3 changes: 3 additions & 0 deletions docs/api-reference/dtypes.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,9 @@
handler: python
options:
members:
- Array
- List
- Struct
- Int64
- Int32
- Int16
Expand Down
2 changes: 1 addition & 1 deletion docs/api-reference/expr_str.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,9 +8,9 @@
- ends_with
- head
- len_chars
- slice
- replace
- replace_all
- slice
- starts_with
- strip_chars
- tail
Expand Down
1 change: 1 addition & 0 deletions docs/api-reference/lazyframe.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@
- tail
- to_native
- unique
- unpivot
- with_columns
- with_row_index
show_root_heading: false
Expand Down
3 changes: 3 additions & 0 deletions docs/api-reference/narwhals.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ Here are the top-level functions available in Narwhals.
- any_horizontal
- col
- concat
- concat_str
- from_dict
- from_native
- get_level
Expand All @@ -22,12 +23,14 @@ Here are the top-level functions available in Narwhals.
- maybe_align_index
- maybe_convert_dtypes
- maybe_get_index
- maybe_reset_index
- maybe_set_index
- mean
- mean_horizontal
- min
- narwhalify
- new_series
- nth
- sum
- sum_horizontal
- when
Expand Down
2 changes: 2 additions & 0 deletions docs/api-reference/series.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@
members:
- __arrow_c_stream__
- __getitem__
- __iter__
- abs
- alias
- all
Expand Down Expand Up @@ -42,6 +43,7 @@
- null_count
- pipe
- quantile
- rename
- round
- sample
- scatter
Expand Down
3 changes: 3 additions & 0 deletions docs/api-reference/series_str.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,5 +14,8 @@
- starts_with
- strip_chars
- tail
- to_datetime
- to_lowercase
- to_uppercase
show_source: false
show_bases: false
6 changes: 6 additions & 0 deletions docs/assets/logo.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
4 changes: 2 additions & 2 deletions docs/backcompat.md
Original file line number Diff line number Diff line change
Expand Up @@ -74,8 +74,8 @@ users of `narwhals.stable.v1` will have their code unaffected.
Which should you use? In general we recommend:

- When prototyping, use `import narwhals as nw`, so you can iterate quickly.
- Once you're happy with what you've got and what to release something production-ready and stable,
when switch out your `import narwhals as nw` usage for `import narwhals.stable.v1 as nw`.
- Once you're happy with what you've got and want to release something production-ready and stable,
then switch out your `import narwhals as nw` usage for `import narwhals.stable.v1 as nw`.

## Exceptions

Expand Down
81 changes: 41 additions & 40 deletions docs/extending.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

## List of supported libraries (and how to add yours!)

Currently, Narwhals supports the following libraries as inputs:
Currently, Narwhals has full API support for the following libraries:

| Library | 🔗 Link 🔗 |
| ------------- | ------------- |
Expand All @@ -12,46 +12,13 @@ Currently, Narwhals supports the following libraries as inputs:
| Modin | [github.com/modin-project/modin](https://github.com/modin-project/modin) |
| PyArrow ⇶ | [arrow.apache.org/docs/python](https://arrow.apache.org/docs/python/index.html) |

If you want your own library to be recognised too, you're welcome open a PR (with tests)!
Alternatively, if you can't do that (for example, if you library is closed-source), see
the next section for what else you can do.

To check which methods are supported for which backend in depth, please refer to the
[API completeness page](api-completeness/index.md).

## Extending Narwhals

We love open source, but we're not "open source absolutists". If you're unable to open
source you library, then this is how you can make your library compatible with Narwhals.

Make sure that, in addition to the public Narwhals API, you also define:

- `DataFrame.__narwhals_dataframe__`: return an object which implements public methods
from `Narwhals.DataFrame`
- `DataFrame.__narwhals_namespace__`: return an object which implements public top-level
functions from `narwhals` (e.g. `narwhals.col`, `narwhals.concat`, ...)
- `DataFrame.__native_namespace__`: return a native namespace object which must have a
`from_dict` method
- `LazyFrame.__narwhals_lazyframe__`: return an object which implements public methods
from `Narwhals.LazyFrame`
- `LazyFrame.__narwhals_namespace__`: return an object which implements public top-level
functions from `narwhals` (e.g. `narwhals.col`, `narwhals.concat`, ...)
- `LazyFrame.__native_namespace__`: return a native namespace object which must have a
`from_dict` method
- `Series.__narwhals_series__`: return an object which implements public methods
from `Narwhals.Series`

If your library doesn't distinguish between lazy and eager, then it's OK for your dataframe
object to implement both `__narwhals_dataframe__` and `__narwhals_lazyframe__`. In fact,
that's currently what `narwhals._pandas_like.dataframe.PandasLikeDataFrame` does. So, if you're stuck,
take a look at the source code to see how it's done!

Note that the "extension" mechanism is still experimental. If anything is not clear, or
doesn't work, please do raise an issue or contact us on Discord (see the link on the README).
It also has lazy-only support for [Dask](https://github.com/dask/dask), and interchange-only support
for [DuckDB](https://github.com/duckdb/duckdb) and [Ibis](https://github.com/ibis-project/ibis).

## Levels
### Levels

Narwhals comes with two levels of support: "full" and "interchange".
Narwhals comes with two levels of support ("full" and "interchange"), and we are working on defining
a "lazy-only" level too.

Libraries for which we have full support can benefit from the whole
[Narwhals API](https://narwhals-dev.github.io/narwhals/api-reference/).
Expand Down Expand Up @@ -91,4 +58,38 @@ def func(df: Any) -> Schema:
return df.schema
```
is also supported, meaning that, in addition to the libraries mentioned above, you can
also pass Ibis, Vaex, PyArrow, and any other library which implements the protocol.
also pass Ibis, DuckDB, Vaex, and any library which implements the protocol.

### Extending Narwhals

If you want your own library to be recognised too, you're welcome open a PR (with tests)!.
Alternatively, if you can't do that (for example, if you library is closed-source), see
the next section for what else you can do.

We love open source, but we're not "open source absolutists". If you're unable to open
source you library, then this is how you can make your library compatible with Narwhals.

Make sure that, in addition to the public Narwhals API, you also define:

- `DataFrame.__narwhals_dataframe__`: return an object which implements public methods
from `Narwhals.DataFrame`
- `DataFrame.__narwhals_namespace__`: return an object which implements public top-level
functions from `narwhals` (e.g. `narwhals.col`, `narwhals.concat`, ...)
- `DataFrame.__native_namespace__`: return a native namespace object which must have a
`from_dict` method
- `LazyFrame.__narwhals_lazyframe__`: return an object which implements public methods
from `Narwhals.LazyFrame`
- `LazyFrame.__narwhals_namespace__`: return an object which implements public top-level
functions from `narwhals` (e.g. `narwhals.col`, `narwhals.concat`, ...)
- `LazyFrame.__native_namespace__`: return a native namespace object which must have a
`from_dict` method
- `Series.__narwhals_series__`: return an object which implements public methods
from `Narwhals.Series`

If your library doesn't distinguish between lazy and eager, then it's OK for your dataframe
object to implement both `__narwhals_dataframe__` and `__narwhals_lazyframe__`. In fact,
that's currently what `narwhals._pandas_like.dataframe.PandasLikeDataFrame` does. So, if you're stuck,
take a look at the source code to see how it's done!

Note that this "extension" mechanism is still experimental. If anything is not clear, or
doesn't work, please do raise an issue or contact us on Discord (see the link on the README).
5 changes: 5 additions & 0 deletions docs/how_it_works.md
Original file line number Diff line number Diff line change
Expand Up @@ -75,6 +75,7 @@ from narwhals.utils import parse_version
pn = PandasLikeNamespace(
implementation=Implementation.PANDAS,
backend_version=parse_version(pd.__version__),
dtypes=nw.dtypes,
)
print(nw.col("a")._call(pn))
```
Expand All @@ -101,13 +102,15 @@ import pandas as pd
pn = PandasLikeNamespace(
implementation=Implementation.PANDAS,
backend_version=parse_version(pd.__version__),
dtypes=nw.dtypes,
)

df_pd = pd.DataFrame({"a": [1, 2, 3], "b": [4, 5, 6]})
df = PandasLikeDataFrame(
df_pd,
implementation=Implementation.PANDAS,
backend_version=parse_version(pd.__version__),
dtypes=nw.dtypes,
)
expression = pn.col("a") + 1
result = expression._call(df)
Expand Down Expand Up @@ -196,6 +199,7 @@ import pandas as pd
pn = PandasLikeNamespace(
implementation=Implementation.PANDAS,
backend_version=parse_version(pd.__version__),
dtypes=nw.dtypes,
)

df_pd = pd.DataFrame({"a": [1, 2, 3], "b": [4, 5, 6]})
Expand All @@ -210,6 +214,7 @@ backend, and it does so by passing a Narwhals-compliant namespace to `nw.Expr._c
pn = PandasLikeNamespace(
implementation=Implementation.PANDAS,
backend_version=parse_version(pd.__version__),
dtypes=nw.dtypes,
)
expr = (nw.col("a") + 1)._call(pn)
print(expr)
Expand Down
Loading

0 comments on commit f5812e4

Please sign in to comment.