Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DataFrame.explode() errors on string columns #14284

Closed
2 tasks done
etiennebacher opened this issue Feb 5, 2024 · 1 comment · Fixed by #14285
Closed
2 tasks done

DataFrame.explode() errors on string columns #14284

etiennebacher opened this issue Feb 5, 2024 · 1 comment · Fixed by #14285
Assignees
Labels
accepted Ready for implementation documentation Improvements or additions to documentation python Related to Python Polars

Comments

@etiennebacher
Copy link

etiennebacher commented Feb 5, 2024

Checks

  • I have checked that this issue has not already been reported.
  • I have confirmed this bug exists on the latest version of Polars.

Reproducible example

import polars as pl

df = pl.DataFrame({"letters": ["aa", "bbb"]})
df.explode("letters")

Log output

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Python311\Lib\site-packages\polars\dataframe\frame.py", line 7268, in explode  
    return self.lazy().explode(columns, *more_columns).collect(_eager=True)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Python311\Lib\site-packages\polars\lazyframe\frame.py", line 1940, in collect  
    return wrap_df(ldf.collect())
                   ^^^^^^^^^^^^^
polars.exceptions.InvalidOperationError: `explode` operation not supported for dtype `str`

Issue description

The documentation for DataFrame.explode() says that the arg columns is:

Column names, expressions, or a selector defining them. The underlying columns being exploded must be of List or String datatype.

However, in the example above, exploding a string column throws an error.

Expected behavior

Not sure if this is a bug or a mistake in the docs.

Installed versions

--------Version info---------
Polars:               0.20.7
Index type:           UInt32
Platform:             Windows-10-10.0.19044-SP0
Python:               3.11.0 (main, Oct 24 2022, 18:26:48) [MSC v.1933 64 bit (AMD64)]

----Optional dependencies----
adbc_driver_manager:  <not installed>
cloudpickle:          <not installed>
connectorx:           <not installed>
deltalake:            <not installed>
fsspec:               2023.6.0
gevent:               <not installed>
hvplot:               <not installed>
matplotlib:           3.7.1
numpy:                1.24.3
openpyxl:             <not installed>
pandas:               2.0.3
pyarrow:              12.0.1
pydantic:             <not installed>
pyiceberg:            <not installed>
pyxlsb:               <not installed>
sqlalchemy:           <not installed>
xlsx2csv:             <not installed>
xlsxwriter:           <not installed>
@etiennebacher etiennebacher added bug Something isn't working needs triage Awaiting prioritization by a maintainer python Related to Python Polars labels Feb 5, 2024
@stinodego stinodego added documentation Improvements or additions to documentation and removed bug Something isn't working needs triage Awaiting prioritization by a maintainer labels Feb 5, 2024
@stinodego
Copy link
Member

stinodego commented Feb 5, 2024

This was recently changed - it should mention List or Array.

Though I guess we do have the str.explode function, so it could work - but it doesn't make much sense anymore now that we have changed the string type.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
accepted Ready for implementation documentation Improvements or additions to documentation python Related to Python Polars
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

2 participants