Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bucket: NotImplementedError: Could not find signature for execute_node: <Bucket, Series, tuple, bool, bool, bool> #4939

Closed
1 task done
MarcSkovMadsen opened this issue Dec 1, 2022 · 7 comments
Labels
feature Features or general enhancements pandas The pandas backend

Comments

@MarcSkovMadsen
Copy link

MarcSkovMadsen commented Dec 1, 2022

What happened?

I'm trying to see if I can get the support for Ibis working in hvPlot and HoloViews. Right now for histograms.

Running some code I see NotImplementedError: Could not find signature for execute_node: <Bucket, Series, tuple, bool, bool, bool>.

I've reduced it to this small example

import ibis
import pandas as pd

df = pd.DataFrame({
    "y": [1,2,3,4,5]
})
con = ibis.pandas.connect({"df": df})
table = con.table("df")
bins=[1.0,3.0,5.0]
expr=table
expr=expr.y
expr.bucket(bins).execute()

What version of ibis are you using?

3.2.0

What backend(s) are you using, if any?

Pandas.

My problem started with duckDB though. But the example above was just simpler to provide.

Relevant log output

$ python script3.py
Traceback (most recent call last):
  File "C:\repos\private\hvplot\.venv\lib\site-packages\multipledispatch\dispatcher.py", line 269, in __call__
    func = self._cache[types]
KeyError: (<class 'ibis.expr.operations.histograms.Bucket'>, <class 'pandas.core.series.Series'>, <class 'tuple'>, <class 'bool'>, <class 'bool'>, <class 'bool'>)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\repos\private\hvplot\script3.py", line 12, in <module>
    expr.bucket(bins).execute()
  File "C:\repos\private\hvplot\.venv\lib\site-packages\ibis\expr\types\core.py", line 291, in execute
    return self._find_backend(use_default=True).execute(
  File "C:\repos\private\hvplot\.venv\lib\site-packages\ibis\backends\pandas\__init__.py", line 216, in execute
    return execute_and_reset(query, params=params, **kwargs)
  File "C:\repos\private\hvplot\.venv\lib\site-packages\ibis\backends\pandas\core.py", line 491, in execute_and_reset
    result = execute(
  File "C:\repos\private\hvplot\.venv\lib\site-packages\multipledispatch\dispatcher.py", line 278, in __call__
    return func(*args, **kwargs)
  File "C:\repos\private\hvplot\.venv\lib\site-packages\ibis\backends\pandas\trace.py", line 137, in traced_func
    return func(*args, **kwargs)
  File "C:\repos\private\hvplot\.venv\lib\site-packages\ibis\backends\pandas\core.py", line 436, in main_execute
    return execute_with_scope(
  File "C:\repos\private\hvplot\.venv\lib\site-packages\ibis\backends\pandas\core.py", line 222, in execute_with_scope
    result = execute_until_in_scope(
  File "C:\repos\private\hvplot\.venv\lib\site-packages\ibis\backends\pandas\trace.py", line 137, in traced_func
    return func(*args, **kwargs)
  File "C:\repos\private\hvplot\.venv\lib\site-packages\ibis\backends\pandas\core.py", line 363, in execute_until_in_scope
    result = execute_node(
  File "C:\repos\private\hvplot\.venv\lib\site-packages\multipledispatch\dispatcher.py", line 273, in __call__
    raise NotImplementedError(
NotImplementedError: Could not find signature for execute_node: <Bucket, Series, tuple, bool, bool, bool>

Code of Conduct

  • I agree to follow this project's Code of Conduct
@MarcSkovMadsen MarcSkovMadsen added the bug Incorrect behavior inside of ibis label Dec 1, 2022
@MarcSkovMadsen MarcSkovMadsen changed the title bucket error: NotImplementedError: Could not find signature for execute_node: <Bucket, Series, tuple, bool, bool, bool> bucket: NotImplementedError: Could not find signature for execute_node: <Bucket, Series, tuple, bool, bool, bool> Dec 1, 2022
@cpcloud
Copy link
Member

cpcloud commented Dec 1, 2022

@MarcSkovMadsen Thanks for reporting the issue.

This operation currently isn't implemented for the pandas backend.

Is it a requirement for you to have this work for the pandas backend or was it just a thing you came across while working through the holoviews support?

Ideally, we wouldn't add operations for completeness's sake but if you're using this or planning to, we can add it.

@cpcloud cpcloud added feature Features or general enhancements pandas The pandas backend and removed bug Incorrect behavior inside of ibis labels Dec 1, 2022
@MarcSkovMadsen
Copy link
Author

MarcSkovMadsen commented Dec 1, 2022

Hi @cpcloud

I am new to the ibis universe. Ideally I am looking for some way to calculate a histogram that works for any ibis backend. Our users would expect to be able to use any ibis backend to create an ibis histogram if we announce ibis to be supported.

As reference I am also having the problem here #4940.

@MarcSkovMadsen
Copy link
Author

If I try to use the code proposed in #4940 it also fails with the error message NotImplementedError: Could not find signature for execute_node: <Bucket, Series, tuple, bool, bool, bool>.

import ibis
import pandas as pd

df = pd.DataFrame({
    "y": [1,2,3,4,5]
})
con = ibis.pandas.connect({"df": df})
table = con.table("df")
expr=table
expr=expr.y

def to_histogram(expr):
    bins=[1.0,3.0,5.0]
    df = expr.to_projection()
    df.mutate(bucket=df.y.bucket(bins)).bucket.value_counts().sort_by('bucket').execute()

to_histogram(expr)

@gforsyth
Copy link
Member

gforsyth commented Dec 1, 2022

Hey @MarcSkovMadsen -- the workaround in #4940 is specific to the (probable) bug in DuckDB -- it won't help with the lack of implementation in the pandas backend.

I understand wanting to offer users the ability to choose any backend so that whatever data the user brings, they can make use of holoviews.

In the case of pandas dataframes, you aren't limited to the pandas backend because there is an option to create a memtable, which loads an in-memory dataframe into duckdb.

[ins] In [1]: import pandas as pd

[ins] In [2]: df = pd.DataFrame(
         ...:     {
         ...:         "value": [1, 2, 3, 4, 5],
         ...:         "group": ["a", "b", "b", "b", "a"],
         ...:     }
         ...: )

[ins] In [3]: import ibis

[ins] In [4]: con = ibis.memtable(df)

[ins] In [5]: con
Out[5]: 
PandasInMemoryTable
  data:
    DataFrameProxy:
         value group
      0      1     a
      1      2     b
      2      3     b
      3      4     b
      4      5     a

@MarcSkovMadsen
Copy link
Author

Thanks so much for proposing solutions. The challenge is that the user is not providing a Pandas dataframe to hvplot/ HoloViews. he is just providing data where data is an ibis Table. Right now we are testing for these three backends

image

Now I will add

image

to the list of fixtures. There are so many other combinations that I should add because the user might be using anything backend that ibis supports.

@gforsyth
Copy link
Member

gforsyth commented Dec 2, 2022

Ok, so given that, I think that you can probably "just" offer duckdb and sqlite to start.
A user might have data in an existing sqlite db, or an existing duckdb file, but there isn't a pandas representation on disk, and any stored data they might want to load (like a csv or a parquet dataset) can be loaded in duckdb instead.

Short of the situation where you have existing data in a specific on-disk format, I don't think there's any benefit to using a local backend that isn't duckdb -- it's far and away the most performant (of the three above) and there are fast paths from DuckDB to arrow.

@cpcloud
Copy link
Member

cpcloud commented Jan 30, 2023

Closing this out for now. We would happily accept a PR for the pandas backend to implement this!

@cpcloud cpcloud closed this as completed Jan 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Features or general enhancements pandas The pandas backend
Projects
None yet
Development

No branches or pull requests

3 participants