Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cleanup how args and kwargs are passed in _fast_slow_function_call #16266

Open
wants to merge 2 commits into
base: branch-24.12
Choose a base branch
from

Conversation

Matt711
Copy link
Contributor

@Matt711 Matt711 commented Jul 12, 2024

Description

I think kwargs are being subsumed by args in _fast_slow_function_call when they shouldn't.

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@Matt711 Matt711 added bug Something isn't working non-breaking Non-breaking change cudf.pandas Issues specific to cudf.pandas labels Jul 12, 2024
@Matt711 Matt711 added this to the Proxying - cudf.pandas milestone Jul 12, 2024
@Matt711 Matt711 self-assigned this Jul 12, 2024
@github-actions github-actions bot added the Python Affects Python cuDF API. label Jul 12, 2024
@Matt711 Matt711 marked this pull request as ready for review July 12, 2024 02:50
@Matt711 Matt711 requested a review from a team as a code owner July 12, 2024 02:50
Copy link
Contributor

@galipremsagar galipremsagar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Matt711 Could you explain what the issue is your are seeing?

@Matt711
Copy link
Contributor Author

Matt711 commented Jul 12, 2024

@Matt711 Could you explain what the issue is your are seeing?

In _fast_slow_function_call the kwargs are being included in args. I think this makes it so that kwargs (an therefore slow_kwargs and fast_kwargs) are always empty.

@mroeschke
Copy link
Contributor

It's still a little unclear how this bug would manifest. For example, this still raises

In [1]: %load_ext cudf.pandas

In [2]: import pandas as pd

In [3]: pd.Timestamp(2020, 1, 1, year=2021)
---------------------------------------------------------------------------
Exception                                 Traceback (most recent call last)
File ~/mroeschke-cudf/python/cudf/cudf/pandas/fast_slow_proxy.py:896, in _fast_slow_function_call(func, *args, **kwargs)
    891 with nvtx.annotate(
    892     "EXECUTE_FAST",
    893     color=_CUDF_PANDAS_NVTX_COLORS["EXECUTE_FAST"],
    894     domain="cudf_pandas",
    895 ):
--> 896     fast_args, fast_kwargs = _fast_arg(args), _fast_arg(kwargs)
    897     result = func(*fast_args, **fast_kwargs)

File ~/mroeschke-cudf/python/cudf/cudf/pandas/fast_slow_proxy.py:1046, in _fast_arg(arg)
   1045 seen: set[int] = set()
-> 1046 return _transform_arg(arg, "_fsproxy_fast", seen)

File ~/mroeschke-cudf/python/cudf/cudf/pandas/fast_slow_proxy.py:973, in _transform_arg(arg, attribute_name, seen)
    971 if type(arg) is tuple:
    972     # Must come first to avoid infinite recursion
--> 973     return tuple(_transform_arg(a, attribute_name, seen) for a in arg)
    974 elif hasattr(arg, "__getnewargs_ex__"):
    975     # Partial implementation of to reconstruct with
    976     # transformed pieces
    977     # This handles scipy._lib._bunch._make_tuple_bunch

File ~/mroeschke-cudf/python/cudf/cudf/pandas/fast_slow_proxy.py:973, in <genexpr>(.0)
    971 if type(arg) is tuple:
    972     # Must come first to avoid infinite recursion
--> 973     return tuple(_transform_arg(a, attribute_name, seen) for a in arg)
    974 elif hasattr(arg, "__getnewargs_ex__"):
    975     # Partial implementation of to reconstruct with
    976     # transformed pieces
    977     # This handles scipy._lib._bunch._make_tuple_bunch

File ~/mroeschke-cudf/python/cudf/cudf/pandas/fast_slow_proxy.py:958, in _transform_arg(arg, attribute_name, seen)
    957 if typ is _Unusable:
--> 958     raise Exception("Cannot transform _Unusable")
    959 return typ

Exception: Cannot transform _Unusable

During handling of the above exception, another exception occurred:

TypeError                                 Traceback (most recent call last)
Cell In[3], line 1
----> 1 pd.Timestamp(2020, 1, 1, year=2021)

File ~/mroeschke-cudf/python/cudf/cudf/pandas/_wrappers/pandas.py:131, in Timestamp_Timedelta__new__(cls, *args, **kwargs)
    124 def Timestamp_Timedelta__new__(cls, *args, **kwargs):
    125     # Call fast/slow constructor
    126     # This takes care of running __init__ as well, but must be paired
   (...)
    129     # Timestamp & Timedelta don't always return same types as self,
    130     # hence this method is needed.
--> 131     self, _ = _fast_slow_function_call(
    132         lambda cls, args, kwargs: cls(*args, **kwargs),
    133         cls,
    134         args,
    135         kwargs,
    136     )
    137     return self

File ~/mroeschke-cudf/python/cudf/cudf/pandas/fast_slow_proxy.py:941, in _fast_slow_function_call(func, *args, **kwargs)
    939         slow_args, slow_kwargs = _slow_arg(args), _slow_arg(kwargs)
    940         with disable_module_accelerator():
--> 941             result = func(*slow_args, **slow_kwargs)
    942 return _maybe_wrap_result(result, func, *args, **kwargs), fast

File ~/mroeschke-cudf/python/cudf/cudf/pandas/_wrappers/pandas.py:132, in Timestamp_Timedelta__new__.<locals>.<lambda>(cls, args, kwargs)
    124 def Timestamp_Timedelta__new__(cls, *args, **kwargs):
    125     # Call fast/slow constructor
    126     # This takes care of running __init__ as well, but must be paired
   (...)
    129     # Timestamp & Timedelta don't always return same types as self,
    130     # hence this method is needed.
    131     self, _ = _fast_slow_function_call(
--> 132         lambda cls, args, kwargs: cls(*args, **kwargs),
    133         cls,
    134         args,
    135         kwargs,
    136     )
    137     return self

File timestamps.pyx:1725, in pandas._libs.tslibs.timestamps.Timestamp.__new__()

TypeError: __new__() got multiple values for keyword argument 'year'

@lithomas1 lithomas1 removed their request for review July 17, 2024 18:44
@Matt711
Copy link
Contributor Author

Matt711 commented Jul 18, 2024

pd.Timestamp(2020, 1, 1, year=2021)

Yeah I would expect that example to still raise. This PR just addresses the fact that kwargs are always empty.

@galipremsagar
Copy link
Contributor

pd.Timestamp(2020, 1, 1, year=2021)

Yeah I would expect that example to still raise. This PR just addresses the fact that kwargs are always empty.

@Matt711 Could you please add a couple of tests that are currently failing and pass with this fix?

@Matt711
Copy link
Contributor Author

Matt711 commented Jul 18, 2024

@Matt711 Could you please add a couple of tests that are currently failing and pass with this fix?

I don't think there's a good way to test this PR because all it does is change how we call _fast_slow_function_call

  1. _fast_slow_function_call(func, args, kwargs)
  2. _fast_slow_function_call(func, *args, **kwargs)

With the second option, we unpack the args and kwargs so that kwargs are not always empty (so this is more of a general python thing).

I labeled this a bug but there's nothing really incorrect here in the sense of an incorrect result from _fast_slow_function_call(I can edit the description). I think it makes more sense to unpack the args and kwargs, so that kwargs aren't included in args. If we don't, kwargs and therefore fast_kwargs and slow_kwargs will always be empty. So _slow_arg(kwargs) is never doing anything meaningful. What do you think?

My apologies for any confusion.
cc. @mroeschke

@mroeschke
Copy link
Contributor

mroeschke commented Jul 18, 2024

If there's not strict bug here, IMO I would just treat this as a "cleanup" and just remove **kwargs from _fast_slow_function_call if we never pass kwargs to it internally. Also minimizes a chance of another bug cropping up if splatting the kwargs has some unintended consequence we don't know about.

@vyasr
Copy link
Contributor

vyasr commented Jul 19, 2024

I agree with @mroeschke. No need to label this a bug, but let's simplify the API to reduce the possibility of error for devs in the future.

@Matt711 Matt711 changed the title [BUG]: Fix how args and kwargs are passed in _fast_slow_function_call Cleanup how args and kwargs are passed in _fast_slow_function_call Jul 24, 2024
_slow_arg(args),
_slow_arg(kwargs),
)
slow_args = (_slow_arg(args),)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
slow_args = (_slow_arg(args),)
slow_args = _slow_arg(args)

@Matt711 Matt711 changed the base branch from branch-24.08 to branch-24.10 July 24, 2024 23:42
@@ -874,7 +874,6 @@ def _fast_slow_function_call(
func: Callable,
/,
*args,
**kwargs,
) -> Any:
"""
Call `func` with all `args` and `kwargs` converted to their
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you update this docstring to note that args contains func's args and kwargs?

@vyasr
Copy link
Contributor

vyasr commented Aug 16, 2024

Looks like this needs some conflict resolution, and there are some discussions to be addressed @Matt711.

@Matt711
Copy link
Contributor Author

Matt711 commented Sep 24, 2024

TODO: Check if this improvement PR is still relevant.

@Matt711 Matt711 changed the base branch from branch-24.10 to branch-24.12 September 24, 2024 03:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working cudf.pandas Issues specific to cudf.pandas non-breaking Non-breaking change Python Affects Python cuDF API.
Projects
Status: In Progress
Development

Successfully merging this pull request may close these issues.

4 participants