-
-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix dask.dataframe
import error for Python 3.11.9
#11035
Conversation
Is this only on 3.11.9? |
Seems to be the case. We have had CI failures in RAPIDS since 3.11.9 was released yesterday. Using an earlier version of CPython works fine. |
That's interesting and weird that this is included in a bugfix release... |
Unit Test ResultsSee test report for an extended history of previous test failures. This is useful for diagnosing flaky tests. 15 files ±0 15 suites ±0 3h 21m 44s ⏱️ + 2m 43s For more details on these failures, see this check. Results for commit b6b01ad. ± Comparison against base commit 9246df0. ♻️ This comment has been updated with latest results. |
I think this was a bad interaction between a fix to correctly traverse Perhaps this is a more principled fix? diff --git a/dask/dataframe/accessor.py b/dask/dataframe/accessor.py
index 9a73eb6c..0bb10e98 100644
--- a/dask/dataframe/accessor.py
+++ b/dask/dataframe/accessor.py
@@ -1,5 +1,6 @@
from __future__ import annotations
+import functools
import warnings
import numpy as np
@@ -28,8 +29,15 @@ def _bind_property(cls, pd_cls, attr, min_version=None):
func.__name__ = attr
func.__qualname__ = f"{cls.__name__}.{attr}"
+ original_prop = getattr(pd_cls, attr)
+ if isinstance(original_prop, property):
+ method = original_prop.fget
+ elif isinstance(original_prop, functools.cached_property):
+ method = original_prop.func
+ else:
+ raise TypeError("bind_property expects original class to provide a property")
try:
- func.__wrapped__ = getattr(pd_cls, attr)
+ func.__wrapped__ = method
except Exception:
pass
setattr(cls, attr, property(derived_from(pd_cls, version=min_version)(func))) |
Relevant cpython change: python/cpython#116197 (backing this out removes the error) |
Okay, that fix seems intuitive to me and works locally. Do you have thoughts @phofl ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
This PR makes a number of significant changes to the patching infrastructure: 1. This PR reorganizes the patching logic to be based on a more principled approach. Rather than maintaining lists of patch functions that are each responsible for filtering modules to apply themselves to, patches are organized in the patches directory in a tree structure matching dask itself. Patches are found and run by importing the same relative paths within the `patches` directory corresponding to a particular dask or distributed module. 2. It adds proper support for patching submodules. Previously the loader was being disabled whenever a real dask module was being imported, but this is problematic because if some dask modules import others they will pre-populate `sys.modules` with the real modules and therefore the loader will never be used for loading a patched version of the submodule. 3. Patches are no longer just functions applied to modules, they are arbitrary functions executed when a module is imported. As a result, a wider range of modifications is possible than was previously allowed. In particular: 4. The more general functions allow the entire module import to be hijacked and redirected to other modules. 5. The new framework is used to vendor a patched version of the accessor.py module in dask, which resolves the issues observed in dask/dask#11035. Authors: - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Richard (Rick) Zamora (https://github.com/rjzamora) URL: #39
This PR makes a number of significant changes to the patching infrastructure: 1. This PR reorganizes the patching logic to be based on a more principled approach. Rather than maintaining lists of patch functions that are each responsible for filtering modules to apply themselves to, patches are organized in the patches directory in a tree structure matching dask itself. Patches are found and run by importing the same relative paths within the `patches` directory corresponding to a particular dask or distributed module. 2. It adds proper support for patching submodules. Previously the loader was being disabled whenever a real dask module was being imported, but this is problematic because if some dask modules import others they will pre-populate `sys.modules` with the real modules and therefore the loader will never be used for loading a patched version of the submodule. 3. Patches are no longer just functions applied to modules, they are arbitrary functions executed when a module is imported. As a result, a wider range of modifications is possible than was previously allowed. In particular: 4. The more general functions allow the entire module import to be hijacked and redirected to other modules. 5. The new framework is used to vendor a patched version of the accessor.py module in dask, which resolves the issues observed in dask/dask#11035. Authors: - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Richard (Rick) Zamora (https://github.com/rjzamora) URL: rapidsai#39
The change in dask#11035 to adapt _bind_property to Python 3.11.9 moved attribute access on the pandas class object around and therefore dropped exception handling. While it is not the case that we currently wrap attributes that don't exist, let's avoid an accidental breakage in the future by reinstating the old exception handling behaviour.
Using the 4/2/2024 release of Python 3.11.9, calling
import dask.dataframe
producesTypeError
when dask tries to "bind" docstrings for theDatetimeAccessor
properties. The problem is that 3.11.9 will throw a newTypeError
when you try to inspect the arguments of a property. This PR updates thetry
/except
block to includeTypeError
.Error without this fix