-
-
Notifications
You must be signed in to change notification settings - Fork 10.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MAINT: Rewrite can-cast logic in terms of NEP 42 #17401
Conversation
f3475c6
to
74a699c
Compare
14e992e
to
5649cd0
Compare
OK, I realize that 4000 lines of new code is too big, but we have to start somewhere on this. Now that it is working as a drop in replacement for I don't like merging almost unused code, but one thing I could do is create a PR which only adds So the question is what do you think, the only serious thing (aside from being a huge chunk of code) is probably the comment I put above around
Would it be possible to get some feedback on |
f733a9c
to
590a006
Compare
5577b77
to
6a7f48b
Compare
I probably will need gh-17706 to fix all the test failures (some new tests create problems). I removed the draft status, just in case that put anyone off from looking at this. This PR includes super important new infrastructure for new DTypes and I really need some review or any improvements here are mainly randomly kicking things and not too helpful. But to be clear: There is some followup necessary probably, but overall it should be far enough that most of that followup can happen later. With two asides:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did a high-level look through. I didn't see a way to break this into smaller PRs, except maybe the resolvers (which is not much code). On the other hand, being able to toggle this on and off will be helpful.
I think we should aim for making sure the API (and NEP 43) is correct since that will be harder to change in future releases, and merge this so it can make it into the 1.20 release. There should be more documentation, or places where this points to parts of NEP 43.
numpy/core/setup.py
Outdated
@@ -23,6 +23,11 @@ | |||
NPY_RELAXED_STRIDES_DEBUG = (os.environ.get('NPY_RELAXED_STRIDES_DEBUG', "0") != "0") | |||
NPY_RELAXED_STRIDES_DEBUG = NPY_RELAXED_STRIDES_DEBUG and NPY_RELAXED_STRIDES_CHECKING | |||
|
|||
# Set to True to use the new casting implementation as much as implemented. | |||
# This allows running the full test suit and testing with the new | |||
# implementation. By default, use the new implementation only in release mode. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the changelog entry you say something a little different
# implementation. By default, use the new implementation only in release mode. | |
# implementation. By default, this is None for this release of NumPy |
numpy/core/setup.py
Outdated
@@ -468,6 +473,11 @@ def generate_config_h(ext, build_dir): | |||
if NPY_RELAXED_STRIDES_DEBUG: | |||
moredefs.append(('NPY_RELAXED_STRIDES_DEBUG', 1)) | |||
|
|||
# Use the new experimental casting implementation in NumPy 1.20: | |||
if NPY_USE_NEW_CASTINGIMPL != "0" or ( | |||
NPY_USE_NEW_CASTINGIMPL is None and not is_released(config)): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The dependence on is_released
is confusing. There should be only one way to turn this on and off.
NPY_USE_NEW_CASTINGIMPL is None and not is_released(config)): | |
NPY_USE_NEW_CASTINGIMPL is None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I will just set it to always of for now, we can just as well switch it to always on after branching 1.20. Hopefully we will delete the whole old branches fairly soon in any case.
} | ||
|
||
/* We find the common dtype of all inputs, and use it for the blanks */ | ||
assert(nin > 0); /* this function is not used */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
better to raise an error than to crash-only-on-debug
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Made it an error, it should be rejected at registration time (in the FromSpec function) or indicates a misuse where a DType was defined but then is missing from the context
(if a user defined an ArrayMethod with all of these dtypes, they must also be passed into the context).
|
||
/* | ||
* Describes casting within datetimes or timedelta | ||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the heart of why the new design is so much better. Nice.
1376750
to
17fb0a9
Compare
9636312
to
d1d5bfc
Compare
I had to squash everything, so the history is lost for now (I have a backup). The As a reminder, the only actual new public API is the change to
|
d1d5bfc
to
7f6f70c
Compare
Tests are all passing now. The doctest failures is real, it is because my code rejects Other than that, I currently only expect to add one or two tests based on what codecov says (although the bad coverage is largely due to untestable error code and the legacy code that is simply never used. |
doctest is failing |
@mattip yes, that is intentional right now. Because I wanted the full test suit to run with the changes. But there is this one small change (e.g. code coverage). But will change it later hopefully (or as soon as you show intention of no/few further fixups). EDIT: Based on codecov, I also wanted to add 1-2 tests, although a lot of what it complains seems hard to trigger error paths. |
Not quite ready yet? |
Was just doing the last update, then run through tests and change the flag. Will finish tonight. |
89f60a4
to
30e7582
Compare
Casting from object uses inspection logic, so doesn't actually end up in this path, and thus will not use (arguably incorrectly) reuse the itemsize of the object dtype in any case.
Lets defer further touch ups to later... One more run, since the last one errored (hopefully due to an old failure not merged correctly)
30e7582
to
39d2e8b
Compare
OK, should have flipped the switch back and one random azure run hopefully including it. (So I think it should be OK to go in, I am sure there will be smaller changes, but those might as well happen later) |
OK, tests are fine. Please ignore the coverage, I manually looked that it was pretty good (aside from some of the code in |
@mattip Ready? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess we should put this in, even though the chances it gets used during the 1.20 release cycle are slim. I am still not 100% happy with the API but as @seberg says it is all internal so we are free to modify it as we go.
|
||
PyArray_DTypeMeta **dtypes; | ||
/* Operand descriptors, filled in by adjust_desciptors */ | ||
PyArray_Descr **descriptors; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice to give dtypes
and descriptors
less generic names. It still bothers me that both are input when really only one or the other is needed. In any case, in NEP 43, the dtypes
field is capitalized Dtypes
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I missed changing it in this PR already, the intention is to modify this to descriptors now, since it is not passed to resolve_descriptors
.
OK, in it goes. If nothing else, later patches should be smaller... |
Thanks Sebastian. |
Yes, there will be API wiggling needed before exposure... But, on the up-side it is probably easier to get a feel for those things once the next step (and maybe the ufunc changes) are in the pipeline. |
Thanks Matti! I know this is tough to move forward, and a large tricky project. |
This PR is pretty big, we could probably split up some parts of it, e.g. the loops are not always used (although in large part tested if they are implemented)
The goal for this PR is currently to define all casts using the new machinery and add fairly thorough tests. Then we can put this in as functionality that is de-facto unused, but tested. A followup will then:
np.can_cast
Both of which should be limited changes after this is done (at least if we defer some optimization), but do change very central code in NumPy, so in the last dev meeting the preliminary plan was that we may defer changing this after the 1.20 release.
There is a lot to dissect in this PR, the basic design is that everything is stored on
ArrayMethod
objects (much like aufunc loop+dtype resolver
).Some aspects, to draw attention to (although some of these should be clarified in NEP 43):
np.can_cast
(only for scalars probably).move_references
flag for handling references together with buffers. Right now, we can just ignore that (keep the "flag" around), but when making this public we may have to think about it, e.g. add additional flags to theArrayMethod
to signal that it can move references.Checklist before merging: