-
-
Notifications
You must be signed in to change notification settings - Fork 30.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-125916: Allow functools.reduce 'initial' to be a keyword argument #125917
base: main
Are you sure you want to change the base?
gh-125916: Allow functools.reduce 'initial' to be a keyword argument #125917
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please also benchmark your implementation.
Kwargs handling will affect performance even if keyword will not be actually used (e.g. calls like reduce(f, seq, init)
). IIUC, PyArg_ParseTupleAndKeywords
is much slower in general than PyArg_UnpackTuple
.
CC @sobolevn, as you have added AC comment. |
This comment was marked as outdated.
This comment was marked as outdated.
Co-authored-by: Sergey B Kirpichev <[email protected]>
Now with AC (patch2). Patch with AC (a draft).diff --git a/Modules/_functoolsmodule.c b/Modules/_functoolsmodule.c
index 802b1cf792..8faa8ad1ac 100644
--- a/Modules/_functoolsmodule.c
+++ b/Modules/_functoolsmodule.c
@@ -932,15 +932,30 @@ _functools_cmp_to_key_impl(PyObject *module, PyObject *mycmp)
/* reduce (used to be a builtin) ********************************************/
-// Not converted to argument clinic, because of `args` in-place modification.
-// AC will affect performance.
+/*[clinic input]
+_functools.reduce
+
+ function as func: object
+ iterable as seq: object
+ /
+ initial as result: object(c_default="NULL") = None
+
+Apply a function of two arguments cumulatively.
+
+Apply it to the items of a sequence or iterable, from left to right, so as to
+reduce the iterable to a single value. For example, reduce(lambda x, y: x+y,
+[1, 2, 3, 4, 5]) calculates ((((1+2)+3)+4)+5). If initial is present, it is
+placed before the items of the iterable in the calculation, and serves as a
+default when the iterable is empty.
+[clinic start generated code]*/
+
static PyObject *
-functools_reduce(PyObject *self, PyObject *args)
+_functools_reduce_impl(PyObject *module, PyObject *func, PyObject *seq,
+ PyObject *result)
+/*[clinic end generated code: output=30d898fe1267c79d input=b7082b8b1473fdc2]*/
{
- PyObject *seq, *func, *result = NULL, *it;
+ PyObject *args, *it;
- if (!PyArg_UnpackTuple(args, "reduce", 2, 3, &func, &seq, &result))
- return NULL;
if (result != NULL)
Py_INCREF(result);
@@ -1006,16 +1021,6 @@ functools_reduce(PyObject *self, PyObject *args)
return NULL;
}
-PyDoc_STRVAR(functools_reduce_doc,
-"reduce(function, iterable[, initial], /) -> value\n\
-\n\
-Apply a function of two arguments cumulatively to the items of a sequence\n\
-or iterable, from left to right, so as to reduce the iterable to a single\n\
-value. For example, reduce(lambda x, y: x+y, [1, 2, 3, 4, 5]) calculates\n\
-((((1+2)+3)+4)+5). If initial is present, it is placed before the items\n\
-of the iterable in the calculation, and serves as a default when the\n\
-iterable is empty.");
-
/* lru_cache object **********************************************************/
/* There are four principal algorithmic differences from the pure python version:
@@ -1720,7 +1725,7 @@ PyDoc_STRVAR(_functools_doc,
"Tools that operate on functions.");
static PyMethodDef _functools_methods[] = {
- {"reduce", functools_reduce, METH_VARARGS, functools_reduce_doc},
+ _FUNCTOOLS_REDUCE_METHODDEF
_FUNCTOOLS_CMP_TO_KEY_METHODDEF
{NULL, NULL} /* sentinel */
}; You should run I did some benchmarks. # a.py
import pyperf
from functools import reduce
f = lambda x, y: x + y
lst = list(range(10))
init = 123
runner = pyperf.Runner()
runner.bench_func('reduce(f, lst)', reduce, f, lst)
runner.bench_func('reduce(f, lst, init)', reduce, f, lst, init) Run e.g. with: with results:
Looks the patch with AC even slightly faster than in the main. |
@skirpichev Is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR! I think that if we want to add keyword support for functools.reduce
, it should be done for all parameters, not just the initial
. If so, this would match the behavior of the pure Python version of functools.
No, it doesn't. Someone can use |
Taken from patch by Sergey B Kirpichev <[email protected]>
That might slowdown the patch v2.
That was a draft;) I think we could use same trick as for the Python version. BTW, it seems the PEP 661 doesn't cover this at all. Edit: Updated AC patch with a sentinel value.diff --git a/Modules/_functoolsmodule.c b/Modules/_functoolsmodule.c
index 802b1cf792..00b4a5e6cc 100644
--- a/Modules/_functoolsmodule.c
+++ b/Modules/_functoolsmodule.c
@@ -932,15 +932,31 @@ _functools_cmp_to_key_impl(PyObject *module, PyObject *mycmp)
/* reduce (used to be a builtin) ********************************************/
-// Not converted to argument clinic, because of `args` in-place modification.
-// AC will affect performance.
+/*[clinic input]
+_functools.reduce
+
+ function as func: object
+ iterable as seq: object
+ /
+ initial as result: object(c_default="NULL") = _functools._initial_missing
+
+Apply a function of two arguments cumulatively to an iterable, from left to right.
+
+This efficiently reduce the iterable to a single value. If initial is present,
+it is placed before the items of the iterable in the calculation, and serves as
+a default when the iterable is empty.
+
+For example, reduce(lambda x, y: x+y, [1, 2, 3, 4, 5])
+calculates ((((1+2)+3)+4)+5).
+[clinic start generated code]*/
+
static PyObject *
-functools_reduce(PyObject *self, PyObject *args)
+_functools_reduce_impl(PyObject *module, PyObject *func, PyObject *seq,
+ PyObject *result)
+/*[clinic end generated code: output=30d898fe1267c79d input=40be8069bcbc1a75]*/
{
- PyObject *seq, *func, *result = NULL, *it;
+ PyObject *args, *it;
- if (!PyArg_UnpackTuple(args, "reduce", 2, 3, &func, &seq, &result))
- return NULL;
if (result != NULL)
Py_INCREF(result);
@@ -1006,16 +1022,6 @@ functools_reduce(PyObject *self, PyObject *args)
return NULL;
}
-PyDoc_STRVAR(functools_reduce_doc,
-"reduce(function, iterable[, initial], /) -> value\n\
-\n\
-Apply a function of two arguments cumulatively to the items of a sequence\n\
-or iterable, from left to right, so as to reduce the iterable to a single\n\
-value. For example, reduce(lambda x, y: x+y, [1, 2, 3, 4, 5]) calculates\n\
-((((1+2)+3)+4)+5). If initial is present, it is placed before the items\n\
-of the iterable in the calculation, and serves as a default when the\n\
-iterable is empty.");
-
/* lru_cache object **********************************************************/
/* There are four principal algorithmic differences from the pure python version:
@@ -1720,7 +1726,7 @@ PyDoc_STRVAR(_functools_doc,
"Tools that operate on functions.");
static PyMethodDef _functools_methods[] = {
- {"reduce", functools_reduce, METH_VARARGS, functools_reduce_doc},
+ _FUNCTOOLS_REDUCE_METHODDEF
_FUNCTOOLS_CMP_TO_KEY_METHODDEF
{NULL, NULL} /* sentinel */
};
@@ -1789,6 +1795,10 @@ _functools_exec(PyObject *module)
// lru_list_elem is used only in _lru_cache_wrapper.
// So we don't expose it in module namespace.
+ if (PyModule_Add(module, "_initial_missing", _PyObject_New(&PyBaseObject_Type)) < 0) {
+ return -1;
+ }
+
return 0;
}
|
@skirpichev should the default be specified at all? I think EDIT: Ah scratch that. That makes |
No. Current code in the main more accurately can be described as function with multiple signatures. Funny notation
The AC can't represent multiple signatures yet. The only way - using the sentinel value |
Lib/functools.py
Outdated
@@ -236,7 +236,7 @@ def __ge__(self, other): | |||
|
|||
def reduce(function, sequence, initial=_initial_missing): | |||
""" | |||
reduce(function, iterable[, initial], /) -> value | |||
reduce(function, iterable, /, initial=None) -> value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe use ellipsis:
reduce(function, iterable, /, initial=None) -> value | |
reduce(function, iterable, /, initial=...) -> value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See PEP 661:)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But the sentinel is private and doesn't even exist in the C implementation. Ellipsis is frequently used for unspecified default values in typeshed. We could use multiple signatures though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But the sentinel is private and doesn't even exist in the C implementation.
It's easy to add, see #125917 (comment)
Ellipsis is frequently used for unspecified default values in typeshed.
We could use multiple signatures though.
Yes, I think it's fine for the sphinx docs. But help will looks like this (as for pure-Python version):
>>> help(functools.reduce)
Help on built-in function reduce in module _functools:
reduce(function, iterable, /,
initial=_functools._initial_missing)
Apply a function of two arguments cumulatively to an iterable, from left to right.
[...]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think I've ever seen =...
in the docs. Do we have precedent for that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like the signature is giving inspect
a hard time. But it is autogenerated by AC. Did I do something wrong?
reduce(function, iterable, /,
initial=_functools._initial_missing)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But the sentinel is private and doesn't even exist in the C implementation. Ellipsis is frequently used for unspecified default values in typeshed. We could use multiple signatures though.
Multiple signatures for a docs sounds like a good solution.
Using ...
for default values is essentially the same as using None
, and it's just wrong since users can pass ...
as the initial
value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think I've ever seen =... in the docs. Do we have precedent for that?
Yeah, e.g. for the int.from_bytes, for example.
it seems like the signature is giving inspect a hard time. But it is autogenerated by AC. Did I do something wrong?
First, note that reduce() has no correct signature in the current main.
Now AC adds one, but it can't be parsed by inspect._signature_fromstr()
: this helper has own opinion on what can be specified as a default value (e.g. it can't be a complex number).
* Apply patch by Sergey B Kirpichev <[email protected]> - fix typo * Update docs
Co-authored-by: Peter Bierma <[email protected]>
Co-authored-by: Sergey B Kirpichev <[email protected]>
Do you update this test? cpython/Lib/test/test_inspect/test_inspect.py Lines 5699 to 5701 in ad6110a
|
I support the opinion that having all parameters as pos-and-keyword is not very useful, but if we only make |
Differences between |
@erlend-aasland as AC changes coming from my patch, probably 1) point - is my job. I think that @sayandipdutta can keep this PR as-is for a while. I'll make a separate PR with AC-related changes. |
PR, that switch to AC: #125999 |
Thanks a lot @skirpichev! It seems all I have to do is wait for your PR to be merged and then merge main into my PR. Will do so. |
Meta: I'll marked this as draft until Sergey's PR has landed. |
Sergey's Argument Clinic adaption landed just now. Please resolve conflicts, regenerate clinic, and mark the PR ready for review again. |
Misc/NEWS.d/next/Library/2024-10-24-13-40-20.gh-issue-126916.MAgz6D.rst
Outdated
Show resolved
Hide resolved
Can you post updated benchmarks vs. current |
I have made the requested changes; please review again |
This comment was marked as outdated.
This comment was marked as outdated.
…Agz6D.rst Co-authored-by: Erlend E. Aasland <[email protected]>
Checked against debug build. Followed script from #125917 (comment)
EDIT: On release:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
BTW, you need a What's New entry for this. |
Before:
After:
Issue: gh-125916
📚 Documentation preview 📚: https://cpython-previews--125917.org.readthedocs.build/