Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-119127: functools.partial placeholders #119827

Merged
merged 78 commits into from
Sep 26, 2024

Conversation

dg-pb
Copy link
Contributor

@dg-pb dg-pb commented May 31, 2024

As I already had implementation I though PR might be helpful for others to see and evaluate.

From all the different extensions of functools.partial I think this one is the best. It is relatively simple and exposes all missing functionality. Other partial extensions that I have seen lack functionality and would not provide complete argument ordering capabilities and/or are too complicated in relation to what they offer.

Implementation can be summarised as follows:
a) Trailing placeholders are not allowed. (Makes things simpler)
b) Throws exception if not all placeholders are filled on call
c) retains optimization benefits of application on other partial instances.

Performance penalty compared to current functools.partial is minimal for extension class. + 20-30 ns for initialisation and <4 ns when called with or without placeholders.

To put it simply, new functionality extends functools.partial so that it has flexibility of lambda / def approach (in terms of argument ordering), but call overhead is 2x smaller.

The way I see it is that this could only be justified if this extension provided completeness and no new functionality is going to be needed anywhere near in the future. I have thought about it and tried various alternatives and I think there is a good chance that this is the case. Personally, I don't think I would ever need anything more from partial class.

Current implementation functions reliably.

Benchmark

There is nothing new here in terms of performance. The performance after this PR will be (almost) the same as the performance of partial until now. Placeholders only provide flexibility for taking advantage of performance benefits where it is important.

So far I have identified 2 such cases:

  1. More flexible predicate construction for functions in operator module. This allows for new strategies in making performant iterator recipes.
  2. Partializing input target function. Examples of this are optimizers and similar. I.e. cases where the function will be called over and over within the routine with number of arguments. But the input target function needs partial substitution for positionals and keywords.

Good example of this is scipy.optimize.minimize.

Its signature is: scipy.optimize.minimize(fun, x0, args=(), ...)

Note, it does not have kwds. Why? I don't know. But good reason for it could be:

fun = lambda x: f(x, **kwds)

will need to expand **kwds on every call (even if it is empty), while partial will make the most optimal call. (see benchmarks below). So the minimize function can leave out kwds given there is a good way to source callable with already substituted keywords.

This extension allows pre-substituting both positionals and keywords. This allows optimizer signature to leave out both kwds and args resulting in simpler interface scipy.optimize.minimize(fun, x0, ...) and gaining slightly better performance - function calls are at the center of such problems after all.

Benchmark Results for __call__

Code for Cases

dct = {'a': 1}
kwds = {'c': 1, 'd': 2}
kwds_empty = {}
args1 = (1,)
args3 = (1, 2, 4)

opr_sub = opr.sub
opr_contains = opr.contains

opr_sub_lambda = lambda b: opr_sub(1, b)
opr_sub_partial = ftl.partial(opr_sub, 1)

opr_contains_lambda = lambda b: opr_contains(dct, b)
opr_contains_partial = ftl.partial(opr_contains, dct)


def pos2(a, b):
    pass

def pos6(a, b, c, d, e, f):
    pass

def pos2kw2(a, b, c=1, d=2):
    pass


pos2_lambda = lambda b: pos2(1, b)
pos2_partial = ftl.partial(pos2, 1)

pos6_lambda = lambda b, c, d: pos6(1, 2, 3, b, c, d)
pos6_partial = ftl.partial(pos6, 1, 2, 3)

pos2kw2_kw_lambda = lambda b: pos2kw2(1, b, **kwds)
pos2kw2_kw_partial = ftl.partial(pos2kw2, 1, **kwds)

pos2kw2_kwe_lambda = lambda b: pos2kw2(1, b, **kwds_empty)
pos2kw2_kwe_partial = ftl.partial(pos2kw2, 1, **kwds_empty)

opr_sub_partial_ph = ftl.partial(opr_sub, PH, 1)
opr_contains_partial_ph = ftl.partial(opr_contains, PH, 'a')
pos2_partial_ph = ftl.partial(pos2, PH, 1)
pos6_partial_ph = ftl.partial(pos6, PH, 2, PH, 4, PH, 6)
pos2kw2_kw_partial_ph = ftl.partial(pos2kw2, PH, 1, **kwds)
pos2kw2_kwe_partial_ph = ftl.partial(pos2kw2, PH, 1, **kwds_empty)

# Placeholder versions
from functools import Placeholder as PH
opr_sub_partial_ph = ftl.partial(opr_sub, PH, 1)
opr_contains_partial_ph = ftl.partial(opr_contains, PH, 'a')
pos2_partial_ph = ftl.partial(pos2, PH, 1)
pos6_partial_ph = ftl.partial(pos6, PH, 2, PH, 4, PH, 6)
pos2kw2_kw_partial_ph = ftl.partial(pos2kw2, PH, 1, **kwds)
pos2kw2_kwe_partial_ph = ftl.partial(pos2kw2, PH, 1, **kwds_empty)

CPython Results

C Implementation
----------------
    ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
    ┃   5 repeats, 1,000,000 times     ┃
    ┣━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┫
    ┃    Units: ns    lambda   partial ┃
    ┃              ┏━━━━━━━━━━━━━━━━━━━┫
    ┃      opr_sub ┃  50 ± 4    40 ± 2 ┃
    ┃ opr_contains ┃  53 ± 3    43 ± 3 ┃
    ┃         pos2 ┃  50 ± 1    64 ± 1 ┃
    ┃ pos2(*args1) ┃  69 ± 5    73 ± 5 ┃
    ┃         pos6 ┃  58 ± 1   103 ± 5 ┃
    ┃ pos6(*args3) ┃  77 ± 3    99 ± 5 ┃
    ┃   pos2kw2_kw ┃ 240 ± 4   259 ± 7 ┃
    ┃  pos2kw2_kwe ┃ 134 ± 6    69 ± 3 ┃
    ┗━━━━━━━━━━━━━━┻━━━━━━━━━━━━━━━━━━━┛
    With Placeholders
    ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
    ┃           5 repeats, 1,000,000 times              ┃
    ┣━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┫
    ┃    Units: ns     lambda    partial   Placeholders ┃
    ┃              ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┫
    ┃      opr_sub ┃  50 ±  2    39 ±  1        44 ±  4 ┃
    ┃ opr_contains ┃  61 ±  2    44 ±  2        49 ±  2 ┃
    ┃         pos2 ┃  54 ±  2    58 ±  3        64 ±  2 ┃
    ┃ pos2(*args1) ┃  67 ±  3    72 ±  9        69 ±  3 ┃
    ┃         pos6 ┃  63 ±  3   102 ±  3        99 ±  2 ┃
    ┃ pos6(*args3) ┃  75 ±  3   101 ±  2        94 ±  4 ┃
    ┃   pos2kw2_kw ┃ 242 ±  7   259 ± 10       260 ±  7 ┃
    ┃  pos2kw2_kwe ┃ 131 ±  4    64 ±  1        69 ±  2 ┃
    ┗━━━━━━━━━━━━━━┻━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛


Python Implementation
---------------------
    Current
    ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
    ┃    5 repeats, 1,000,000 times      ┃
    ┣━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┫
    ┃    Units: ns     lambda    partial ┃
    ┃              ┏━━━━━━━━━━━━━━━━━━━━━┫
    ┃      opr_sub ┃  48 ±  1   373 ± 13 ┃
    ┃ opr_contains ┃  51 ±  1   377 ± 12 ┃
    ┃         pos2 ┃  51 ±  4   378 ±  5 ┃
    ┃ pos2(*args1) ┃  63 ±  5   354 ±  7 ┃
    ┃         pos6 ┃  59 ±  1   437 ±  5 ┃
    ┃ pos6(*args3) ┃  75 ±  2   410 ±  7 ┃
    ┃   pos2kw2_kw ┃ 239 ±  4   517 ±  5 ┃
    ┃  pos2kw2_kwe ┃ 133 ±  3   408 ± 49 ┃
    ┗━━━━━━━━━━━━━━┻━━━━━━━━━━━━━━━━━━━━━┛
    With Placeholders
    ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
    ┃           5 repeats, 1,000,000 times              ┃
    ┣━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┫
    ┃    Units: ns     lambda    partial   Placeholders ┃
    ┃              ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┫
    ┃      opr_sub ┃  49 ±  1   392 ± 13       547 ± 12 ┃
    ┃ opr_contains ┃  54 ±  2   393 ±  9       605 ± 78 ┃
    ┃         pos2 ┃  55 ±  9   398 ±  7       544 ±  5 ┃
    ┃ pos2(*args1) ┃  66 ±  2   373 ±  5       533 ±  8 ┃
    ┃         pos6 ┃  58 ±  5   462 ±  4       652 ±  3 ┃
    ┃ pos6(*args3) ┃  74 ±  2   428 ± 11       635 ±  9 ┃
    ┃   pos2kw2_kw ┃ 240 ±  5   533 ±  4       696 ± 10 ┃
    ┃  pos2kw2_kwe ┃ 134 ±  2   406 ±  4       555 ±  3 ┃
    ┗━━━━━━━━━━━━━━┻━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛

PyPy Results

PyPy
----
    ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
    ┃     5 repeats, 10,000 times        ┃
    ┣━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┫
    ┃    Units: ns     lambda    partial ┃
    ┃              ┏━━━━━━━━━━━━━━━━━━━━━┫
    ┃      opr_sub ┃ 122 ± 15   266 ± 70 ┃
    ┃ opr_contains ┃ 147 ±  7   248 ± 64 ┃
    ┃         pos2 ┃ 114 ± 17   204 ± 49 ┃
    ┃ pos2(*args1) ┃ 156 ± 24   202 ± 28 ┃
    ┃         pos6 ┃ 124 ± 14   268 ± 39 ┃
    ┃ pos6(*args3) ┃ 147 ± 36   225 ± 21 ┃
    ┃   pos2kw2_kw ┃ 259 ± 17   436 ± 66 ┃
    ┃  pos2kw2_kwe ┃ 180 ± 14   243 ± 43 ┃
    ┗━━━━━━━━━━━━━━┻━━━━━━━━━━━━━━━━━━━━━┛
    ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
    ┃   5 repeats, 1,000,000 times     ┃
    ┣━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┫
    ┃     Units: ns   lambda   partial ┃
    ┃               ┏━━━━━━━━━━━━━━━━━━┫
    ┃       opr_sub ┃  1 ± 0     3 ± 1 ┃
    ┃  opr_contains ┃ 13 ± 0    16 ± 2 ┃
    ┃          pos2 ┃  1 ± 0     3 ± 1 ┃
    ┃  pos2(*args1) ┃  2 ± 0     2 ± 0 ┃
    ┃          pos6 ┃  1 ± 0     2 ± 0 ┃
    ┃  pos6(*args3) ┃  2 ± 0     2 ± 0 ┃
    ┃    pos2kw2_kw ┃ 42 ± 1    72 ± 2 ┃
    ┃   pos2kw2_kwe ┃  2 ± 0     2 ± 0 ┃
    ┗━━━━━━━━━━━━━━━┻━━━━━━━━━━━━━━━━━━┛

PyPy Placeholder
----------------
    ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
    ┃             5 repeats, 10,000 times                 ┃
    ┣━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┫
    ┃    Units: ns      lambda     partial   Placeholders ┃
    ┃              ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┫
    ┃      opr_sub ┃ 114 ±   5   256 ±  82      719 ± 170 ┃
    ┃ opr_contains ┃ 142 ±   7   538 ± 536      787 ± 145 ┃
    ┃         pos2 ┃ 125 ±  19   239 ±  54      679 ± 116 ┃
    ┃ pos2(*args1) ┃ 130 ±  30   199 ±  17      638 ±  48 ┃
    ┃         pos6 ┃ 115 ±  16   237 ±  43      785 ± 176 ┃
    ┃ pos6(*args3) ┃ 138 ±  25   214 ±  14      703 ±  19 ┃
    ┃   pos2kw2_kw ┃ 260 ±  24   382 ±  67      850 ±  92 ┃
    ┃  pos2kw2_kwe ┃ 179 ±  28   223 ±  44      661 ±  32 ┃
    ┗━━━━━━━━━━━━━━┻━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛
    ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
    ┃          5 repeats, 1,000,000 times             ┃
    ┣━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┫
    ┃    Units: ns    lambda   partial   Placeholders ┃
    ┃              ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┫
    ┃      opr_sub ┃  1 ±  0    3 ±  1       156 ±  4 ┃
    ┃ opr_contains ┃ 13 ±  0   15 ±  1       173 ±  3 ┃
    ┃         pos2 ┃  2 ±  0    3 ±  1       154 ±  7 ┃
    ┃ pos2(*args1) ┃  2 ±  0    2 ±  0       148 ±  3 ┃
    ┃         pos6 ┃  2 ±  0    3 ±  1       200 ±  2 ┃
    ┃ pos6(*args3) ┃  2 ±  0    3 ±  0       217 ± 39 ┃
    ┃   pos2kw2_kw ┃ 43 ±  1   71 ±  1       240 ±  2 ┃
    ┃  pos2kw2_kwe ┃  2 ±  0    2 ±  0       149 ±  2 ┃
    ┗━━━━━━━━━━━━━━┻━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛

Setup:

  • First 2 columns are identical calls - one using lambda other partial.
  • 3rd column is using placeholder to expose 1st argument as opposed to 2nd (or different places for 6-arg case).

CPython:

  • There is negligible impact on __call__. Run times are very close of current and new version with Placeholders.
  • It can be seen that run times are not impacted by placeholder usage in any significant way.
  • pos2kw2_kwe (empty kwds) is much faster of partial call. pos2kw2_kw (non-empty kwds) is currently slower, however gh-119109: functool.partial vectorcall supports pto->kw & fallback to tp_call removed #120783 will likely to improve its speed so that it outperforms lambda.

PyPy:

  • Usage of Placeholders results in very poor performance. However, this has no material implication as lambda is more performant than partial in all cases and is an optimal choice.

Benchmark Results for __new__

INIT='import functools as ftl; g = lambda a, b, c, d, e, f: (a, b, c, d, e, f);'

# CURRENT
$PY_MAIN -m timeit -s $INIT 'ftl.partial(g, 0, 1, 2)'   # 160

# PLACEHOLDERS
INIT2="$INIT PH=ftl.Placeholder;"
$PY_MAIN -m timeit -s $INIT2 'ftl.partial(g, 0, 1, 2)'              # 170
$PY_MAIN -m timeit -s $INIT2 'ftl.partial(g, 0, 1, 2, 3, 4, 5)'     # 190
$PY_MAIN -m timeit -s $INIT2 'ftl.partial(g, PH, 1, PH, 3, PH, 5)'  # 200
  • There is small performance decrease for initialization without placeholders.
  • Initializing it with placeholders is slower for the same number of arguments (excluding placeholders).
  • But it is not much slower if placeholders are counted as arguments.

To sum up

This extension:

  1. allows extracting current performance benefits of partial to few more important (at least from my POV) cases.
  2. seems to allow for certain simplifications to happen by bringing it more in line with lambda/def behaviour. Thus, allowing partial to be used for partialmethod application which allows for some simplifications in handling these in other parts of the library - i.e. inspect.

📚 Documentation preview 📚: https://cpython-previews--119827.org.readthedocs.build/

Lib/functools.py Outdated Show resolved Hide resolved
Lib/functools.py Outdated Show resolved Hide resolved
@rhettinger rhettinger self-assigned this Jun 6, 2024
Doc/library/functools.rst Outdated Show resolved Hide resolved
Doc/library/functools.rst Outdated Show resolved Hide resolved
Doc/library/functools.rst Outdated Show resolved Hide resolved
Modules/_functoolsmodule.c Outdated Show resolved Hide resolved
Modules/_functoolsmodule.c Outdated Show resolved Hide resolved
Modules/_functoolsmodule.c Outdated Show resolved Hide resolved
Modules/_functoolsmodule.c Outdated Show resolved Hide resolved
Modules/_functoolsmodule.c Outdated Show resolved Hide resolved
Modules/_functoolsmodule.c Outdated Show resolved Hide resolved
Modules/_functoolsmodule.c Outdated Show resolved Hide resolved
Copy link
Contributor

@picnixz picnixz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally would advocate for PLACEHOLDER instead of Placeholder to stress that 1) this is not a global constant, just a module constant, 2) this is not a class, 3) this is named similarly to dataclasses.KW_ONLY. (see #119827 (comment))

Doc/library/functools.rst Outdated Show resolved Hide resolved
Doc/library/functools.rst Outdated Show resolved Hide resolved
Doc/library/functools.rst Outdated Show resolved Hide resolved
Lib/functools.py Outdated Show resolved Hide resolved
Modules/_functoolsmodule.c Outdated Show resolved Hide resolved
Doc/library/functools.rst Outdated Show resolved Hide resolved
Copy link
Member

@serhiy-storchaka serhiy-storchaka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for more nitpicks.

Lib/test/test_functools.py Outdated Show resolved Hide resolved
Lib/test/test_functools.py Outdated Show resolved Hide resolved
Lib/test/test_functools.py Outdated Show resolved Hide resolved
Lib/test/test_functools.py Outdated Show resolved Hide resolved
Doc/library/functools.rst Outdated Show resolved Hide resolved
@dg-pb
Copy link
Contributor Author

dg-pb commented Aug 14, 2024

Not very convenient to match exact exception message string, which, at least from my experience, is more common case. #123013

Doc/library/functools.rst Outdated Show resolved Hide resolved
Doc/library/functools.rst Outdated Show resolved Hide resolved
@rhettinger
Copy link
Contributor

I just looked at the Placeholder singleton logic and think it is fine because it mirrors the None singleton. It is a little complicated but not in a way that will impair maintenance.

@rhettinger rhettinger enabled auto-merge (squash) September 26, 2024 00:52
@rhettinger rhettinger merged commit d929652 into python:main Sep 26, 2024
37 checks passed
@bedevere-bot
Copy link

⚠️⚠️⚠️ Buildbot failure ⚠️⚠️⚠️

Hi! The buildbot AMD64 CentOS9 NoGIL Refleaks 3.x has failed when building commit d929652.

What do you need to do:

  1. Don't panic.
  2. Check the buildbot page in the devguide if you don't know what the buildbots are or how they work.
  3. Go to the page of the buildbot that failed (https://buildbot.python.org/#/builders/1610/builds/60) and take a look at the build logs.
  4. Check if the failure is related to this commit (d929652) or if it is a false positive.
  5. If the failure is related to this commit, please, reflect that on the issue and make a new Pull Request with a fix.

You can take a look at the buildbot page here:

https://buildbot.python.org/#/builders/1610/builds/60

Failed tests:

  • test__interpchannels
  • test_capi
  • test__interpreters
  • test_ast
  • test_importlib

Test leaking resources:

  • test_importlib: references
  • test_ast: references
  • test__interpchannels: references
  • test__interpreters: memory blocks
  • test_capi: memory blocks
  • test_capi: references
  • test__interpreters: references
  • test_importlib: memory blocks
  • test_ast: memory blocks
  • test__interpchannels: memory blocks

Summary of the results of the build (if available):

==

Click to see traceback logs
Traceback (most recent call last):
  File "<string>", line 10, in <module>
  File "<frozen importlib._bootstrap>", line 813, in module_from_spec
  File "<frozen importlib._bootstrap_external>", line 1046, in create_module
  File "<frozen importlib._bootstrap>", line 488, in _call_with_frames_removed
ImportError: module _test_module_state_shared does not support loading in subinterpreters
xpected failure

@bedevere-bot
Copy link

⚠️⚠️⚠️ Buildbot failure ⚠️⚠️⚠️

Hi! The buildbot ARM64 MacOS M1 Refleaks NoGIL 3.x has failed when building commit d929652.

What do you need to do:

  1. Don't panic.
  2. Check the buildbot page in the devguide if you don't know what the buildbots are or how they work.
  3. Go to the page of the buildbot that failed (https://buildbot.python.org/#/builders/1368/builds/1855) and take a look at the build logs.
  4. Check if the failure is related to this commit (d929652) or if it is a false positive.
  5. If the failure is related to this commit, please, reflect that on the issue and make a new Pull Request with a fix.

You can take a look at the buildbot page here:

https://buildbot.python.org/#/builders/1368/builds/1855

Failed tests:

  • test__interpchannels
  • test_capi
  • test__interpreters
  • test_ast
  • test_importlib

Test leaking resources:

  • test_importlib: references
  • test_ast: references
  • test__interpchannels: references
  • test__interpreters: memory blocks
  • test_capi: memory blocks
  • test_capi: references
  • test__interpreters: references
  • test_importlib: memory blocks
  • test_ast: memory blocks
  • test__interpchannels: memory blocks

Summary of the results of the build (if available):

==

Click to see traceback logs
Traceback (most recent call last):
  File "<string>", line 10, in <module>
  File "<frozen importlib._bootstrap>", line 813, in module_from_spec
  File "<frozen importlib._bootstrap_external>", line 1046, in create_module
  File "<frozen importlib._bootstrap>", line 488, in _call_with_frames_removed
ImportError: module _test_module_state_shared does not support loading in subinterpreters
xpected failure

@rhettinger
Copy link
Contributor

@dg-pb Congratulations on landing a new feature.

@dg-pb
Copy link
Contributor Author

dg-pb commented Sep 26, 2024

Thank you @rhettinger.

And thank you everyone for your help. Especially @serhiy-storchaka.

I hope this will prove to be a useful feature.

Was pleasure working with you all.

@dg-pb dg-pb deleted the implement-119127-again branch September 26, 2024 05:02
@vstinner
Copy link
Member

The deallocator which marks the Placeholder as an immortal object is causing memory leaks: #124586. I propose to use a regular singleton by just storing a strong reference in the module state: #124601. My PR fix the leak. I don't think that immortal object is needed here.

@bswck
Copy link
Contributor

bswck commented Oct 3, 2024

I'm a very often user of partial. This looks so useful, thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

10 participants