Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PERF: Fix performance regression due to CoW ref tracking #51390

Merged
merged 2 commits into from
Feb 15, 2023

Conversation

phofl
Copy link
Member

@phofl phofl commented Feb 14, 2023

  • closes #xxxx (Replace xxxx with the GitHub issue number)
  • Tests added and passed if fixing a bug or adding a new feature
  • All code checks passed.
  • Added type annotations to new arguments/methods/functions.
  • Added an entry in the latest doc/source/whatsnew/vX.X.X.rst file if fixing a bug or adding a new feature.
       before           after         ratio
     [74d5c195]       [ef1734f1]
-      3.53±0.6ms      2.36±0.06ms     0.67  frame_ctor.FromArrays.time_frame_from_arrays_sparse
-        3.22±2ms      2.03±0.01ms     0.63  frame_ctor.FromArrays.time_frame_from_arrays_int

WeakSet caused a slowdown in these two benchmarks. Switching to a list for now, we will try to opimize this

@phofl phofl requested a review from lithomas1 February 14, 2023 21:41
@lithomas1
Copy link
Member

Can we also gate the handling of refs behind the CoW mode to avoid perf regressions like these when CoW is off (not for this PR but for the future)?

@phofl
Copy link
Member Author

phofl commented Feb 14, 2023

2 things are not behind CoW:

  • creating this object
  • adding the ref

With this implementation, the more costly part is checking the ref, which is hidden behind CoW. There is no real efficient way of hiding everything (at least I can't see one right now)

@lithomas1
Copy link
Member

Cool, as long as adding the ref doesn't take too much fine(I don't think it would), this is fine then.

@phofl
Copy link
Member Author

phofl commented Feb 14, 2023

This was cheap with the WeakSet when I checked where we spent the most time, since we are basically doing the same, I think we are good

@jorisvandenbossche jorisvandenbossche added Copy / view semantics Performance Memory or execution speed performance labels Feb 15, 2023
@jorisvandenbossche jorisvandenbossche added this to the 2.0 milestone Feb 15, 2023
@jorisvandenbossche
Copy link
Member

Certainly fine to merge this (it's basically just reverting a change that was done in the original PR, AFAIU?)

One thing I am wondering: do we know if we have (sufficient) benchmarks that hit the has_reference() call? Because I might assume that it's this part that will now get slower with this PR (I don't know if you timed that specifically?).
Although even if it is slower, it might not be significantly slower in practice, given that in the majority of use cases the list (or set) of referenced_blocks is not a huge list.

@phofl
Copy link
Member Author

phofl commented Feb 15, 2023

Yes this is how it was implemented initially

You are correct, that it slows down the has_reference check. Two points why I think this is a better idea right now:

  • it only hits in case of CoW is enabled, e.g. the default is not touched by this
  • the list is small most of the time, what you said basically

I'll investigate this further, but wanted to get this into the rc

@phofl phofl merged commit 82a595b into pandas-dev:main Feb 15, 2023
@phofl phofl deleted the perf_cow branch February 15, 2023 21:24
@wangwillian0
Copy link
Contributor

Hi @phofl, I think this commit may be related to the problem at #54352. Each weakref.ref ocupies 72 bytes of memory even after it's a dead , which accumulates invisibly on tons of operations (e.g. Index.view called during df.copy). The gap is probably a little closer too if you consider the extreme case of the weakref.WeakSet() having always length 1 when the current list version keeps allocating memory. Do you think it's worth it to revert this commit?

@phofl
Copy link
Member Author

phofl commented Sep 5, 2023

We can clear the list more often, that would be an approach worth considering as well

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Copy / view semantics Performance Memory or execution speed performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants