Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve BindableList performance #6405

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

smoogipoo
Copy link
Contributor

@smoogipoo smoogipoo commented Oct 31, 2024

RFC.

I'm still looking for ways to quantify this, to the point that it may not be worth it if the comprehension overhead is too large... Basically, I started optimising diffcalc and reached a point where BindableList.addRange() and BindableList.clear() were the two biggest hotspots. This is on-top of various other (potentially upcoming) optimisations applied.

A few optimisations are applied here:

  • Avoid allocating callback lists and callback objects if the collection changed callback isn't bound to.
  • Using ICollection<T> instead of IList to avoid a .Cast<> enumeration.
  • Removing the HashSet allocation used for cycle detection. It will use up to 64B of stack space (16 bindables) before falling back to HashSet. In practice (in game), I haven't found BindableList invocation chains longer than 10 calls, so this should cover the majority of cases.

Over a 20s osu-difficulty-calculator (16T) run period...

Before:
Screenshot 2024-10-31 at 20 58 24

After:
Screenshot 2024-10-31 at 20 58 30

If you think it doesn't look like much, I would agree with you. Diffcalc seems to be one of those things where you optimise one part and it moves the hotspot down the chain.

Master:

|    Method | NumBindings |        Mean | Error |     StdDev | Ratio | RatioSD |   Gen0 |   Gen1 | Allocated | Alloc Ratio |
|---------- |------------ |------------:|------:|-----------:|------:|--------:|-------:|-------:|----------:|------------:|
|    Create |           0 |    81.33 ns |    NA |   1.539 ns |  1.00 |    0.00 | 0.0930 |      - |     584 B |        1.00 |
|       Add |           0 |   135.18 ns |    NA |  21.728 ns |  1.66 |    0.24 | 0.1364 |      - |     856 B |        1.47 |
|    Remove |           0 |    96.40 ns |    NA |   1.789 ns |  1.19 |    0.04 | 0.1147 |      - |     720 B |        1.23 |
|     Clear |           0 |   106.17 ns |    NA |   0.695 ns |  1.31 |    0.02 | 0.1211 |      - |     760 B |        1.30 |
|  AddRange |           0 |   131.20 ns |    NA |   2.461 ns |  1.61 |    0.06 | 0.1390 |      - |     872 B |        1.49 |
|  SetIndex |           0 |   126.74 ns |    NA |   1.458 ns |  1.56 |    0.05 | 0.1440 |      - |     904 B |        1.55 |
| Enumerate |           0 |    83.12 ns |    NA |   0.102 ns |  1.02 |    0.02 | 0.0930 |      - |     584 B |        1.00 |
|           |             |             |       |            |       |         |        |        |           |             |
|    Create |           1 |   232.07 ns |    NA |  11.904 ns |  1.00 |    0.00 | 0.1299 |      - |     816 B |        1.00 |
|       Add |           1 |   337.81 ns |    NA |   9.747 ns |  1.46 |    0.03 | 0.1884 |      - |    1184 B |        1.45 |
|    Remove |           1 |   238.13 ns |    NA |   1.806 ns |  1.03 |    0.06 | 0.1450 |      - |     912 B |        1.12 |
|     Clear |           1 |   256.45 ns |    NA |   0.311 ns |  1.11 |    0.06 | 0.1578 |      - |     992 B |        1.22 |
|  AddRange |           1 |   358.29 ns |    NA |   0.575 ns |  1.55 |    0.08 | 0.1936 |      - |    1216 B |        1.49 |
|  SetIndex |           1 |   344.04 ns |    NA |   1.081 ns |  1.48 |    0.08 | 0.2036 |      - |    1280 B |        1.57 |
| Enumerate |           1 |   230.97 ns |    NA |   3.169 ns |  1.00 |    0.06 | 0.1299 |      - |     816 B |        1.00 |
|           |             |             |       |            |       |         |        |        |           |             |
|    Create |          10 | 1,454.40 ns |    NA |  18.072 ns |  1.00 |    0.00 | 0.6485 |      - |    4072 B |        1.00 |
|       Add |          10 | 2,080.50 ns |    NA |   1.270 ns |  1.43 |    0.02 | 0.9384 |      - |    5888 B |        1.45 |
|    Remove |          10 | 1,369.26 ns |    NA |   4.166 ns |  0.94 |    0.01 | 0.6065 |      - |    3808 B |        0.94 |
|     Clear |          10 | 1,447.20 ns |    NA |   2.785 ns |  1.00 |    0.01 | 0.6771 |      - |    4248 B |        1.04 |
|  AddRange |          10 | 2,223.23 ns |    NA |   6.040 ns |  1.53 |    0.01 | 0.9651 |      - |    6064 B |        1.49 |
|  SetIndex |          10 | 2,176.85 ns |    NA |  20.808 ns |  1.50 |    0.03 | 1.0223 |      - |    6416 B |        1.58 |
| Enumerate |          10 | 1,423.40 ns |    NA |   0.864 ns |  0.98 |    0.01 | 0.6485 |      - |    4072 B |        1.00 |
|           |             |             |       |            |       |         |        |        |           |             |
|    Create |          20 | 2,960.90 ns |    NA |  75.523 ns |  1.00 |    0.00 | 1.2703 | 0.0038 |    7976 B |        1.00 |
|       Add |          20 | 4,214.74 ns |    NA |  11.631 ns |  1.42 |    0.04 | 1.8387 |      - |   11544 B |        1.45 |
|    Remove |          20 | 3,661.51 ns |    NA |   4.385 ns |  1.24 |    0.03 | 1.1635 |      - |    7312 B |        0.92 |
|     Clear |          20 | 3,072.85 ns |    NA | 158.676 ns |  1.04 |    0.08 | 1.2970 | 0.0038 |    8152 B |        1.02 |
|  AddRange |          20 | 4,493.25 ns |    NA |   3.102 ns |  1.52 |    0.04 | 1.8921 | 0.0076 |   11880 B |        1.49 |
|  SetIndex |          20 | 4,488.73 ns |    NA |  48.956 ns |  1.52 |    0.02 | 1.9989 |      - |   12552 B |        1.57 |
| Enumerate |          20 | 3,041.22 ns |    NA | 160.420 ns |  1.03 |    0.03 | 1.2703 | 0.0038 |    7976 B |        1.00 |

After commit 1:

|    Method | NumBindings |        Mean | Error |     StdDev | Ratio | RatioSD |   Gen0 | Allocated | Alloc Ratio |
|---------- |------------ |------------:|------:|-----------:|------:|--------:|-------:|----------:|------------:|
|    Create |           0 |    49.91 ns |    NA |   0.813 ns |  1.00 |    0.00 | 0.0561 |     352 B |        1.00 |
|       Add |           0 |    74.59 ns |    NA |   1.424 ns |  1.49 |    0.00 | 0.0842 |     528 B |        1.50 |
|    Remove |           0 |    75.69 ns |    NA |   0.761 ns |  1.52 |    0.04 | 0.0842 |     528 B |        1.50 |
|     Clear |           0 |    74.06 ns |    NA |   0.168 ns |  1.48 |    0.02 | 0.0842 |     528 B |        1.50 |
|  AddRange |           0 |    82.64 ns |    NA |   1.324 ns |  1.66 |    0.05 | 0.0842 |     528 B |        1.50 |
|  SetIndex |           0 |    74.48 ns |    NA |   0.211 ns |  1.49 |    0.02 | 0.0842 |     528 B |        1.50 |
| Enumerate |           0 |    50.59 ns |    NA |   0.092 ns |  1.01 |    0.02 | 0.0561 |     352 B |        1.00 |
|           |             |             |       |            |       |         |        |           |             |
|    Create |           1 |   161.05 ns |    NA |   1.265 ns |  1.00 |    0.00 | 0.0560 |     352 B |        1.00 |
|       Add |           1 |   241.94 ns |    NA |   1.801 ns |  1.50 |    0.00 | 0.0839 |     528 B |        1.50 |
|    Remove |           1 |   193.39 ns |    NA |   1.447 ns |  1.20 |    0.02 | 0.0842 |     528 B |        1.50 |
|     Clear |           1 |   190.15 ns |    NA |   1.779 ns |  1.18 |    0.00 | 0.0842 |     528 B |        1.50 |
|  AddRange |           1 |   262.38 ns |    NA |   0.329 ns |  1.63 |    0.01 | 0.0839 |     528 B |        1.50 |
|  SetIndex |           1 |   244.24 ns |    NA |   0.791 ns |  1.52 |    0.01 | 0.0839 |     528 B |        1.50 |
| Enumerate |           1 |   163.61 ns |    NA |   5.447 ns |  1.02 |    0.03 | 0.0560 |     352 B |        1.00 |
|           |             |             |       |            |       |         |        |           |             |
|    Create |          10 | 1,623.94 ns |    NA |  80.347 ns |  1.00 |    0.00 | 0.2422 |    1520 B |        1.00 |
|       Add |          10 | 1,588.68 ns |    NA |  36.029 ns |  0.98 |    0.07 | 0.3624 |    2280 B |        1.50 |
|    Remove |          10 | 1,713.21 ns |    NA | 524.839 ns |  1.05 |    0.27 | 0.2689 |    1696 B |        1.12 |
|     Clear |          10 | 1,309.85 ns |    NA |  90.409 ns |  0.81 |    0.10 | 0.2689 |    1696 B |        1.12 |
|  AddRange |          10 | 1,710.17 ns |    NA |   4.467 ns |  1.05 |    0.05 | 0.3624 |    2280 B |        1.50 |
|  SetIndex |          10 | 1,629.28 ns |    NA |  17.520 ns |  1.00 |    0.04 | 0.3624 |    2280 B |        1.50 |
| Enumerate |          10 | 1,269.36 ns |    NA |  58.352 ns |  0.78 |    0.07 | 0.2422 |    1520 B |        1.00 |
|           |             |             |       |            |       |         |        |           |             |
|    Create |          20 | 2,184.72 ns |    NA |   4.066 ns |  1.00 |    0.00 | 0.4921 |    3104 B |        1.00 |
|       Add |          20 | 3,328.71 ns |    NA |  15.992 ns |  1.52 |    0.00 | 0.7401 |    4656 B |        1.50 |
|    Remove |          20 | 2,346.84 ns |    NA |   9.699 ns |  1.07 |    0.01 | 0.5226 |    3280 B |        1.06 |
|     Clear |          20 | 2,206.18 ns |    NA |   5.316 ns |  1.01 |    0.00 | 0.5226 |    3280 B |        1.06 |
|  AddRange |          20 | 4,812.98 ns |    NA |  75.557 ns |  2.20 |    0.04 | 0.7401 |    4656 B |        1.50 |
|  SetIndex |          20 | 3,316.43 ns |    NA |  60.392 ns |  1.52 |    0.02 | 0.7401 |    4656 B |        1.50 |
| Enumerate |          20 | 2,168.55 ns |    NA |   4.329 ns |  0.99 |    0.00 | 0.4921 |    3104 B |        1.00 |

After commit 2:

|    Method | NumBindings |        Mean | Error |     StdDev | Ratio | RatioSD |   Gen0 | Allocated | Alloc Ratio |
|---------- |------------ |------------:|------:|-----------:|------:|--------:|-------:|----------:|------------:|
|    Create |           0 |    12.85 ns |    NA |   0.175 ns |  1.00 |    0.00 |      - |         - |          NA |
|       Add |           0 |    18.65 ns |    NA |   0.031 ns |  1.45 |    0.02 |      - |         - |          NA |
|    Remove |           0 |    20.99 ns |    NA |   0.265 ns |  1.63 |    0.00 |      - |         - |          NA |
|     Clear |           0 |    17.90 ns |    NA |   0.065 ns |  1.39 |    0.01 |      - |         - |          NA |
|  AddRange |           0 |    34.57 ns |    NA |   2.203 ns |  2.69 |    0.13 |      - |         - |          NA |
|  SetIndex |           0 |    19.52 ns |    NA |   0.358 ns |  1.52 |    0.01 |      - |         - |          NA |
| Enumerate |           0 |    14.10 ns |    NA |   1.078 ns |  1.10 |    0.10 |      - |         - |          NA |
|           |             |             |       |            |       |         |        |           |             |
|    Create |           1 |    92.19 ns |    NA |   1.322 ns |  1.00 |    0.00 |      - |         - |          NA |
|       Add |           1 |   140.82 ns |    NA |   3.627 ns |  1.53 |    0.06 |      - |         - |          NA |
|    Remove |           1 |   101.08 ns |    NA |   0.263 ns |  1.10 |    0.02 |      - |         - |          NA |
|     Clear |           1 |    98.41 ns |    NA |   1.996 ns |  1.07 |    0.01 |      - |         - |          NA |
|  AddRange |           1 |   163.20 ns |    NA |   0.219 ns |  1.77 |    0.03 |      - |         - |          NA |
|  SetIndex |           1 |   193.82 ns |    NA |   0.154 ns |  2.10 |    0.03 |      - |         - |          NA |
| Enumerate |           1 |    92.95 ns |    NA |   0.667 ns |  1.01 |    0.02 |      - |         - |          NA |
|           |             |             |       |            |       |         |        |           |             |
|    Create |          10 |   591.32 ns |    NA |   3.298 ns |  1.00 |    0.00 |      - |         - |          NA |
|       Add |          10 |   882.85 ns |    NA |   6.561 ns |  1.49 |    0.00 |      - |         - |          NA |
|    Remove |          10 |   625.85 ns |    NA |   0.365 ns |  1.06 |    0.01 |      - |         - |          NA |
|     Clear |          10 |   586.27 ns |    NA |   8.371 ns |  0.99 |    0.02 |      - |         - |          NA |
|  AddRange |          10 |   988.85 ns |    NA |   1.424 ns |  1.67 |    0.01 |      - |         - |          NA |
|  SetIndex |          10 |   887.21 ns |    NA |   0.809 ns |  1.50 |    0.01 |      - |         - |          NA |
| Enumerate |          10 |   592.77 ns |    NA |   0.737 ns |  1.00 |    0.01 |      - |         - |          NA |
|           |             |             |       |            |       |         |        |           |             |
|    Create |          20 | 1,653.57 ns |    NA |   1.273 ns |  1.00 |    0.00 | 0.4177 |    2624 B |        1.00 |
|       Add |          20 | 2,500.44 ns |    NA |  36.704 ns |  1.51 |    0.02 | 0.6256 |    3936 B |        1.50 |
|    Remove |          20 | 1,768.74 ns |    NA |  10.309 ns |  1.07 |    0.01 | 0.4177 |    2624 B |        1.00 |
|     Clear |          20 | 1,714.45 ns |    NA |   4.237 ns |  1.04 |    0.00 | 0.4177 |    2624 B |        1.00 |
|  AddRange |          20 | 2,866.00 ns |    NA | 111.233 ns |  1.73 |    0.07 | 0.6256 |    3936 B |        1.50 |
|  SetIndex |          20 | 2,517.65 ns |    NA |  30.558 ns |  1.52 |    0.02 | 0.6256 |    3936 B |        1.50 |
| Enumerate |          20 | 1,706.27 ns |    NA |   1.348 ns |  1.03 |    0.00 | 0.4177 |    2624 B |        1.00 |

In particular I'm interested in the alloc savings, because diffcalc is mostly GC-limited as far as I can tell (allocates ~1GB/s). Keep in mind the above results are per call.

Remove usages of `Cast<T>` by using `ICollection<T>`. `List<T>`
internally optimises across `ICollection<T>`, so doing this removes an
enumeration.

Replaced `notifyCollectionChanged()` with local invocations, allowing
null-check to happen.

Only capture previous states (e.g. the cleared items during `Clear()`)
if there's a subscription to `CollectionChanged`.
Comment on lines +49 to +50
// This will return the unique runtime identity for the object.
instanceId = RuntimeHelpers.GetHashCode(this);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I immediately don't like whatever this is.

To quote docs:

Although the RuntimeHelpers.GetHashCode method returns identical hash codes for identical object references, you should not use this method to test for object identity, because this hash code does not uniquely identify an object reference. To test for object identity (that is, to test that two objects reference the same object in memory), call the Object.ReferenceEquals method. Nor should you use GetHashCode to test whether two strings represent equal object references, because the string is interned. To test for string interning, call the String.IsInterned method.

I do not want to ever debug a situation wherein a one-in-a-trillion hash collision happened and suddenly two bindable lists are freakishly joined at the hip. The hash code isn't even 64 bits, it's a plain int.

I'm not even sure I can bring myself to read the rest of this diff after that, because it looks mightly complicated and all of it seems predicated on the (in my opinion) flawed premise that this hash code never collides.

Copy link
Contributor Author

@smoogipoo smoogipoo Nov 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dang my understanding of this method was incorrect. What you said makes sense in hindsight...

That said, I'm somewhat happy this blocks the PR because I believe the cycle detection can be done better. More tricks to pull from my sleeve :)

@peppy peppy added the blocked label Nov 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants