Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize concurrent collection's shrink logic #3417

Merged
merged 2 commits into from
Jul 26, 2022

Conversation

wenbingshen
Copy link
Member

@wenbingshen wenbingshen commented Jul 21, 2022

Motivation

Optimize concurrent collection's shrink and clear logic

Changes

  1. Reduce the repeated Arrays.fill in the clear process
  2. When capacity is already equal to initCapacity,rehash should not be executed
  3. Reduce the rehash logic in the clear process
  4. Shrinking must at least ensure initCapacity, so as to avoid frequent shrinking and expansion near initCapacity, frequent shrinking and expansion, additionally opened arrays will consume more memory and affect GC.

If this PR is accepted, I will optimize the same concurrent collection's shrink and clear logic defined in pulsar.

Related to #3061 and #3074

@wenbingshen
Copy link
Member Author

Ping @eolivelli @lordcheng10 @hangc0276 PTAL, Thanks.

@wenbingshen
Copy link
Member Author

rerun failure checks

@wenbingshen
Copy link
Member Author

ping @eolivelli PTAL.

@wenbingshen
Copy link
Member Author

ping @dlg99 PTAL.

@wenbingshen
Copy link
Member Author

ping @shoothzj PTAL.

Copy link
Member

@StevenLuMT StevenLuMT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice

// so as to avoid frequent shrinking and expansion near initCapacity,
// frequent shrinking and expansion,
// additionally opened arrays will consume more memory and affect GC
int newCapacity = Math.max(alignToPowerOfTwo((int) (capacity / shrinkFactor)), initCapacity);
Copy link
Member

@zymap zymap Jul 26, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just skip shrink when the size is smaller than the initCapacity? That would save a lot of computing of alignToPowerOfTwo((int) (capacity / shrinkFactor))

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming that initCapacity is 256,mapFillFactor, mapIdleFactor, expandFactor, shrinkFactor keep the default values:

initCapacity capacity mapFillFactor mapIdleFactor resizeThresholdUp resizeThresholdBelow expandFactor shrinkFactor
256 256 0.66f 0.15f 168 38 2 2
256 512 0.66f 0.15f 337 76 2 2
256 1024 0.66f 0.15f 675 153 2 2
256 2048 0.66f 0.15f 1351 307 2 2

When the capacity is expanded to 1024, the size needs to be less than 153 to shrink, and size=153 is less than initCapacity=256, causing at least 1024-153 of space wasted. This waste is worse when initCapacity is larger.

Therefore, we cannot use size to judge that it will not shrink when it is smaller than initCapacity.

@zymap PTAL

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This calculation will only happen if size < resizeThresholdBelow, it is worthwhile to use the occasional calculation to relieve the free space that is not being used.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your detailed explanation!

@wenbingshen wenbingshen requested a review from zymap July 26, 2022 06:30
@zymap zymap merged commit a580547 into apache:master Jul 26, 2022
@wenbingshen wenbingshen deleted the OptimizeMapShrink branch July 29, 2022 10:15
zymap pushed a commit that referenced this pull request Aug 2, 2022
### Motivation

Optimize concurrent collection's shrink and clear logic

### Changes
1. Reduce the repeated `Arrays.fill` in the clear process
2. When `capacity` is already equal to `initCapacity`,`rehash` should not be executed
3. Reduce the `rehash` logic in the `clear` process
4. Shrinking must at least ensure `initCapacity`, so as to avoid frequent shrinking and expansion near `initCapacity`, frequent shrinking and expansion, additionally opened `arrays` will consume more memory and affect GC.

If this PR is accepted, I will optimize the same `concurrent collection's shrink and clear logic ` defined in pulsar.

Related to #3061 and #3074

(cherry picked from commit a580547)
Ghatage pushed a commit to sijie/bookkeeper that referenced this pull request Jul 12, 2024
### Motivation

Optimize concurrent collection's shrink and clear logic

### Changes
1. Reduce the repeated `Arrays.fill` in the clear process
2. When `capacity` is already equal to `initCapacity`,`rehash` should not be executed
3. Reduce the `rehash` logic in the `clear` process
4. Shrinking must at least ensure `initCapacity`, so as to avoid frequent shrinking and expansion near `initCapacity`, frequent shrinking and expansion, additionally opened `arrays` will consume more memory and affect GC.

If this PR is accepted, I will optimize the same `concurrent collection's shrink and clear logic ` defined in pulsar.

Related to apache#3061 and apache#3074
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants