-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
folly::ConcurrentHashMap crashes under high contention #2097
Comments
Thanks for the repro! This issue seems to be due to ConcurrentHashMap not being able to reclaim deleted entries fast enough. For the second - It seems your benchmarking code most likely runs without debug compilation, which causes some invariant checks to be skipped, leading to double reclamation and the second error. |
Summary: PROBLEM Folly ConcurrentHashMaps use Hazard pointers to ensure map entries that were recently removed (using `erase`, `insert_or_assign`, etc) aren't cleaned up when there are readers for those objects. Instead, they are removed as part of a reclamation process which typically happens asynchronously. Moreover within ConcurrentHashMap, entries are linked to one another, and this linkage needs to be known within the hazard pointer logic to ensure we don't clean up an object that itself doesn't have any direct hazard pointers, but is referenced by another object that might have hazard pointers. That logic is within `HazptrObjLinked`. Under high contention situations (see facebook#2097 ) , the link counting logic can overflow, because a single object has too many dangling links. For example, consider a map that has 2 entries with the same hash code- `(A,0)` and `(B,0)`. Let's assume that `A` is stored before `B` internally within the `ConcurrentHashMap`'s `BucketTable`. `B` stores initially that it has a 1 link (to `A`). Now, let's assume that we replace `(A,0)` with `(A,1)`. While `(A,0)` is erased out of the `ConcurrentHashMap`, its not immediately reclaimed/deleted. During this interim, `B` has a link count of 2 to the 2 entries of `A`. This link count is stored as a 16 bit unsigned integer. If the above operation happens very quickly, then we end up in a situation where `B`'s link count overflows past 65535, and wraps around. This situation is caught in debug compilation (due to `DCHECK`), but in opt builds, it results in bad retirements. For eg, if `B`'s link count goes past 65535 to 65537 (i.e. `1`), then when 1 object of `A` is reclaimed, the `B`'s link count would decrement past `1` back to `0`, causing `B` to be incorrectly retired. Now if we actually end up removing all of `A`, the link count will overflow backwards, from `0` back to `65535` and then back to `0`, causing a double retirement - a sign to corruption. SOLUTION While the situation is rare, it can arise for skewed data with a lot of contention. There are 3 options to "solve" this: 1. Increase the link count data structure size from 16bit to something higher - Simple, but a work-around. Eventually high-enough contention would bugs to show up there as well. 2. Crash the process when there is very high contention - Maintains the current performance guarantees, and when ConcurrentHashMap cannot meet those guarantees, it causes a fatal error. 3. Slow ConcurrentHashMap erasures under high contention (this diff) - Very high contention would cause ConcurrentHashMap to slow down, and give reclamation time to act. Functionally `ConcurrentHashMap` remains the same, but does exhibit different perf characteristics. In this change, the `HazptrObjLinked` code is changed is disallow for overflows since it leads to corruption, and the callers are responsible for handling cases where links cannot be created. For `ConcurrentHashMap`, we keep waiting, until we can acquire a link : which means erasures under high contention are lock-free but not wait-free. For reclamation, there are buffers within the cohort to store both retired objects (aka `list`) and reclaimed objects (aka `safe list`). In cases where `ConcurrentHashMap` is unable to acquire a link, it's imperative it tries to initiate a reclamation cycle to make progress, and thus I added a `cleanup()` method within the cohort to flush any existing retired objects to the hazard pointer domain for retirement-evaluation, kick off a reclamation cycle, and also retire any retired objects pending within the cohort. Differential Revision: D51647789
Summary: PROBLEM For linked objects, the ref count and link count are stored in 16 bit integers. It's easily possible to overflow either counts. SOLUTION Increase the counts to be stored using 32 bits. While this theoretically increases the footprint of the `hazptr_obj_base_linked` struct from 28 bytes to 32 bytes, the increase is small; and in many cases struct padding would anyways cause the `hazptr_obj_base_linked` to be 32 bytes. Note: This is a mitigation for #2097 . Theoretically its possible to increase contention more to cause the issue to recur, but in practice that's hard to do. See #2107 for a potential fix to the underlying issue. Reviewed By: ot Differential Revision: D51829892 fbshipit-source-id: f3ef8f7cf245dd7ff0e1ba6f6ee7bb15ead532ef
Summary: PROBLEM For linked objects, the ref count and link count are stored in 16 bit integers. It's easily possible to overflow either counts. SOLUTION Increase the counts to be stored using 32 bits. While this theoretically increases the footprint of the `hazptr_obj_base_linked` struct from 28 bytes to 32 bytes, the increase is small; and in many cases struct padding would anyways cause the `hazptr_obj_base_linked` to be 32 bytes. Note: This is a mitigation for facebook/folly#2097 . Theoretically its possible to increase contention more to cause the issue to recur, but in practice that's hard to do. See facebook/folly#2107 for a potential fix to the underlying issue. Reviewed By: ot Differential Revision: D51829892 fbshipit-source-id: f3ef8f7cf245dd7ff0e1ba6f6ee7bb15ead532ef
When under significant contention and with a specific key distribution,
folly::ConcurrentHashMap
fails checks insideHazptrObjLinked.h
. I've reproduced this on multiple machines.OS: Ubuntu 20.04.6 LTS (reproduced on Ubuntu 22.04)
Compiler: GCC 13.1.0 (reproduced on GCC 11.4.0)
The following minimal reproducible example consistently reproduces the bug.
This outputs the following:
We initially encountered this bug while running our own benchmarking code here, in which we consistently get a different crash, but also coming from the hazard pointer code:
The text was updated successfully, but these errors were encountered: