-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[fix] [broker] Fix broker deadlock by making blocking bk-create async during ml-creation #22842
Conversation
… during ml-creation
return future; | ||
}).thenAccept(ml -> callback.openLedgerComplete(ml, ctx)).exceptionally(exception -> { | ||
callback.openLedgerFailed((ManagedLedgerException) exception.getCause(), ctx); | ||
return null; | ||
}); | ||
} | ||
|
||
private CompletableFuture<BookKeeper> createBookKeeperClient(ManagedLedgerConfig config) { | ||
CompletableFuture<BookKeeper> future = new CompletableFuture<>(); | ||
scheduledExecutor.execute(() -> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the culprit of the issue is in the BookkeeperFactoryForCustomEnsemblePlacementPolicy
interface, because it effectively masks a potentially blocking operation behind a method that does not look blocking, and doesn't even declare an exception.
I think a better solution here would to make the interface BookkeeperFactoryForCustomEnsemblePlacementPolicy
to return a CompletableFuture of a BK client instead of using an executor here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the culprit of the issue is in the BookkeeperFactoryForCustomEnsemblePlacementPolicy interface
Yes. But that's something that depends on other system which is BookKeeper in this case, Upgrading bk version will not be possible for all Pulsar stable version.
Let's merge this patch and I can make bookkeeper enhancement to fix this issue for future version.
@rdhabalia I posted a fix in #22846. PTAL |
I added a comment on #22846, that fix may not work when we want to strong bookie affinity. |
also, please let us know if you want us to just create an issue with info so, we don't waste our time in creating PR. I don't think it's a correct way to review and merge something without any discussion. I feel it's not a good practice. |
Fixes #22840
Main Issue: #22840
Motivation
This PR addresses the second issue discussed in #22840
BookKeeper client creation path has a blocking call and blocks managed-ledger creation, and that eventually causes the deadlock. therefore, managed-ledger creation must create bk-client asynchronously.
This PR has unit-test which fails with single thread and it can be fixed by providing multiple threads into callback processing.
Modifications
Allow metadata-store to use multiple callback threads to avoid deadlock situation in broker.
Verifying this change
(Please pick either of the following options)
This change is a trivial rework / code cleanup without any test coverage.
(or)
This change is already covered by existing tests, such as (please describe tests).
(or)
This change added tests and can be verified as follows:
(example:)
Does this pull request potentially affect one of the following parts:
If the box was checked, please highlight the changes
Documentation
doc
doc-required
doc-not-needed
doc-complete
Matching PR in forked repository
PR in forked repository: