Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[improve] Do not process acks in the Netty thread #22793

Closed
wants to merge 2 commits into from

Conversation

dlg99
Copy link
Contributor

@dlg99 dlg99 commented May 28, 2024

Motivation

Cherry-picked from internal @eolivelli work.
Do not block netty threads if cursor has a lot of individuallyDeletedMessages or needs to compress the PositionInfo (separate change)

Modifications

Run acks on pulsar executor

Verifying this change

  • Make sure that the change passes the CI checks.

This change is a trivial rework / code cleanup without any test coverage.

Does this pull request potentially affect one of the following parts:

NO

If the box was checked, please highlight the changes

  • Dependencies (add or upgrade a dependency)
  • The public API
  • The schema
  • The default values of configurations
  • The threading model
  • The binary protocol
  • The REST endpoints
  • The admin CLI options
  • The metrics
  • Anything that affects deployment

Documentation

  • doc
  • doc-required
  • doc-not-needed
  • doc-complete

Matching PR in forked repository

PR in forked repository: dlg99#16

@dlg99 dlg99 requested a review from eolivelli May 28, 2024 23:22
@github-actions github-actions bot added the doc-not-needed Your PR changes do not impact docs label May 28, 2024
// which is the best thread ?
return CompletableFuture.runAsync(() -> {
acknowledgeMessage(positions, ackType, properties);
}, topic.getBrokerService().pulsar().getExecutor());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that an ordered executor (or pinned single thread executor) should be used so that all acknowledgements for a single subscription/cursor or topic get processed in a single thread.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lhotari something like topic.getBrokerService().getTopicOrderedExecutor().chooseThread(topicName)?
I'd love to avoid adding yet another executor.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dlg99 yes makes sense. I guess that the thread selection should involve the subscription name so that acknowledgements for a single subscription get handled by a single thread so that there would be less contention.

@lhotari
Copy link
Member

lhotari commented May 29, 2024

Do not block netty threads if cursor has a lot of individuallyDeletedMessages or needs to compress the PositionInfo (separate change)

@dlg99 @eolivelli Do you have any performance results to share? What is the improvement when this PR is applied? In what type of use cases does this have an impact?

@eolivelli
Copy link
Contributor

I don't have numbers to share because we did this work in the scope of a case in which an application generates huge lists of individuallyDeletedMessages ranges (millions).

We have other patches to share that will allow compressing the cursor PositionInfo while writing to BookKeeper as well (and also chunk the payload if it doesn't fit 5MB).
We also have some other patches to improve the memory overhead caused by Protobuf (it generates tons of garbage on the heap)

As the serialization happens when you call "subscription.acknowledge(xx)" you can see the netty thread blocked for milliseconds.

@dlg99
Copy link
Contributor Author

dlg99 commented May 29, 2024

@lhotari @eolivelli any objections to merging this? (PR needs an approval)

@lhotari
Copy link
Member

lhotari commented May 29, 2024

@lhotari @eolivelli any objections to merging this? (PR needs an approval)

I do have concerns. With asynchronous methods, there's usually a problem with back pressure. (another example: #22541 (comment)) Perhaps it's not a real issue for acknowledgements.
Has this solution been proven with heavy testing and/or production usage?

Copy link
Contributor

@merlimat merlimat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the premise that the code is "blocking the IO thread" is not super accurate.
There is no blocking behavior here (eg: waiting for some other operation to complete). It might be doing some significant amount of work and consume non-trivial amount of CPU, though that would be happening anyway in the other thread.

At the same time, this change will introduce an extra context switch for everyone else, even when amount of work to do for the ack is 0.

@dlg99 dlg99 closed this Oct 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
doc-not-needed Your PR changes do not impact docs ready-to-test
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants