-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[fix] [broker] Part-1: Replicator can not created successfully due to an orphan replicator in the previous topic owner #21946
[fix] [broker] Part-1: Replicator can not created successfully due to an orphan replicator in the previous topic owner #21946
Conversation
pulsar-broker/src/main/java/org/apache/pulsar/broker/service/AbstractReplicator.java
Outdated
Show resolved
Hide resolved
pulsar-broker/src/main/java/org/apache/pulsar/broker/service/AbstractReplicator.java
Outdated
Show resolved
Hide resolved
@poorbarcode Does this PR fix the issue mentioned in #21203 ? |
Yes, the current PR also fixed the issue that #21203 tries to fix. |
3eb5393
to
498ebec
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to add a test to cover this case?
And it looks like we can simplify the fix by adding a new method terminate()
to the replicator so that we don't need to mix the closeProducer
and closeReplicator
logic.
05de423
to
257f163
Compare
Rebase master |
257f163
to
3bb81fa
Compare
a42bd91
to
5793ca1
Compare
… an orphan replicator in the previous topic owner (apache#21946)
Because there are too many conflicts and there are no new releases for
|
… an orphan replicator in the previous topic owner (apache#21946) (cherry picked from commit 4924052) (cherry picked from commit 670aff0)
… an orphan replicator in the previous topic owner (apache#21946) (cherry picked from commit 4924052) (cherry picked from commit 670aff0)
Motivation
There is a race condition that makes an orphan replicator in the original owner of a topic, and causes the new owner of the topic can not start a replicator due to
org.apache.pulsar.broker.service.BrokerServiceException$NamingException Producer with name 'pulsar.repl.{local_cluster}-->{remote_cluster}' is already connected to topic
.Scenario 1
Scenario 2
replication_clusters
.Current PR is focusing on Scenario 1.
Steps of Scenario 1
thread start replicator
unload bundle
pulsar.repl
closing
replicator.disconnect
replicator.stat --> Stopped
replicator.stat --> Starting
replicator.stat --> Started
readMoreEntries
, since there is no entries to read, just pending this requestpulsar.repl
Producer with name 'pulsar.repl.{local_cluster}-->{remote_cluster}' is already connected to topic
Modifications
Replicator.State.Stopped
intoProducer_Stopped
andClosed
.terminate
to close the Replicator.disconnect
only used to close the internal producer.A case that hit this issue
Picture-1: An orphan producer was left in
old broker
, it is not associated with any topic/replicatorPicture-2: After the topic is transferred to
new broker
, it can not start a new Replicator successfullySince the scenario is too complex, I can not add a test. But I reproduced the Scenario 1 locally.
#21948 fixes the following issues:
topic.unfenceTopicToResume
aftertopic.close
failed.Documentation
doc
doc-required
doc-not-needed
doc-complete
Matching PR in forked repository
PR in forked repository: x