-
Notifications
You must be signed in to change notification settings - Fork 11.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ISSUE #5989] Support unique broker-id as identification in controller mode #6100
[ISSUE #5989] Support unique broker-id as identification in controller mode #6100
Conversation
broker/src/main/java/org/apache/rocketmq/broker/controller/ReplicasManager.java
Outdated
Show resolved
Hide resolved
broker/src/main/java/org/apache/rocketmq/broker/controller/ReplicasManager.java
Outdated
Show resolved
Hide resolved
store/src/main/java/org/apache/rocketmq/store/ha/autoswitch/TempBrokerMetadata.java
Outdated
Show resolved
Hide resolved
broker/src/main/java/org/apache/rocketmq/broker/controller/ReplicasManager.java
Show resolved
Hide resolved
broker/src/main/java/org/apache/rocketmq/broker/controller/ReplicasManager.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
该方案是否有办法完成比较好兼容性升级?举个例子
1.先升controller组件(可能需要所有controller停机删除数据后再升级),升级完成后,broker不具备选举能力,但仍能正常工作(最低要求)
2.升级Broker组件,可以保证升级后正常上线,不丢数据。(最好是能保证主备关系)
.../java/org/apache/rocketmq/remoting/protocol/header/namesrv/BrokerHeartbeatRequestHeader.java
Outdated
Show resolved
Hide resolved
broker/src/main/java/org/apache/rocketmq/broker/controller/ReplicasManager.java
Outdated
Show resolved
Hide resolved
controller/src/main/java/org/apache/rocketmq/controller/impl/manager/BrokerReplicaInfo.java
Outdated
Show resolved
Hide resolved
remoting/src/main/java/org/apache/rocketmq/remoting/protocol/RequestCode.java
Outdated
Show resolved
Hide resolved
tools/src/main/java/org/apache/rocketmq/tools/command/controller/ReElectMasterSubCommand.java
Show resolved
Hide resolved
@TheR1sing3un 后续可以更新一下https://github.com/apache/rocketmq/tree/develop/docs/cn/controller 下面的对应文档(中英文) |
Codecov Report
@@ Coverage Diff @@
## develop #6100 +/- ##
=============================================
- Coverage 43.14% 43.02% -0.13%
- Complexity 8860 8889 +29
=============================================
Files 1094 1103 +9
Lines 77284 77718 +434
Branches 10085 10115 +30
=============================================
+ Hits 33347 33437 +90
- Misses 39771 40091 +320
- Partials 4166 4190 +24
... and 16 files with indirect coverage changes 📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
...oller/src/main/java/org/apache/rocketmq/controller/processor/ControllerRequestProcessor.java
Show resolved
Hide resolved
store/src/main/java/org/apache/rocketmq/store/config/MessageStoreConfig.java
Outdated
Show resolved
Hide resolved
broker/src/main/java/org/apache/rocketmq/broker/controller/ReplicasManager.java
Outdated
Show resolved
Hide resolved
broker/src/main/java/org/apache/rocketmq/broker/controller/ReplicasManager.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
for (int retryTimes = 0; retryTimes < 5; retryTimes++) { | ||
if (register()) { | ||
LOGGER.info("First time register broker success"); | ||
this.state = State.REGISTER_TO_CONTROLLER_DONE; | ||
break; | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这边是不是不需要连续重试,如果失败直接等待再次重跑startBasicService
controller/src/main/java/org/apache/rocketmq/controller/impl/manager/ReplicasInfoManager.java
Show resolved
Hide resolved
controller/src/main/java/org/apache/rocketmq/controller/impl/manager/ReplicasInfoManager.java
Outdated
Show resolved
Hide resolved
### 持久化BrokerID版本的升级注意事项 | ||
|
||
目前版本支持采用了新的持久化BrokerID版本的高可用架构,从该版本前的5.x升级到当前版本需要注意如下事项。 | ||
|
||
4.x版本升级遵守上述流程即可。 | ||
5.x非持久化BrokerID版本升级到持久化BrokerID版本按照如下流程: | ||
|
||
**升级Controller** | ||
|
||
1. 将旧版本Controller组停机。 | ||
2. 清除Controller数据,即默认在`~/DLedgerController`下的数据文件。 | ||
3. 上线新版Controller组。 | ||
|
||
> 在上述升级Controller流程中,Broker仍可正常运行,但无法切换。 | ||
|
||
**升级Broker** | ||
|
||
1. 将Broker从节点停机。 | ||
2. 将Broker主节点停机。 | ||
3. 将所有的Broker的Epoch文件删除,即默认为`~/store/epochFileCheckpoint`和`~/store/epochFileCheckpoint.bak`。 | ||
4. 将原先的主Broker先上线,等待该Broker当选为master。(可使用`admin`命令的`getSyncStateSet`来观察) | ||
5. 将原来的从Broker全部上线。 | ||
|
||
> 建议停机时先停从再停主,上线时先上原先的主再上原先的从,这样可以保证原来的主备关系。 | ||
> 若需要改变升级前后主备关系,则需要停机时保证主、备的CommitLog对齐,否则可能导致数据被截断而丢失。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
建议单独搞一个文档,介绍brokerId持久化方案的背景(解决的问题)、设计思想、兼容性升级方案,然后在部署文档中用链接引过去
6238611
to
77745b4
Compare
…ller mode (apache#5046) * refactor(controller): refactor the register logic 1. refactor the register logic * refactor(controller): remove unused field in ElectMasterEvent 1. remove unused field in ElectMasterEvent * feat(controller): add a tryElectMaster request and process logic about it 1. add a tryElectMaster request and process logic about it * feat(controller): refactor ReplicasInfoManagerTest 1. refactor ReplicasInfoManagerTest * refactor(controller): refactor DLedgerControllerTest 1. refactor DLedgerControllerTest * refactor(controller): refactor ControllerManagerTest 1. refactor ControllerManagerTest * refactor(controller): refactor ReplicasInfoManagerTest 1. refactor ReplicasInfoManagerTest * refactor(controller): refactor register process and pass the junit test 1. refactor ReplicasInfoManagerTest * style(broker): rename a constant 1. rename a constant * feat(controller): update the DLedger dependency from v0.27 to v0.30 1. update the DLedger dependency from v0.27 to v0.30 * style(controller): add a white-line just for trigger GitHub action again 1. add a white-line just for trigger GitHub action again * feat(controller): combine electMaster api and brokerTryElectMaster api 1. combine electMaster api and brokerTryElectMaster api * feat(controller): add a logic about verifying the broker id returned from registering 1. add a logic about verifying the broker id returned from registering * fix(controller): remove unused code and add a warning log in ControllerManager 1. remove unused code and add a warning log in ControllerManager * fix: resolve conflicts 1. resolve conflicts * fix(controller): remove unused class 1. remove unused class * fix(controller): Resolve conflicts after merging 1. Resolve conflicts after merging * refactor(controller): Refactor ReplicasInfoManager#elect 1. Refactor ReplicasInfoManager#elect * style(controller): remove unused imports 1. remove unused imports * style(controller): remove unused imports 1. remove unused imports * fix(controller): resolve conflicts after merging develop branch 1. resolve conflicts after merging develop branch * rerun * fix(controller): resolve conflicts in ReplicasInfoManagerTest#testRegisterNewBroker after merging develop branch 1. resolve conflicts in ReplicasInfoManagerTest#testRegisterNewBroker after merging develop branch * style(controller): pass style check 1. pass style check
…p address to broker id 1. refactor broker's information recording core from ip address to broker id
1. add protocols about new register flow
1. refactor code in module: store/ha for persistence broker-id
1. implement the general register to controller protocol
1. add docs about how to update to BrokerId version
…g#storePathTempMetadata` 1. remove meaningless attribute `MessageStoreConfig#storePathTempMetadata`
1. check metadata if valid when register
… to name server 1. set isolate's value to false to normally register broker to name server
1. Random sleep within one second when broker register failed
1. rename registerSuccess to registerBrokerToController
…ig back to default value 1. fix forgetting set the changed cluster name broker config back to default value
1. add more logs when broker register to controller
1. fix wrong log
1. fix wrong test
1. fix incompatible command: CleanBrokerMeta
1. fix forgetting initialize BrokerHeartbeatManager
…nagerRegisterTest 1. add more logs when register and refactor ReplicasManagerRegisterTest
1. optimize some test base store path
1. fix conflicts after rebase
1. fix conflicts in test after rebase
77745b4
to
b2c7421
Compare
…id the controller to notify the brokers when their roles have been changed. 1. To pass `ControllerManagerTest` in Windows, we forbid the controller to notify the brokers when their roles have been changed.
Rebase and merge this PR because its commit log is clear and huge. |
Make sure set the target branch to
develop
What is the purpose of the change
fix #5989
Brief changelog
XX
Verifying this change
XXXX
Follow this checklist to help us incorporate your contribution quickly and easily. Notice,
it would be helpful if you could finish the following 5 checklist(the last one is not necessary)before request the community to review your PR
.[ISSUE #123] Fix UnknownException when host config not exist
. Each commit in the pull request should have a meaningful subject line and body.mvn -B clean apache-rat:check findbugs:findbugs checkstyle:checkstyle
to make sure basic checks pass. Runmvn clean install -DskipITs
to make sure unit-test pass. Runmvn clean test-compile failsafe:integration-test
to make sure integration-test pass.