Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: upgrades_sentinel #1949

Merged
merged 2 commits into from
Sep 7, 2023
Merged

fix: upgrades_sentinel #1949

merged 2 commits into from
Sep 7, 2023

Conversation

Mixficsol
Copy link
Collaborator

@Mixficsol Mixficsol commented Aug 30, 2023

背景

Pika 当前的 sentinel 会在主节点挂掉后进行备升主,但是挂掉后起来的主节点起来后依然是主,需要优化,当前的 sentinel 会每个 5s 检查一次所有主从状态,每1 秒检查预离线主机状态,决定是否自动主从切换。

改进方案

对于已经掉线的主节点,我们进行逻辑删除,让其 ReplicaGroup 置为 false,使其流量不再打到旧主节点上

问题

  • 在不考虑 Codis 的情况下,如果有一个一主三从的 Pika 集群,手动 kill 掉主节点,会自动触发从节点升主吗?

    不会

  • 在考虑 Codis 的情况下,在一个 Group 中有一主三从的 Pika 集群,如果手动把主节点 kill 掉,此时 sentinel 会自动触发从节点升主吗?如果会的话这时候旧主节点重新连上来是什么状态?其他从节点需要手动执行 slaveof 才能同步数据吗?

    会自动触发从节点升主,旧主节点重新连上来是 Master 状态,Codis 通过执行命令让其他节点执行 slaveof 命令,不需要人手动执行

  • 如果 Codis 能自动实现切主操作并且不考虑这个 Group 此时有新加进节点的情况,那么在触发从节点同步新主数据的时候,是不是唯独没同步的就是之前旧主的节点

    是的

  • 是因为旧主的 info replication 的 role 是 Master 所以不能同步吗?

    是的

  • 但是有个问题是旧主的 role 依然是 Master,这个需要在哪里进行改变身份呢?(Pika 层面还是 Codis 层面)

    Codis 层面,Codis 发送 slaveof 命令给 Pika 让其实现旧主的 role 切换成 salve

  • 为什么不用 codis 自带的 fe 上有 sentinel 按钮的方案?

    Codis 自带的 sentinel 比较被动,目前我们实现的 sentinel 每隔 5 秒会主动对 Group 中的节点进行检测

  • 切主过程中怎么从从节点中选主?是 offset 最大的作为新主吗?

    检测节点的健康状态和 offset 最大节点作为新的主节点

close: #1850

@Mixficsol Mixficsol changed the title fix: upgrades_sentinel WIP: upgrades_sentinel Aug 30, 2023
@@ -72,6 +72,7 @@ func (s *Topom) CheckAndSwitchSlavesAndMasters(filter func(index int, g *models.
if g.Servers[0].ReCallTimes >= s.Config().SentinelMasterDeadCheckTimes {
// Mark enters objective offline state
g.Servers[0].State = models.GroupServerStateOffline
g.Servers[0].IsOnceGroupMaster = true
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ReplicaGroup 改为false

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IsOnceGroupMaster 不需要使用

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@Mixficsol Mixficsol changed the title WIP: upgrades_sentinel fix: upgrades_sentinel Aug 31, 2023
@chejinge chejinge merged commit b87a861 into OpenAtomFoundation:unstable Sep 7, 2023
11 checks passed
@Mixficsol Mixficsol deleted the Upgrades_sentinel branch September 25, 2023 09:18
bigdaronlee163 pushed a commit to bigdaronlee163/pika that referenced this pull request Jun 8, 2024
cheniujh pushed a commit to cheniujh/pika that referenced this pull request Sep 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

升级codis sentinel 功能
4 participants