Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Doc] Update Meta Recovery (backport #51661) #51778

Merged
merged 1 commit into from
Oct 11, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
47 changes: 43 additions & 4 deletions docs/en/administration/Meta_recovery.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ Generally, you may have to resort to metadata recovery when and only when one of

- [FE fails to restart](#fe-fails-to-restart)
- [FE fails to provide services](#fe-fails-to-provide-services)
- [Recover metadata on a new FE node with the metadata backup](#recover-metadata-on-a-new-fe-node-with-metadata-backup).

Check the issue you encountered, follow the solution provided in the corresponding section, and perform any recommended actions.

Expand Down Expand Up @@ -124,7 +125,7 @@ You can follow these steps to fix this issue:
ALTER SYSTEM DROP FOLLOWER "<follower_host>:<follower_edit_log_port>";

-- To drop an Observer node, replace <observer_host> with the IP address (priority_networks)
-- of the Oberver node, and replace <observer_edit_log_port> (Default: 9010) with
-- of the Observer node, and replace <observer_edit_log_port> (Default: 9010) with
-- the Observer node's edit_log_port.
ALTER SYSTEM DROP OBSERVER "<observer_host>:<observer_edit_log_port>";
```
Expand Down Expand Up @@ -540,7 +541,7 @@ Follow these steps to recover the metadata:

</TabItem>

<TabItem value="oberserver" label="Proceed with Observer Node" >
<TabItem value="observer" label="Proceed with Observer Node" >

If an Observer node has the latest metadata, perform the following operations:

Expand Down Expand Up @@ -579,7 +580,7 @@ Follow these steps to recover the metadata:
ALTER SYSTEM DROP FOLLOWER "<follower_host>:<follower_edit_log_port>";

-- To drop an Observer node, replace <observer_host> with the IP address (priority_networks)
-- of the Oberver node, and replace <observer_edit_log_port> (Default: 9010) with
-- of the Observer node, and replace <observer_edit_log_port> (Default: 9010) with
-- the Observer node's edit_log_port.
ALTER SYSTEM DROP OBSERVER "<observer_host>:<observer_edit_log_port>";
```
Expand Down Expand Up @@ -649,7 +650,7 @@ Follow these steps to recover the metadata:
ALTER SYSTEM DROP FOLLOWER "<follower_host>:<follower_edit_log_port>";

-- To drop an Observer node, replace <observer_host> with the IP address (priority_networks)
-- of the Oberver node, and replace <observer_edit_log_port> (Default: 9010) with
-- of the Observer node, and replace <observer_edit_log_port> (Default: 9010) with
-- the Observer node's edit_log_port.
ALTER SYSTEM DROP OBSERVER "<observer_host>:<observer_edit_log_port>";
```
Expand All @@ -672,6 +673,44 @@ Follow these steps to recover the metadata:

After all nodes are added back to the cluster, the metadata is successfully recovered.

## Recover metadata on a new FE node with metadata backup

Follow these steps if you want to start a new FE node with the metadata backup:

1. Copy the backup metadata directory `meta_dir` to the new FE node.
2. In the configuration file of the FE node, set `bdbje_reset_election_group` to `true`.

```Properties
bdbje_reset_election_group = true
````

3. Start the FE node.

```Bash
# Replace <fe_ip> with the IP address (priority_networks)
# of the new FE node, and replace <fe_edit_log_port> (Default: 9010) with
# the new FE node's edit_log_port.
./fe/bin/start_fe.sh --helper <fe_ip>:<fe_edit_log_port> --daemon
```

4. Check whether the current FE node is the Leader FE node.

```SQL
SHOW FRONTENDS;
```

If the field `Role` is `LEADER`, this FE node is the Leader FE node. Make sure its IP address is the that of the current FE node.

5. If the data and metadata are intact, and the role of the node is Leader, you must remove the configuration `bdbje_reset_election_group` and restart the node.
6. Now you have successfully start a new Leader FE node with the metadata backup. You can add new Follower nodes using the new Leader FE node as the helper.

```Bash
# Replace <leader_ip> with the IP address (priority_networks)
# of the Leader FE node, and replace <leader_edit_log_port> (Default: 9010) with
# the Leader FE node's edit_log_port.
./fe/bin/start_fe.sh --helper <leader_ip>:<leader_edit_log_port> --daemon
```

## Metadata recovery-related configurations

:::tip
Expand Down
39 changes: 38 additions & 1 deletion docs/zh/administration/Meta_recovery.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ import TabItem from '@theme/TabItem';

- [FE 节点无法启动](#fe-节点无法启动)
- [FE 节点无法提供服务](#fe-节点无法提供服务)
- [基于备份在新 FE 节点恢复元数据](#基于备份在新-fe-节点恢复元数据)

请排查您所遇到的问题,并按照对应解决方案进行操作。建议按照文中推荐的操作执行。

Expand Down Expand Up @@ -538,7 +539,7 @@ jstat -gcutil pid 1000 1000

</TabItem>

<TabItem value="oberserver" label="从 Observer 节点恢复" >
<TabItem value="observer" label="从 Observer 节点恢复" >

如果元数据最新的节点为 Observer,则执行以下操作:

Expand Down Expand Up @@ -664,6 +665,42 @@ jstat -gcutil pid 1000 1000

当所有节点都重新添加到集群后,元数据恢复成功。

## 基于备份在新 FE 节点恢复元数据

如果要使用元数据备份启动新的 FE 节点,请按照以下步骤操作:

1. 将备份的元数据路径 `meta_dir` 复制至新的 FE 节点。
2. 在该 FE 节点的配置文件中添加配置项 `bdbje_reset_election_group` 为 `true`。

```Properties
bdbje_reset_election_group = true
````

3. 启动该 FE 节点。

```Bash
# 将 <fe_ip> 替换为新 FE 节点的 IP 地址(priority_networks),
# 并将 <fe_edit_log_port>(默认:9010)替换为新 FE 节点的 edit_log_port。
./fe/bin/start_fe.sh --helper <fe_ip>:<fe_edit_log_port> --daemon
```

4. 查看当前节点是否为 Leader 节点。

```SQL
SHOW FRONTENDS;
```

如果字段 `Role` 为 `LEADER`,说明该 FE 节点为 Leader FE 节点。确保返回的是当前 FE 节点的 IP 地址。

5. 如果数据和元数据完整,且该节点的角色是 Leader 后,需要删除之前添加的配置项 `bdbje_reset_election_group` 并重新启动节点。
6. 现在,您已成功通过元数据备份启动了新的 Leader FE 节点。您可以使用新 Leader FE 节点作为 Helper 添加 Follower 节点。

```Bash
# 将 <leader_ip> 替换为 Leader FE 节点的 IP 地址(priority_networks),
# 并将 <leader_edit_log_port>(默认:9010)替换为 Leader FE 节点的 edit_log_port。
./fe/bin/start_fe.sh --helper <leader_ip>:<leader_edit_log_port> --daemon
```

## 元数据恢复相关配置

:::tip
Expand Down
Loading