Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The CDC client is still using the old PD address #9584

Closed
jacktd9 opened this issue Aug 15, 2023 · 4 comments · Fixed by #9713
Closed

The CDC client is still using the old PD address #9584

jacktd9 opened this issue Aug 15, 2023 · 4 comments · Fixed by #9713
Assignees
Labels
affects-6.5 affects-7.1 area/ticdc Issues or PRs related to TiCDC. severity/minor type/bug The issue is confirmed as a bug.

Comments

@jacktd9
Copy link

jacktd9 commented Aug 15, 2023

What did you do?

  1. Initially, there were 3 old PD nodes (pd1, pd2, pd3).
  2. A scaling operation was performed, adding 3 new PD nodes (pd4, pd5, pd6).
  3. After waiting for 5 minutes.
  4. A downsizing operation was executed, removing 3 old PD nodes (pd1, pd2, pd3).
  5. The CDC changefeed operation was restored.

An error occurred during the command execution, and it was found that the CDC logs are still accessing the old PD
image

current PD address is... 2479
image

What did you expect to see?

no error

What did you see instead?

connect pd failed

Versions of the cluster

cluster version : v6.5.3

@jacktd9 jacktd9 added area/ticdc Issues or PRs related to TiCDC. type/bug The issue is confirmed as a bug. labels Aug 15, 2023
@nongfushanquan
Copy link
Contributor

/assign @asddongmen

@asddongmen
Copy link
Contributor

@jacktd9 May I ask if the changefeed has resumed normal synchronization? In other words, was the error log you found temporary or has it not been resolved yet?

@jacktd9
Copy link
Author

jacktd9 commented Aug 17, 2023

  1. When we reload CDC and then execute the same 'resume' command again, the command was successful.

Similarly, in the scenario where '1' has already reloaded CDC, we attempted to update the changefeed configuration, but it returned a 500 error. We discovered that this seemed to be due to the effect of the old PD still stored in the upstream info. We tried expanding one of the previously downsized PD nodes and then executed the same 'update' command again, which succeeded this time.
image
image

@asddongmen
Copy link
Contributor

asddongmen commented Aug 18, 2023

  1. When we reload CDC and then execute the same 'resume' command again, the command was successful.

Similarly, in the scenario where '1' has already reloaded CDC, we attempted to update the changefeed configuration, but it returned a 500 error. We discovered that this seemed to be due to the effect of the old PD still stored in the upstream info. We tried expanding one of the previously downsized PD nodes and then executed the same 'update' command again, which succeeded this time. image image

So, if I understand correctly, TiCDC's pdClient is still using the old address and cannot update to the new one?

Based on our discussion and my comprehension, here is a summary:

  1. There are warning logs indicating that pdClient cannot connect to the old PD address, but changefeed can still advance.
  2. After restarting the cdc process, you can successfully execute cdc cli changefeed resume, but are unable to execute cdc cli changefeed update.
  3. When you scale out the PD cluster with one of the old PD nodes, TiCDC can correctly handle update changefeed requests.

Please correct me if I misunderstood anything. @jacktd9

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affects-6.5 affects-7.1 area/ticdc Issues or PRs related to TiCDC. severity/minor type/bug The issue is confirmed as a bug.
Projects
Development

Successfully merging a pull request may close this issue.

3 participants