Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RDS Not Updating after kmesh long periods of Inactivity #964

Open
PerforMance308 opened this issue Oct 15, 2024 · 8 comments · May be fixed by #973
Open

RDS Not Updating after kmesh long periods of Inactivity #964

PerforMance308 opened this issue Oct 15, 2024 · 8 comments · May be fixed by #973
Labels

Comments

@PerforMance308
Copy link

PerforMance308 commented Oct 15, 2024

What happened:

After running KMesh for an long period without any operations, RDS stops updating. The following observations were made:

  1. KMesh is left running overnight without any operations.
  2. The next day, modifications to the existing VirtualService do not take effect. For example, changing the match prefix in a VirtualService from /test to /test-echo does not update.
  3. Restarting KMesh resolves the issue temporarily.
  4. Debugging logs indicate that only CDS and EDS updates are happening, while RDS is not updating.

What you expected to happen:

Modifications to VirtualService should be applied without needing to restart KMesh.

How to reproduce it (as minimally and precisely as possible):

  1. Start KMesh and configure a VirtualService with match prefix /test.
  2. Allow KMesh to run overnight without any operations.
  3. Modify the VirtualService to match prefix /test-echo.
  4. Observe that the update does not take effect.

Anything else we need to know?:

Environment:

  • Kmesh version: 0.5
  • K8S Node OS: HCE 2.0 with enhanced kernel
  • Others:
@PerforMance308 PerforMance308 added the kind/bug Something isn't working label Oct 15, 2024
@hzxuzhonghu
Copy link
Member

Can you enable kmesh debug log level and paste the log here?

@PerforMance308
Copy link
Author

I manually added logs to every handlexdsResponse function :

image

image

And same for handleEdsResponse and handleLdsResponse function,

from the log we can see that only CDS and EDS were printed.

But after restarting kmesh, all xDS could be updated

@hzxuzhonghu
Copy link
Member

What istiod version? And @lec-bit and I fixed a similar bug in v0.5 #890

@PerforMance308
Copy link
Author

istio v1.19

@PerforMance308
Copy link
Author

log.txt

After line 283 of this log file, I performed an update operation on the VirtualService, only modifying the match prefix. The content printed shows that no RDS update was received.

@hzxuzhonghu
Copy link
Member

I suspect this is due to this part of code directly.

if !slices.EqualUnordered(p.Cache.routeNames, lastRouteNames) {
// we cannot set the nonce here.
// There is a race: when xds server has pushed rds, but kmesh hasn't a chance to receive and process
// Then it will lead to this request been ignored, we will lose the new rds resource
p.req = newAdsRequest(resource_v3.RouteType, p.Cache.routeNames, "")

We do not cleanup p.Cache.routeNames even when xds connection reconnected, so after reconnect, this route name check maybe equal(because only vs match prefix updated here.)

Actually on istiod side, a new xds connection will not share any info with previous connection, so it has no info about what route names the client subscribed, So it will have no route at all, then no need to push

@hzxuzhonghu
Copy link
Member

@lec-bit I donot see a fix filed, can you open a pr for it

@lec-bit lec-bit linked a pull request Oct 21, 2024 that will close this issue
@lec-bit
Copy link
Contributor

lec-bit commented Oct 21, 2024

ok

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants