-
Notifications
You must be signed in to change notification settings - Fork 9.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Question: Should a large get with WithSerializable() option block puts? #7719
Comments
can you try with current master? We made read nonblocking in master. |
That's great news, will give it a go, thanks. |
Why were nonblocking reads only added to the master? |
master tracks the next minor revision release; it's a development branch for major changes. Updates to 3.1.x are bug fixes.
The read will reflect the data from the revision when it was started. The write won't affect it. |
Hi, I tried with master built today: And I didn't see any improvement over 3.1.5, it still appears my puts are being blocked when a large read occurs. I tried with and without WithSerializable() and it didn't make much difference. Is there anything I need to change to enable non-blocking read? |
@mcginne maybe cpu or net io has starvation issue? so no matter what etcd does internally, the write still appears blocking? probably you want to pull some system metrics to verify. |
@mcginne kindly ping. |
@xiang90 sorry I was on vacation. I don't believe I am CPU limited (running at ~50% utilisation) and I don't believe network would be an issue as I am currently running to localhost in my test env.
The stacks of the Range varies between runs, but I see one here:
Is this all as expected with the non-blocking reads? I can attach the full thread dumps if they would be of use. |
The first lock can be improved a bit by creating a new read buffer instead of modifying the shared one, but there'll some extra copy/allocation overhead. The second lock will probably need a boltdb patch; the next best thing would be lock striping with several read txns. |
@heyitsanthony thanks for looking, so just to confirm my understanding; am I using the "non-blocking read" path here, but hitting some further locks that block the puts whilst the read is ongoing? My interpretation of the "non-blocking read" was that other transactions would be able to complete whilst the read was ongoing. |
Yes, it's hitting locks. |
Moved from #8202:
My worry here is that this will become increasingly annoying in large Kubernetes clusters - while large lists are uncommon, they are a common way for components to resync. The worst case would be for large range reads to block lease acquisition on loaded clusters. The cluster I'm describing above is one of the largest Kube clusters that is likely in the near term, and has some pathological distributions (we have lots of secrets, which are very large in etcd), but even at 10k namespaces has many fewer keys than we expect to eventually have. We aren't blocked by this - but I would expect it to be an issue for a subset of large deployers. |
Is similar to what we are seeing in #8114 |
etcd adds pagination support for this exact reason. I hope k8s can adopt this pattern instead of trying to get ALL keys at the same time. This is not saying that we should not make read non-blocking, if anyone wants to look into that problem please do! |
Yeah, I'm going to look at paging - @lavalamp and I had a quick discussion about it, and for me we have some very large clusters that want pagination to smooth out allocation curves for other reasons. I'll definitely be experimenting with pagination soon. |
If pagination solves this issue-- why doesn't etcd internally implement a large get as a series of gets? |
There are two cases:
Assembling large range results into one gRPC response inside etcd server will still cause memory blow up. Moreover, large message also break gRPC best practice. gRPC has a suggested 4MB max message size. To sum it up, current pagination only solves part of the problem and there are more things to think about.
Everything can be implemented inside etcd in theory, but every line of code is liability. We need to balance the complexity and care about the budget left here. We are still uncertain about how complicated this would be. And besides all that, we have limited human resources. As I mentioned, if anyone wants to look into the problem and come up with a proposal, that is great! |
@jpbetz @gyuho @hexfusion @spzala I have heard quite some people hitting this issue for large deployments We'd:
|
maybe we should investigate on how other databases behavior for large responses: mysql, pgsql, mongodb, redis, etc.. |
Going to do some research here, if anyone is working on this please ping me would like to collaborate. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 21 days if no further activity occurs. Thank you for your contributions. |
Our workload does some periodic large reads from etcd (V3.1.5 using V3 API) (potentially reading 300,000 of 600,000 keys from etcd)
With 600,000 keys in etcd the Get operation takes ~1.3 seconds to complete.
I notice that some puts that occur whilst the get is happening take ~700 milliseconds to complete (usually the puts take ~ 1 millisecond)
I had hoped that as the clientv3.WithSerializable() option is a bit like a dirty read it may not block the puts, but from my tests it seems like it does, as I still see 700ms delays with some puts. Is this expected?
Is there any way to perform a read that will not block puts?
The text was updated successfully, but these errors were encountered: