All primary shards on the same node for every index in 6.x #29437
Labels
:Distributed Coordination/Allocation
All issues relating to the decision making around placing a shard (both master logic & on the nodes)
ES 6.1.1
NEST 6.0.1
3 node cluster, 2 replicas for each index
In looking for the cause of a "hot" node, I read a lot about how "updates" can cause this since they need to be coordinated via the primary shard. In my cluster, I have 34 indexes, most w/ 5 shards, and some "user" indexes w/ 20 shards. Each and every primary shard for each and every index is on the same node which means that every update request that I make has to be handled by this node.
As I perused similar Q&A that I found via web searches and reviewing Cluster Level Shard Allocation, I couldn't see that there was any way to redistribute the primary shards when we're "fully replicated" as we are. So, I tried an experiment by setting one of my indexes to 1 replica. Sure enough, now a couple of the primary shards are on another node. So, then I set it back to 2 replicas. Sure enough, the primaries stayed where they had been moved after changing to 1 replica. So, that's a cheat. Is there a way for me to more explicitly distribute the "primary" designation for shards across the nodes in my cluster?
Some of the most highly updated indexes are ones where we use terms lookup where ES best practices dictate we have a replica so that the file system cache can be utilized so ES doesn't have to request the terms from another replica. Ok, so we have 2 replicas but I also need to balance out the update traffic and I don't know if my little "back and forth" trick is even persistent (ie. our windows VMs auto update in a staggered way). FWIW, I've been referred to Elasticsearch Versioning Support which we can consider and potentially implement (assuming the C# NEST library supports) but I'm definitely not sure we'll even want to do that. Regardless, of course, we also have a running production cluster we need to keep performant in the meantime.
The text was updated successfully, but these errors were encountered: