Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow adjustment of index resource constraints in ILM phase transitions #44070

Closed
DaveCTurner opened this issue Jul 8, 2019 · 17 comments · Fixed by #76134, #76732, #76775, #76780 or #76794
Closed

Allow adjustment of index resource constraints in ILM phase transitions #44070

DaveCTurner opened this issue Jul 8, 2019 · 17 comments · Fixed by #76134, #76732, #76775, #76780 or #76794
Labels
:Data Management/ILM+SLM Index and Snapshot lifecycle management good first issue low hanging fruit help wanted adoptme Team:Data Management Meta label for data/management team

Comments

@DaveCTurner
Copy link
Contributor

The shards allocator mostly does a good job of spreading the shards of each index out across the available nodes, but there are some corner cases where it might temporarily concentrate too many hot shards on a single node, leading to poor performance and possible resource exhaustion (#17213). It is occasionally useful to add the index.routing.allocation.total_shards_per_node constraint when indices are under heavy load, to prevent too many hot shards from being allocated to the same node. We are contemplating generalising this to a more nuanced set of per-index resource constraints too (#17213 (comment)).

If an index is managed by ILM then it is likely its resource requirements will change along with its lifecycle. In particular when the index leaves the hot phase it will no longer see such heavy indexing, and so any total_shards_per_node constraint is likely no longer necessary. It's important to keep the use of this sort of setting to a minimum to avoid over-constraining the allocator since this can lead to unassigned shards (#12273). I therefore think it'd be a good idea to allow this kind of setting to be changed in the appropriate ILM phase transitions.

@DaveCTurner DaveCTurner added :Distributed Coordination/Allocation All issues relating to the decision making around placing a shard (both master logic & on the nodes) :Data Management/ILM+SLM Index and Snapshot lifecycle management team-discuss labels Jul 8, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-features

@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed

@DaveCTurner
Copy link
Contributor Author

I raised this question with the distrib team today and although we're not super-keen on recommending the widespread use of the total_shards_per_node constraint, we think it's a good idea to allow it to be relaxed on ILM phase transitions, particularly the hot-to-warm one.

@vigyasharma
Copy link
Contributor

Are per-index resource constraints being planned as prescriptive - as in allocator makes a best effort to comply by constraints but allocates the shards anyway if no node is able to fulfill its requirements, or will it be a hard limit, and leave shards unallocated if not fulfilled?

Resource requirements are often quite dynamic. Even in the hot stage, and index may have off peak hours when the reserved resources could've been shared by other indices.

Requirement prediction itself is hard with users almost always under/over provisioning in the first pass. It needs iterative fine tuning to get the limits right. I would vote for a prescriptive best case effort to allow for such flexibility.

Separately, +1 on revising constraints through ILM transitions.

@dakrone
Copy link
Member

dakrone commented Jul 18, 2019

We discussed this and agreed on putting an option for setting index.routing.allocation.total_shards_per_node into the allocate action in ILM.

@DaveCTurner DaveCTurner removed the :Distributed Coordination/Allocation All issues relating to the decision making around placing a shard (both master logic & on the nodes) label Jul 22, 2019
@dakrone dakrone added help wanted adoptme good first issue low hanging fruit labels Jul 23, 2019
@avneesh91
Copy link
Contributor

@DaveCTurner would it be ok if I started working on this?

@dakrone
Copy link
Member

dakrone commented Aug 5, 2019

@avneesh91 sure if you'd like to contribute a PR for this that would be great.

@shoaib4330
Copy link

@avneesh91 I'd like to pick this issue, if you're not actively working on it?

@epicvinny
Copy link

It is impracticable to use ILM at scale :(

@Aloshi
Copy link

Aloshi commented Dec 9, 2020

Another +1 to this issue. We've been having problems when we scale up our larger clusters by 2-3 nodes where ES decides to allocate all of the shards for the next index on the new nodes, which causes abysmal performance and slows down ingestion. We've been guarding against it by setting index.routing.allocation.total_shards_per_node to 1 or 2 on our big indices until the new nodes are mostly full, and we'd like to just leave this set all the time, but when these indices are moved to warm nodes there aren't enough warm nodes to satisfy the constraint (since having fewer warm nodes than hot nodes is kind of the point).

Is there any update on this? It sounds like the proposed solution would work great for us.

@ppf2
Copy link
Member

ppf2 commented Dec 20, 2020

+1 Shard imbalance necessitating the use of total_shards_per_node does not only apply to the heavy indexing use case, we have seen this affect other cluster operations such as force merge when a majority of shards of an index being force merged ended up concentrating on a single warm node.

@JohnLyman
Copy link

We discussed this and agreed on putting an option for setting index.routing.allocation.total_shards_per_node into the allocate action in ILM.

Resetting total_shards_per_node is also required when using the shrink action which necessitates that all primary shards be moved to the same node.

@dakrone - if this option is put in the allocate action, would that allow for the value to be reset prior to shrink? I'm not sure of the order of operations here (I don't think it's documented). In other words, does allocate come before shrink in the warm phase?

@dakrone
Copy link
Member

dakrone commented Jan 6, 2021

if this option is put in the allocate action, would that allow for the value to be reset prior to shrink? I'm not sure of the order of operations here (I don't think it's documented). In other words, does allocate come before shrink in the warm phase?

allocate does come before shrink in the warm phase, though I think regardless we should endeavor to make shrink run regardless of the setting, unsetting the setting during the allocation, then resetting it if necessary.

@JohnLyman
Copy link

That seems reasonable @dakrone (and thanks for the quick reply). Should I open a new issue for your suggestion?

@dakrone
Copy link
Member

dakrone commented Jan 6, 2021

@JohnLyman I think it can be subsumed into this issue for now, thanks for bringing it up!

@JohnLyman
Copy link

I was thinking for shrink in general, whether it's initiated by ILM or not.

@dakrone
Copy link
Member

dakrone commented Jan 6, 2021

I think that is one of the main goals of #63519

masseyke added a commit that referenced this issue Aug 11, 2021
#76134)

This adds a new optional field to the allocate ILM action called "total_shards_per_node". If present, the
value of this field is set as the value of "index.routing.allocation.total_shards_per_node" before the allocation
takes place.
Relates to #44070
masseyke added a commit that referenced this issue Aug 20, 2021
… is too low (#76732)

We added configuration to AllocateAction to set the total shards per node property on the index. This makes it possible that a user could set this to a value lower than the total number of shards in the index that is about to be shrunk, meaning that all of the shards could not be moved to a single node in the ShrinkAction. This commit unsets the total shards per node property so that we fall back to the default value (-1, unlimited) in the ShrinkAction to avoid this.
Relates to #44070
masseyke added a commit to masseyke/elasticsearch that referenced this issue Aug 20, 2021
elastic#76134)

This adds a new optional field to the allocate ILM action called "total_shards_per_node". If present, the
value of this field is set as the value of "index.routing.allocation.total_shards_per_node" before the allocation
takes place.
Relates to elastic#44070
masseyke added a commit to masseyke/elasticsearch that referenced this issue Aug 20, 2021
… is too low (elastic#76732)

We added configuration to AllocateAction to set the total shards per node property on the index. This makes it possible that a user could set this to a value lower than the total number of shards in the index that is about to be shrunk, meaning that all of the shards could not be moved to a single node in the ShrinkAction. This commit unsets the total shards per node property so that we fall back to the default value (-1, unlimited) in the ShrinkAction to avoid this.
Relates to elastic#44070
masseyke added a commit that referenced this issue Aug 20, 2021
… is too low (#76732) (#76780)

This is a backport of #76732. We added configuration to AllocateAction to set the total shards
per node property on the index. This makes it possible that a user could set this to a value lower
than the total number of shards in the index that is about to be shrunk, meaning that all of the
shards could not be moved to a single node in the ShrinkAction. This commit unsets the total
shards per node property so that we fall back to the default value (-1, unlimited) in the
ShrinkAction to avoid this.
Relates to #44070
masseyke added a commit that referenced this issue Aug 23, 2021
#76775)

Allow for setting the total shards per node in the Allocate ILM action (#76134)

This is a backport of #76134. It adds a new optional field to the allocate ILM action called
"total_shards_per_node". If present, the value of this field is set as the value of "index.routing.allocation.total_shards_per_node" before the allocation takes place.
Relates to #44070
masseyke added a commit that referenced this issue Aug 23, 2021
Updating the version where the total_shards_per_node parameter is supported, after backporting the feature to 7.16.
Relates to #76775 #44070
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment