-
Notifications
You must be signed in to change notification settings - Fork 490
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adds max_shard_size parameter to shrink API #2519
Conversation
Signed-off-by: Fanit Kolchina <[email protected]>
@gaobinlong: I've created a section for the max_shard_size parameter. Could you review it for technical accuracy? Thanks! |
@kolchfa-aws here is my comment: |
Signed-off-by: Fanit Kolchina <[email protected]>
@gaobinlong Done, thank you for the review! |
|
||
The `max_shard_size` parameter specifies the maximum size of a primary shard in the target index. OpenSearch uses `max_shard_size` and the total storage for all primary shards in the source index to calculate the number of primary shards and their size for the target index. | ||
|
||
The primary shard count of the target index is the lowest factor of the source index's primary shard count, for which the shard size does not exceed `max_shard_size`. Consider the following example. Let's say the source index has 8 primary shards and they occupy a total of 400 GB of storage. If `max_shard_size` is equal to 150 GB, OpenSearch calculates the number of primary shards in the target index using the following algorithm: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The primary shard count of the target index is the lowest factor of the source index's primary shard count, for which the shard size does not exceed `max_shard_size`. Consider the following example. Let's say the source index has 8 primary shards and they occupy a total of 400 GB of storage. If `max_shard_size` is equal to 150 GB, OpenSearch calculates the number of primary shards in the target index using the following algorithm: | |
The primary shard count of the target index is the lowest factor of the source index's primary shard count, for which the shard size does not exceed `max_shard_size`. As an example, the source index has eight primary shards and they occupy a total of 400 GB of storage. If `max_shard_size` is equal to 150 GB, OpenSearch calculates the number of primary shards in the target index using the following algorithm: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@natebower Could you provide guidance on numerals vs spelled out numbers? I thought that in technical texts we use numerals, but please let me know if it's not true.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Spell out cardinal numbers from 1 to 9. For example, one NAT instance. Use numerals for cardinal numbers 10 and higher. Spell out ordinal numbers: first, second, and so on. In a series that includes numbers 10 or higher, use numerals for all. In this case, we should use 8
because 400
also appears in the sentence.
Signed-off-by: Fanit Kolchina <[email protected]>
1. Calculate the minimum number of primary shards as 400/150, rounded to the nearest whole integer. The minimum number of primary shards is 3. | ||
1. Calculate the number of primary shards as the lowest factor of 8 that is greater than 3. The number of primary shards is 4. | ||
|
||
The maximum number of primary shards for the target index is equal to the number of primary shards in the source index because the shrink operation is used to reduce the primary shard count. As an example, consider the source index with 5 primary shards that occupy a total of 600 GB of memory. If `max_shard_size` is 100 GB, the minimum number of primary shards is 600/100, which is 6. However, because the number of primary shards in the source index is lower than 6, the number of primary shards in the target index is set to 5. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The maximum number of primary shards for the target index is equal to the number of primary shards in the source index because the shrink operation is used to reduce the primary shard count. As an example, consider the source index with 5 primary shards that occupy a total of 600 GB of memory. If `max_shard_size` is 100 GB, the minimum number of primary shards is 600/100, which is 6. However, because the number of primary shards in the source index is lower than 6, the number of primary shards in the target index is set to 5. | |
The maximum number of primary shards for the target index is equal to the number of primary shards in the source index because the shrink operation is used to reduce the primary shard count. As an example, consider the source index with five primary shards that occupy a total of 600 GB of memory. If `max_shard_size` is 100 GB, the minimum number of primary shards is 600/100, which is six. However, because the number of primary shards in the source index is lower than six, the number of primary shards in the target index is set to five. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with very minor suggestions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, with a few questions/comments.
|
||
The `max_shard_size` parameter specifies the maximum size of a primary shard in the target index. OpenSearch uses `max_shard_size` and the total storage for all primary shards in the source index to calculate the number of primary shards and their size for the target index. | ||
|
||
The primary shard count of the target index is the lowest factor of the source index's primary shard count, for which the shard size does not exceed `max_shard_size`. Consider the following example. Let's say the source index has 8 primary shards and they occupy a total of 400 GB of storage. If `max_shard_size` is equal to 150 GB, OpenSearch calculates the number of primary shards in the target index using the following algorithm: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Are these paragraphs under the heading indented? Extra space here?
- "Smallest" factor sounds more natural to me than "lowest" factor. But this might be a convention in mathematics and widely accepted. Just asking.
- "... of the source index's primary shard count, whose shard size should not [will not?] exceed
max_shard_size
." Does that mess up the meaning?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Removed extra space, thank you. Does not affect the rendering, but still a good call.
- Agreed. Smallest is better.
- I feel like "whose" makes it less clear because it's not clear what it refers to.
The primary shard count of the target index is the lowest factor of the source index's primary shard count, for which the shard size does not exceed `max_shard_size`. Consider the following example. Let's say the source index has 8 primary shards and they occupy a total of 400 GB of storage. If `max_shard_size` is equal to 150 GB, OpenSearch calculates the number of primary shards in the target index using the following algorithm: | ||
|
||
1. Calculate the minimum number of primary shards as 400/150, rounded to the nearest whole integer. The minimum number of primary shards is 3. | ||
1. Calculate the number of primary shards as the lowest factor of 8 that is greater than 3. The number of primary shards is 4. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
smallest versus lowest for factor. (again, I admit I may be out of the know on this)
I like having this example here.
1. Calculate the minimum number of primary shards as 400/150, rounded to the nearest whole integer. The minimum number of primary shards is 3. | ||
1. Calculate the number of primary shards as the lowest factor of 8 that is greater than 3. The number of primary shards is 4. | ||
|
||
The maximum number of primary shards for the target index is equal to the number of primary shards in the source index because the shrink operation is used to reduce the primary shard count. As an example, consider the source index with 5 primary shards that occupy a total of 600 GB of storage. If `max_shard_size` is 100 GB, the minimum number of primary shards is 600/100, which is 6. However, because the number of primary shards in the source index is lower than 6, the number of primary shards in the target index is set to 5. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"... because the number of primary shards in the source index is smaller than 6, ..."
Number being smaller.
Signed-off-by: Fanit Kolchina <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kolchfa-aws Just a few small changes and one comment. Thanks!
Co-authored-by: Nathan Bower <[email protected]>
Co-authored-by: Nathan Bower <[email protected]>
Co-authored-by: Nathan Bower <[email protected]>
Signed-off-by: Fanit Kolchina <[email protected]>
The backport to
To backport manually, run these commands in your terminal: # Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-2.0 2.0
# Navigate to the new working tree
cd .worktrees/backport-2.0
# Create a new branch
git switch --create backport/backport-2519-to-2.0
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 6e120ec4b8d6ff03b0706f56b0106a16f1ef9b42
# Push it to GitHub
git push --set-upstream origin backport/backport-2519-to-2.0
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-2.0 Then, create a pull request where the |
The backport to
To backport manually, run these commands in your terminal: # Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-2.1 2.1
# Navigate to the new working tree
cd .worktrees/backport-2.1
# Create a new branch
git switch --create backport/backport-2519-to-2.1
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 6e120ec4b8d6ff03b0706f56b0106a16f1ef9b42
# Push it to GitHub
git push --set-upstream origin backport/backport-2519-to-2.1
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-2.1 Then, create a pull request where the |
The backport to
To backport manually, run these commands in your terminal: # Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-2.2 2.2
# Navigate to the new working tree
cd .worktrees/backport-2.2
# Create a new branch
git switch --create backport/backport-2519-to-2.2
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 6e120ec4b8d6ff03b0706f56b0106a16f1ef9b42
# Push it to GitHub
git push --set-upstream origin backport/backport-2519-to-2.2
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-2.2 Then, create a pull request where the |
* Adds max_shard_size parameter to shrink API Signed-off-by: Fanit Kolchina <[email protected]> * Implemented tech review comment Signed-off-by: Fanit Kolchina <[email protected]> * One more rewording Signed-off-by: Fanit Kolchina <[email protected]> * Implemented doc review comments Signed-off-by: Fanit Kolchina <[email protected]> * Update _api-reference/index-apis/shrink-index.md Co-authored-by: Nathan Bower <[email protected]> * Update _api-reference/index-apis/shrink-index.md Co-authored-by: Nathan Bower <[email protected]> * Update _api-reference/index-apis/shrink-index.md Co-authored-by: Nathan Bower <[email protected]> * Implemented editorial comments Signed-off-by: Fanit Kolchina <[email protected]> --------- Signed-off-by: Fanit Kolchina <[email protected]> Co-authored-by: Nathan Bower <[email protected]> (cherry picked from commit 6e120ec)
* Adds max_shard_size parameter to shrink API Signed-off-by: Fanit Kolchina <[email protected]> * Implemented tech review comment Signed-off-by: Fanit Kolchina <[email protected]> * One more rewording Signed-off-by: Fanit Kolchina <[email protected]> * Implemented doc review comments Signed-off-by: Fanit Kolchina <[email protected]> * Update _api-reference/index-apis/shrink-index.md Co-authored-by: Nathan Bower <[email protected]> * Update _api-reference/index-apis/shrink-index.md Co-authored-by: Nathan Bower <[email protected]> * Update _api-reference/index-apis/shrink-index.md Co-authored-by: Nathan Bower <[email protected]> * Implemented editorial comments Signed-off-by: Fanit Kolchina <[email protected]> --------- Signed-off-by: Fanit Kolchina <[email protected]> Co-authored-by: Nathan Bower <[email protected]> (cherry picked from commit 6e120ec)
* Adds max_shard_size parameter to shrink API Signed-off-by: Fanit Kolchina <[email protected]> * Implemented tech review comment Signed-off-by: Fanit Kolchina <[email protected]> * One more rewording Signed-off-by: Fanit Kolchina <[email protected]> * Implemented doc review comments Signed-off-by: Fanit Kolchina <[email protected]> * Update _api-reference/index-apis/shrink-index.md Co-authored-by: Nathan Bower <[email protected]> * Update _api-reference/index-apis/shrink-index.md Co-authored-by: Nathan Bower <[email protected]> * Update _api-reference/index-apis/shrink-index.md Co-authored-by: Nathan Bower <[email protected]> * Implemented editorial comments Signed-off-by: Fanit Kolchina <[email protected]> --------- Signed-off-by: Fanit Kolchina <[email protected]> Co-authored-by: Nathan Bower <[email protected]> (cherry picked from commit 6e120ec)
* Adds max_shard_size parameter to shrink API Signed-off-by: Fanit Kolchina <[email protected]> * Implemented tech review comment Signed-off-by: Fanit Kolchina <[email protected]> * One more rewording Signed-off-by: Fanit Kolchina <[email protected]> * Implemented doc review comments Signed-off-by: Fanit Kolchina <[email protected]> * Update _api-reference/index-apis/shrink-index.md Co-authored-by: Nathan Bower <[email protected]> * Update _api-reference/index-apis/shrink-index.md Co-authored-by: Nathan Bower <[email protected]> * Update _api-reference/index-apis/shrink-index.md Co-authored-by: Nathan Bower <[email protected]> * Implemented editorial comments Signed-off-by: Fanit Kolchina <[email protected]> --------- Signed-off-by: Fanit Kolchina <[email protected]> Co-authored-by: Nathan Bower <[email protected]> (cherry picked from commit 6e120ec) Co-authored-by: kolchfa-aws <[email protected]>
* Adds max_shard_size parameter to shrink API Signed-off-by: Fanit Kolchina <[email protected]> * Implemented tech review comment Signed-off-by: Fanit Kolchina <[email protected]> * One more rewording Signed-off-by: Fanit Kolchina <[email protected]> * Implemented doc review comments Signed-off-by: Fanit Kolchina <[email protected]> * Update _api-reference/index-apis/shrink-index.md Co-authored-by: Nathan Bower <[email protected]> * Update _api-reference/index-apis/shrink-index.md Co-authored-by: Nathan Bower <[email protected]> * Update _api-reference/index-apis/shrink-index.md Co-authored-by: Nathan Bower <[email protected]> * Implemented editorial comments Signed-off-by: Fanit Kolchina <[email protected]> --------- Signed-off-by: Fanit Kolchina <[email protected]> Co-authored-by: Nathan Bower <[email protected]> (cherry picked from commit 6e120ec) Co-authored-by: kolchfa-aws <[email protected]>
* Adds max_shard_size parameter to shrink API Signed-off-by: Fanit Kolchina <[email protected]> * Implemented tech review comment Signed-off-by: Fanit Kolchina <[email protected]> * One more rewording Signed-off-by: Fanit Kolchina <[email protected]> * Implemented doc review comments Signed-off-by: Fanit Kolchina <[email protected]> * Update _api-reference/index-apis/shrink-index.md Co-authored-by: Nathan Bower <[email protected]> * Update _api-reference/index-apis/shrink-index.md Co-authored-by: Nathan Bower <[email protected]> * Update _api-reference/index-apis/shrink-index.md Co-authored-by: Nathan Bower <[email protected]> * Implemented editorial comments Signed-off-by: Fanit Kolchina <[email protected]> --------- Signed-off-by: Fanit Kolchina <[email protected]> Co-authored-by: Nathan Bower <[email protected]> (cherry picked from commit 6e120ec) Co-authored-by: kolchfa-aws <[email protected]>
* Adds max_shard_size parameter to shrink API Signed-off-by: Fanit Kolchina <[email protected]> * Implemented tech review comment Signed-off-by: Fanit Kolchina <[email protected]> * One more rewording Signed-off-by: Fanit Kolchina <[email protected]> * Implemented doc review comments Signed-off-by: Fanit Kolchina <[email protected]> * Update _api-reference/index-apis/shrink-index.md Co-authored-by: Nathan Bower <[email protected]> * Update _api-reference/index-apis/shrink-index.md Co-authored-by: Nathan Bower <[email protected]> * Update _api-reference/index-apis/shrink-index.md Co-authored-by: Nathan Bower <[email protected]> * Implemented editorial comments Signed-off-by: Fanit Kolchina <[email protected]> --------- Signed-off-by: Fanit Kolchina <[email protected]> Co-authored-by: Nathan Bower <[email protected]>
Adds max_shard_size parameter to shrink API
Expands #2352
Fixes #2044
Checklist
For more information on following Developer Certificate of Origin and signing off your commits, please check here.