Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Transform] do not fail checkpoint creation due to global checkpoint mismatch #48423

Merged
merged 2 commits into from
Oct 24, 2019

Conversation

hendrikmuhs
Copy link
Contributor

Take the max if global checkpoints mismatch instead of throwing an exception. It turned out global
checkpoints can mismatch by design

fixes #48379

Severity: checkpoint creation can fail due to this issue in rare cases, however checkpoint creation is retried

@elasticmachine
Copy link
Collaborator

Pinging @elastic/ml-core (:ml/Transform)

Copy link
Member

@benwtrent benwtrent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor simplification suggestion

// it's possible that replica shards report a different/higher global checkpoints
// This is by design and not a problem, take the max() for this case
if (checkpoints.get(shard.getShardRouting().getId()) < globalCheckpoint) {
checkpoints.put(shard.getShardRouting().getId(), globalCheckpoint);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this whole if clause could then be changed to something similar to

checkpoints.compute(shard.getShardRouting().getId(), (shardId, cp) -> (cp == null) ? globalCheckpoint : Math.max(cp, globalCheckPoint))

@hendrikmuhs
Copy link
Contributor Author

run elasticsearch-ci/1
run elasticsearch-ci/packaging-sample-matrix

@hendrikmuhs hendrikmuhs merged commit b3d9f0a into elastic:master Oct 24, 2019
hendrikmuhs pushed a commit that referenced this pull request Oct 24, 2019
…mismatch (#48423)

Take the max if global checkpoints mismatch instead of throwing an exception. It turned out global
checkpoints can mismatch by design

fixes #48379
hendrikmuhs pushed a commit that referenced this pull request Oct 24, 2019
…mismatch (#48423)

Take the max if global checkpoints mismatch instead of throwing an exception. It turned out global
checkpoints can mismatch by design

fixes #48379
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[ML] Unnecessary transform warning message is logged very often
4 participants