-
Notifications
You must be signed in to change notification settings - Fork 313
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve simulation of bulk-indexing conflicts #477
Improve simulation of bulk-indexing conflicts #477
Conversation
With this commit we introduce a new property `conflict-probability` for the bulk-indexing parameter source. Previously we had a hard-codded probability of 25% but now the user can control it. We also use `update` now as bulk-indexing action when simulating a conflict (previously the action was always `index`). Closes elastic#422
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
tests/track/params_test.py
Outdated
am_handler = params.GenerateActionMetaData("test_index", "test_type", | ||
conflicting_ids=[100, 200, 300, 400], | ||
conflict_probability=25, | ||
rand=lambda: next(pseudo_random_conflicts), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clever use of iter()
and lambda
👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hehe, thanks :)
With this commit we add a new parameter `on-conflict` which allows users to define whether the action-and-metadata line should use "index" or "update" on simulated id conflicts.
@dliappis I pushed another commit since your LGTM which adds a new parameter "on-conflict" that lets the user decide whether to use Can you please review that commit as well? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
tests/track/params_test.py
Outdated
@@ -104,6 +105,12 @@ def test_generate_action_meta_data_without_id_conflicts(self): | |||
next(params.GenerateActionMetaData("test_index", "test_type"))) | |||
|
|||
def test_generate_action_meta_data_with_id_conflicts(self): | |||
def idx(id): | |||
return "index", '{"index": {"_index": "test_index", "_type": "test_type", "_id": "%s"}}' % id |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very minor comment, could use .format()
for consistency going forwards.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is on the hot-code path where we decided against #format()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
D'oh yes!
return "index", '{"index": {"_index": "test_index", "_type": "test_type", "_id": "%s"}}' % id | ||
|
||
def conflict(action, id): | ||
return action, '{"%s": {"_index": "test_index", "_type": "test_type", "_id": "%s"}}' % (action, id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Likewise, could use .format()
here too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is on the hot-code path where we decided against #format()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
D'oh yes!
Thanks for the review @dliappis! |
With this commit we introduce a new property
conflict-probability
forthe bulk-indexing parameter source. Previously we had a hard-codded
probability of 25% but now the user can control it.
We also use
update
now as bulk-indexing action when simulating aconflict (previously the action was always
index
).Closes #422