Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BIG-4969] Removing a bug in ByteArraySet chunk iteration #75

Merged
merged 41 commits into from
Nov 17, 2020

Conversation

qlecorre
Copy link
Contributor

@qlecorre qlecorre commented Oct 20, 2020

GOAL:

This pull request fixes a bug in the way we iterate over ByteArraySet chunks, which would cause data loss.

The bug arises on serialization if the last value of one of the chunks was deleted. In this case, the iteration over chunks would terminate on that chunk, the ones after being lost. This is because we increment currentOffset without actually updating currentChunk, hence causing the subsequent hasNext() to return false.

The testEmptyLastValueChunkIteratorBug() test presents reproduction steps (in case you are interested).

BUILDING BLOCKS:

Below are some branches and Docker images I created as par of the investigation:

Only Bytes:

  • Branch: qlecorre-BIG-4969-75
  • Image: quentin-only-bytes-latest

Only Front Set:

  • Branch: qlecorre-BIG-4969-53
  • Image: quentin-only-frontset-latest

Both Front Set and Bytes:

  • Branch: qlecorre-BIG-4969-51
  • Image: quentin-latest

MANUAL DEBUGGING:

Use qlecorre-BIG-4969-51-local-debugging branch for debugging

BIG-4969

@qlecorre qlecorre force-pushed the qlecorre-BIG-4969-51 branch 5 times, most recently from 860e663 to e4162aa Compare October 23, 2020 19:11
@qlecorre qlecorre changed the title [BIG-4969] Replace Set<ByteArray> by ByteArraySet [BIG-4969] Removing a bug on large re-indexes Oct 23, 2020
@qlecorre qlecorre force-pushed the qlecorre-BIG-4969-51 branch 6 times, most recently from 555a7a9 to 555ea71 Compare October 26, 2020 19:44
@qlecorre qlecorre changed the title [BIG-4969] Removing a bug on large re-indexes [BIG-4969] Removing a bug in ByteArraySet chunk iteration Oct 29, 2020
Copy link
Contributor

@arron-green arron-green left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice work! left some suggestions to clean this up a bit

@qlecorre qlecorre changed the base branch from master to qlecorre-use-ByteArraySet October 29, 2020 19:07

List<ByteArray> vals = getRandomByteArrays(size);

for (ByteArray val: vals) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can be instantiated by the constructor rather than looping

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ByteArraySet cannot be instantiated with a List, but simplified with addAll

Base automatically changed from qlecorre-use-ByteArraySet to master November 17, 2020 15:11
@eric-weaver eric-weaver merged commit e13f174 into master Nov 17, 2020
@eric-weaver eric-weaver deleted the qlecorre-BIG-4969-51 branch November 17, 2020 19:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants