-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add check of hashed data when writing new data #296
Add check of hashed data when writing new data #296
Conversation
The same check should also go into Can you add a test where the hash function is mocked to be one of the previous data_version hash functions? The check should get triggered. |
Okay, this is really strange. The error in the test that I mentioned above is happening because When I run
|
Looks like there must be a problem locally with Changes
|
Looking pretty good!
|
c3883d7
to
3c9272f
Compare
Okay, I think we have our bases covered with these tests. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me!
This PR adds a check when writing data to a file. This check verifies that chunks of data that already exist in the hashtable correspond to arrays that match the data we are trying to write.
This should incur a performance overhead when writing datasets that reuse chunks, but ensures we won't be corrupting hashtables in the future. Closes #294.
Note: Currently I'm seeing a test failure locally with
test_delete_string_dataset
onmaster
. The failure happens when trying to open the file after runningh5repack
on thehdf5
file, withOSError: Unable to open file (bad object header version number)
. I'm opening this PR to see if CI sees it.