Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Zip file output sometimes corrupted on Linux #36

Closed
tbirdso opened this issue Jul 31, 2023 · 7 comments · Fixed by #54
Closed

BUG: Zip file output sometimes corrupted on Linux #36

tbirdso opened this issue Jul 31, 2023 · 7 comments · Fixed by #54
Assignees

Comments

@tbirdso
Copy link
Collaborator

tbirdso commented Jul 31, 2023

Overview

Tests for reading from .zip file store sometimes fail. Exact cause is unclear at the moment.

First CI run (failed)
https://github.com/InsightSoftwareConsortium/ITKIOOMEZarrNGFF/actions/runs/5716549674/job/15488329023?pr=35

Second CI run, no changes (passing):
https://github.com/InsightSoftwareConsortium/ITKIOOMEZarrNGFF/actions/runs/5716549674/job/15490659684?pr=35

Workaround

Re-running CI seems to work for now.

@tbirdso
Copy link
Collaborator Author

tbirdso commented Aug 2, 2023

This is currently blocking the v0.1.3 PyPI release: https://github.com/InsightSoftwareConsortium/ITKIOOMEZarrNGFF/actions/runs/5739761493

@tbirdso tbirdso changed the title BUG: Zip file tests sometimes fail CI BUG: Zip file output sometimes corrupted on Linux Aug 9, 2023
@tbirdso
Copy link
Collaborator Author

tbirdso commented Aug 9, 2023

Following investigation in #44, it was observed that the memory.zip output of the inMemory_zip test is sometimes a corrupted archive. The exact cause for this is not clear. Behavior is observed regularly in CI and infrequently in local testing on Linux platforms.

The Tensorstore zip driver is maintained in the https://github.com/InsightSoftwareConsortium/tensorstore fork and is not maintained by Google. We will need to track this ourselves.

Following offline discussion, we will disable the failing test for now to prevent blocking PyPI releases and continue investigation in parallel with subsequent work.

tbirdso added a commit to tbirdso/ITKIOOMEZarrNGFF that referenced this issue Aug 9, 2023
Disable "IOOMEZarrNGFF_inMemory_zip" test which sometimes writes a
corrupted archive on Linux.

"zip" tensorstore driver will be investigated independently in the
InsightSoftwareConsortium Tensorstore fork and enabled when "zip" write
on Linux is supported again.

InsightSoftwareConsortium#36
tbirdso added a commit to tbirdso/ITKIOOMEZarrNGFF that referenced this issue Aug 9, 2023
Disable "IOOMEZarrNGFF_inMemory_zip" test which sometimes writes a
corrupted archive on Linux.

"zip" tensorstore driver will be investigated independently in the
InsightSoftwareConsortium Tensorstore fork and enabled when "zip" write
on Linux is supported again.

InsightSoftwareConsortium#36
@dzenanz
Copy link
Member

dzenanz commented Aug 30, 2023

After all the contents are added to in-memory zip file, the buffer size is 47447. After closeZip() is called, the buffer size increases to the correct value of 47685. This leads me to suspect that destructor is not invoked sometimes.

I will now try to fix this.

I originally wrote this in the wrong issue's discussion: #48 (comment)

@dzenanz dzenanz self-assigned this Aug 30, 2023
@dzenanz
Copy link
Member

dzenanz commented Aug 30, 2023

Looking at tensorstore's commit log, there were some fixes recently regarding caches and deep copies. I will rebase our changes on top of current master, to see whether that resolves the problem.

@dzenanz
Copy link
Member

dzenanz commented Aug 30, 2023

With rebasing, I ran into a snag: google/tensorstore#110.

@dzenanz
Copy link
Member

dzenanz commented Aug 31, 2023

With building cleared up, the rebase does not help with crashes. We should still use the newer version of tensorstore, so I created this PR:
InsightSoftwareConsortium/tensorstore#1
Please just review, do not click merge - this branch should become the new master.

@dzenanz
Copy link
Member

dzenanz commented Sep 1, 2023

#54 fixes the problem via a workaround. I still think that the original problem was caused by the ~ZipEncapsulator() destructor not always being called. But digging through tensorstore internals seemed daunting.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants