Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More efficient consolidation check in dataslice.update_slice #49

Open
3 tasks
pont-us opened this issue Sep 9, 2021 · 1 comment
Open
3 tasks

More efficient consolidation check in dataslice.update_slice #49

pont-us opened this issue Sep 9, 2021 · 1 comment
Labels
enhancement New feature or request

Comments

@pont-us
Copy link
Member

pont-us commented Sep 9, 2021

dataslice.update_slice checks whether the data store is consolidated by calling zarr.open_consolidated on it, and catching the resulting exception if it isn't consolidated. It then uses xarray.open_zarr to open the data store as an xarray.Dataset. This means that, if the store is consolidated, the consolidated metadata object is read twice, which is inefficient. We should try to improve on this.

  • Measure performance with current implementation, using a remote S3 data store.
  • Implement a more efficient method of checking for consolidation.
  • Measure performance with the new implementation to confirm and quantify the improvement.
@pont-us pont-us added the enhancement New feature or request label Sep 9, 2021
@pont-us
Copy link
Member Author

pont-us commented Sep 9, 2021

On further investigation, I think there's a simple solution: xarray.open_zarr (at least as of version 0.19.0) also accepts an explicit consolidation parameter and throws an exception when trying to open unconsolidated data with this parameter set to True. So we can attempt a consolidated xarray.open_zarr and either use the dataset directly or catch the exception and fall back to unconsolidated, noting the consolidation state for later reference.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant