-
Notifications
You must be signed in to change notification settings - Fork 124
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TCTracks: improve hdf5 I/O #735
Conversation
Great addition, thank you @tovogt ! 🎉 Regarding backwards compatibility: The purpose of So I am fine with dropping it. However, I would like to have a useful error message when somebody tries to load an old-style file with the new reader. It should contain instructions on how to resolve the issue (load the original data, then store it with the new writer) and what to do if this does not work (raise an issue). In case somebody complains, we can still come up with a script that "translates" the old into the new format. Finally, I am a bit curious about the testing. Why do the tests still pass? Do you think it's justified to not adapt any test? |
Thanks for your feedback! I added a helpful error message in case users try to load a file in the legacy file format. To test this functionality, I also added a 90 KB test data file to the repository. I hope that's okay for a binary blob.
The tests only tested the cycling of |
@bguillod and I don't use the existing functionality (as far as we can remember), so I'm fine if this is a breaking change. And very excited for faster I/O. Thanks!! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All changes look good to me, but there are a few remaining linter issues. As soon as they are fixed, this can be merged!
Thanks for double-checking. I pushed another commit that should fix all remaining points. |
Changes proposed in this PR:
TCTracks.from_hdf5
andTCTracks.write_hdf5
work internally. While it does not change the API, it changes the file structure and is not backwards compatible the way it is implemented here in the sense that files that have been stored with the old implementation cannot be read with the new implementation.I think that nobody currently uses the HDF5 I/O feature of
TCTracks
objects, and think that it's safe to drop backwards compatibility, but I might be wrong. I think, I also talked to @chahan about this a few months ago and he was quite optimistic that nobody actually uses this feature and we can easily change the file format at some point.@ThomasRoosli You modified the
TCForecast
class to be compatible with this feature back in the day (CLIMADA-project/climada_petals#33). The proposed new format won't require any changes to the wrapper you wrote forTCForecast
. However, did you end up using this HDF5 I/O feature in practice?@ChrisFairless @bguillod and everyone that uses CLIMADA's TC feature: Do you know anyone who actually uses the HDF5 IO features of the TCTracks class?
If you insist that backwards compatibility is important, I would have this proposition: We let
write_hdf5
write to the new format no matter what - this should be safe. But infrom_hdf5
, we implement a check that determines whether the user is trying to read from the legacy file format and falls back to the old implementation if necessary. I would prefer to drop backwards compatibility because it's quite a lot of lines of code, and I'm quite convinced that nobody uses the old format. But I might be wrong...PR Author Checklist
develop
)PR Reviewer Checklist