-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Migrate treenode module. #8757
Merged
Merged
Migrate treenode module. #8757
Changes from all commits
Commits
Show all changes
25 commits
Select commit
Hold shift + click to select a range
065db0e
Update the formating tests
flamingbear f167b99
Migrate treenode module
flamingbear 6bd492f
Update NotFoundInTreeError description.
flamingbear b62a21a
Reformat some comments
flamingbear 32053b6
Updates whats-new.rst
flamingbear 32e7453
mypy typing. (terrible?)
flamingbear 7f2a178
Adds __repr__ to NamedNode and updates test
flamingbear b4fb773
Merge remote-tracking branch 'pydata/main' into mhs/migrate_treenode
flamingbear 548536b
Adds quotes to NamedNode __str__ representation.
flamingbear b9685d7
swaps " for ' in NamedNode __str__ representation.
flamingbear 59da654
Adding Tom in so he gets blamed properly.
flamingbear b821fca
Merge remote-tracking branch 'pydata/main' into mhs/migrate_treenode
flamingbear 4530973
resolve conflict whats-new.rst
flamingbear c03d373
Moves test_treenode.py to xarray/tests.
flamingbear ccd5374
Merge branch 'main' into mhs/migrate_treenode
flamingbear 4830cd4
refactors backend tests for datatree IO
flamingbear e8459bc
Add explicit engine back in test_to_zarr
flamingbear 4238ce2
Removes OrderedDict from treenode
flamingbear db264c0
Renames tests/test_io.py -> tests/test_backends_datatree.py
flamingbear b4acb0d
typo
flamingbear 0f4b38a
Add types
flamingbear e7ea2b4
Merge pull request #7 from flamingbear/mhs/migrate-datatree-tests
flamingbear f2f327f
Pass mypy for 3.9
flamingbear 9cbaf3b
Merge branch 'main' into mhs/migrate_treenode
flamingbear 9db7040
Merge branch 'main' into mhs/migrate_treenode
flamingbear File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we're exposing it (and keep it as a pure path, which I think we should), should we move this class to a separate module?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we exposing
NodePath
? I suggested it, but after talking with @etienneschalk decided maybe it wasn't particularly helpful xarray-contrib/datatree#205.What's the logic for making it a pure path? The posix path is to ensure the slashes are in a consistent direction (remember this isn't a real filepath so shouldn't change automatically on windows systems).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure about this, but I believe the difference is that
PurePosixPath
doesn't have methods that require read/write access (liketouch
,remove
,mkdir
,iterdir
, ...), whilePosixPath
does.The main difference between filesystems and
datatree
is that there's only a single filesystem instance at a time, while you can have multipleDataTree
objects at the same time. This means that aNodePath
inheriting fromPosixPath
would have to be created from theDataTree
object and would require a lot of discipline to not be confusing.On the other hand, a
NodePath
that inherits fromPurePosixPath
would help a bit with path manipulations (name
,root
,joinpath()
,parts
,parent
,parents
, ...) that we would otherwise have to useposixpath
for. I didn't check ifposixpath
has the same functionality asPurePosixPath
, though, and they might also not be as convenient to use.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I wrote my above comment too quickly - we both agree it should stay as a pure path.
We could also consider adding extra methods to this
NodePath
class if it simplifies other parts of the code.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Related issue: xarray-contrib/datatree#283 (Consistency between DataTree methods and pathlib.PurePath methods)
To me the graal would be to have datatree more and more compatible with
PurePosixPath
(whatNodePath
already does by inheriting from it, but we can imagine having an API accepting both strings andPurePosixPath
, without the need to exposeNodePath
publicly).As @keewis mentioned , there is a single instance of the filesystem whereas there can be multiple DataTrees. So a single DataTree could act like a "small filesystem on its own" and implement the
PosixPath
methods.The idea here is to delegate to
pathlib
the diffcult task of choosing the best names for methods that help manipulating trees. For an audience of developers that are already used to writing scripts withpathlib
, this would make working with datatree automatic! And this also gives plenty of ideas of features. Also, with hierarchical-formats such as Zarr that rely on the filesystem, openable by datatree, the proximity is innate.Concrete example: working with multi-resolution rasters (a usecase of datatree). By using f-strings and glob, you can write resolution independant code once, like you would have done by manipulating a data structure on the file system
The schema on the pathlib doc (arrows are inheritance):
How I see it:
(I removed
Path
for simplification)I am not 100% sure it fully makes sense... The nuance is that when you instanciate a Path, it is always implicit that it is bound to the filesystem, while each DataTree instantiation creates its own data representation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @etienneschalk for all your thoughts here!
I strongly feel that DataTree / NodePath objects should not be associated with concrete filesystem paths.
However this
seems like a reasonable idea, and I think we should continue the discussion of that in xarray-contrib/datatree#283.
For now I don't think there is anything we need to do in this PR specifically - we can revisit the API choices / accepting path-like types in a future PR.