You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Oct 24, 2024. It is now read-only.
A tree-like structure must have nodes, each of which can contain multiple children, and those children have to be selectable via some kind of name. However, those names can either be keys to access the child objects, or inherent properties of the child objects.
In the former case we would have a node.children=tuple(child1, child2), where child1.name = 'steve', child2.name = 'mary' etc. In the latter case we would have node.children=dict('steve': child1, 'mary': child2), where each child need not have a name. It's not clear to me which of these approaches is better in our case.
It's easy to ensure that all nodes have names (and if we make nodes inherit from Dataset they will inherit a name), but storing children in tuples leads to annoying code like child_we_want = next(c for c in node.children if c.name == name_we_want), instead of just child_we_want = node[name_we_want]. A DataTree is also quite intuitively represented by a nested dictionary where keys are parts of a path and values are either datasets or child nodes, and in that description we would not say that the name key is an inherent property of the value.
Using a dictionary also means that the path to an object is distinct from the name of that object.
This also means that a node doesn't need a name at all, and becomes defined only in terms of its parent and children. In effect, the name of the node would be the key for which self.parent.children[key] returns self. Parentless nodes would be nameless.
A disadvantage of this is that a stored Dataset object has no idea who its parent is.
None of the tree implementations I've seen work like this, and it appears to deviate from the way that a "tree" is defined mathematically.
The anytree library uses named nodes and tuples to store the children, so to use dictionaries we would need to reimplement the NodeMixin class to use a dictionary instead.
The text was updated successfully, but these errors were encountered:
A related question is whether the set of child nodes is an ordered set or not. In the mathematical definition of a tree it is unordered, but I'm not sure whether order of nodes matters for certain filetypes or not. By using a tuple or list to store children we are implicitly ordering the tree, compared to using a set (or pre-python 3.6 dict).
Even if we stick with an ordered type for storing the children we still have to decide if our trees are ordered or not, because it matters when checking equivalence between trees.
It might make sense to just choose the more general option (i.e.ordered), and then have flags to treat trees as unordered when it matters.
A tree-like structure must have nodes, each of which can contain multiple children, and those children have to be selectable via some kind of name. However, those names can either be keys to access the child objects, or inherent properties of the child objects.
In the former case we would have a
node.children=tuple(child1, child2)
, wherechild1.name = 'steve'
,child2.name = 'mary'
etc. In the latter case we would havenode.children=dict('steve': child1, 'mary': child2)
, where each child need not have a name. It's not clear to me which of these approaches is better in our case.It's easy to ensure that all nodes have names (and if we make nodes inherit from Dataset they will inherit a name), but storing children in tuples leads to annoying code like
child_we_want = next(c for c in node.children if c.name == name_we_want)
, instead of justchild_we_want = node[name_we_want]
. ADataTree
is also quite intuitively represented by a nested dictionary where keys are parts of a path and values are either datasets or child nodes, and in that description we would not say that the name key is an inherent property of the value.Using a dictionary also means that the path to an object is distinct from the name of that object.
This also means that a node doesn't need a name at all, and becomes defined only in terms of its parent and children. In effect, the name of the node would be the key for which
self.parent.children[key]
returnsself
. Parentless nodes would be nameless.A disadvantage of this is that a stored Dataset object has no idea who its parent is.
None of the tree implementations I've seen work like this, and it appears to deviate from the way that a "tree" is defined mathematically.
The anytree library uses named nodes and tuples to store the children, so to use dictionaries we would need to reimplement the
NodeMixin
class to use a dictionary instead.The text was updated successfully, but these errors were encountered: