-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Basic DataView Implementation #543
Conversation
Codecov Report
@@ Coverage Diff @@
## master #543 +/- ##
==========================================
+ Coverage 37.08% 38.97% +1.88%
==========================================
Files 470 487 +17
Lines 26027 26800 +773
Branches 3961 4066 +105
==========================================
+ Hits 9652 10445 +793
+ Misses 15948 15899 -49
- Partials 427 456 +29
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to do a very thorough review so there are quite a few comments. Some of them are minor, some of them are more serious. I wasn't actively checking for typos but I marked the ones that I did notice.
Some other things that I didn't write in the comments:
-
Quite a few docstrings and trait definitions are copy-pasted and are incorrect (some are incorrect only in parts)
-
Some flake8 warnings, mainly to do with unused imports
-
I think the core parts should have 100% test coverage as they are foundations of a big new feature. I pointed out some of the missing tests with in code comments, but not all
Coverage report
Name Stmts Miss Branch BrPart Cover Missing -------------------------------------------------------------------------------------------------------- pyface/data_view/__init__.py 0 0 0 0 100% pyface/data_view/abstract_data_model.py 28 2 10 0 95% 180, 206 pyface/data_view/abstract_value_type.py 21 0 2 0 100% pyface/data_view/data_models/__init__.py 0 0 0 0 100% pyface/data_view/data_models/api.py 1 1 0 0 0% 11 pyface/data_view/data_models/array_data_model.py 72 8 28 4 88% 136, 158-159, 208, 214, 240, 247, 261, 135->136, 207->208, 213->214, 258->261 pyface/data_view/data_view_widget.py 2 2 0 0 0% 11-13 pyface/data_view/i_data_view_widget.py 21 21 2 0 0% 11-58 pyface/data_view/index_manager.py 68 1 20 1 98% 267, 266->267 pyface/data_view/value_types/__init__.py 0 0 0 0 100% pyface/data_view/value_types/api.py 5 0 0 0 100% pyface/data_view/value_types/constant_value.py 8 0 0 0 100% pyface/data_view/value_types/editable_value.py 12 0 2 0 100% pyface/data_view/value_types/no_value.py 7 0 0 0 100% pyface/data_view/value_types/numeric_value.py 35 0 0 0 100% pyface/data_view/value_types/text_value.py 4 0 0 0 100%
Overall I think the API is good! There are some parts that confused me (as pointed out in the comments) but either with some fixes or additional documentation it should be easy to understand and easy to use.
Assuming that the tests pass, I think this is ready for a re-review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the changes! Now the test coverage of the new code is 100% 🎉
There are a few remaining comments from the previous review. This batch of comments is mostly about copy-paste docstrings. I added suggestions where I could to make them easy to address.
from pyface.data_view.index_manager import TupleIndexManager | ||
|
||
|
||
class ArrayDataModel(AbstractDataModel): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The docstrings of the methods in this class still need fixing.
OK, once more into the breach. It would be good to get this finished. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the interest of making my review visible, I am pushing the review comments I have so far. It is really taking me a long time to review this, and I still haven't got to the toolkit-specific implementation yet. Given I have not finished reviewing the whole thing, and it is possible that some of my comments are stupid / plainly wrong. Please excuse those...
I try to categorize my reviews into three types:
(1) Design
(2) Implementation details
(3) Styling / nitpicks.
In this first pass, I try to focus more on (1), and less on (2) and (3)
docs/source/data_view.rst
Outdated
================= | ||
|
||
The Pyface DataView API allows visualization of heirarchical and | ||
non-heirarchical tabular data. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
non-heirarchical tabular data. | |
non-hierarchical tabular data. |
Styling / nit: There are a few more occurrences of the same misspelling which can be fixed by doing a find-and-replace-all from "heirarch"
to "hierarch"
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, I alwys get it wrong :(
if role == Qt.EditRole: | ||
if value_type.can_edit(self.model, row, column): | ||
return value_type.set_editable(self.model, row, column, value) | ||
elif role == Qt.TextRole: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no Qt.TextRole
, but there is Qt.DisplayRole
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. Somewhat surprising that this is working.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The next question is... why didn't the tests detect this? 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The tests didn't detect it because this is UI scaffolding code that is likely to be replaced/re-written. And looking more closely, it probably never gets hit because the role is always Qt.EditRole
when called from existing Qt objects.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implementation details:
can_edit
is not checked before set_text
is called in this branch. If we are going to keep this branch, perhaps can_edit
should be checked early before all these branches.
I can't seem to respond to this comment directly:
I want to keep the name distinct from the |
I think this now addresses everything except the ui dispatch audit that needs doing (mainly to make sure it is consistent for now). There is also a bit of re-work needed for the documentation, as I haven't updated |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for the new changes. I have gone through them again, they look pretty good.
Summary / To do:
- Agreed the “dispatch=‘ui’” should be audited, preferably moved out of the data model.
- The new _AtLeastTwoDArray: validate may be more strict. We should avoid overriding a private method in Array.
- Revisit whether set_value should also be responsible for firing an event to ‘values_changed’, or whether that responsibility can be moved down to lower level code.
- Like you said, a few of the docstring and returned values for the ‘set_something’ need to be updated (e.g. raise instead of returning a boolean)
There are a few more comments on the scaffolding code for demo purposes. But because we are definitely going to rewrite them, those don't need to be changed now as long as there are reminders for when they are rewritten.
def validate(self, object, name, value): | ||
from numpy import atleast_2d | ||
value = super().validate(object, name, value) | ||
return atleast_2d(value) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a first cut, we probably want to be strict first, and lenient later when we have use cases to support it. In other words, we should just raise TraitError
if the number of dimension is less than 2, and not bother changing its shape using atleast_2d
. e.g. it could be the application code's responsibility to ensure the number of dimensions is sufficient before assigning the value to ArrayDataModel.data
. Maybe a 3D array is more appropriate in an application. Maybe a 1D array of shape (10, )
should be turned into a shape (10, 1)
instead of (1, 10)
. By raising a validation error, we allow these issues to be presented to the developers early and this trait type does not have to guess what shape should be used instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm happy with it not raising an exception right now: the case that I was most concerned about was actually if no data is provided (we get an empty view); and a single column is a sane default for a 1D array (way better than a single row if the array is large). If someone really doesn't like it, they can provide data in the shape that they really want without too much difficulty.
I think that these are the right default choices.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a single column is a sane default for a 1D array (way better than a single row if the array is large).
Wait, given an array of shape (X, Y)
, isn't X the number of rows and Y the number of columns?
If the first dimension is the number of rows, I think atleast_2d
returns a single row, rather than a single column.
If someone really doesn't like it, they can provide data in the shape that they really want without too much difficulty.
The same can be applied to when a validation occurs, it is easy to put it right. Trying to guess an orientation when a 1D array is provided is "guessing-what-you-mean", which is not a good thing to do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Argh. You're right, it does produce a single row. That's not optimal. I'll fix it one way or another.
In terms of APIs, I feel that there are different standards depending on who the user of the API is. In this case ArrayDataModel
is designed to be used by other programmers who want to display their data, so there is value in affordances where 90%+ of the time "what you mean" is clear, particularly if the other 10% is easy to make the meaning clear (ie. we aren't preventing the other 10% of uses). By saving some effort for the 90% case, you make the API more ergonomic.
There is an issue which this has raised though, which is that atleast_2d
doesn't modify in place (good!) but it's not the original array that is assigned (bad!) - I have to have another look at what implicit promises Array
makes in terms of identity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Following up on the last point - for better or worse, the validate method of the Array
trait does not guarantee that the object stored in the trait is the same object that was set, even in the non-coercing case. In particular there are places where asarray
is called, which may create new ndarray
instances.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And following up on the last point; traits.trait_numeric
really needs a clean-up or re-implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Woops, I did not notice that the array may be a copy. Agreed that's bad, I think we want to keep the original array.
This is quite similar to the problems / surprises people often run into when they assign a list to a trait of List
type... the list is copied implicitly.
Maybe we just want an instance of array here for now. Developers should make sure they assign an array with at least two dimensions: If they don't, bad things will happen anyway, it just happens later (less ideal). We can introduce the validation logic later after cleaning up trait_numeric
. It seems more important to be making sure the original array is used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think trait_numeric
is going to need a re-write. The name is a hint: Numeric was the predecessor to NumPy.
But even if we use Array
we aren't giving any guarantees for the user on identity of the array. I've simplified things out, but I'm also happy to revert to Array
with a shape (0, 0) default supplied by a default handler with the idea of fixing later.
pyface/data_view/index_manager.py
Outdated
Parameters | ||
---------- | ||
id : int | ||
An integer object id value. | ||
|
||
Returns | ||
------- | ||
index : index object | ||
The persistent index object associated with this id. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The returned value and parameters are still swapped here.
pyface/data_view/index_manager.py
Outdated
Parameters | ||
---------- | ||
id : int | ||
An integer object id value. | ||
|
||
Returns | ||
------- | ||
index : index object | ||
The persistent index object associated with this id. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The returned value and parameters are still swapped here.
return True | ||
|
||
return False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did I understand correctly that the return True
can be replaced with return
(to terminate) and the return False
will be replaced with raise DataViewSetError()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missed that method.
self.on_values_changed, | ||
'values_changed', | ||
remove=True, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a reminder of what has not been resolved: When DataViewWidget.destroy
is called, we want this DataViewItemModel
to go, but if it is somehow kept alive (e.g. by reference cycles), then the change handler would still be alive and be invoked when the values_changed
and structure_changed
change. We probably want to make sure that does not happen by making sure observe(..., remove=True)
is always called at the end.
I understand this scaffolding code will get rewritten. With that in mind, perhaps we can add a comment here or open an issue to remind us.
Returns | ||
------- | ||
success : bool | ||
Whether or not the value was successfully set. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs to be raise DataViewSetError
I think.
Right, I think everything is now addressed. May have missed some things, but I think I got them all. Ready for any needed re-review assuming tests pass. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Most comments are surrounding the documentation and examples.
to represent the rows and columns, as illustrated below: | ||
|
||
.. figure:: images/data_view_indices.png | ||
:alt: an illustration of data view indices |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! 👍
Co-authored-by: Kit Choi <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, with the caveats that some of the code being merged here are actually scaffolding meant to be removed later.
It wasn't obvious to me at all until the last iteration that the examples are not ready to be consumed more widely: They are going into master and look clean at the superficial level. #569 is open to remind us to clean them up before the release.
For future considerations: Perhaps this is why I tend to keep demo code in a separate branch instead of trying to get them into master: With clear instructions, reviewers are capable of checking out a demo branch for playing with demo code. If it is in a separate branch, it is obvious for the reviewer that they are draft and try to understand them in that context. If it is going into master, reviewers may think they are aiming for production, wasting some review cycles trying to fix up things that are not meant to be production ready. Even if it was made clear that it is not aiming for production, there is always this nag in the back of the head "what if we did not come back to clean this up".
Note: partially resolves #533 |
This aims to resolve #531.
This provides an abstract model class, with basic front-ends for Qt and Wx. It provides two concrete implementations: an n-dimensional array model, and a model which presents has traits classes as columns (the latter is an example).
Remaining tasks are:
Some tips for reading the code:
So when reviewing, what really matters is the design of the AbstractDataModel class. Does it seem good? Do you see problems with implementing your own version of it for, say, a pandas dataframe? Beyond that, it is a question of code cleanliness, working examples, and test coverage.