Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Basic DataView Implementation #543

Merged
merged 52 commits into from
Jul 14, 2020
Merged

Basic DataView Implementation #543

merged 52 commits into from
Jul 14, 2020

Conversation

corranwebster
Copy link
Contributor

@corranwebster corranwebster commented Jun 16, 2020

This aims to resolve #531.

This provides an abstract model class, with basic front-ends for Qt and Wx. It provides two concrete implementations: an n-dimensional array model, and a model which presents has traits classes as columns (the latter is an example).

Remaining tasks are:

  • tests for column model (or alternatively remove from this PR)
  • tests for a couple of value type classes (constant and none)
  • docstrings and basic documentation
  • wx data view needs some fixes

Some tips for reading the code:

  • there are two core ABCs in this PR: AbstractDataModel and AbstractValueType. The documentation gives more detail on these, but the idea of AbstractDataModel gives the shape and raw values for each cell; AbstractValueType handles converting raw values to formats which are useful to the UI
  • AbstractValueType is not complete: it needs to grow methods for Colors, Images, etc. in future PRs
  • the IndexManagers handle converting from toolkit indices to sequence-based row indices, plus ensuring that referenced things don't get garbage collected out from underneath the toolkit C++ code. This is the most opaque bit of the code if you haven't written low-level table models, so on initial reading you can consider it "magic".
  • the ArrayDataModel is a concrete implementation of AbstractDataModel that displays an n-dim array as a heirarchical view.
  • there are a bunch of concrete AbstractValueType classes to handle common types and support demos. This is not all the value types that will be needed, but it is enough to demonstrate capability.
  • the UI code is essentially scaffolding: enough to get things up and running. It will change or be replaced.

So when reviewing, what really matters is the design of the AbstractDataModel class. Does it seem good? Do you see problems with implementing your own version of it for, say, a pandas dataframe? Beyond that, it is a question of code cleanliness, working examples, and test coverage.

@codecov-commenter
Copy link

codecov-commenter commented Jun 16, 2020

Codecov Report

Merging #543 into master will increase coverage by 1.88%.
The diff coverage is 76.32%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #543      +/-   ##
==========================================
+ Coverage   37.08%   38.97%   +1.88%     
==========================================
  Files         470      487      +17     
  Lines       26027    26800     +773     
  Branches     3961     4066     +105     
==========================================
+ Hits         9652    10445     +793     
+ Misses      15948    15899      -49     
- Partials      427      456      +29     
Impacted Files Coverage Δ
pyface/__init__.py 82.60% <ø> (ø)
pyface/ui/qt4/init.py 62.50% <33.33%> (ø)
pyface/ui/qt4/data_view/data_view_item_model.py 49.06% <49.06%> (ø)
pyface/ui/qt4/workbench/split_tab_widget.py 12.47% <50.00%> (ø)
pyface/ui/wx/data_view/data_view_model.py 56.36% <56.36%> (ø)
pyface/qt/QtNetwork.py 75.00% <60.00%> (ø)
pyface/qt/QtOpenGL.py 75.00% <60.00%> (ø)
pyface/qt/QtSvg.py 75.00% <60.00%> (ø)
pyface/qt/QtTest.py 75.00% <60.00%> (ø)
pyface/qt/QtWebKit.py 68.42% <60.00%> (ø)
... and 67 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c98d1d3...516cc22. Read the comment docs.

Copy link
Contributor

@ievacerny ievacerny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to do a very thorough review so there are quite a few comments. Some of them are minor, some of them are more serious. I wasn't actively checking for typos but I marked the ones that I did notice.

Some other things that I didn't write in the comments:

  • Quite a few docstrings and trait definitions are copy-pasted and are incorrect (some are incorrect only in parts)

  • Some flake8 warnings, mainly to do with unused imports

  • I think the core parts should have 100% test coverage as they are foundations of a big new feature. I pointed out some of the missing tests with in code comments, but not all

    Coverage report
    Name                                                         Stmts   Miss Branch BrPart  Cover   Missing
    --------------------------------------------------------------------------------------------------------
    pyface/data_view/__init__.py                                     0      0      0      0   100%
    pyface/data_view/abstract_data_model.py                         28      2     10      0    95%   180, 206
    pyface/data_view/abstract_value_type.py                         21      0      2      0   100%
    pyface/data_view/data_models/__init__.py                         0      0      0      0   100%
    pyface/data_view/data_models/api.py                              1      1      0      0     0%   11
    pyface/data_view/data_models/array_data_model.py                72      8     28      4    88%   136, 158-159, 208, 214, 240, 247, 261, 135->136, 207->208, 213->214, 258->261
    pyface/data_view/data_view_widget.py                             2      2      0      0     0%   11-13
    pyface/data_view/i_data_view_widget.py                          21     21      2      0     0%   11-58
    pyface/data_view/index_manager.py                               68      1     20      1    98%   267, 266->267
    pyface/data_view/value_types/__init__.py                         0      0      0      0   100%
    pyface/data_view/value_types/api.py                              5      0      0      0   100%
    pyface/data_view/value_types/constant_value.py                   8      0      0      0   100%
    pyface/data_view/value_types/editable_value.py                  12      0      2      0   100%
    pyface/data_view/value_types/no_value.py                         7      0      0      0   100%
    pyface/data_view/value_types/numeric_value.py                   35      0      0      0   100%
    pyface/data_view/value_types/text_value.py                       4      0      0      0   100%
    

Overall I think the API is good! There are some parts that confused me (as pointed out in the comments) but either with some fixes or additional documentation it should be easy to understand and easy to use.

docs/source/data_view.rst Outdated Show resolved Hide resolved
pyface/data_view/data_models/array_data_model.py Outdated Show resolved Hide resolved
pyface/data_view/data_models/array_data_model.py Outdated Show resolved Hide resolved
pyface/data_view/index_manager.py Outdated Show resolved Hide resolved
docs/source/data_view.rst Outdated Show resolved Hide resolved
docs/source/data_view.rst Outdated Show resolved Hide resolved
docs/source/data_view.rst Outdated Show resolved Hide resolved
docs/source/data_view.rst Outdated Show resolved Hide resolved
@corranwebster
Copy link
Contributor Author

Assuming that the tests pass, I think this is ready for a re-review.

Copy link
Contributor

@ievacerny ievacerny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the changes! Now the test coverage of the new code is 100% 🎉

There are a few remaining comments from the previous review. This batch of comments is mostly about copy-paste docstrings. I added suggestions where I could to make them easy to address.

pyface/data_view/abstract_data_model.py Outdated Show resolved Hide resolved
pyface/data_view/abstract_data_model.py Outdated Show resolved Hide resolved
pyface/data_view/abstract_data_model.py Outdated Show resolved Hide resolved
pyface/data_view/abstract_value_type.py Outdated Show resolved Hide resolved
pyface/data_view/abstract_value_type.py Outdated Show resolved Hide resolved
from pyface.data_view.index_manager import TupleIndexManager


class ArrayDataModel(AbstractDataModel):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docstrings of the methods in this class still need fixing.

pyface/data_view/index_manager.py Outdated Show resolved Hide resolved
pyface/data_view/index_manager.py Outdated Show resolved Hide resolved
pyface/data_view/value_types/editable_value.py Outdated Show resolved Hide resolved
pyface/data_view/tests/test_data_view_widget.py Outdated Show resolved Hide resolved
@corranwebster
Copy link
Contributor Author

OK, once more into the breach. It would be good to get this finished.

Copy link
Contributor

@kitchoi kitchoi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the interest of making my review visible, I am pushing the review comments I have so far. It is really taking me a long time to review this, and I still haven't got to the toolkit-specific implementation yet. Given I have not finished reviewing the whole thing, and it is possible that some of my comments are stupid / plainly wrong. Please excuse those...

I try to categorize my reviews into three types:
(1) Design
(2) Implementation details
(3) Styling / nitpicks.
In this first pass, I try to focus more on (1), and less on (2) and (3)

=================

The Pyface DataView API allows visualization of heirarchical and
non-heirarchical tabular data.
Copy link
Contributor

@kitchoi kitchoi Jun 29, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
non-heirarchical tabular data.
non-hierarchical tabular data.

Styling / nit: There are a few more occurrences of the same misspelling which can be fixed by doing a find-and-replace-all from "heirarch" to "hierarch".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, I alwys get it wrong :(

docs/source/data_view.rst Outdated Show resolved Hide resolved
pyface/data_view/abstract_data_model.py Show resolved Hide resolved
pyface/data_view/abstract_data_model.py Outdated Show resolved Hide resolved
examples/data_view/column_data_model.py Outdated Show resolved Hide resolved
pyface/data_view/abstract_value_type.py Outdated Show resolved Hide resolved
pyface/data_view/abstract_value_type.py Show resolved Hide resolved
pyface/data_view/abstract_value_type.py Outdated Show resolved Hide resolved
if role == Qt.EditRole:
if value_type.can_edit(self.model, row, column):
return value_type.set_editable(self.model, row, column, value)
elif role == Qt.TextRole:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no Qt.TextRole, but there is Qt.DisplayRole.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. Somewhat surprising that this is working.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The next question is... why didn't the tests detect this? 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tests didn't detect it because this is UI scaffolding code that is likely to be replaced/re-written. And looking more closely, it probably never gets hit because the role is always Qt.EditRole when called from existing Qt objects.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implementation details:
can_edit is not checked before set_text is called in this branch. If we are going to keep this branch, perhaps can_edit should be checked early before all these branches.

@corranwebster
Copy link
Contributor Author

corranwebster commented Jun 30, 2020

I can't seem to respond to this comment directly:

Can this method be called get_value and not be concerned with whether the value can be edited or not?

I want to keep the name distinct from the get_value returned by the model, as the editable thing may be different from the value (eg. imaging if the model returns a HasTraits object for get_value but the thing being edited is actually a particular attribute on the object).

@corranwebster
Copy link
Contributor Author

I think this now addresses everything except the ui dispatch audit that needs doing (mainly to make sure it is consistent for now). There is also a bit of re-work needed for the documentation, as I haven't updated set_value there, and I want to pull out the code into a proper working example.

Copy link
Contributor

@kitchoi kitchoi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for the new changes. I have gone through them again, they look pretty good.

Summary / To do:

  • Agreed the “dispatch=‘ui’” should be audited, preferably moved out of the data model.
  • The new _AtLeastTwoDArray: validate may be more strict. We should avoid overriding a private method in Array.
  • Revisit whether set_value should also be responsible for firing an event to ‘values_changed’, or whether that responsibility can be moved down to lower level code.
  • Like you said, a few of the docstring and returned values for the ‘set_something’ need to be updated (e.g. raise instead of returning a boolean)

There are a few more comments on the scaffolding code for demo purposes. But because we are definitely going to rewrite them, those don't need to be changed now as long as there are reminders for when they are rewritten.

def validate(self, object, name, value):
from numpy import atleast_2d
value = super().validate(object, name, value)
return atleast_2d(value)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a first cut, we probably want to be strict first, and lenient later when we have use cases to support it. In other words, we should just raise TraitError if the number of dimension is less than 2, and not bother changing its shape using atleast_2d. e.g. it could be the application code's responsibility to ensure the number of dimensions is sufficient before assigning the value to ArrayDataModel.data. Maybe a 3D array is more appropriate in an application. Maybe a 1D array of shape (10, ) should be turned into a shape (10, 1) instead of (1, 10). By raising a validation error, we allow these issues to be presented to the developers early and this trait type does not have to guess what shape should be used instead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy with it not raising an exception right now: the case that I was most concerned about was actually if no data is provided (we get an empty view); and a single column is a sane default for a 1D array (way better than a single row if the array is large). If someone really doesn't like it, they can provide data in the shape that they really want without too much difficulty.

I think that these are the right default choices.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a single column is a sane default for a 1D array (way better than a single row if the array is large).

Wait, given an array of shape (X, Y), isn't X the number of rows and Y the number of columns?
If the first dimension is the number of rows, I think atleast_2d returns a single row, rather than a single column.

If someone really doesn't like it, they can provide data in the shape that they really want without too much difficulty.

The same can be applied to when a validation occurs, it is easy to put it right. Trying to guess an orientation when a 1D array is provided is "guessing-what-you-mean", which is not a good thing to do.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Argh. You're right, it does produce a single row. That's not optimal. I'll fix it one way or another.

In terms of APIs, I feel that there are different standards depending on who the user of the API is. In this case ArrayDataModel is designed to be used by other programmers who want to display their data, so there is value in affordances where 90%+ of the time "what you mean" is clear, particularly if the other 10% is easy to make the meaning clear (ie. we aren't preventing the other 10% of uses). By saving some effort for the 90% case, you make the API more ergonomic.

There is an issue which this has raised though, which is that atleast_2d doesn't modify in place (good!) but it's not the original array that is assigned (bad!) - I have to have another look at what implicit promises Array makes in terms of identity.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Following up on the last point - for better or worse, the validate method of the Array trait does not guarantee that the object stored in the trait is the same object that was set, even in the non-coercing case. In particular there are places where asarray is called, which may create new ndarray instances.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And following up on the last point; traits.trait_numeric really needs a clean-up or re-implementation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Woops, I did not notice that the array may be a copy. Agreed that's bad, I think we want to keep the original array.
This is quite similar to the problems / surprises people often run into when they assign a list to a trait of List type... the list is copied implicitly.

Maybe we just want an instance of array here for now. Developers should make sure they assign an array with at least two dimensions: If they don't, bad things will happen anyway, it just happens later (less ideal). We can introduce the validation logic later after cleaning up trait_numeric. It seems more important to be making sure the original array is used.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think trait_numeric is going to need a re-write. The name is a hint: Numeric was the predecessor to NumPy.

But even if we use Array we aren't giving any guarantees for the user on identity of the array. I've simplified things out, but I'm also happy to revert to Array with a shape (0, 0) default supplied by a default handler with the idea of fixing later.

Comment on lines 317 to 325
Parameters
----------
id : int
An integer object id value.

Returns
-------
index : index object
The persistent index object associated with this id.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The returned value and parameters are still swapped here.

Comment on lines 411 to 419
Parameters
----------
id : int
An integer object id value.

Returns
-------
index : index object
The persistent index object associated with this id.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The returned value and parameters are still swapped here.

Comment on lines 235 to 237
return True

return False
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did I understand correctly that the return True can be replaced with return (to terminate) and the return False will be replaced with raise DataViewSetError()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missed that method.

self.on_values_changed,
'values_changed',
remove=True,
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a reminder of what has not been resolved: When DataViewWidget.destroy is called, we want this DataViewItemModel to go, but if it is somehow kept alive (e.g. by reference cycles), then the change handler would still be alive and be invoked when the values_changed and structure_changed change. We probably want to make sure that does not happen by making sure observe(..., remove=True) is always called at the end.

I understand this scaffolding code will get rewritten. With that in mind, perhaps we can add a comment here or open an issue to remind us.

Comment on lines 176 to 179
Returns
-------
success : bool
Whether or not the value was successfully set.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be raise DataViewSetError I think.

@corranwebster
Copy link
Contributor Author

Right, I think everything is now addressed. May have missed some things, but I think I got them all.

Ready for any needed re-review assuming tests pass.

Copy link
Contributor

@kitchoi kitchoi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Most comments are surrounding the documentation and examples.

docs/source/data_view.rst Outdated Show resolved Hide resolved
to represent the rows and columns, as illustrated below:

.. figure:: images/data_view_indices.png
:alt: an illustration of data view indices
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! 👍

docs/source/data_view.rst Outdated Show resolved Hide resolved
docs/source/data_view.rst Outdated Show resolved Hide resolved
docs/source/data_view.rst Outdated Show resolved Hide resolved
pyface/data_view/abstract_data_model.py Outdated Show resolved Hide resolved
pyface/data_view/index_manager.py Outdated Show resolved Hide resolved
pyface/data_view/index_manager.py Outdated Show resolved Hide resolved
pyface/data_view/index_manager.py Outdated Show resolved Hide resolved
pyface/data_view/index_manager.py Outdated Show resolved Hide resolved
Copy link
Contributor

@kitchoi kitchoi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, with the caveats that some of the code being merged here are actually scaffolding meant to be removed later.

It wasn't obvious to me at all until the last iteration that the examples are not ready to be consumed more widely: They are going into master and look clean at the superficial level. #569 is open to remind us to clean them up before the release.

For future considerations: Perhaps this is why I tend to keep demo code in a separate branch instead of trying to get them into master: With clear instructions, reviewers are capable of checking out a demo branch for playing with demo code. If it is in a separate branch, it is obvious for the reviewer that they are draft and try to understand them in that context. If it is going into master, reviewers may think they are aiming for production, wasting some review cycles trying to fix up things that are not meant to be production ready. Even if it was made clear that it is not aiming for production, there is always this nag in the back of the head "what if we did not come back to clean this up".

@corranwebster corranwebster merged commit 1d63210 into master Jul 14, 2020
@corranwebster
Copy link
Contributor Author

Note: partially resolves #533

@rahulporuri rahulporuri deleted the feat/data-view-basics branch October 27, 2020 17:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

New Tree/Table Editor: core functionality
5 participants