Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

raise AttributeError for __asdf_traverse__ on NDArrayType #1572

Merged
merged 3 commits into from
Aug 7, 2023

Conversation

braingram
Copy link
Contributor

NDArrayType does not implement this method and if not intercepted (and an AttributeError raised) than AsdfFile.info will load all array data

Fixes: #1553

This PR adds a regression test for this bug.

@braingram braingram added the development No backport required label Jun 27, 2023
@github-actions github-actions bot modified the milestone: 3.0.0 Jun 27, 2023
@braingram braingram marked this pull request as ready for review June 27, 2023 17:50
@braingram braingram requested a review from a team as a code owner June 27, 2023 17:50
asdf/_tests/_issues/test_1553.py Outdated Show resolved Hide resolved
@braingram braingram force-pushed the info_array_loading branch 2 times, most recently from bbd9901 to 7cea2ad Compare July 14, 2023 20:26
@braingram
Copy link
Contributor Author

Following up on discussion with @perrygreenfield and comments by @WilliamJamieson and @pllim on test naming here.

It was discussed that it might be easiest to address the issue in this smaller PR compared instead of #1537 (which makes other large changes).

It was never my intention to discourage engagement or make this project 'hostile' to developers. If these changes did anything of the sort I apologize and am happy to not use the naming structure. I never had strong opinions about it but hopefully I can explain why I chose the structure and we can reach a happy compromise that addresses the valid and helpful concerns that everyone has raised.

During work on #1537 I encountered many issues (see the PR for the full list). Many of these were quite high level (similar to issues that an end-user might raise using the public API). The reproducers for these high level issues are often not good unit tests as they cover a wide surface area of the code. It occurred to me that it might be nice to separate the regression tests added alongside fixing these issues from unit tests (and in the case of asdf the mostly unorganized test suite to hopefully work towards a more organized structure). Separating the unit and regression tests would have a few benefits, a major one being that it would be relatively easy to run coverage on just the unit tests (to reveal obvious gaps). In addition to separating unit and regression tests, it occurred to me that the issues often contain discussion, external links, failed exploration, and other information that often does not make it into the code, commits and comments with the PR that fixes the issue. As this information can be useful if a regression tests starts failing it seems beneficial to make the issue number prominent in a failed test run (and not a comment in the regression test requiring one to open the source for the failed test). Given these reasons I put forward the structure that was proposed in #1537 that regression tests for issues be placed in _issues/test_<issue number> and be named test_<issue number.

As noted, there are issues with this approach:

  • the issue number, in itself, is not information
  • migration to a new system (not github) means that issue numbers are not permanent
  • no one memorizes issue numbers
  • what if the test span multiple modules?
  • what if a future issue is opened about the old issue?
  • additionally the issue number as show above is redundant if in the file name and test name

I like @WilliamJamieson suggestion of test classes. This seems like a way to link the issue number to the regression tests (and provides obvious grouping for tests related to the issue). I do not mean to dismiss this suggestion and would be happy to go with this approach. What I'm proposing in this PR is to:

  • change the test naming to something descriptive (for this PR it's currently test_asdf_info_loads_arrays)
  • include a brief description and link to the issue in the docstring
  • rename _issues to _regtests to further make these distinct from unit tests
  • keep the files as test_<issue number>.py. As pytest usually includes both the file name and test name in output this means that both the issue number and descriptive name are apparent on a failed test run

A few reasons for keeping the issue number as the filename are:

  • as noted above it is printed on a failed test run and can be useful for finding the wider context contained in the issue discussion
  • during work on the issue, where only one issue is in mind, it makes it easy to type/complete while working on the issue pytest asdf/_tests/_regtests/test_1572.py or pytest -k test_1572
  • this might help with migration if the code is moved from github. Each file can be edited to point to the new issue (if the issues can be migrated) or renamed based on whatever strategy is devised at the time.

I don't believe I have yet responded to @pllim question about what if a new issue is opened for the same problem (also please let me know if I've missed any other questions and concerns). If the original regression test failed to produce the same issue, it seems reasonable to delete it and replace it with a new test (linking back to the new issue, which links to the old issue, probably also including a description of the old and new issues in the docstring of the test). If the new failure is sufficiently distinct, adding a second regression test might be a sensible option.

I am happy to add to the contributing documentation describing whatever structure we agree upon.

I don't mean to belabor anything but I am not strongly committed to these changes I only found it useful to organize things by issue while attempting to tackle the numerous issues exposed during rewriting the block management. I am happy to adopt test classes as @WilliamJamieson suggested I only wanted to raise the option of keeping the issue name in the filename as I had to pick a file name for the new tests and this seemed like a way to insert the issue number in a way that should be easy to migrate.

@pllim
Copy link
Contributor

pllim commented Jul 14, 2023

If you want to separate unit test from regtest, sure, have sub-dirs. Though personally I don't find that distinction useful in practice. It is either run in CI or not. But maybe useful if you want users to only run pytest tests/unit_tests/ or something.

tests/
    unit_tests/
    regtests/

Issue number and links are only useful if you decide you want to revisit old discussions. If the test name is descriptive and purpose of the test is clear, I find this need rare. I think something like this would be good enough for most cases.

test_chunk_overflow():
    """Test such and such bug that happens when this and that.
    See <GitHub issue link> for more info.
    """
    # test_code_here

In the future, if someone else run into another edge case of this bug, it is immediately clear which function they have to add to. And they also have access to the link if they want to know more. Even if there is no link provided, git blame would take them to the original PR.

@pllim
Copy link
Contributor

pllim commented Jul 14, 2023

Thank you for the lengthy clarification! I appreciate that you took time to clarify things.

@braingram
Copy link
Contributor Author

devdeps will likely fail due to pyyaml and cython3 incompatibility: yaml/pyyaml#601

Copy link
Contributor

@eslavich eslavich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@braingram
Copy link
Contributor Author

@WilliamJamieson does the proposed test structure/naming in this PR address your concern? #1572 (comment)

NDArrayType does not implement this method and if not
intercepted (and an AttributeError raised) than AsdfFile.info
will load all array data

Fixes: asdf-format#1553
@braingram braingram dismissed WilliamJamieson’s stale review August 7, 2023 19:11

Discussion is out of date.

@braingram braingram merged commit bf1843e into asdf-format:main Aug 7, 2023
22 checks passed
@braingram braingram deleted the info_array_loading branch August 7, 2023 19:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
development No backport required
Projects
None yet
Development

Successfully merging this pull request may close these issues.

AsdfFile.info loads array data
4 participants