Renamed failing test, added prospective use case for NWBI warning. #1162

TheChymera · 2022-11-21T11:47:19Z

Addressing #1158

Properly rename the test which is now testing that we properly fail on a critical error
Formulate a new test for checking that we properly display non-critical errors (i.e. WARNINGs)

The issue with the latter is that for validation via DANDI there is only one none-critical type of error, namely check_data_orientation. Though going by the comment that too appears to be temporary.

@CodyCBakerPhD (1) do you know how I could create a test case for this if this will likely be long-term-stable behaviour or (2) is there anything else we could add as BEST_PRACTICE_VIOLATION (apparently the second-highest criterion for NWBI errors) so that we can check warning reporting functionality?

codecov · 2022-11-21T11:52:07Z

Codecov Report

Base: 88.22% // Head: 88.23% // Increases project coverage by +0.01% 🎉

Coverage data is based on head (d0ef688) compared to base (b11835a).
Patch coverage: 100.00% of modified lines in pull request are covered.

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #1162      +/-   ##
==========================================
+ Coverage   88.22%   88.23%   +0.01%     
==========================================
  Files          73       73              
  Lines        8804     8817      +13     
==========================================
+ Hits         7767     7780      +13     
  Misses       1037     1037

Flag	Coverage Δ
unittests	`88.23% <100.00%> (+0.01%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
dandi/files/bases.py	`76.72% <ø> (ø)`
dandi/cli/tests/test_cmd_validate.py	`100.00% <100.00%> (ø)`
dandi/tests/fixtures.py	`98.25% <100.00%> (+0.06%)`	⬆️
dandi/download.py	`88.04% <0.00%> (-0.34%)`	⬇️
dandi/cli/cmd_validate.py	`73.86% <0.00%> (+2.27%)`	⬆️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

CodyCBakerPhD · 2022-11-21T13:48:12Z

The code at L504-L509 was designed to be extremely minimal on the DANDI side; in particular, prior to the introduction of dandi warning levels (this new severity thing) the Inspector was only capable of returning items that ought to prevent validation of files (CRITICAL and above with the "dandi" config). But the Inspector is capable of running many more (40+) checks for warnings/hints.

In the relevant dandi-side code:

                for error in inspect_nwb(
                    nwbfile_path=self.filepath,
                    skip_validate=True,
                    config=load_config(filepath_or_keyword="dandi"),
                    importance_threshold=Importance.CRITICAL,
                ):

The idea being, previously, that someone could merely run nwbinspector prior to or separate from dandi validate in order to get complete feedback. But now that DANDI is going more in this direction, I guess the goal is to output all output from the Inspector.

So to achieve this, all you need to do is not specify the importance_threshold argument above (default is to include all levels), and you'll get the entire output stream.

For reference, you can find the config maintained here: https://github.com/NeurodataWithoutBorders/nwbinspector/blob/dev/src/nwbinspector/internal_configs/dandi.inspector_config.yaml

CodyCBakerPhD · 2022-11-21T13:51:10Z

Formulate a new test for checking that we properly display non-critical errors (i.e. WARNINGs)

Note that a bare-minimal NWBFile (one with only a sesssion_start_time and a couple other fields; see https://github.com/NeurodataWithoutBorders/nwbinspector/blob/dev/src/nwbinspector/tools.py#L16-L18 for a nice helper for generating...) will trigger many such 'hints' regarding various missing metadata (BEST_PRACTICE_SUGGESTIONS).

BEST_PRACTICE_VIOLATIONS (your 'warnings' I guess) would likely require adding a 'bad' type of data object at some point in the file; typically a TimeSeries in acquisition is used for simplicity. One example could be a mismatch between the zero-axis shape and the length of the timestamps field.

TheChymera · 2022-11-21T15:30:22Z

@CodyCBakerPhD thank you for the description of the mechanics, I think I understand what the main issue is now :) Sadly, as soon as I set importance_threshold to Importance.BEST_PRACTICE_SUGGESTION that also exposes additional errors (of higher severity), the following being the command outputs I get:

dandi/cli/tests/test_cmd_validate.py::test_validate_nwb_path_grouping BEST_PRACTICE_SUGGESTION
ERROR
/tmp/pytest-of-chymera/pytest-9/simple21/simple2.nwb: 2 issues detected.
  [NWBI.check_description] Description is missing.
  [NWBI.During io.read() - <class 'TypeError'>: 'InspectorMessage' object is not iterable] Traceback (most recent call last):
  File "/usr/lib/python3.10/site-packages/nwbinspector/nwbinspector.py", line 538, in inspect_nwb
    for inspector_message in run_checks(nwbfile=nwbfile, checks=checks):
  File "/usr/lib/python3.10/site-packages/nwbinspector/nwbinspector.py", line 576, in run_checks
    for x in output:
TypeError: 'InspectorMessage' object is not iterable

/tmp/pytest-of-chymera/pytest-9/simple21/simple2.nwb: 2 issues detected.
  [NWBI.check_description] Description is missing.
  [NWBI.During io.read() - <class 'TypeError'>: 'InspectorMessage' object is not iterable] Traceback (most recent call last):
  File "/usr/lib/python3.10/site-packages/nwbinspector/nwbinspector.py", line 538, in inspect_nwb
    for inspector_message in run_checks(nwbfile=nwbfile, checks=checks):
  File "/usr/lib/python3.10/site-packages/nwbinspector/nwbinspector.py", line 576, in run_checks
    for x in output:
TypeError: 'InspectorMessage' object is not iterable

Do you perhaps have an inkling what this might be about, or should I upload a separate branch with the actual code that produces it?

In any case I can't trigger any BEST_PRACTICE_VIOLATION errors thus far, since there appears to be only the one check that triggers it for DANDI. I'll go ahead and tackle that next, but the fact that higher-severity errors get exposed specifically when a lower-severity threshold is specified is in any case also worth exploring, though admittedly less urgent, from the DANDI end at least.

TheChymera · 2022-11-21T15:46:44Z

@CodyCBakerPhD I have added a50c4fe which should exemplify the status of the code that produces the issue mentioned above. I'll switch it back to BEST_PRACTICE_VIOLATION in the next one. Regarding that, do you perhaps know how I can create an NWB file which has a timeseries and therefore can produce the BEST_PRACTICE_VIOLATION importance level without at the same time missing a subject? The code examples I found for timeseries do not have a subject, and I wasn't able to add a timeseries to an NWB object with subject info:

@pytest.fixture(scope="session")
def simple4_nwb(
    simple1_nwb_metadata: Dict[str, Any], tmp_path_factory: pytest.TempPathFactory
) -> str:
    """
    With, subject, subject_id, species, but including data orientation ambiguity,
    the only currently non-critical issue in the dandi schema for nwbinspector validation:
    https://github.com/NeurodataWithoutBorders/nwbinspector/blob/
        54ac2bc7cdcb92802b9251e29f249f155fb1ff52
        /src/nwbinspector/internal_configs/dandi.inspector_config.yaml#L10
    """
    from datetime import datetime, timezone
    start_time = datetime(2017, 4, 3, 11, tzinfo=timezone.utc)
    create_date = datetime(2017, 4, 15, 12, tzinfo=timezone.utc)
    time_series=pynwb.TimeSeries(
            name="test_time_series",
            unit="test_units",
            data=np.zeros(shape=(2, 100)),
            rate=1.0,
    )
    nwbfile = NWBFile(
        session_description="demonstrate external files",
        identifier="NWBE4",
        session_start_time=start_time,
        file_create_date=create_date,
        age="P1D/",
        sex="O",
        species="Mus musculus",
    )
    nwbfile.add_acquisition(time_series)
    filename = str(tmp_path_factory.mktemp("simple4") / "simple4.nwb")
    with pynwb.NWBHDF5IO(filename, "w") as io:
        io.write(nwbfile, cache_spec=False)
    return filename

gives me:

E   TypeError: NWBFile.__init__: unrecognized argument: 'age', unrecognized argument: 'sex', unrecognized argument: 'species'

CodyCBakerPhD · 2022-11-21T16:03:22Z

Subjects are added as

    nwbfile = NWBFile(
        session_description="demonstrate external files",
        identifier="NWBE4",
        session_start_time=start_time,
        subject=Subject(
            age="P1D/",
            sex="O",
            species="Mus musculus",
       )
    )

I'm looking into those other issues now

also improved test function docstrings

there must be a better way of putting longer links in docstrings...

TheChymera · 2022-11-22T13:01:15Z

@CodyCBakerPhD thanks again for your help :) I think I finally got a test case nailed down. As for the aforementioned threshold issue, that's not strictly related to this PR, for now we're setting importance_threshold=Importance.BEST_PRACTICE_VIOLATION, though at some point in the future it would be cool if we could go even lower and get all the hints through as well. I added a comment to that effect.

dandi/tests/fixtures.py

Renamed failing test, added prospective use case for NWBI warning.

b97689e

TheChymera marked this pull request as draft November 21, 2022 11:48

importance_threshold and debugging print calls

a50c4fe

This was referenced Nov 21, 2022

Add test confirming expected config+threshold behavior NeurodataWithoutBorders/nwbinspector#305

Merged

Add downstream DANDI CI NeurodataWithoutBorders/nwbinspector#306

Merged

TheChymera added 2 commits November 22, 2022 07:56

Created and using new fixture for non-critical NWBI error

ede4a20

also improved test function docstrings

Indentation fix

b6ee44a

there must be a better way of putting longer links in docstrings...

TheChymera marked this pull request as ready for review November 22, 2022 13:43

TheChymera requested review from jwodder and yarikoptic November 22, 2022 13:43

jwodder requested changes Nov 22, 2022

View reviewed changes

dandi/tests/fixtures.py Outdated Show resolved Hide resolved

Imports at top of module

d0ef688

TheChymera requested a review from jwodder November 23, 2022 12:19

jwodder approved these changes Nov 23, 2022

View reviewed changes

TheChymera merged commit 0439caf into master Nov 24, 2022

TheChymera deleted the error_fix branch November 24, 2022 16:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Renamed failing test, added prospective use case for NWBI warning. #1162

Renamed failing test, added prospective use case for NWBI warning. #1162

TheChymera commented Nov 21, 2022 •

edited

Loading

codecov bot commented Nov 21, 2022 •

edited

Loading

CodyCBakerPhD commented Nov 21, 2022

CodyCBakerPhD commented Nov 21, 2022 •

edited

Loading

TheChymera commented Nov 21, 2022 •

edited

Loading

TheChymera commented Nov 21, 2022

CodyCBakerPhD commented Nov 21, 2022

TheChymera commented Nov 22, 2022

Renamed failing test, added prospective use case for NWBI warning. #1162

Renamed failing test, added prospective use case for NWBI warning. #1162

Conversation

TheChymera commented Nov 21, 2022 • edited Loading

codecov bot commented Nov 21, 2022 • edited Loading

Codecov Report

CodyCBakerPhD commented Nov 21, 2022

CodyCBakerPhD commented Nov 21, 2022 • edited Loading

TheChymera commented Nov 21, 2022 • edited Loading

TheChymera commented Nov 21, 2022

CodyCBakerPhD commented Nov 21, 2022

TheChymera commented Nov 22, 2022

TheChymera commented Nov 21, 2022 •

edited

Loading

codecov bot commented Nov 21, 2022 •

edited

Loading

CodyCBakerPhD commented Nov 21, 2022 •

edited

Loading

TheChymera commented Nov 21, 2022 •

edited

Loading