Tidies up Images.plot_and_save() #810

ns-rse · 2024-03-04T17:56:50Z

Closes #806

Whilst #806 was meant to address why no Matplotlib image was generated and tested/compared to a target image when plotting without axes or colorbar I discovered this was because the image is saved with Images.save_array_figure().

This in turn uses matplotlib.pyplot.imsave() to save the image and the test was using skimage.io.imread() to read the image back as an array and test the np.sum() and img.shape. The other method available is Images.save_fig() which in turn calls Matplotlibs matplot.pylig.savefig().

We no longer need to save arrays nor images as readable arrays (see #804 / #802) as the arrays are saved in the HDF5 .topostats files and so the need to save the image in this manner seemed redundant. The Images.save_figure() already has logic which excludes the axes and scale bars (see lines 321 of current changeset) although these have been further tweaked to use tight_layout.

Further the logic for deciding what to save within Image.plot_and_save() seemed overly complicated and if Images.save_array_figure() were being called Images.plot_and_save() did not return a fig (Matplotlib Image) that could be saved and tested using pytest-mpl extension, another reason for switching and in turn removal.

To this end I have...

Removed Images.save_array_figure() if people need the arrays they can use the HDF5 files).
Tweaked the plotting options under Images.savefig() starting at line 321 to use a tight
layout which ensures there is no border (see note below).
Simplified the logic in Images.plot_and_save() controlling whether images are saved so that ValueError are raised if Images.savefig == False or if Images.image_set in ["all", "core"] or self.core_set: evaluates to False. Ensures images are explicitly closed to reduce memory usage.
Added tests for the raising of ValueError exceptions in Images.plot_and_save()
Tidied up some test names to be consistent (resulted in image name change).
Tidied up passing of plotting_config into these tests.
Added a pytest-mpl test for test_plot_and_save_no_axes_no_colorbar(), adding the target image as required by the issue that started this.
Added an explanation of what image_set and core_set are to the Images() class definition as this confused me having not touched plotting in a number of months, hopefully useful for others who look at this code.

Note on Borders

The image generated by pytest-mpl against which comparisons are made does have a border because it is saving the returned fig object itself. The actual generated image doesn't have a border (manual checks have been made).

NB It also appears that an old image was included in the repository for this test using the full minicircle image rather than the cropped area that @SylviaWhittle slimmed the test suite down to in order to speed up tests.

An unexpected consequence of this change was that some target directories were not being created. I couldn't see why as I hadn't touched the code that did this. In fixing it I found that some other tests broke where images that were meant to be in the all image_set were apparently being created when the image_set == "core". See specific commit message for more details.

Closes #806 Whislt #806 was meant to address why no Matplotlib image was generated and tested/compared to a target image when plotting without axes or colorbar I discovered this was because the image is saved with `Images.save_array_figure()`. This in turn uses [`matplotlib.pyplot.imsave()`](https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.imsave.html) to save the image and the test was using [`skimage.io.imread()`](https://scikit-image.org/docs/stable/api/skimage.io.html#skimage.io.imread) to read the image back as an array and test the `np.sum()` and `img.shape`. We no longer need to save arrays nor images as readable arrays (see #804 / #802) and so the need to save the image in this manner seemed redudant. The `Images.save_figure()` already has logic which excludes the axes and scale bars (see lines 321 of current changeset). Further the logic for deciding what to save within `Image.plot_and_save()` seemed overly complicated and if `Images.save_array_figure()` were being called `Images.plot_and_save()` did not return a `fig` (Matplotlib Image) that could be saved and tested using `pytest-mpl` extension. To this end I have... 1. Removed `Images.save_array_figure()`. 2. Tweaked the plotting options under `Images.savefig()` starting at line 321 to use a [tight layout](https://matplotlib.org/2.0.2/users/tight_layout_guide.html) which ensures there is no border (see note below). 3. Simplified the logic in `Images.plot_and_save()` controlling whether images are saved so that `ValueError` are raised if `Images.savefig == False` or `if Images.image_set in ["all", "core"] or self.core_set:` evaluates to `False`. Ensures images are explicitly closed to reduce memory usage. 4. Added tests for the raising of `ValueError` exceptions in `Images.plot_and_save()` 5. Tidied up some test names to be consistent (resulted in image name change). 6. Tidied up passing of `plotting_config` into these tests. 7. Added a `pytest-mpl` test for `test_plot_and_save_no_axes_no_colorbar()`, adding the target image as required by #806 Note on Borders The image generated by `pytest-mpl` against which comparisons are made _does_ have a border because it is saving the returned `fig` object itself. The actual generated image doesn't have a border (manual checks have been made).

In #806 and refactoring (see 07db718) I found an unintended side-effect was that some of the target directories (specifically `filter_out_path` and `grain_out_path`) were no longer present and the tests (`tests/test_processing.py`) failed. These are meant to be explicitly created by `processing.get_out_paths()` but for some reason were now no longer. For filters the solution seemed counter-intuitive as the images are saved to `filter_out_path` if the image is part of the `core_set`, otherwise they are saved to `core_out_path`. For grains the `direction` is appended to the `grain_out_path` once (rather than three times) and the `.mkdir(parents = True, exists_ok=True)` pathlib method is used to ensure the directory exists prior to attempting to plot.

In #806 and refactoring (see 07db718) I found an unintended side-effect was that some of the target directories (specifically `filter_out_path` and `grain_out_path`) were no longer present and the tests (`tests/test_processing.py`) failed. These are meant to be explicitly created by `processing.get_out_paths()` but for some reason were now no longer. For filters the solution was to ensure the `filter_out_path` is created _and_ use either the `core_out_path` or `filter_out_path` depending on which image is being saved. For grains the `direction` is appended to the `grain_out_path` once (rather than three times) and the `.mkdir(parents = True, exists_ok=True)` pathlib method is used to ensure the directory exists prior to attempting to plot. In simplifying the logic I found that some of the processing tests failed, for some reason images that were meant to be part of the `all` set were being plotted by mistake and the tests were looking for these images in the `caplog`. These have been updated.

codecov · 2024-03-04T18:00:38Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 84.78%. Comparing base (7901d14) to head (f2d0b78).
Report is 3 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #810      +/-   ##
==========================================
+ Coverage   84.72%   84.78%   +0.05%     
==========================================
  Files          21       21              
  Lines        3195     3200       +5     
==========================================
+ Hits         2707     2713       +6     
+ Misses        488      487       -1

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

MaxGamill-Sheffield · 2024-03-06T11:04:12Z

Hey @ns-rse before I review this I just want to check with the experimentalists that this is the desired intent of the tool and aligns with their goals as users. I'll mention in the group meeting tomorrow.

It is my belief that people will want to use the images generated by TS that appear in the directories, and having an output that saved the image pixel-for-pixel (without interpolation and dpi limitations) is useful for publications / presentations (and has helped me save on storage), all without having to run more code to save individual arrays from the .topostats files which I don't think is available yet without prior coding knowledge.

Regardless, the test logic and fixes using the cropped region, and the updated info in the doc strings are helpful too so I'll get back to you on this :)

MaxGamill-Sheffield · 2024-03-06T11:09:29Z

topostats/processing.py

+            if plotting_config["image_set"] == "all":
+                filter_out_path.mkdir(parents=True, exist_ok=True)
+                LOGGER.debug(f"[{filename}] : Target filter directory created : {filter_out_path}")
            # Generate plots
            for plot_name, array in filters.images.items():
                if plot_name not in ["scan_raw"]:
                    if plot_name == "extracted_channel":
                        array = np.flipud(array.pixels)
-                    plotting_config["plot_dict"][plot_name]["output_dir"] = filter_out_path
+                    plotting_config["plot_dict"][plot_name]["output_dir"] = (
+                        core_out_path if plotting_config["plot_dict"][plot_name]["core_set"] else filter_out_path
+                    )
                    try:
                        Images(array, **plotting_config["plot_dict"][plot_name]).plot_and_save()
                        Images(array, **plotting_config["plot_dict"][plot_name]).plot_histogram_and_save()


The only other thing is that you've got an if image_set=='all' statement which is then used again within the plot_and_save() function, and I'm thinking would it make sense just to use it the once in processing?

We could pop it and re-add it like the plotting.run param and keep all the non-core images within this if statement, and the core image outside like it is? What do you think?

The logic here though is to create the sub-directory filter as held in filter_out_path (the same is true for grains). We then need to set the parameter in the current images plot_dict to whatever that is, hence the if/else.

When I simplified the logic within plot_and_save() for some reason I couldn't understand I got error messages about target directories not being present so I figured I needed to create them.

Also having plot_and_save() know about image_set might be useful should anyone use the class/method interactively e.g. in a Notebook.

Ok makes sense, thanks for explaining :)

MaxGamill-Sheffield · 2024-03-12T12:53:50Z

Experimentalists have confirmed that they only use the mpl image and not the pure array saved by the save_array function. Therefore this PR can go ahead in its entirety and depreciate the save_array function for consistency

ns-rse added 3 commits March 4, 2024 15:46

ns-rse requested review from SylviaWhittle, MaxGamill-Sheffield and llwiggins March 4, 2024 21:40

Filter and grain image directory creation conditional on image_set

f2d0b78

MaxGamill-Sheffield requested changes Mar 6, 2024

View reviewed changes

MaxGamill-Sheffield approved these changes Mar 12, 2024

View reviewed changes

ns-rse added this pull request to the merge queue Mar 12, 2024

Merged via the queue into main with commit 75e6ec4 Mar 12, 2024
16 checks passed

ns-rse deleted the ns-rse/806-mpl-test-image-no-axes-no-colorbar branch March 12, 2024 15:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tidies up Images.plot_and_save() #810

Tidies up Images.plot_and_save() #810

ns-rse commented Mar 4, 2024 •

edited

Loading

codecov bot commented Mar 4, 2024 •

edited

Loading

MaxGamill-Sheffield commented Mar 6, 2024

MaxGamill-Sheffield Mar 6, 2024

ns-rse Mar 6, 2024 •

edited

Loading

ns-rse Mar 6, 2024

MaxGamill-Sheffield Mar 12, 2024

MaxGamill-Sheffield commented Mar 12, 2024

Tidies up Images.plot_and_save() #810

Tidies up Images.plot_and_save() #810

Conversation

ns-rse commented Mar 4, 2024 • edited Loading

codecov bot commented Mar 4, 2024 • edited Loading

Codecov Report

MaxGamill-Sheffield commented Mar 6, 2024

MaxGamill-Sheffield Mar 6, 2024

Choose a reason for hiding this comment

ns-rse Mar 6, 2024 • edited Loading

Choose a reason for hiding this comment

ns-rse Mar 6, 2024

Choose a reason for hiding this comment

MaxGamill-Sheffield Mar 12, 2024

Choose a reason for hiding this comment

MaxGamill-Sheffield commented Mar 12, 2024

ns-rse commented Mar 4, 2024 •

edited

Loading

codecov bot commented Mar 4, 2024 •

edited

Loading

ns-rse Mar 6, 2024 •

edited

Loading