cleanup and bugfix in viz dl2 #279

vuillaut · 2020-02-04T13:35:38Z

Fixes issue #278

I took this as an opportunity to do some cleanup in viz functions .

rlopezcoto

Hi @vuillaut, thanks for the fixes.
Again, nothing against ctaplot, but for me it is an overkill using external libraries to plot either roc curves, energy/angular resolutions or sensitivities. Since this is a personal package, I would like to hear the opinion from @kosack whether it is recommended or not to use it (why is it not used anywhere in ctapipe?)

kosack · 2020-02-04T13:54:17Z

Again, nothing against ctaplot, but for me it is an overkill using external libraries to plot either roc curves, energy/angular resolutions or sensitivities. Since this is a personal package, I would like to hear the opinion from @kosack whether it is recommended or not to use it (why is it not used anywhere in ctapipe?)

I disagree there - the way in which sensitivity curves, IRFs, etc are computed and plotted, is quite complex and can be done in many ways with various assumptions that hide what the user is doing and make it impossible to do a true comparison. We've had many problems with that in the past - not everyone agrees even what "PSF" is (90% containment? RMS? etc).

However, we've been discussing with @vuillaut and others that sensitivity (and other IRFs) must be in a common package, and will start to move toward that (and perhaps ctaplot is not the right place).

The idea of ctaplot is to have a nice toolbox for making benchmark plots for code generated by ctapipe-based or other pipelines (that use a common data format and model), in a common way, with common assumptions, so that we can generate a set of notebooks with all the required benchmarks, without having a ton of boilerplate code pasted in each one.

vuillaut · 2020-02-04T13:54:33Z

Hi @vuillaut, thanks for the fixes.
Again, nothing against ctaplot, but for me it is an overkill using external libraries to plot either roc curves, energy/angular resolutions or sensitivities. Since this is a personal package, I would like to hear the opinion from @kosack whether it is recommended or not to use it (why is it not used anywhere in ctapipe?)

I am actually in the process of moving ctaplot under cta-observatory from @kosack request.
The whole purpose of ctaplot is actually to not reimplant such simple but specific plots in each reconstruction pipeline of parallel projects. It would also ensure that we all use the same definitions of metrics everywhere (clearly not obvious) and also reduce a lot code maintenance.
ctaplot is already used in lstchain (and I was not the one to introduce it) but if you don't see the use of it, I won't be pushing for it (but won't reimplant these plots either in lstchain and make double maintenance when there are bugs in things already existing as the example above)
Concerning the use in ctapipe, I think the answer is simple, ctapipe does not provide a pipeline.

kosack · 2020-02-04T13:55:23Z

We will also use ctaplot in the Protopipe benchmarks

rlopezcoto · 2020-02-05T06:47:51Z

@kosack:

sensitivity (and other IRFs) must be in a common package, and will start to move toward that...

The idea of ctaplot is to have a nice toolbox for making benchmark plots

I think these are two different things: shouldn't the "toolbox for making benchmark plots" be the purpose of cta-benchmarks? ctaplot is in principlye a library to plot energy/angular resolution and sensitivity with the outputs (the points) that have been given to you.

@vuillaut:

ctaplot is already used in lstchain (and I was not the one to introduce it)

and I had already discussed about it at least in 3/4 PRs. We are calculating everything in lstchain and a simple plot should take 3 lines. My proposal has always been that we write the output (the points) that can be handled by ctaplot or other plotting library.

ctapipe does not provide a pipeline.

the definition of ctapipe is "CTA Low-level Data Processing Pipeline Framework Prototype"

vuillaut · 2020-02-05T08:53:11Z

@vuillaut:

ctaplot is already used in lstchain (and I was not the one to introduce it)

and I had already discussed about it at least in 3/4 PRs. We are calculating everything in lstchain and a simple plot should take 3 lines. My proposal has always been that we write the output (the points) that can be handled by ctaplot or other plotting library.

I thought the problem was adding it as a dependency but was fine to have it for control plots, which is exactly the case here.
But no problem, I'll close the PR.

ctapipe does not provide a pipeline.

the definition of ctapipe is "CTA Low-level Data Processing Pipeline Framework Prototype"

And yet... 🙄

vuillaut · 2020-02-05T08:58:16Z

I think these are two different things: shouldn't the "toolbox for making benchmark plots" be the purpose of cta-benchmarks? ctaplot is in principlye a library to plot energy/angular resolution and sensitivity with the outputs (the points) that have been given to you.

For the record, cta-benchmarks is aiming at providing a framework to automatically produce benchmarks, version them and control the evolution of pipelines. It does not aim at being used by users, nor at providing a way to compute IRFs, but will use an external library able to do so.

kosack · 2020-02-05T09:26:01Z

the definition of ctapipe is "CTA Low-level Data Processing Pipeline Framework Prototype"

I think you're ignoring the word "framework" - its a library for making pipelines, not a pipeline

rlopezcoto · 2020-02-27T11:39:50Z

Ok, sorry it took so long to get back to this. If we want to use ctaplot to perform these comparisons, I would suggest that that it is clearly explained what are the inputs of every of the functions that are being called on lstchain. Additionally, I would suggest that on output should be in any case written with the final results (points and errors), so they can be handled with other plotting options if desired, does that sound ok?

rlopezcoto · 2020-04-14T06:09:29Z

any news on this @vuillaut? do you agree with the latest proposal?

vuillaut · 2020-05-03T22:43:59Z

Sorry, I completely over-regarded this issue and then forgot.

If we want to use ctaplot to perform these comparisons, I would suggest that that it is clearly explained what are the inputs of every of the functions that are being called on lstchain.

I am not sure to understand what you mean here, apart from improving the docstring?

Additionally, I would suggest that on output should be in any case written with the final results (points and errors), so they can be handled with other plotting options if desired, does that sound ok?

The computation and the plotting can be separated for some of the metrics (actually ctaplot is built that way and the functions exist).
But to write them, what you are asking is basically to build the DL3 files. As there is an on-going task to do exactly that outside of lstchain, I think that would effectively double the work and/or define a different output dataformat, which would be counter-effective.

rlopezcoto · 2020-05-04T08:02:48Z

Connected to #391, ctaplot was already moved into the cta-observatory organization:
https://github.com/cta-observatory/ctaplot

I am not sure to understand what you mean here, apart from improving the docstring?

I think there have been some documentation updates since this PR was opened, I'll have a look at that.

But to write them, what you are asking is basically to build the DL3 files. As there is an on-going task to do exactly that outside of lstchain, I think that would effectively double the work and/or define a different output dataformat, which would be counter-effective.

what I was proposing here is that this is actually written by ctaplot. If I do not understand wrong, DL3 files should be photon lists, what I'm saying is that the plot that is outputted by ctaplot comes together with a file (.txt?) with the points that are plotted (angular/energy resolution) or the histograms.

Final question, are we compatible with the latest ctaplot release (v0.5.0)? We should probably follow some released version to avoid breaking of the code in the future.

maxnoe · 2020-05-04T08:40:57Z

what I was proposing here is that this is actually written by ctaplot. If I do not understand wrong, DL3 files should be photon lists, what I'm saying is that the plot that is outputted by ctaplot comes together with a file (.txt?) with the points that are plotted (angular/energy resolution) or the histograms

No. These file will be the input to ctaplot. Not its output.

DL3 is photon lists AND the IRFs already included. See https://gamma-astro-data-formats.readthedocs.io/en/latest/

Reconstructed event lists used as input for the IRF calculation would be considered DL2.

Work as also started on a common package for calculation of IRFs: https://github.com/cta-observatory/pyrf

ctaplot should read input and plot, not make irf calculations or complicated IO.
If you are talking about serialization of plots to something other than image formats, this would probably be nice. But the important information is stored in the DL3 files, not in something ctaplot produces.

vuillaut · 2020-05-04T11:43:46Z

what I was proposing here is that this is actually written by ctaplot. If I do not understand wrong, DL3 files should be photon lists, what I'm saying is that the plot that is outputted by ctaplot comes together with a file (.txt?) with the points that are plotted (angular/energy resolution) or the histograms

No. These file will be the input to ctaplot. Not its output.

DL3 is photon lists AND the IRFs already included. See https://gamma-astro-data-formats.readthedocs.io/en/latest/

Reconstructed event lists used as input for the IRF calculation would be considered DL2.

Work as also started on a common package for calculation of IRFs: https://github.com/cta-observatory/pyrf

ctaplot should read input and plot, not make irf calculations or complicated IO.
If you are talking about serialization of plots to something other than image formats, this would probably be nice. But the important information is stored in the DL3 files, not in something ctaplot produces.

Sorry catching up with all the weekend exchanges.
That is probably not the right place for this conversation but let's clarify some things.
ctaplot provides metrics, benchmarks and plot for DL0, DL1 and DL2 data.
I think this is what we want here.

The definitive resolution should come at level 3 (DL3), once all cuts have been optimised, with the complete IRFs. pyrf is an attempt to do that (very preliminary) and the output should be DL3 FITS IRFs as defined in the gamma astro data format.

I'm saying is that the plot that is outputted by ctaplot comes together with a file (.txt?) with the points that are plotted (angular/energy resolution) or the histograms.

Yes, no problem to have the values and plots (maybe not for all).

Final question, are we compatible with the latest ctaplot release (v0.5.0)? We should probably follow some released version to avoid breaking of the code in the future.

That I think was the subject of this PR.
If used, stating it as a dependency with a version would definitely help.

Let me know, I can either update this PR or close it.

rlopezcoto · 2020-05-05T15:16:50Z

btw, we agreed in #391 to add ctaplot as a dependency, but without any fixed version because it is likely not working with the latest one. I would prefer not to keep it like this in the setup.py because ctaplot is quickly evolving and the compatibility is likely to break.
Ideally, here or in a new PR, the code should be adapted to the latest release and freeze that version.

rlopezcoto

Just a couple of comments, also travis is complaining about lstchain_mc_rfperformance.py, otherwise it looks good to go.

lstchain/scripts/lstchain_mc_rfperformance.py

codecov · 2020-05-08T10:59:41Z

Codecov Report

Merging #279 into master will increase coverage by 2.12%.
The diff coverage is 73.49%.

@@            Coverage Diff             @@
##           master     #279      +/-   ##
==========================================
+ Coverage   43.95%   46.07%   +2.12%     
==========================================
  Files          71       71              
  Lines        5192     5200       +8     
==========================================
+ Hits         2282     2396     +114     
+ Misses       2910     2804     -106

Impacted Files	Coverage Δ
lstchain/scripts/lstchain_mc_rfperformance.py	`0.00% <0.00%> (ø)`
lstchain/visualization/plot_dl2.py	`55.07% <76.02%> (+36.93%)`	⬆️
lstchain/scripts/tests/test_lstchain_scripts.py	`98.43% <100.00%> (+0.02%)`	⬆️
lstchain/tests/test_lstchain.py	`100.00% <100.00%> (ø)`
lstchain/visualization/tests/test_plot_dl2.py	`100.00% <100.00%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9c8662a...7c8793f. Read the comment docs.

vuillaut added 2 commits February 4, 2020 14:31

cleanup and bugfix in viz dl2

a21c2cb

add docstring for energy results

4d2cbc0

vuillaut requested a review from misabelber February 4, 2020 13:35

rlopezcoto reviewed Feb 4, 2020

View reviewed changes

rlopezcoto mentioned this pull request May 1, 2020

Fixes for Syntax Errors and other broken stuff #391

Merged

vuillaut added 5 commits May 6, 2020 15:53

ours

fdc986e

fix ctaplot version

20b9d8d

fix and improve some dl2 plots

95e5363

simple unit test for dl2 plots

c9c3a4e

change rfperf script accordingly

db25eca

rlopezcoto reviewed May 8, 2020

View reviewed changes

lstchain/scripts/lstchain_mc_rfperformance.py Show resolved Hide resolved

lstchain/scripts/lstchain_mc_rfperformance.py Show resolved Hide resolved

vuillaut added 4 commits May 8, 2020 12:23

produce fake dl1 proton file

1e02519

test with fake dl1 proton file

f32c5d5

fix typo bug

131d58f

fix test

7c8793f

rlopezcoto approved these changes May 8, 2020

View reviewed changes

vuillaut merged commit a106fae into cta-observatory:master May 8, 2020

vuillaut deleted the plot_dl2 branch May 8, 2020 13:48

rlopezcoto mentioned this pull request Aug 28, 2020

Wrong argument gammaness for plot ROC Curve on lstchain.visualization.plot_dl2 #278

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cleanup and bugfix in viz dl2 #279

cleanup and bugfix in viz dl2 #279

vuillaut commented Feb 4, 2020

rlopezcoto left a comment

kosack commented Feb 4, 2020

vuillaut commented Feb 4, 2020

kosack commented Feb 4, 2020

rlopezcoto commented Feb 5, 2020

vuillaut commented Feb 5, 2020

vuillaut commented Feb 5, 2020

kosack commented Feb 5, 2020

rlopezcoto commented Feb 27, 2020

rlopezcoto commented Apr 14, 2020

vuillaut commented May 3, 2020

rlopezcoto commented May 4, 2020

maxnoe commented May 4, 2020 •

edited

Loading

vuillaut commented May 4, 2020

rlopezcoto commented May 5, 2020

rlopezcoto left a comment

codecov bot commented May 8, 2020 •

edited

Loading

cleanup and bugfix in viz dl2 #279

cleanup and bugfix in viz dl2 #279

Conversation

vuillaut commented Feb 4, 2020

rlopezcoto left a comment

Choose a reason for hiding this comment

kosack commented Feb 4, 2020

vuillaut commented Feb 4, 2020

kosack commented Feb 4, 2020

rlopezcoto commented Feb 5, 2020

vuillaut commented Feb 5, 2020

vuillaut commented Feb 5, 2020

kosack commented Feb 5, 2020

rlopezcoto commented Feb 27, 2020

rlopezcoto commented Apr 14, 2020

vuillaut commented May 3, 2020

rlopezcoto commented May 4, 2020

maxnoe commented May 4, 2020 • edited Loading

vuillaut commented May 4, 2020

rlopezcoto commented May 5, 2020

rlopezcoto left a comment

Choose a reason for hiding this comment

codecov bot commented May 8, 2020 • edited Loading

Codecov Report

maxnoe commented May 4, 2020 •

edited

Loading

codecov bot commented May 8, 2020 •

edited

Loading