Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cleanup and bugfix in viz dl2 #279

Merged
merged 11 commits into from
May 8, 2020
Merged

Conversation

vuillaut
Copy link
Member

@vuillaut vuillaut commented Feb 4, 2020

Fixes issue #278

I took this as an opportunity to do some cleanup in viz functions .

Copy link
Contributor

@rlopezcoto rlopezcoto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @vuillaut, thanks for the fixes.
Again, nothing against ctaplot, but for me it is an overkill using external libraries to plot either roc curves, energy/angular resolutions or sensitivities. Since this is a personal package, I would like to hear the opinion from @kosack whether it is recommended or not to use it (why is it not used anywhere in ctapipe?)

@kosack
Copy link

kosack commented Feb 4, 2020

Again, nothing against ctaplot, but for me it is an overkill using external libraries to plot either roc curves, energy/angular resolutions or sensitivities. Since this is a personal package, I would like to hear the opinion from @kosack whether it is recommended or not to use it (why is it not used anywhere in ctapipe?)

I disagree there - the way in which sensitivity curves, IRFs, etc are computed and plotted, is quite complex and can be done in many ways with various assumptions that hide what the user is doing and make it impossible to do a true comparison. We've had many problems with that in the past - not everyone agrees even what "PSF" is (90% containment? RMS? etc).

However, we've been discussing with @vuillaut and others that sensitivity (and other IRFs) must be in a common package, and will start to move toward that (and perhaps ctaplot is not the right place).

The idea of ctaplot is to have a nice toolbox for making benchmark plots for code generated by ctapipe-based or other pipelines (that use a common data format and model), in a common way, with common assumptions, so that we can generate a set of notebooks with all the required benchmarks, without having a ton of boilerplate code pasted in each one.

@vuillaut
Copy link
Member Author

vuillaut commented Feb 4, 2020

Hi @vuillaut, thanks for the fixes.
Again, nothing against ctaplot, but for me it is an overkill using external libraries to plot either roc curves, energy/angular resolutions or sensitivities. Since this is a personal package, I would like to hear the opinion from @kosack whether it is recommended or not to use it (why is it not used anywhere in ctapipe?)

I am actually in the process of moving ctaplot under cta-observatory from @kosack request.
The whole purpose of ctaplot is actually to not reimplant such simple but specific plots in each reconstruction pipeline of parallel projects. It would also ensure that we all use the same definitions of metrics everywhere (clearly not obvious) and also reduce a lot code maintenance.
ctaplot is already used in lstchain (and I was not the one to introduce it) but if you don't see the use of it, I won't be pushing for it (but won't reimplant these plots either in lstchain and make double maintenance when there are bugs in things already existing as the example above)
Concerning the use in ctapipe, I think the answer is simple, ctapipe does not provide a pipeline.

@kosack
Copy link

kosack commented Feb 4, 2020

We will also use ctaplot in the Protopipe benchmarks

@rlopezcoto
Copy link
Contributor

@kosack:

sensitivity (and other IRFs) must be in a common package, and will start to move toward that...

The idea of ctaplot is to have a nice toolbox for making benchmark plots

I think these are two different things: shouldn't the "toolbox for making benchmark plots" be the purpose of cta-benchmarks? ctaplot is in principlye a library to plot energy/angular resolution and sensitivity with the outputs (the points) that have been given to you.

@vuillaut:

ctaplot is already used in lstchain (and I was not the one to introduce it)

and I had already discussed about it at least in 3/4 PRs. We are calculating everything in lstchain and a simple plot should take 3 lines. My proposal has always been that we write the output (the points) that can be handled by ctaplot or other plotting library.

ctapipe does not provide a pipeline.

the definition of ctapipe is "CTA Low-level Data Processing Pipeline Framework Prototype"

@vuillaut
Copy link
Member Author

vuillaut commented Feb 5, 2020

@vuillaut:

ctaplot is already used in lstchain (and I was not the one to introduce it)

and I had already discussed about it at least in 3/4 PRs. We are calculating everything in lstchain and a simple plot should take 3 lines. My proposal has always been that we write the output (the points) that can be handled by ctaplot or other plotting library.

I thought the problem was adding it as a dependency but was fine to have it for control plots, which is exactly the case here.
But no problem, I'll close the PR.

ctapipe does not provide a pipeline.

the definition of ctapipe is "CTA Low-level Data Processing Pipeline Framework Prototype"

And yet... 🙄

@vuillaut
Copy link
Member Author

vuillaut commented Feb 5, 2020

I think these are two different things: shouldn't the "toolbox for making benchmark plots" be the purpose of cta-benchmarks? ctaplot is in principlye a library to plot energy/angular resolution and sensitivity with the outputs (the points) that have been given to you.

For the record, cta-benchmarks is aiming at providing a framework to automatically produce benchmarks, version them and control the evolution of pipelines. It does not aim at being used by users, nor at providing a way to compute IRFs, but will use an external library able to do so.

@kosack
Copy link

kosack commented Feb 5, 2020

the definition of ctapipe is "CTA Low-level Data Processing Pipeline Framework Prototype"

I think you're ignoring the word "framework" - its a library for making pipelines, not a pipeline

@rlopezcoto
Copy link
Contributor

Ok, sorry it took so long to get back to this. If we want to use ctaplot to perform these comparisons, I would suggest that that it is clearly explained what are the inputs of every of the functions that are being called on lstchain. Additionally, I would suggest that on output should be in any case written with the final results (points and errors), so they can be handled with other plotting options if desired, does that sound ok?

@rlopezcoto
Copy link
Contributor

any news on this @vuillaut? do you agree with the latest proposal?

@vuillaut
Copy link
Member Author

vuillaut commented May 3, 2020

Sorry, I completely over-regarded this issue and then forgot.

If we want to use ctaplot to perform these comparisons, I would suggest that that it is clearly explained what are the inputs of every of the functions that are being called on lstchain.

I am not sure to understand what you mean here, apart from improving the docstring?

Additionally, I would suggest that on output should be in any case written with the final results (points and errors), so they can be handled with other plotting options if desired, does that sound ok?

The computation and the plotting can be separated for some of the metrics (actually ctaplot is built that way and the functions exist).
But to write them, what you are asking is basically to build the DL3 files. As there is an on-going task to do exactly that outside of lstchain, I think that would effectively double the work and/or define a different output dataformat, which would be counter-effective.

@rlopezcoto
Copy link
Contributor

Connected to #391, ctaplot was already moved into the cta-observatory organization:
https://github.com/cta-observatory/ctaplot

I am not sure to understand what you mean here, apart from improving the docstring?

I think there have been some documentation updates since this PR was opened, I'll have a look at that.

But to write them, what you are asking is basically to build the DL3 files. As there is an on-going task to do exactly that outside of lstchain, I think that would effectively double the work and/or define a different output dataformat, which would be counter-effective.

what I was proposing here is that this is actually written by ctaplot. If I do not understand wrong, DL3 files should be photon lists, what I'm saying is that the plot that is outputted by ctaplot comes together with a file (.txt?) with the points that are plotted (angular/energy resolution) or the histograms.

Final question, are we compatible with the latest ctaplot release (v0.5.0)? We should probably follow some released version to avoid breaking of the code in the future.

@maxnoe
Copy link
Member

maxnoe commented May 4, 2020

what I was proposing here is that this is actually written by ctaplot. If I do not understand wrong, DL3 files should be photon lists, what I'm saying is that the plot that is outputted by ctaplot comes together with a file (.txt?) with the points that are plotted (angular/energy resolution) or the histograms

No. These file will be the input to ctaplot. Not its output.

DL3 is photon lists AND the IRFs already included. See https://gamma-astro-data-formats.readthedocs.io/en/latest/

Reconstructed event lists used as input for the IRF calculation would be considered DL2.

Work as also started on a common package for calculation of IRFs: https://github.com/cta-observatory/pyrf

ctaplot should read input and plot, not make irf calculations or complicated IO.
If you are talking about serialization of plots to something other than image formats, this would probably be nice. But the important information is stored in the DL3 files, not in something ctaplot produces.

@vuillaut
Copy link
Member Author

vuillaut commented May 4, 2020

what I was proposing here is that this is actually written by ctaplot. If I do not understand wrong, DL3 files should be photon lists, what I'm saying is that the plot that is outputted by ctaplot comes together with a file (.txt?) with the points that are plotted (angular/energy resolution) or the histograms

No. These file will be the input to ctaplot. Not its output.

DL3 is photon lists AND the IRFs already included. See https://gamma-astro-data-formats.readthedocs.io/en/latest/

Reconstructed event lists used as input for the IRF calculation would be considered DL2.

Work as also started on a common package for calculation of IRFs: https://github.com/cta-observatory/pyrf

ctaplot should read input and plot, not make irf calculations or complicated IO.
If you are talking about serialization of plots to something other than image formats, this would probably be nice. But the important information is stored in the DL3 files, not in something ctaplot produces.

Sorry catching up with all the weekend exchanges.
That is probably not the right place for this conversation but let's clarify some things.
ctaplot provides metrics, benchmarks and plot for DL0, DL1 and DL2 data.
I think this is what we want here.

The definitive resolution should come at level 3 (DL3), once all cuts have been optimised, with the complete IRFs. pyrf is an attempt to do that (very preliminary) and the output should be DL3 FITS IRFs as defined in the gamma astro data format.

I'm saying is that the plot that is outputted by ctaplot comes together with a file (.txt?) with the points that are plotted (angular/energy resolution) or the histograms.

Yes, no problem to have the values and plots (maybe not for all).

Final question, are we compatible with the latest ctaplot release (v0.5.0)? We should probably follow some released version to avoid breaking of the code in the future.

That I think was the subject of this PR.
If used, stating it as a dependency with a version would definitely help.

Let me know, I can either update this PR or close it.

@rlopezcoto
Copy link
Contributor

btw, we agreed in #391 to add ctaplot as a dependency, but without any fixed version because it is likely not working with the latest one. I would prefer not to keep it like this in the setup.py because ctaplot is quickly evolving and the compatibility is likely to break.
Ideally, here or in a new PR, the code should be adapted to the latest release and freeze that version.

Copy link
Contributor

@rlopezcoto rlopezcoto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a couple of comments, also travis is complaining about lstchain_mc_rfperformance.py, otherwise it looks good to go.

@codecov
Copy link

codecov bot commented May 8, 2020

Codecov Report

Merging #279 into master will increase coverage by 2.12%.
The diff coverage is 73.49%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #279      +/-   ##
==========================================
+ Coverage   43.95%   46.07%   +2.12%     
==========================================
  Files          71       71              
  Lines        5192     5200       +8     
==========================================
+ Hits         2282     2396     +114     
+ Misses       2910     2804     -106     
Impacted Files Coverage Δ
lstchain/scripts/lstchain_mc_rfperformance.py 0.00% <0.00%> (ø)
lstchain/visualization/plot_dl2.py 55.07% <76.02%> (+36.93%) ⬆️
lstchain/scripts/tests/test_lstchain_scripts.py 98.43% <100.00%> (+0.02%) ⬆️
lstchain/tests/test_lstchain.py 100.00% <100.00%> (ø)
lstchain/visualization/tests/test_plot_dl2.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9c8662a...7c8793f. Read the comment docs.

@vuillaut vuillaut merged commit a106fae into cta-observatory:master May 8, 2020
@vuillaut vuillaut deleted the plot_dl2 branch May 8, 2020 13:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants