Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add algorithm to compute image parameters via fitting the light distribution #2275

Open
wants to merge 35 commits into
base: main
Choose a base branch
from

Conversation

clara-escanuela
Copy link
Contributor

No description provided.

@clara-escanuela clara-escanuela linked an issue Mar 1, 2023 that may be closed by this pull request
@clara-escanuela
Copy link
Contributor Author

This is my first step towards this new image-fitting method. Although I would like to get suggestions about:

  1. How to make ellipsoid.py more general and include an option for PDFs
  2. I want to now integrate this into HillasReconstructor. How would you proceed?

@clara-escanuela
Copy link
Contributor Author

I also think that the parameter boundaries should be calculated in ellipsoid.py and not being optional. So maybe I could include another function in ellipsoid.py that finds the limits of each parameter

@maxnoe
Copy link
Member

maxnoe commented Mar 2, 2023

Let's do one thing after another. Adding new image parameters is already a large enough pull request, let's not mix it with changing the HillasReconstructor

@@ -243,6 +243,86 @@ class HillasParametersContainer(BaseHillasParametersContainer):
width_uncertainty = Field(nan * u.deg, "uncertainty of width", unit=u.deg)
psi = Field(nan * u.deg, "rotation angle of ellipse", unit=u.deg)

class BaseImageFitParametersContainer(Container):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we are planning to drop the camera frame stuff altogether, for new stuff I wouldn't at it. So just having the TelescopeFrame parameter container is fine.

@@ -369,3 +371,72 @@ def pdf(self, x, y):
r = np.sqrt((x - self.x) ** 2 + (y - self.y) ** 2)

return norm(self.radius.to_value(u.m), self.sigma.to_value(u.m)).pdf(r)


class SkewedCauchy(ImageModel):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not a SkewedCauchy, is it? It's a SkewedGaussian for the longitudinal and a normal cauchy for trans, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I will clarify

Copy link
Contributor

@kosack kosack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not a complete review, but a few initial comments

ctapipe/image/ellipsoid.py Outdated Show resolved Hide resolved
Camera geometry
cleaned_image: ndarray
Charge for each pixel, cleaning mask should be applied
pdf: str
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be useful to use an Enum here and elsewhere for pdf (for speed and correctness and to make it easier to include in a config file later).

x0 : dict
seeds of the fit
pdf: str
name of the PDF, options = "Gaussian", "Laplace", "Skewed"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here, use an Enum

0.99, x0["skewness"] + 0.3
)

if pdf == "Gaussian":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this then becomes, e.g. PDFType.Gaussian

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that what you were asking for?

@clara-escanuela clara-escanuela marked this pull request as ready for review May 5, 2023 17:46
@clara-escanuela
Copy link
Contributor Author

Here are some examples of the performance of the fit when applying a 2D Gaussian:
image

@clara-escanuela
Copy link
Contributor Author

clara-escanuela commented May 9, 2023

Maybe this is also interesting. For these tests, I am using an array of 14 MSTs and the HillasReconstructor, telescope multiplicity cut >= 5 triggered telescopes. Although these plots were made with a slightly older version of this code.
image

@maxnoe
Copy link
Member

maxnoe commented May 9, 2023

@clara-escanuela Could you also include please the standard hillas parameters based approach without applying a centroid cut?

@maxnoe
Copy link
Member

maxnoe commented May 9, 2023

As discussed via slack, I don't think that the centroid is a good criterion, the leakage_* are better parameters.

Thinking about it: since you get a predicted image from the model, and the model pdf is normalized, you should very easily be able to compute the expected containment from the image fit as (predicted_image.sum() / normalization) , which might be an interesting parameter.

@gaia-verna
Copy link

Maybe this is also interesting. For these tests, I am using an array of 14 MSTs and the HillasReconstructor, telescope multiplicity cut >= 5 triggered telescopes. Although these plots were made with a slightly older version of this code. image

Just for clarity, you considered:

  • the MST sub-array of CTA-South

  • MC gammas from Prod5

  • an on-axis observation @ 20° ZA

and the 2D fit is applied to all the shower images, right?

@maxnoe
Copy link
Member

maxnoe commented May 9, 2023

And when you do the cut on the centroid: is it the "new" centroid from the image fit for the fit or the same hillas centroid for both?

@clara-escanuela
Copy link
Contributor Author

clara-escanuela commented May 10, 2023

And when you do the cut on the centroid: is it the "new" centroid from the image fit for the fit or the same hillas centroid for both?

The centroid from Hillas in both cases. I wanted to compare the exact same events.

@clara-escanuela
Copy link
Contributor Author

@clara-escanuela Could you also include please the standard hillas parameters based approach without applying a centroid cut?

Here I am plotting Hillas when I do not apply any cut on the centroid distance (light blue), Hillas when I apply a cut on centroid (need to change definition) at 0.8m (dark blue), and 2D Fit with no cut on centroid. I fit all images and not only truncated. It would be interesting to see how this plot looks like when fitting only truncated images to check what it is the actual impact on only truncated images.

image

goodness_of_fit = Field(
nan, "measure of goodness of fit, mean likelihood subtracted to the likelihood"
)
n_pix_fit = Field(nan, "number of pixels used in the fit")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please avoid such unneeded abbreviations.

@@ -216,6 +217,37 @@ class CameraHillasParametersContainer(BaseHillasParametersContainer):
psi = Field(nan * u.deg, "rotation angle of ellipse", unit=u.deg)


class ImageFitParametersContainer(CameraHillasParametersContainer):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We added support for the toy models in telescope frame now, so it should be possible (and should be the default) that this happens also in telescope frame (i.e. using fov_lon / fov_lat in degrees instead of x, y in m)

Seed parameters of the fit.
"""
unit = geometry.pix_x.unit
hillas = hillas_parameters(geometry, image) # compute Hillas parameters
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why compute hillas parameters again instead of just taking them as input?

return initial_guess


def dilation(n, cleaned_mask, geometry):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pretty sure this already exists in ctapipe.image.cleaning, no need to redefine

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually, we only have dilate which does a single dilation. THe loop for multiple is not there, but this code should in any case go into cleaning.py (maybe call it dilate_multiple)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or use functools.reduce to do the dilate(dilate(dilate()) call chaining

-------
list of boundaries
"""
hillas = hillas_parameters(geometry, cleaned_image)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again a recomputation of parameters...

hillas = hillas_parameters(geometry, cleaned_image)

unit = geometry.pix_x.unit
camera_radius = geometry.guess_radius()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this memoized? If not it might be expensive to call this in the event loop

Copy link
Contributor

@kosack kosack Sep 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is current not memoized, so it will be quite slow to call this frequently. I didn't add a @lazyproperty to this call, since this is only one way to compute the edge of the camera (not sure it works well in all cases). It was just to give a quick way of computing a reasonable value. Ideally we would have a geom.radius property that is supplied externally like the pixel positions or telescope focal length, etc, and only guessed if none is provided.

In the mean-time, we could just add:

@lazyproperty
def radius(self):
    return self.guess_radius()

camera_radius = geometry.guess_radius()

cogx_min, cogx_max = np.sign(hillas.x) * min(
np.abs(hillas.x - u.Quantity(0.2, unit)), camera_radius
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

magic number 0.2?


from ..containers import ImageFitParametersContainer

PED_TABLE = {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where do these come from? Why these keys?

Copy link
Contributor

@kosack kosack Sep 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These seem to be used in the likelihood to provide the pedestal, so this is not the right way to load them: it will break if you use different simulation productions, or even sims with different cameras (there is no "SST-Camera" for example, it's call "SSTCam"), and clearly won't work for real data. Note that even between prod5 and prod6, the pedestal values are not the same

However, I would have two questions:

  1. since the waveforms and images derived from them are already pedestal-subtracted, do you really need the pedestal? Maybe the likelihood function needs modification for this case?
  2. if you need the pedestal, you should get it from the calibration container (but have to check if it is actually filled for simulations)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not think the pedestal width is available at the moment. I used those values that I found in Dan's code as he also had to load the pedestal width like that.

@@ -369,3 +399,88 @@ def pdf(self, x, y):
r = np.sqrt((x - self.x) ** 2 + (y - self.y) ** 2)

return norm(self.radius.to_value(u.m), self.sigma.to_value(u.m)).pdf(r)


class SkewedGaussianLaplace(ImageModel):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a physical motivation for this? Does it fit better?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We thought this could better estimate the width. But, after some tests, I do not think it works better than the 2D Skewed Gaussian so I think I could remove it

"DUMMY": 0,
"testcam": 0,
}
SPE_WIDTH = 0.5
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this is also not the correct value, and you should get it from the appropriate Container.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also think this is not available. At least I could not find the SPE width

@maxnoe maxnoe changed the title Image-fitting algorithm Add algorithm to compute image parameters via fitting the light distribution Oct 22, 2023
Copy link
Member

@maxnoe maxnoe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left some more inline comments.

One main one, that also should simplify the implementation a lot is that we shouldn't add new features using the CameraFrame, since we plan to remove image parameters in CameraFrame anyways very soon.

Second comment is that there are seemingly multiple PDFType choices allowed, but I see several parts of the code that should only work with the skewed version.

goodness_of_fit = Field(
nan, "measure of goodness of fit, mean likelihood subtracted to the likelihood"
)
n_pixels = Field(nan, "number of pixels used in the fit")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Positive integer values should have defaults of the correct type, e.g. -1

)
n_pixels = Field(nan, "number of pixels used in the fit")
free_parameters = Field(nan, "number of free parameters")
is_valid = Field(False, "True if the fit is valid")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the difference between is_valid and is_accurate?


def create_initial_guess(geometry, hillas, size):
"""

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment below was only for the "Parameters" and "Returns" sections, not for the initial line. So this empty line should be removed.


initial_guess["skewness"] = hillas.skewness

if (hillas.width.to_value(unit) == 0) or (hillas.length.to_value(unit) == 0):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For checking against 0, the unit doesn't matter, so just doing .value != 0 is fine and faster.


def boundaries(geometry, image, dilated_mask, hillas, pdf):
"""

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No empty line here

ctapipe/image/ellipsoid.py Outdated Show resolved Hide resolved
ctapipe/image/ellipsoid.py Show resolved Hide resolved
like_array[like_array > 0]
- mean_poisson_likelihood_gaussian(
pdf_dict[pdf](
pars[0] * unit,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This won't work for different kinds of PDFs that have different numbers of parameters, does it?

ctapipe/image/ellipsoid.py Show resolved Hide resolved
ctapipe/image/ellipsoid.py Outdated Show resolved Hide resolved
Copy link

codecov bot commented Dec 15, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

❗ No coverage uploaded for pull request base (main@3c5b4f3). Click here to learn what that means.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #2275   +/-   ##
=======================================
  Coverage        ?   92.54%           
=======================================
  Files           ?      236           
  Lines           ?    20288           
  Branches        ?        0           
=======================================
  Hits            ?    18776           
  Misses          ?     1512           
  Partials        ?        0           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Alternative image parametrization based on gaussian fitting
4 participants