Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ucts fix to account for wrongly tagged pedestal events due to skipping events #497

Merged
merged 19 commits into from
Sep 14, 2020

Conversation

moralejo
Copy link
Collaborator

@moralejo moralejo commented Jul 31, 2020

This introduces and offline fix for the shifting of the UCTS info by 1-event jumps.

It also makes that the DL1 output contains now all events, not just those with valid image parameters. I think this is much better to allow study e.g. of interleaved events in the standard DL1 files.

@moralejo
Copy link
Collaborator Author

By the way, in order to work, the fix needs to have the proper dragon timestamps, so the r0 to dl1 step has to be launched with the command-line options to build them.

@contrera
Copy link

Is this correction compatible with running the subruns in parallel ?

@DirkHoffmann
Copy link

Is this correction compatible with running the subruns in parallel ?

No, the comparisons and corrections need strictly incremented event order N → N+1 (and later N → N+2 etc., if the skip happends multiple times in the same run).

Unless someone has found a brilliant solution for that.

@moralejo moralejo marked this pull request as ready for review July 31, 2020 13:08
@moralejo
Copy link
Collaborator Author

moralejo commented Jul 31, 2020

The correction is indeed compatible with running the subruns in parallel. But one needs the same that is needed to build a proper dragon time stamp: pass some synchronization info through the command line. This is done now automatically in the onsite analysis, so no problem there.

@DirkHoffmann
Copy link

Sorry, first of all I must admit that I mis-read "subruns" and understood streams (sub-files 1-4).

But splitting a run into several parts still seems not obvious to me:

The correction is indeed compatible with running the subruns in parallel.

How do you decide to shift the event fragments of later subruns (and by how much), if you have not yet processed all previous subruns of the same run, in order to know if there was a shift (or more)?

@moralejo
Copy link
Collaborator Author

Sorry, first of all I must admit that I mis-read "subruns" and understood streams (sub-files 1-4).

But splitting a run into several parts still seems not obvious to me:

The correction is indeed compatible with running the subruns in parallel.

How do you decide to shift the event fragments of later subruns (and by how much), if you have not yet processed all previous subruns of the same run, in order to know if there was a shift (or more)?

The trick is that the process which launches each subrun gets information from the daily check which has already gone through the whole run. It somehow identifies in each subrun how to build the absolute dragon timestamp for the first event in the subrun (or the first with valid ucts info to be precise), and this is passed to the r0_to_dl1 process through two command-line arguments. With valid dragon time stamps we can compare the times with ucts and search for the jumps.

@moralejo
Copy link
Collaborator Author

Ok, tests fail because I now write out all events to the DL1 output, and it is comparing one so-produced dl1 file, with a fixed test one which contains only survivors of the image parametrization...

@contrera
Copy link

OK. I think right now the onsite analysis is not yet ready for this, we only consider one correction per run. Adapting the code to it may take time.

@moralejo
Copy link
Collaborator Author

OK. I think right now the onsite analysis is not yet ready for this, we only consider one correction per run. Adapting the code to it may take time.

No, I think it is ready. The dragon time stamps are built with one correction per subrun, I believe, at least Isidro provides them like that. And I think we could not have built at all the dragon timestamp (and hence detect the Crab pulsar) without that.

@moralejo
Copy link
Collaborator Author

No, I think it is ready. The dragon time stamps are built with one correction per subrun, I believe, at least Isidro provides them like that. And I think we could not have built at all the dragon timestamp (and hence detect the Crab pulsar) without that.

After discussion with @contrera: it is not correct that it is 'one correction per subrun'; since the Dragon counter is reliable, we only need one fixed point in the whole run which connects Dragon counter and absolute time. And so the same fixed point (from subrun 0) is used for all subruns.

This however does not mean the UCTS event shift correction introduced in this PR will not work on a subrun-wise basis: only the first N events of a subrun cannot be corrected, with N being the number of jumps in the UCTS info that have happened in previous subruns. Since at least in current data these jumps are rare, I think this is a very minor limitation.

@moralejo
Copy link
Collaborator Author

moralejo commented Jul 31, 2020

Ok, tests fail because I now write out all events to the DL1 output, and it is comparing one so-produced dl1 file, with a fixed test one which contains only survivors of the image parametrization...

The test that fails is actually not against a fixed file, but against a file produced with lstchain_mc_dl1ab (i.e. opening the DL1 file and recalculating image parameters).

…mtel,

with those in one produced from the very same DL1 (redoing the cleaning and
parametrization) is now done only for events in which intensity is not nan,
since the former file may now contain events in which the Hillas
parametrization was unsuccessful
@moralejo moralejo marked this pull request as draft July 31, 2020 15:52
@contrera
Copy link

No, I think it is ready. The dragon time stamps are built with one correction per subrun, I believe, at least Isidro provides them like that. And I think we could not have built at all the dragon timestamp (and hence detect the Crab pulsar) without that.

After discussion with @contrera: it is not correct that it is 'one correction per subrun'; since the Dragon counter is reliable, we only need one fixed point in the whole run which connects Dragon counter and absolute time. And so the same fixed point (from subrun 0) is used for all subruns.

This however does not mean the UCTS event shift correction introduced in this PR will not work on a subrun-wise basis: only the first N events of a subrun cannot be corrected, with N being the number of jumps in the UCTS info that have happened in previous subruns. Since at least in current data these jumps are rare, I think this is a very minor limitation.

You are right. Good. Then it can run in parallel and modifications are minimal. :-)

…ning

and parametrization to the DL1 file is too difficult because the
(deprecated) DL1ParametersContainer is a mess. Basically, a reset() leaves
it in a state (initialized with None's, no units) which is incompatible
with writing it later. And I do not think it is worth to fill manually all
the values for empty events, given the container is deprecated.
@moralejo
Copy link
Collaborator Author

moralejo commented Jul 31, 2020

Turns out that writing out the events which do not survive cleaning and parametrization to the DL1 file is pretty difficult, because the (deprecated) DL1ParametersContainer is a mess. Basically, a reset() leaves it in a state (initialized with None's, no units) which is incompatible with writing it later (that seems to me an ugly feature - anyone can think of an easy fix, wirth applying to a deprecated container?).
Found a way of doing that, so we now write out to DL1 file all events, whether or not they survived cleaning. Now tests are passing, for that I had to slightly modify lstchain_mc_dl1ab so that the default values for non-parametrized images are the same.

@codecov
Copy link

codecov bot commented Jul 31, 2020

Codecov Report

Merging #497 into master will decrease coverage by 0.02%.
The diff coverage is 32.25%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #497      +/-   ##
==========================================
- Coverage   41.58%   41.56%   -0.03%     
==========================================
  Files          77       77              
  Lines        6428     6448      +20     
==========================================
+ Hits         2673     2680       +7     
- Misses       3755     3768      +13     
Impacted Files Coverage Δ
lstchain/scripts/lstchain_mc_dl1ab.py 0.00% <0.00%> (ø)
lstchain/reco/r0_to_dl1.py 63.66% <43.47%> (-1.31%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9815214...5fedbd4. Read the comment docs.

@moralejo moralejo marked this pull request as ready for review July 31, 2020 19:14
@moralejo moralejo requested a review from rlopezcoto July 31, 2020 19:15
@moralejo moralejo marked this pull request as draft August 1, 2020 07:15
…eaning,

and still keep copnsistency between r0_to_dl1 and dl1ab
@moralejo moralejo marked this pull request as ready for review August 1, 2020 08:27
@moralejo
Copy link
Collaborator Author

moralejo commented Aug 1, 2020

Note: as of now, the correction of the UCTS info is done quite late in the loop, after calibration and parametrization, just because the needed dragon time is calculated there. Obviously this is inconvenient if one wants to use the info from interleaved pedestals e.g. in the cleaning. Eventually we have to move both the dragon time calculation and the ucts info shifting to a function called right after an event is read from the event source. Or even putting it directly into the event source, if it is possible... For now it is still very useful like it is, because we will have all calibrated images properly tagged in the DL1 output, so one can re-do the cleaning starting from DL1 files, for optimizing cleaning strategies.

@moralejo moralejo marked this pull request as draft August 1, 2020 10:46
@moralejo moralejo marked this pull request as ready for review August 1, 2020 13:07
…in case

of more than one UCTS jump. Also: set -1 as ucts_trigger_type of events with
no valid UCTS info.
lstchain/reco/r0_to_dl1.py Show resolved Hide resolved
lstchain/reco/r0_to_dl1.py Show resolved Hide resolved
lstchain/scripts/lstchain_mc_dl1ab.py Show resolved Hide resolved
lstchain/scripts/lstchain_mc_dl1ab.py Show resolved Hide resolved
# integer parameters.
#
for key in dl1_container.keys():
dl1_container[key] = u.Quantity(0, dl1_container.fields[key].unit)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Putting in values that might be in the valid range for parameters as missing value indicators is never good and should be avoided.

The right thing todo is to use the correct defaults in the containers.

For floats, that is nan, for quantitites it is u.Quantity(np.nan, unit), for positive integers it could be -1, for general integers there is no general case and care must be taken in chossing a missing value indicator.

We did all this for the ctapipe containers in Version 0.8, we only missed one container for muon results which is fixed in the current master.

The same approach should be adapted in lstchain.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with you from a theoretical point of view. But here none of the parameters will be used for anything if intensity is 0. And we are anyway moving to the ctapipe DL1 containers as soon as we can, so why bother about something in the deprecated container, which even now has no impact? What we badly need now is proper event tagging, and hopefully this provide it for most data.

@rlopezcoto rlopezcoto changed the title Ucts fix Ucts fix to account for wrongly tagged pedestal events due to skipping events Sep 14, 2020
@rlopezcoto rlopezcoto merged commit 27ca0db into cta-observatory:master Sep 14, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants