-
Notifications
You must be signed in to change notification settings - Fork 77
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Impute pointing values with linear interpolation #295
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is very necessary and a very good way of handling the problem of missing ucts/pointing info. Just a couple of comments, would wait for @morcuended to test it and then merge right away.
Hi @vuillaut, when running the cloned scripts for example for sub-run 1840.0033 like:
I'm having this error after processing some of the events:
However I can process sub-run 1840.0034, which in principle did not contain Will investigate for a few more problematic subruns to see if I can reproduce the error. I don't know if it is related but, did you finally change the 'None' by 'nan' in
? I saw the commit in the other closed PR but not here. However, I see that you are setting nan in |
Hey @morcuended From your comment, I would imagine that the difference between these subruns are that:
|
@vuillaut, I think this is exactly the case. I'm running some more test with the new changes |
Hi @vuillaut, this is what I'm getting for the entire run 1840: You can see that now there are However, I realized that |
Will check the dl1->dl2 step right away in order to check the interpolation of missing values. |
I am not sure I understand why
Could you elaborate, please? |
Yes, they are negative. I just filtered them out in the plot.
This is what is happening. After dropping the negative values, the remaining ones are constant for all the events in the affected subruns. The same happens to the interpolated pointing values. I hope it is clearer in this zoomed plot: |
Related to this problem of missing ucts time info, Isidro has been checking that even if timestamps are recovered after the loss, values may not be correct even if the external device presence flag says so. This is still under investigation by people involved in those subsystems. I think there is nothing we can do but rely on that flag saying whether the data is OK or not. |
Thank you @morcuended for the explanation. |
Yes, I would say so. Let's fix that in another PR. Concerning dl1->dl2 step, this is the result of the interpolation of the missing values:
Additionally, I noticed that for some subruns the same unit error we were getting before appears again:
Strange because the values of the coordinates seem to be OK: Could it be related to this assignment?
|
In this case, do you apply
My bad, I assigned a |
alt_tel = - np.pi/2. * np.ones(len(dl2)) | ||
az_tel = - np.pi/2. * np.ones(len(dl2)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh well spotted
data.alt_tel = - np.pi/2. | ||
data.az_tel = - np.pi/2. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm running some more tests with this definition and it seems they work fine now.
I'm running |
Ok so for these subruns you'll get pointing to -90deg now.
Frankly, at this point, I am not sure what is best with such messy data. |
That's what I'm getting
I agree with you, some runs are a real pain. At least, these very useful changes will allow going up to dl2 regardless of missing time/pointing info. Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now it works fine
Hi again @vuillaut, |
Seeing the problems you have to implement this solution I wonder whether it would be better to take the nominal RA,dec pointing direction, and then (assuming the pointing is exact) use the interpolated time, which looks good enough, to obtain alt, az event-wise. Obviously this would result in the weird situation that if one calculates back the pointing RA/Dec the exact value will be returned. But seems like a minor side effect. |
Since you have to reconstruct alt, az coordinates, at the end you will not recover exactly the same RADec coordinates but a cloud of points around the true values, right? |
Should we leave it as it is now and merge this PR? Add a warning somewhere? |
I do not know what you mean - if for a given event one "fakes" the alt/az pointing direction, just calculating the values that would correspond to its (known, nominal) RA/dec values, at the interpolated ucts time, then inverting the transformation, i.e. starting from alt/az (and using the interpolated ucts time), should result in exactly the same RA/dec. |
I was thinking of going from these |
I am sorry, I fail to see how that would be easier in RA,DEC. We have pointing (either alt,az or RA,DEC) every ~second. The problem is that some subruns have not a single valid timestamp. How do you get the pointing then? |
Hi @vuillaut , I was thinking on the image below: It is not clear to me from the thread if you managed to fix that behaviour or not, but that is what triggered my suggestion. It just replaces the interpolation of alt & az by a coordinates transformation. But of course, you still need to have an interpolated time. If a bad time estimate was behind the ugly plot above, then it will not solve anything.
Indeed, there is no other way than assigning "uniformly" distributed times to the events, unless one of the other times is available. |
@morcuended |
Do you mean run 1840? That is what I did to obtain that 'messy' plot. First, merge DL1 files (including those subruns with no time info at all) and then run dl1_to_dl2 which interpolated taking into account the -pi/2 values. |
else: | ||
data.alt_tel = - np.pi/2. | ||
data.az_tel = - np.pi/2. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that in order to avoid the interpolation between -pi/2 and correct values, the condition here to interpolate missing values should include also those values which were set to -pi/2 in dl1_to_dl2.py script:
if 'mc_alt_tel' in dl2.columns:
alt_tel = dl2['mc_alt_tel'].values
az_tel = dl2['mc_az_tel'].values
elif 'alt_tel' in dl2.columns:
alt_tel = dl2['alt_tel'].values
az_tel = dl2['az_tel'].values
else:
alt_tel = - np.pi/2. * np.ones(len(dl2))
az_tel = - np.pi/2. * np.ones(len(dl2))
In this way, those subruns with no alt az coordinates at all can be properly interpolated. Does this makes sense to you @vuillaut?
Ha sorry I thought you merged DL2 files. This is what I get:
x-axis=event_id |
This looks much better indeed :) Let me check what I'm doing wrong then |
@morcuended |
Ouch! probably yes. Will try it with that. More generally I was thinking about sorting the file list in:
by adding But again, there is no single way of merging files. You can merge only the subruns corresponding to a given run or even several runs. Hence sorting does not make sense anymore. Maybe there should be an option in |
I like the idea of the run option if you can implement that. I implemented the sort by event_id here. But this means that the method will fail when analysing a DL1 file with several runs inside (the sort by event_id will actually shuffle the events from different runs). |
Sure, will do that |
Ok, I saw you solved the problem with the weird interpolation, and the across-the-run interpolation also seems to work well enough - I withdraw my suggestion then! |
@morcuended |
Confirmed! Let's merge it |
Thank you so much for all the testing and feedback @morcuended. |
Fixes #285
nan
instead of None in case of wrong timing info