Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Impute pointing values with linear interpolation #295

Merged
merged 14 commits into from
Feb 28, 2020

Conversation

vuillaut
Copy link
Member

Fixes #285

  • the pointing values are now nan instead of None in case of wrong timing info
  • I propose a linear interpolation between pointing values when they are missing with the hypothesis that events are in order and come at a regular rate. Note that if a lot of values are missing, it might not be the best approach and feedback is welcome.

Copy link
Contributor

@rlopezcoto rlopezcoto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very necessary and a very good way of handling the problem of missing ucts/pointing info. Just a couple of comments, would wait for @morcuended to test it and then merge right away.

lstchain/scripts/lstchain_mc_dl1_to_dl2.py Outdated Show resolved Hide resolved
lstchain/scripts/lstchain_mc_dl1_to_dl2.py Outdated Show resolved Hide resolved
@morcuended
Copy link
Member

Hi @vuillaut, when running the cloned scripts for example for sub-run 1840.0033 like:

srun python lstchain/scripts/lstchain_data_r0_to_dl1.py \
    -f /fefs/aswg/data/real/R0/20200118/LST-1.1.Run01840.0033.fits.fz \
    -o ~/data/DL1 \
    -pedestal /fefs/aswg/data/real/calibration/20200118/v00/drs4_pedestal.Run1830.0000.fits \
    -calib /fefs/aswg/data/real/calibration/20200118/v00/calibration.Run1831.0000.hdf5 \
    -time_calib /fefs/aswg/data/real/calibration/20200118/v00/time_calibration.Run1831.0000.hdf5 \
    -pointing /home/lapp/DrivePositioning/drive_log_20_01_18.txt

I'm having this error after processing some of the events:

...
11900
12000
12100
Traceback (most recent call last):
  File "tables/tableextension.pyx", line 1590, in tables.tableextension.Row.__setitem__
TypeError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "lstchain/scripts/lstchain_data_r0_to_dl1.py", line 78, in <module>
    main()
  File "lstchain/scripts/lstchain_data_r0_to_dl1.py", line 73, in main
    pointing_file_path=args.pointing_file_path
  File "/fefs/home/daniel.morcuende/software/to_remove/cta-lstchain/lstchain/reco/dl0_to_dl1.py", line 380, in r0_to_dl1
    containers=[dl1_container, extra_im])
  File "/home/daniel.morcuende/.local/miniconda3/envs/lst-dev/lib/python3.7/site-packages/ctapipe/io/hdf5tableio.py", line279, in write
    self._append_row(table_name, containers)
  File "/home/daniel.morcuende/.local/miniconda3/envs/lst-dev/lib/python3.7/site-packages/ctapipe/io/hdf5tableio.py", line255, in _append_row
    row[colname] = value
  File "tables/tableextension.pyx", line 1595, in tables.tableextension.Row.__setitem__
TypeError: invalid type (<class 'astropy.units.quantity.Quantity'>) for column ``alt_tel``

However I can process sub-run 1840.0034, which in principle did not contain alt_tel column before (with existing lstchain version), without problems and alt_tel and az_tel columns are now created with NaN values.

Will investigate for a few more problematic subruns to see if I can reproduce the error.

I don't know if it is related but, did you finally change the 'None' by 'nan' in

    alt_tel = Field(None, 'Telescope altitude pointing', unit=u.rad)                                                    
    az_tel = Field(None, 'Telescope azimuth pointing', unit=u.rad) 

?

I saw the commit in the other closed PR but not here. However, I see that you are setting nan in cal_pointingposition when ucts_time is missing.

@vuillaut
Copy link
Member Author

Hi @vuillaut, when running the cloned scripts for example for sub-run 1840.0033 like:

srun python lstchain/scripts/lstchain_data_r0_to_dl1.py \
    -f /fefs/aswg/data/real/R0/20200118/LST-1.1.Run01840.0033.fits.fz \
    -o ~/data/DL1 \
    -pedestal /fefs/aswg/data/real/calibration/20200118/v00/drs4_pedestal.Run1830.0000.fits \
    -calib /fefs/aswg/data/real/calibration/20200118/v00/calibration.Run1831.0000.hdf5 \
    -time_calib /fefs/aswg/data/real/calibration/20200118/v00/time_calibration.Run1831.0000.hdf5 \
    -pointing /home/lapp/DrivePositioning/drive_log_20_01_18.txt

I'm having this error after processing some of the events:

...
11900
12000
12100
Traceback (most recent call last):
  File "tables/tableextension.pyx", line 1590, in tables.tableextension.Row.__setitem__
TypeError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "lstchain/scripts/lstchain_data_r0_to_dl1.py", line 78, in <module>
    main()
  File "lstchain/scripts/lstchain_data_r0_to_dl1.py", line 73, in main
    pointing_file_path=args.pointing_file_path
  File "/fefs/home/daniel.morcuende/software/to_remove/cta-lstchain/lstchain/reco/dl0_to_dl1.py", line 380, in r0_to_dl1
    containers=[dl1_container, extra_im])
  File "/home/daniel.morcuende/.local/miniconda3/envs/lst-dev/lib/python3.7/site-packages/ctapipe/io/hdf5tableio.py", line279, in write
    self._append_row(table_name, containers)
  File "/home/daniel.morcuende/.local/miniconda3/envs/lst-dev/lib/python3.7/site-packages/ctapipe/io/hdf5tableio.py", line255, in _append_row
    row[colname] = value
  File "tables/tableextension.pyx", line 1595, in tables.tableextension.Row.__setitem__
TypeError: invalid type (<class 'astropy.units.quantity.Quantity'>) for column ``alt_tel``

However I can process sub-run 1840.0034, which in principle did not contain alt_tel column before (with existing lstchain version), without problems and alt_tel and az_tel columns are now created with NaN values.

Will investigate for a few more problematic subruns to see if I can reproduce the error.

Hey @morcuended
Thank you for testing!

From your comment, I would imagine that the difference between these subruns are that:

  • sub-run 1840.0034 do not have any valid pointing. In this case there is no unit conflict, every value is nan.
  • sub-run 01840.0033 have a mix of valid and non valid pointings. So it works up to event ~12100 then it becomes non-valid and there is a unit conflict.
    Could you confirm? I'll push a fix for this.

@morcuended
Copy link
Member

From your comment, I would imagine that the difference between these subruns are that:

  • sub-run 1840.0034 do not have any valid pointing. In this case there is no unit conflict, every value is nan.
  • sub-run 01840.0033 have a mix of valid and non valid pointings. So it works up to event ~12100 then it becomes non-valid and there is a unit conflict.
    Could you confirm? I'll push a fix for this.

@vuillaut, I think this is exactly the case. I'm running some more test with the new changes

@morcuended
Copy link
Member

Hi @vuillaut, this is what I'm getting for the entire run 1840:

image

You can see that now there are alt_tel values for those sub-runs with some missing ucts_time info. Good!

However, I realized that ucts_time values remain constant across events with ucts info for those subruns with a mix of valid and non-valid pointings as you can see in the plot above. Thus alt_tel values keep constant as well.
Any ideas?

@morcuended
Copy link
Member

Will check the dl1->dl2 step right away in order to check the interpolation of missing values.

@vuillaut
Copy link
Member Author

However, I realized that ucts_time values remain constant across events with ucts info for those subruns with a mix of valid and non-valid pointings as you can see in the plot above. Thus alt_tel values keep constant as well.
Any ideas?

I am not sure I understand why ucts_time values remain constant.

  • I thought invalid ucts_time were negative
  • If ucts_time is constant but with valid value, the derived pointing should be valid as well (and constant yes, as expected)

Could you elaborate, please?

@morcuended
Copy link
Member

  • I thought invalid ucts_time were negative

Yes, they are negative. I just filtered them out in the plot.

  • If ucts_time is constant but with valid value, the derived pointing should be valid as well (and constant yes, as expected)

This is what is happening. After dropping the negative values, the remaining ones are constant for all the events in the affected subruns. The same happens to the interpolated pointing values. I hope it is clearer in this zoomed plot:

image

@morcuended
Copy link
Member

Related to this problem of missing ucts time info, Isidro has been checking that even if timestamps are recovered after the loss, values may not be correct even if the external device presence flag says so. This is still under investigation by people involved in those subsystems. I think there is nothing we can do but rely on that flag saying whether the data is OK or not.

@vuillaut
Copy link
Member Author

  • I thought invalid ucts_time were negative

Yes, they are negative. I just filtered them out in the plot.

  • If ucts_time is constant but with valid value, the derived pointing should be valid as well (and constant yes, as expected)

This is what is happening. After dropping the negative values, the remaining ones are constant for all the events in the affected subruns. The same happens to the interpolated pointing values. I hope it is clearer in this zoomed plot:

Thank you @morcuended for the explanation.
If I understand correctly, these constant values need another handling, unrelated to this PR, right?

@morcuended
Copy link
Member

Thank you @morcuended for the explanation.
If I understand correctly, these constant values need another handling, unrelated to this PR, right?

Yes, I would say so. Let's fix that in another PR.

Concerning dl1->dl2 step, this is the result of the interpolation of the missing values:

  • Since coordinates remain constant, new values replicate this behavior, interpolating values for all the events in the affected subrun.
  • Whenever there are no valid alt_tel values in a given subrun, there are no interpolated values for none of the events in that particular subrun.

image

Additionally, I noticed that for some subruns the same unit error we were getting before appears again:

Traceback (most recent call last):                                                                                                                                                                                  
  File "/home/daniel.morcuende/software/to_remove/cta-lstchain/lstchain/scripts/lstchain_mc_dl1_to_dl2.py", line 99, in <module>                                                                                    
    main()                                                                                                                                                                                                          
  File "/home/daniel.morcuende/software/to_remove/cta-lstchain/lstchain/scripts/lstchain_mc_dl1_to_dl2.py", line 89, in main                                                                                        
    dl2 = dl1_to_dl2.apply_models(data, cls_gh, reg_energy, reg_disp_vector, custom_config=config)                                                                                                                  
  File "/fefs/home/daniel.morcuende/software/to_remove/cta-lstchain/lstchain/reco/dl1_to_dl2.py", line 410, in apply_models                                                                                         
    az_tel * u.rad)                                                                                                                                                                                                 
  File "/fefs/home/daniel.morcuende/software/to_remove/cta-lstchain/lstchain/reco/utils.py", line 197, in reco_source_position_sky                                                                                  
    return camera_to_sky(src_x, src_y, focal_length, pointing_alt, pointing_az)                                                                                                                                     
  File "/fefs/home/daniel.morcuende/software/to_remove/cta-lstchain/lstchain/reco/utils.py", line 227, in camera_to_sky 
    pointing_direction = SkyCoord(alt=pointing_alt, az=pointing_az, frame=horizon_frame)                                                                                                                            
  File "/home/daniel.morcuende/.local/miniconda3/envs/lst-dev/lib/python3.7/site-packages/astropy/coordinates/sky_coordinate.py", line 257, in __init__
    frame_cls(**frame_kwargs), args, kwargs)                                                                                                                                                                        
  File "/home/daniel.morcuende/.local/miniconda3/envs/lst-dev/lib/python3.7/site-packages/astropy/coordinates/sky_coordinate_parsers.py", line 244, in _parse_coordinate_data
    valid_components.update(_get_representation_attrs(frame, units, kwargs))                                                                                                                                        
  File "/home/daniel.morcuende/.local/miniconda3/envs/lst-dev/lib/python3.7/site-packages/astropy/coordinates/sky_coordinate_parsers.py", line 589, in _get_representation_attrs
    valid_kwargs[frame_attr_name] = repr_attr_class(value, unit=unit)                                                                                                                                               
  File "/home/daniel.morcuende/.local/miniconda3/envs/lst-dev/lib/python3.7/site-packages/astropy/coordinates/angles.py", line 514, in __new__
    self._validate_angles()                                                                                                                                                                                         
  File "/home/daniel.morcuende/.local/miniconda3/envs/lst-dev/lib/python3.7/site-packages/astropy/coordinates/angles.py", line 531, in _validate_angles
    'got {0}'.format(angles.to(u.degree)))                                                                                                                                                                          
ValueError: Latitude angle(s) must be within -90 deg <= angle <= 90 deg, got [-5156.62015618 -5156.62015618 -5156.62015618 ... -5156.62015618
 -5156.62015618 -5156.62015618] deg 

Strange because the values of the coordinates seem to be OK:
image

Could it be related to this assignment?

            data.alt_tel = u.Quantity(-90, u.deg)
            data.az_tel = u.Quantity(-90, u.deg)

@vuillaut
Copy link
Member Author

Concerning dl1->dl2 step, this is the result of the interpolation of the missing values:

  • Since coordinates remain constant, new values replicate this behavior, interpolating values for all the events in the affected subrun.
  • Whenever there are no valid alt_tel values in a given subrun, there are no interpolated values for none of the events in that particular subrun.

image

In this case, do you apply lstchain_mc_dl1_to_dl2 per subrun or to a merged file for the whole run?
If the former, you get the expected behaviour, no valid pointing value, nothing to interpolate on.
If the former, I don't understand.

Additionally, I noticed that for some subruns the same unit error we were getting before appears again:

ValueError: Latitude angle(s) must be within -90 deg <= angle <= 90 deg, got [-5156.62015618 -5156.62015618 -5156.62015618 ... -5156.62015618
 -5156.62015618 -5156.62015618] deg 

Could it be related to this assignment?

            data.alt_tel = u.Quantity(-90, u.deg)
            data.az_tel = u.Quantity(-90, u.deg)

My bad, I assigned a Quantity to a pandas column 🤦‍♂️
Corrected.

Comment on lines +400 to +401
alt_tel = - np.pi/2. * np.ones(len(dl2))
az_tel = - np.pi/2. * np.ones(len(dl2))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh well spotted

Comment on lines +73 to +74
data.alt_tel = - np.pi/2.
data.az_tel = - np.pi/2.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm running some more tests with this definition and it seems they work fine now.

@morcuended
Copy link
Member

In this case, do you apply lstchain_mc_dl1_to_dl2 per subrun or to a merged file for the whole run?
If the former, you get the expected behaviour, no valid pointing value, nothing to interpolate on.
If the former, I don't understand.

I'm running lstchain_mc_dl1_to_dl2 per subrun.

@vuillaut
Copy link
Member Author

vuillaut commented Feb 25, 2020

In this case, do you apply lstchain_mc_dl1_to_dl2 per subrun or to a merged file for the whole run?
If the former, you get the expected behaviour, no valid pointing value, nothing to interpolate on.
If the former, I don't understand.

I'm running lstchain_mc_dl1_to_dl2 per subrun.

Ok so for these subruns you'll get pointing to -90deg now.
On a merged file, you should get a linear interpolation between subruns with valid pointing values, so in this case, straight lines joining constant lines

__     __
   \__/  \__/

Frankly, at this point, I am not sure what is best with such messy data.
As a user, I would probably get rid of the pointing for the whole run and set it constant. But this is a peculiar case (I hope) and should be dealt with specifically by the analyzer.

@morcuended
Copy link
Member

In this case, do you apply lstchain_mc_dl1_to_dl2 per subrun or to a merged file for the whole run?
If the former, you get the expected behaviour, no valid pointing value, nothing to interpolate on.
If the former, I don't understand.

I'm running lstchain_mc_dl1_to_dl2 per subrun.

Ok so for these subruns you'll get pointing to -90deg now.

That's what I'm getting

On a merged file, you should get a linear interpolation between subruns with valid pointing values, so in this case, straight lines joining constant lines

__     __
   \__/  \__/

Frankly, at this point, I am not sure what is best with such messy data.
As a user, I would probably get rid of the pointing for the whole run and set it constant. But this is a peculiar case (I hope) and should be dealt with specifically by the analyzer.

I agree with you, some runs are a real pain. At least, these very useful changes will allow going up to dl2 regardless of missing time/pointing info. Thanks!

Copy link
Member

@morcuended morcuended left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now it works fine

@morcuended
Copy link
Member

Hi again @vuillaut,
I tried to run lstchain_mc_dl1_to_dl2 over the whole merged run 1840. This is the result
image
I don't understand what I'm doing wrong this time...it should look as you mentioned earlier.

@moralejo
Copy link
Collaborator

Seeing the problems you have to implement this solution I wonder whether it would be better to take the nominal RA,dec pointing direction, and then (assuming the pointing is exact) use the interpolated time, which looks good enough, to obtain alt, az event-wise. Obviously this would result in the weird situation that if one calculates back the pointing RA/Dec the exact value will be returned. But seems like a minor side effect.

@morcuended
Copy link
Member

Seeing the problems you have to implement this solution I wonder whether it would be better to take the nominal RA,dec pointing direction, and then (assuming the pointing is exact) use the interpolated time, which looks good enough, to obtain alt, az event-wise. Obviously this would result in the weird situation that if one calculates back the pointing RA/Dec the exact value will be returned. But seems like a minor side effect.

Since you have to reconstruct alt, az coordinates, at the end you will not recover exactly the same RADec coordinates but a cloud of points around the true values, right?

@morcuended
Copy link
Member

morcuended commented Feb 26, 2020

So, most cases are dealt with.
The problem appears when entire subruns are missing pointing values.
But the solution differs if we analyse per subruns or per runs.

  • per subruns, we can set a default value (as now) so the next step of the analysis runs
  • per run, we should not set a default value, keep nan values and interpolate on the entire run

As this decision is done outside of lstchain, I don't see any obvious way to implement a solution but to leave it to the user.
Hopefully, such issues only concern a few runs and happen only now in the early phases.

Should we leave it as it is now and merge this PR? Add a warning somewhere?

@moralejo
Copy link
Collaborator

Seeing the problems you have to implement this solution I wonder whether it would be better to take the nominal RA,dec pointing direction, and then (assuming the pointing is exact) use the interpolated time, which looks good enough, to obtain alt, az event-wise. Obviously this would result in the weird situation that if one calculates back the pointing RA/Dec the exact value will be returned. But seems like a minor side effect.

Since you have to reconstruct alt, az coordinates, at the end you will not recover exactly the same RADec coordinates but a cloud of points around the true values, right?

I do not know what you mean - if for a given event one "fakes" the alt/az pointing direction, just calculating the values that would correspond to its (known, nominal) RA/dec values, at the interpolated ucts time, then inverting the transformation, i.e. starting from alt/az (and using the interpolated ucts time), should result in exactly the same RA/dec.

@morcuended
Copy link
Member

I was thinking of going from these alt_tel, az_tel coordinates (even if they are faked using time stamps) from DL1 level to reco_alt, reco_az values at DL2 and then transform them back to RA/dec.

@vuillaut
Copy link
Member Author

vuillaut commented Feb 26, 2020

Seeing the problems you have to implement this solution I wonder whether it would be better to take the nominal RA,dec pointing direction, and then (assuming the pointing is exact) use the interpolated time, which looks good enough, to obtain alt, az event-wise. Obviously this would result in the weird situation that if one calculates back the pointing RA/Dec the exact value will be returned. But seems like a minor side effect.

I am sorry, I fail to see how that would be easier in RA,DEC.
The solution is not actually complicated here, it's rather a question of how to approach the problem and dealing with multiple use cases.

We have pointing (either alt,az or RA,DEC) every ~second.
Then we interpolate the pointing for each event based on timestamp.
As timestamps are not always there, pointing interpolation fails.
This can easily be solved (this PR) if it concerns a limited fraction of events.

The problem is that some subruns have not a single valid timestamp. How do you get the pointing then?
Well, we could interpolate across the whole run to guess the pointing of the invalid subruns.
The problem there is the workflow (which is not implemented in lstchain so we have no control here). There is no way to know if the data is or will be analysed per subrun or in a merged way.

@moralejo
Copy link
Collaborator

Hi @vuillaut ,

I was thinking on the image below:
image

It is not clear to me from the thread if you managed to fix that behaviour or not, but that is what triggered my suggestion. It just replaces the interpolation of alt & az by a coordinates transformation. But of course, you still need to have an interpolated time. If a bad time estimate was behind the ugly plot above, then it will not solve anything.

The problem is that some subruns have not a single valid timestamp. How do you get the pointing then? Well, we could interpolate across the whole run to guess the pointing of the invalid subruns.

Indeed, there is no other way than assigning "uniformly" distributed times to the events, unless one of the other times is available.

@vuillaut
Copy link
Member Author

@morcuended
To deal with such run, I think the strategy would be to merge at DL1 level and then run dl1_to_dl2 which should interpolate missing values.

@morcuended
Copy link
Member

@morcuended
To deal with such run, I think the strategy would be to merge at DL1 level and then run dl1_to_dl2 which should interpolate missing values.

Do you mean run 1840? That is what I did to obtain that 'messy' plot. First, merge DL1 files (including those subruns with no time info at all) and then run dl1_to_dl2 which interpolated taking into account the -pi/2 values.

Comment on lines +72 to +74
else:
data.alt_tel = - np.pi/2.
data.az_tel = - np.pi/2.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that in order to avoid the interpolation between -pi/2 and correct values, the condition here to interpolate missing values should include also those values which were set to -pi/2 in dl1_to_dl2.py script:

    if 'mc_alt_tel' in dl2.columns:
        alt_tel = dl2['mc_alt_tel'].values
        az_tel = dl2['mc_az_tel'].values
    elif 'alt_tel' in dl2.columns:
        alt_tel = dl2['alt_tel'].values
        az_tel = dl2['az_tel'].values
    else:
        alt_tel = - np.pi/2. * np.ones(len(dl2))
        az_tel = - np.pi/2. * np.ones(len(dl2))

In this way, those subruns with no alt az coordinates at all can be properly interpolated. Does this makes sense to you @vuillaut?

@vuillaut
Copy link
Member Author

vuillaut commented Feb 27, 2020

@morcuended
To deal with such run, I think the strategy would be to merge at DL1 level and then run dl1_to_dl2 which should interpolate missing values.

Do you mean run 1840? That is what I did to obtain that 'messy' plot. First, merge DL1 files (including those subruns with no time info at all) and then run dl1_to_dl2 which interpolated taking into account the -pi/2 values.

Ha sorry I thought you merged DL2 files.

This is what I get:

dir = '/home/daniel.morcuende/data/real/DL1/20200118/pointing/'
list = [os.path.join(dir, f) for f in os.listdir(dir) if f.endswith('h5') and '1.1' in f]
list.sort()
dfs = {}
for i, f in enumerate(list):
    dfs[i] = pd.read_hdf(f, key=dl1_params_lstcam_key)
merged = pd.concat([df for i, df  in dfs.items()])
impute_pointing(merged)
plt.figure(figsize=(12,7))
plt.plot(event_id, alt_tel, label='interpolated on all run', color='black')
for i, df in dfs.items():
    plt.plot(df.event_id, df.alt_tel)
plt.legend()

image

x-axis=event_id
y-axis=alt_tel

@morcuended
Copy link
Member

This looks much better indeed :) Let me check what I'm doing wrong then

@vuillaut
Copy link
Member Author

This looks much better indeed :) Let me check what I'm doing wrong then

@morcuended
I just thought that when you merge with the script, the data is not sorted by event_id across the run so the interpolation goes berzerk. I think that's what happened in your case.
I'll push a fix.

@morcuended
Copy link
Member

Ouch! probably yes. Will try it with that. More generally I was thinking about sorting the file list in:

file_list = [args.srcdir + '/' + f for f in os.listdir(args.srcdir) if f.endswith('.h5')]

by adding file_list.sort()

But again, there is no single way of merging files. You can merge only the subruns corresponding to a given run or even several runs. Hence sorting does not make sense anymore. Maybe there should be an option in lstchain_merge_hdf5_files.py to allow for merging in a run basis (parsing the run_number as an input option).

@vuillaut
Copy link
Member Author

But again, there is no single way of merging files. You can merge only the subruns corresponding to a given run or even several runs. Hence sorting does not make sense anymore. Maybe there should be an option in lstchain_merge_hdf5_files.py to allow for merging in a run basis (parsing the run_number as an input option).

I like the idea of the run option if you can implement that.

I implemented the sort by event_id here. But this means that the method will fail when analysing a DL1 file with several runs inside (the sort by event_id will actually shuffle the events from different runs).

@morcuended
Copy link
Member

I like the idea of the run option if you can implement that.

Sure, will do that

@moralejo
Copy link
Collaborator

moralejo commented Feb 28, 2020

The problem is that some subruns have not a single valid timestamp. How do you get the pointing then? Well, we could interpolate across the whole run to guess the pointing of the invalid subruns.

Indeed, there is no other way than assigning "uniformly" distributed times to the events, unless one of the other times is available.

Ok, I saw you solved the problem with the weird interpolation, and the across-the-run interpolation also seems to work well enough - I withdraw my suggestion then!

@vuillaut
Copy link
Member Author

@morcuended
If you can confirm that we now have the correct behaviour with the merged files, I'll merge.

@morcuended
Copy link
Member

@morcuended
If you can confirm that we now have the correct behaviour with the merged files, I'll merge.

Confirmed!

image

Let's merge it

@vuillaut
Copy link
Member Author

Thank you so much for all the testing and feedback @morcuended.
Before merging I will add a Warning in case of identical event_id in the file so this is not forgotten.

@vuillaut vuillaut merged commit 4b46b7c into cta-observatory:master Feb 28, 2020
@vuillaut vuillaut deleted the impute_poin branch February 28, 2020 13:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Error in dl1 to dl2 step when pointing info is missing
4 participants