Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add possibility to copy and not merge some keys by lstchain_merge_hdf5_files #1174

Merged
merged 7 commits into from
Nov 30, 2023

Conversation

FrancaCassol
Copy link
Collaborator

This is in particular necessary for the calibration keys in DL1 and DL2 files, which are the same for all subruns and should not be merged but simply copied from the first file of the list.
I wonder if we can already add them as default in the script...

@vuillaut
Copy link
Member

Hi
Should these keys be added to merging_checks to make sure they are indeed the same in all files? (open question)

@codecov
Copy link

codecov bot commented Oct 13, 2023

Codecov Report

Attention: 1 lines in your changes are missing coverage. Please review.

Comparison is base (e82bd68) 74.39% compared to head (fd7aa94) 74.40%.

Files Patch % Lines
lstchain/io/io.py 85.71% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1174      +/-   ##
==========================================
+ Coverage   74.39%   74.40%   +0.01%     
==========================================
  Files         129      129              
  Lines       13195    13206      +11     
==========================================
+ Hits         9816     9826      +10     
- Misses       3379     3380       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

parser.add_argument(
'--keys-to-copy',
nargs="+", default=[''],
help='List of duplicated keys to be copied and not to be merged'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this should be a command line argument. We know which keys are event-wise and should be merged and which are global so shouldn't be merged.

Let's define this in the code.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, this would be the best, but since this seems a general merging function I was not sure we wanted to specify it inside the function. If we agree on that I prefer it also.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a general merging function for our data, we do know all possible keys.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm reopening this conversation because I think it is not resolved yet. Having this command line argument makes it a bit confusing at the same time as having a set of default keys to be copied. Do we really need this additional option? I think they can be simply defined in lstchain/io/io.py

@maxnoe
Copy link
Member

maxnoe commented Oct 19, 2023

This special handling of some keys is also needed to fix #1152

So we should just make a set of keys that are global and are copied from the first file (potentially checking that they are equal in all following files) and keys that are "event-like" so need to be merged.

parser.add_argument(
'--keys-to-copy',
nargs="+", default=[''],
help='List of duplicated keys to be copied and not to be merged'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm reopening this conversation because I think it is not resolved yet. Having this command line argument makes it a bit confusing at the same time as having a set of default keys to be copied. Do we really need this additional option? I think they can be simply defined in lstchain/io/io.py

lstchain/io/io.py Outdated Show resolved Hide resolved
lstchain/io/io.py Outdated Show resolved Hide resolved
@@ -332,15 +337,33 @@ def auto_merge_h5files(
else:
keys = set(nodes_keys)

copy_keys = {}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be this simple one liner:

keys_to_copy = set() if keys_to_copy is None else keys_to_copy.intersection(keys)

lstchain/io/io.py Outdated Show resolved Hide resolved
@FrancaCassol
Copy link
Collaborator Author

Hi @maxnoe , @morcuended , I added the default keys, but I prefer to keep the trailet in case of tests on new implementations. Do not hesitate to add changes if you have strong opinions on this. We need to merge this PR before next release, which should be done soon.

@rlopezcoto
Copy link
Contributor

@maxnoe , @morcuended are you ok with the changes so we can move on with this PR as well?

Copy link
Member

@morcuended morcuended left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good to me

@rlopezcoto rlopezcoto merged commit 38da653 into main Nov 30, 2023
9 checks passed
@rlopezcoto rlopezcoto deleted the do_not_merge_calibration_info branch November 30, 2023 10:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants