Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix dl1 file naming to get rid of the stream number and "fits" extension #344

Merged
merged 15 commits into from
Apr 20, 2020

Conversation

rlopezcoto
Copy link
Contributor

@rlopezcoto rlopezcoto commented Apr 16, 2020

Fixes #301

To get rid of the .1. and fits I had to include two different paths depending on the name of the file. Another option is to read the input file and treat it differently if it is mc or real_data, but this one will work as long as we do not modify the naming of the files.

@rlopezcoto
Copy link
Contributor Author

rlopezcoto commented Apr 16, 2020

Also, I had some troubles with the __init__ because the file r0_to_dl1.py contains the function r0_to_dl1 (amongst others), then it cannot be included in the init as:
from .r0_to_dl1 import *
not sure what is the best solution here...

@codecov
Copy link

codecov bot commented Apr 16, 2020

Codecov Report

Merging #344 into master will increase coverage by 0.00%.
The diff coverage is 66.66%.

Impacted file tree graph

@@           Coverage Diff           @@
##           master     #344   +/-   ##
=======================================
  Coverage   43.42%   43.43%           
=======================================
  Files          68       68           
  Lines        4638     4644    +6     
=======================================
+ Hits         2014     2017    +3     
- Misses       2624     2627    +3     
Impacted Files Coverage Δ
lstchain/reco/utils.py 65.83% <ø> (ø)
lstchain/reco/r0_to_dl1.py 69.23% <20.00%> (-0.73%) ⬇️
lstchain/reco/__init__.py 100.00% <100.00%> (ø)
lstchain/reco/disp.py 92.10% <100.00%> (ø)
lstchain/reco/volume_reducer.py 94.28% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e13b0b7...e82dfd6. Read the comment docs.

from .r0_to_dl1 import get_dl1, add_disp_to_parameters_table
from .dl1_to_dl2 import *
from .volume_reducer import *
from .disp import *
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please no star imports without providing __all__. Neither editors nor code linters will be able to work properly (autocompletion / linting)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what do you mean, all files provide __all__, but some functions were missing though, this is what you mean?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file does not have an __all__.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, I get it. We need to modify all the __init__ files because none have __all__ and are making wildcard imports. Also modify some of the calls to avoid

Also, I had some troubles with the init because the file r0_to_dl1.py contains the function r0_to_dl1 (amongst others), then it cannot be included in the init as:
from .r0_to_dl1 import *

can we merge this one and I open an Issue/PR right afterwards to solve that in a different thread?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so @maxnoe shall we move on with this PR and I'll correct the different __init__ files afterwards?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

lstchain/reco/r0_to_dl1.py Outdated Show resolved Hide resolved
lstchain/reco/r0_to_dl1.py Outdated Show resolved Hide resolved
@rlopezcoto rlopezcoto mentioned this pull request Apr 16, 2020
@rlopezcoto
Copy link
Contributor Author

@maxnoe are you ok with the changes?

@rlopezcoto rlopezcoto changed the title Fix dl1 stream naming Fix dl1 file naming to get rid of the stream number and "fits" extension Apr 20, 2020
@rlopezcoto rlopezcoto merged commit 686c702 into cta-observatory:master Apr 20, 2020
@rlopezcoto rlopezcoto deleted the fix_dl1_stream_naming branch April 20, 2020 08:50
'dl1_' + os.path.basename(input_filename).rsplit('.', 1)[0] + '.h5'
)
if (input_filename.startswith('LST')):
output_filename = (
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like you are expecting a certain filename pattern here. Not just that the file starts with LST.

What is the format?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, names are e.g.:
LST-1.4.Run01975.0005.fits.fz
so:
'LST-' + tel_id + '.' + stream_number + '.Run' + run_number + '.' + subrun_number + '.fits.fz'

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I think we should use a regex for this and check if this matches, I'll prepare a PR.

Copy link
Member

@maxnoe maxnoe Apr 20, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And you want to remove the stream here? So dl1_LST-{tel_id}.{run_number}.{subrun_number}.h5'?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, that is the misleading thing about streams, see #301 .
The reader, if all 4 streams are in the same folder, reads the 4 of them and aligns the event numbers from the four files. So when you process the data in the standard way, event though the input string may be LST-1.1.Run01975.0005.fits.fz, the dl1 output has all 4 streams, that was the reason to get rid of the stream_number

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that this may be problematic if you process only one of the streams, because you would lose the stream number in the output, but this is related to how the reader processes the data and in any case, you process all the 4 streams in most of the cases

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

THis is what I wrote here, or not? The intention of the code is to add the dl1_ prefix, replace the extension with .h5 and to remove the stream number, correct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, that is right, but this is what is done in the code right now, right?
sorry, I read the original message that contained some stream_number in the output name

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AH yeah sorry, that was a typo I corrected immediately, but you probably read it in the mail

@morcuended
Copy link
Member

Is the stream number already removed in the master? From my latest test I think I'm still getting them:
dl1_LST-1.1.Run01844.0131.fits.h5

@rlopezcoto
Copy link
Contributor Author

rlopezcoto commented May 2, 2020

Grrr, I think that I know what happened there. The output file name is actually changed in:

output_filename = os.path.join(

(addint the dl1_ already in the lstchain_r0_to_dl1 script). Then when it checks this:
if input_filename.startswith('LST'):

the file name does not start with 'LST' anymore... Will this get solved with #383 @maxnoe ? Otherwise I can work it out.

@maxnoe
Copy link
Member

maxnoe commented May 2, 2020

It should be, yes.

@rlopezcoto
Copy link
Contributor Author

then let's wait for the merge

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Naming of DL1 files and dealing with data streams
3 participants