Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Small improvements to modeling script #114

Conversation

HealthyPear
Copy link
Member

@HealthyPear HealthyPear commented Mar 19, 2021

This PR is the result of some improvement that came out as a consequence of the development of the integration tests for the modeling part.

  • the split between train and test data has been simplified/improved by
    • using the correspondent scikit-learn function,
    • splitting on images instead of obs_id (more "democratic" choice)
    • shuffling the images against shower reuse in CORSIKA
  • the I/O of the script has been improved,
    • input and output information can be overwritten by the CLI, which has lower priority over the config file
    • camera IDs can be read similarly with the following priorities (from higher to lower)
      • from the config file
      • from the training file
      • from the CLI

@HealthyPear HealthyPear added enhancement New feature or request machine learning input / output new features or issues regarding input and output formats labels Mar 19, 2021
@codecov
Copy link

codecov bot commented Mar 19, 2021

Codecov Report

Merging #114 (e87557c) into master (e708fe7) will decrease coverage by 0.61%.
The diff coverage is 9.43%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #114      +/-   ##
==========================================
- Coverage   39.22%   38.61%   -0.62%     
==========================================
  Files          22       22              
  Lines        1912     1950      +38     
==========================================
+ Hits          750      753       +3     
- Misses       1162     1197      +35     
Impacted Files Coverage Δ
protopipe/mva/train_model.py 15.68% <ø> (ø)
protopipe/scripts/build_model.py 11.29% <2.50%> (-3.93%) ⬇️
protopipe/pipeline/utils.py 50.90% <22.22%> (-1.66%) ⬇️
protopipe/mva/utils.py 12.31% <50.00%> (+0.97%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e708fe7...e87557c. Read the comment docs.

@HealthyPear HealthyPear mentioned this pull request Mar 19, 2021
2 tasks
@HealthyPear HealthyPear marked this pull request as ready for review March 19, 2021 14:30
@HealthyPear HealthyPear requested a review from kosack March 19, 2021 14:30
@HealthyPear HealthyPear merged commit 7632ef2 into cta-observatory:master Mar 26, 2021
@HealthyPear HealthyPear deleted the feature-improve_build_models_and_mva branch March 26, 2021 08:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request input / output new features or issues regarding input and output formats machine learning
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants