Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhance logic/configuration to handle occasional missing input files #1656

Open
22 tasks
georgemccabe opened this issue Jun 13, 2022 · 0 comments
Open
22 tasks
Assignees
Labels
alert: NEED ACCOUNT KEY Need to assign an account key to this issue alert: NEED CYCLE ASSIGNMENT Need to assign to a release development cycle alert: NEED MORE DEFINITION Not yet actionable, additional definition required METplus: Configuration requestor: NOAA/EMC NOAA Environmental Modeling Center type: enhancement Improve something that it is currently doing

Comments

@georgemccabe
Copy link
Collaborator

georgemccabe commented Jun 13, 2022

This came up during the METplus NOAA Telecon on 6/13/2022.

To summarize, an error is reported whenever an input file is not found and therefore the entire METplus run reports an error (non-zero exit code). Missing files may occasionally be expected and users may not want the entire run to fail in that case. More discussion is needed to determine the best way to allow for this situation.

See discussion from this meeting:

In METplus when a file is missing it outputs “ERROR: (command_builder.py:723) Could not find FCST/OBS_INPUT file”. METplus logs this and adds it to an error count. When METplus is finished running, it will output the total number of errors, if there are any. If there are errors, METplus will exit with an exit code of 1 (https://github.com/dtcenter/METplus/blob/61fd319423c074464795dd14d49047207c293baa/metplus/util/met_util.py#L217).
I was running some parallel jobs on WCOSS2 that were using pcp_combine to create 24 hour accumulations for different variables. It was using the production GFS data on WCOSS2, so there are only 7 days kept around. I have the last forecast hour set to 384 since this is what we will want for the GFS in EVS, so there were a lot of missing files, which was okay and expected. I noticed in my log file “nid001007.cactus.wcoss2.ncep.noaa.gov: rank 0 exited with code 1” . I’m guessing this is from METplus exiting with exit code 1 because there were missing files. However, METplus did run successfully for the files that were available, so it doesn’t really feel like anything “failed”.

For operational vx, some international models are more likely to have some files missing.

Logan: Is the treatment of missing files on WCOSS and WCOSS2 the same? Mallory has only seen this behavior since testing on WCOSS2.

Perry: Has a history of this with NCO for VSDB vx. A missing obs file for air quality vx caused the job to fail. NCO required him to change the logic so that we log missing files but actually return good status. Failed jobs trigger alarms for NCO, and we need a way to avoid that.

George: METplus is designed to report an error for the entire run if anything goes wrong at any point. Users often want to know if it failed to find files, because it is often the case that they have a mistake in their filename templates or directory paths. We could potentially add a config option to report a warning instead of error in some cases, but we would have to be careful to determine that criteria. For example, if no files are found at all (because of a typo in the path) or if the user has configured an invalid value for a MET config file would likely still want to fail.

Could we define an allowable percentage of missing files… and only trigger an error when that threshold is reached?

Logan: He does radar verification for 5 different models using the same configuration with the same set of forecast leads for all. That results in many errors about missing files because not all models have output for the same set of times. He could define more precise configuration files for each model, but doing it this way is easier for now.

George: There is an option to skip certain valid times: https://metplus.readthedocs.io/en/latest/Users_Guide/systemconfiguration.html#skipping-times
This could be added temporarily when files could be missing and removed when more than 7 days worth of files are available.
John HG: When? Either METplus-5.0.0-beta2 on 8/3 or beta3 on 9/14.

Describe the Enhancement

Provide a description of the enhancement request here.

Time Estimate

Estimate the amount of work required here.
Issues should represent approximately 1 to 3 days of work.

Sub-Issues

Consider breaking the enhancement down into sub-issues.

  • Add a checkbox for each sub-issue here.

Relevant Deadlines

List relevant project deadlines here or state NONE.

Funding Source

Define the source of funding and account keys here or state NONE.

Define the Metadata

Assignee

  • Select engineer(s) or no engineer required
  • Select scientist(s) or no scientist required

Labels

  • Select component(s)
  • Select priority
  • Select requestor(s)

Projects and Milestone

  • Select Repository and/or Organization level Project(s) or add alert: NEED PROJECT ASSIGNMENT label
  • Select Milestone as the next official version or Future Versions

Define Related Issue(s)

Consider the impact to the other METplus components.

Enhancement Checklist

See the METplus Workflow for details.

  • Complete the issue definition above, including the Time Estimate and Funding Source.
  • Fork this repository or create a branch of develop.
    Branch name: feature_<Issue Number>_<Description>
  • Complete the development and test your changes.
  • Add/update log messages for easier debugging.
  • Add/update unit tests.
  • Add/update documentation.
  • Add any new Python packages to the METplus Components Python Requirements table.
  • Push local changes to GitHub.
  • Submit a pull request to merge into develop.
    Pull request: feature <Issue Number> <Description>
  • Define the pull request metadata, as permissions allow.
    Select: Reviewer(s) and Linked issues
    Select: Repository level development cycle Project for the next official release
    Select: Milestone as the next official version
  • Iterate until the reviewer(s) accept and merge your changes.
  • Delete your fork or branch.
  • Close this issue.
@georgemccabe georgemccabe added type: enhancement Improve something that it is currently doing requestor: NOAA/EMC NOAA Environmental Modeling Center alert: NEED MORE DEFINITION Not yet actionable, additional definition required alert: NEED ACCOUNT KEY Need to assign an account key to this issue alert: NEED CYCLE ASSIGNMENT Need to assign to a release development cycle METplus: Configuration labels Jun 13, 2022
@georgemccabe georgemccabe self-assigned this Jun 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
alert: NEED ACCOUNT KEY Need to assign an account key to this issue alert: NEED CYCLE ASSIGNMENT Need to assign to a release development cycle alert: NEED MORE DEFINITION Not yet actionable, additional definition required METplus: Configuration requestor: NOAA/EMC NOAA Environmental Modeling Center type: enhancement Improve something that it is currently doing
Projects
None yet
Development

No branches or pull requests

1 participant