Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create the new Gen-Ens-Prod tool for ensemble product generation. #1904

Closed
10 of 23 tasks
JohnHalleyGotway opened this issue Sep 2, 2021 · 2 comments · Fixed by #1927, #1931 or #2087
Closed
10 of 23 tasks

Create the new Gen-Ens-Prod tool for ensemble product generation. #1904

JohnHalleyGotway opened this issue Sep 2, 2021 · 2 comments · Fixed by #1927, #1931 or #2087
Assignees
Labels
MET: Ensemble Verification MET: PreProcessing Tools (Grid) priority: blocker Blocker reporting: DTC NOAA R2O NOAA Research to Operations DTC Project requestor: METplus Team METplus Development Team required: FOR DEVELOPMENT RELEASE Required to be completed in the development release for the assigned project type: new feature Make it do something new
Milestone

Comments

@JohnHalleyGotway
Copy link
Collaborator

JohnHalleyGotway commented Sep 2, 2021

Describe the New Feature

Create a new tool for ensemble product generation named Gen-Ens-Prod. This tool requires a configuration file and should contain the ensemble product generation currently performed for the fields in the "ens" dictionary of Ensemble-Stat. Consider renaming the "ens" dictionary to "data" to be consistent with the conventions of the Grid-Diag tool.

This tool does not process observations, but must support climatology mean and standard deviation data to support the use of climatological distribution percentile thresholds (e.g. >CDP75).

Acceptance Testing

Ensure that other MET tools can read the NetCDF output created by this tool.

Time Estimate

2 weeks.

Sub-Issues

Consider breaking the new feature down into sub-issues.
Sub-issues will likely be required. Not sure what they are yet.

  • Add a checkbox for each sub-issue here.

Relevant Deadlines

List relevant project deadlines here or state NONE.

Funding Source

Split 2793541, 2700041, 2799991

Define the Metadata

Assignee

Labels

  • Select component(s)
  • Select priority
  • Select requestor(s)

Projects and Milestone

  • Select Repository and/or Organization level Project(s) or add alert: NEED PROJECT ASSIGNMENT label
  • Select Milestone as the next official version or Future Versions

Define Related Issue(s)

Consider the impact to the other METplus components.

New Feature Checklist

See the METplus Workflow for details.

  • Complete the issue definition above, including the Time Estimate and Funding source.
  • Fork this repository or create a branch of develop.
    Branch name: feature_<Issue Number>_<Description>
  • Complete the development and test your changes.
  • Add/update log messages for easier debugging.
  • Add/update unit tests.
  • Add/update documentation.
  • Push local changes to GitHub.
  • Submit a pull request to merge into develop.
    Pull request: feature <Issue Number> <Description>
  • Define the pull request metadata, as permissions allow.
    Select: Reviewer(s) and Linked issues
    Select: Repository level development cycle Project for the next official release
    Select: Milestone as the next official version
  • Iterate until the reviewer(s) accept and merge your changes.
  • Delete your fork or branch.
  • Close this issue.
@JohnHalleyGotway JohnHalleyGotway added type: new feature Make it do something new priority: blocker Blocker requestor: UK Met Office United Kingdom Met Office alert: NEED ACCOUNT KEY Need to assign an account key to this issue required: FOR DEVELOPMENT RELEASE Required to be completed in the development release for the assigned project MET: PreProcessing Tools (Grid) MET: Ensemble Verification labels Sep 2, 2021
@JohnHalleyGotway JohnHalleyGotway added this to the MET 10.1.0 milestone Sep 2, 2021
@JohnHalleyGotway JohnHalleyGotway self-assigned this Sep 2, 2021
@JohnHalleyGotway JohnHalleyGotway added the alert: NEED MORE DEFINITION Not yet actionable, additional definition required label Sep 2, 2021
JohnHalleyGotway added a commit that referenced this issue Sep 10, 2021
…f the existing Ensemble-Stat tool, just renamed to Gen-Ens-Prod, including the newly added documentation section.
JohnHalleyGotway added a commit that referenced this issue Sep 10, 2021
…S_ to _GEP_ to get rid of RTD documentation warnings.
JohnHalleyGotway added a commit that referenced this issue Sep 10, 2021
@JohnHalleyGotway
Copy link
Collaborator Author

JohnHalleyGotway commented Sep 22, 2021

Progress. I have an initial version of gen_ens_prod working in my local feature_1904_gen_ens_prod branch. In Ensemble-Stat, the ensemble_flag dictionary, can only be set once and is applied to all the fields in the "ens" dictionary. So all products are derived for all fields. Recommend that we enable the ensemble_flag to be specified separately for each "ens" field so that users can control the output separately for each field. Here's the usage statement:

Usage: gen_ens_prod
	-ens file_1 ... file_n | ens_file_list
	-out file
	-config file
	[-ctrl file]
	[-log file]
	[-v level]

	where	"-ens file_1 ... file_n" are the gridded ensemble data files to be used (required).
		"ens_file_list" is an ASCII file containing a list of ensemble member file names (required).
		"-out file" is the NetCDF output file for the derived ensemble products (required).
		"-config file" is a GenEnsProdConfig file containing the desired configuration settings (required).
		"-ctrl file" is the gridded ensemble control data file included in mean but excluded from the spread (optional).
		"-log file" outputs log messages to the specified file (optional).
		"-v level" overrides the default level of logging (2) (optional).

JohnHalleyGotway added a commit that referenced this issue Sep 22, 2021
…omplete. Continue edits in the configuration section.
@TaraJensen TaraJensen added reporting: DTC NOAA R2O NOAA Research to Operations DTC Project and removed alert: NEED ACCOUNT KEY Need to assign an account key to this issue labels Sep 23, 2021
JohnHalleyGotway added a commit that referenced this issue Sep 23, 2021
…data for use in defining climo cdp threshold types.
@JohnHalleyGotway
Copy link
Collaborator Author

JohnHalleyGotway commented Sep 23, 2021

List of enhancements to make:

  • DONE 9/23/2021: Let ensemble_flag be specified separately for each ens.field array entry.

  • DONE 9/24/2021: Add support for climo and climo_cdp options in the ensemble_flag:

ensemble_flag = {
...
   climo     = TRUE;
   climo_cdp = TRUE;
...
}

Since this tool can read climo data and use them to define climo distribution percentile thresholds (e.g. >CDP50), we should be able to write these fields to the output, as Grid-Stat does.

  • DONE 9/24/2021: Update logic for handling the ensemble control member. If the -ctrl option is used, the tool reads all the ensemble fields from it. So that assumes the control member is the same file format as the other members. The only special logic is that the control member is excluded from the computation of spread, but it's used for everything else.

  • DONE 9/28/21: Update the gen_ens_prod documentation.

  • DONE 9/27/21: Add unit_gen_ens_prod.xml unit tests. Call the tool twice, once with/without the -cntrl option. Confirm that the ensemble spreads match but means differ.

  • [XXX] Write NetCDF output following the CF convention rather than MET's internal format. Consider moving this into a new GitHub issue instead.

NOTE Will not make this change as part of this issue. The changes required would be mostly in the common library code and a piecemeal approach isn't a good idea here. Instead make these changes as part of #660.

JohnHalleyGotway added a commit that referenced this issue Sep 24, 2021
…() followed by add_const(double, int). That saves code in gen_ens_prod.
JohnHalleyGotway added a commit that referenced this issue Sep 27, 2021
…the unit tests. Still need more work to get climo working in unit tests.
JohnHalleyGotway added a commit that referenced this issue Sep 27, 2021
JohnHalleyGotway added a commit that referenced this issue Sep 27, 2021
…climo data to the fractional_coverage() function.
JohnHalleyGotway added a commit that referenced this issue Sep 27, 2021
…ge() function. Also update the gen_ens_prod unit tests to call it once with/without a control member. I manually confirmed that the spread is the same in the output but the means differ.
JohnHalleyGotway added a commit that referenced this issue Sep 28, 2021
…move unused rng and tmp_dir config entries. Simplify variable names.
@JohnHalleyGotway JohnHalleyGotway removed the alert: NEED MORE DEFINITION Not yet actionable, additional definition required label Sep 28, 2021
@JohnHalleyGotway JohnHalleyGotway linked a pull request Sep 28, 2021 that will close this issue
12 tasks
@JohnHalleyGotway JohnHalleyGotway added requestor: METplus Team METplus Development Team and removed requestor: UK Met Office United Kingdom Met Office labels Sep 28, 2021
JohnHalleyGotway added a commit that referenced this issue Oct 1, 2021
…emble sum is used to compute both the mean and the standard deviation. Since we include control member in the mean but not the standard deviation, we need to track two different versions of that sum.
JohnHalleyGotway added a commit that referenced this issue Oct 1, 2021
…ude support was added to develop. Had to manually resolve some conflicts.
JohnHalleyGotway added a commit that referenced this issue Oct 1, 2021
… fractional_coverage() utility functions as their signatures where changed by recent enhancements in develop.
JohnHalleyGotway added a commit that referenced this issue Oct 2, 2021
* Per #1904, add gen_ens_prod tool. Note that this is a complete copy of the existing Ensemble-Stat tool, just renamed to Gen-Ens-Prod, including the newly added documentation section.

* Per #1904, update the gen-ens-prod documentation labels, switching _ES_ to _GEP_ to get rid of RTD documentation warnings.

* Per #1904, more renaming of EnsembleStat to GenEnsProd in the code and add the default configuration file.

* Per #1904, add updated version of the MET flowchart for version 10.1.0.

* Per #1904, add make directives for running a sample call to gen_ens_prod.

* Per #1904, updates to the default Gen-Ens-Prod config file.

* Per #1904, strip out the logic from Ensemble-Stat that does not apply to Gen-Ens-Prod.

* Per #1904, updates to the Gen-Ens-Prod documentation. These are not complete. Continue edits in the configuration section.

* Per #1904, fix command line in example to use -out instead of -outdir.

* Per #1904, removing lots of uneeded code. Update to read climatology data for use in defining climo cdp threshold types.

* Per #1904, add climo and climo_cdp options to ensemble_flag.

* Per #1904, update gen_ens_prod to actually write climo and climo_cdp outputs.

* Per #1904, add NumArray::set_const(double, int) to quickly call erase() followed by add_const(double, int). That saves code in gen_ens_prod.

* Per #1904, add support for the ensemble control member. Exclude it from the spread.

* Per #1904, update call to gen_ens_prod for make test and also update the unit tests. Still need more work to get climo working in unit tests.

* Per #1904, update the fractional_coverage() function to handle climo mean/stdev for CDP type thresholds.

* Per #1904, update Grid-Stat, Ensemlbe-Stat, and Gen-Ens-Prod to pass climo data to the fractional_coverage() function.

* Per #1904, fix logic for handling climo data in the fractional_coverage() function. Also update the gen_ens_prod unit tests to call it once with/without a control member. I manually confirmed that the spread is the same in the output but the means differ.

* Per #1904, add the make test script logic to unit_met_test_scripts.xml

* Per #1904, complete initial version of docs for gen_ens_prod tool. Remove unused rng and tmp_dir config entries. Simplify variable names.

* Per #1904, fix names to get it compiling and remove unneeded gsl include.

* Per #1904, simplify variable names.

* Per #1904, correct the logic for handling the control member. The ensemble sum is used to compute both the mean and the standard deviation. Since we include control member in the mean but not the standard deviation, we need to track two different versions of that sum.

* Per #1904, in gen_ens_prod.cc correct calls to the smooth_field() and fractional_coverage() utility functions as their signatures where changed by recent enhancements in develop.
@JohnHalleyGotway JohnHalleyGotway linked a pull request Oct 4, 2021 that will close this issue
12 tasks
JohnHalleyGotway added a commit that referenced this issue Mar 6, 2022
…cking the length of config_file instead of out_file. This became obvious when running gen_ens_prod without the -out option. That run segfaulted because it tried to create an output file using an empty string.
@JohnHalleyGotway JohnHalleyGotway linked a pull request Mar 6, 2022 that will close this issue
15 tasks
JohnHalleyGotway added a commit that referenced this issue Mar 7, 2022
…cking the length of config_file instead of out_file. This became obvious when running gen_ens_prod without the -out option. That run segfaulted because it tried to create an output file using an empty string. (#2087)
@JohnHalleyGotway JohnHalleyGotway changed the title Create new Gen-Ens-Prod tool for ensemble product generation. Create the new Gen-Ens-Prod tool for ensemble product generation. Mar 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
MET: Ensemble Verification MET: PreProcessing Tools (Grid) priority: blocker Blocker reporting: DTC NOAA R2O NOAA Research to Operations DTC Project requestor: METplus Team METplus Development Team required: FOR DEVELOPMENT RELEASE Required to be completed in the development release for the assigned project type: new feature Make it do something new
Projects
None yet
3 participants