Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for new point_weight_flag to the Point-Stat and Ensemble-Stat tools #2279

Open
6 of 23 tasks
JohnHalleyGotway opened this issue Sep 23, 2022 · 6 comments · May be fixed by #2993
Open
6 of 23 tasks

Add support for new point_weight_flag to the Point-Stat and Ensemble-Stat tools #2279

JohnHalleyGotway opened this issue Sep 23, 2022 · 6 comments · May be fixed by #2993
Assignees
Labels
MET: PreProcessing Tools (Point) priority: high High Priority requestor: UK Met Office United Kingdom Met Office required: FOR OFFICIAL RELEASE Required to be completed in the official release for the assigned milestone type: new feature Make it do something new
Milestone

Comments

@JohnHalleyGotway
Copy link
Collaborator

JohnHalleyGotway commented Sep 23, 2022

Describe the New Feature

The MET Grid-Stat tool supports a configuration option named grid_weight_flag to define weights for computing statistics aggregated over multiple grid points. The grid weighting is based on the area each grid point represents, giving larger weight to grid boxes with larger areas.

This task is to develop a method for weighting the aggregation of point observations. And the same basic motivation applies, wanting to avoid overemphasizing areas with dense observations, and underemphasizing areas with sparse observations.

Consider also whether point_weight_flag should be added to Stat-Analysis jobs when aggregating MPR lines to compute partial sums, contingency tables, and statistics.

This request originally arose when aggregating SEEPS for individual stations into a spatial summary. The UK Met Office defines weights for that aggregation based on the spatial density of those stations. However, those weights are pre-defined and static since the stations they use are consistent run-to-run.

Recommend that when implementing this in MET, the weights NOT be static, in general. Instead, dynamically compute them for each verification task based on the location of the observations being processed.. The number and location of point observations can change dramatically run-to-run based on the masking region, variable, and data source. That being said, re-defining them in each run would likely be slower. Recommend that when specifying a mask.sid list of station id's we provide an option to define a fixed station weight.

The tasks for this issue include:

  • Collaborating with @RachelNorth and @mpm-meto to clarify the algorithm for defining these weights.
  • Add point_weight_flag configuration option for each verification task in Point-Stat and Ensemble-Stat with a default value of NONE, meaning apply a weight of 1 to all points.
  • Support setting point_weight_flag equal to DENSITY to define the weights on the fly based on station location.
  • When mask.sid is set to a file and station names are read from that file, add an option for a raw weight to be specified for that station (perhaps SID(weight) with a numeric weight?). When aggregating across multiple stations, note that the true weight should be computed as the raw station weight divided by the sum of the weights of all points.
  • Set the existing weights in the PairDataPoint and PairDataEnsemble classes and ensure that those weights are used correctly in the computation of statistics.
  • In particular, ensure that the these weights are used in the aggregation of SEEPS_MPR data to compute the aggregated SEEPS data.

Acceptance Testing

List input data types and sources.
Describe tests required for new functionality.

Time Estimate

1 week.

Sub-Issues

Consider breaking the new feature down into sub-issues.

  • Add a checkbox for each sub-issue here.

Relevant Deadlines

List relevant project deadlines here or state NONE.

Funding Source

Split between MetOffice and NOAA R2O keys.

Define the Metadata

Assignee

  • Select engineer(s) or no engineer required
  • Select scientist(s) or no scientist required

Labels

  • Select component(s)
  • Select priority
  • Select requestor(s)

Projects and Milestone

  • Select Repository and/or Organization level Project(s) or add alert: NEED PROJECT ASSIGNMENT label
  • Select Milestone as the next official version or Future Versions

Define Related Issue(s)

Consider the impact to the other METplus components.

New Feature Checklist

See the METplus Workflow for details.

  • Complete the issue definition above, including the Time Estimate and Funding source.
  • Fork this repository or create a branch of develop.
    Branch name: feature_<Issue Number>_<Description>
  • Complete the development and test your changes.
  • Add/update log messages for easier debugging.
  • Add/update unit tests.
  • Add/update documentation.
  • Push local changes to GitHub.
  • Submit a pull request to merge into develop.
    Pull request: feature <Issue Number> <Description>
  • Define the pull request metadata, as permissions allow.
    Select: Reviewer(s) and Linked issues
    Select: Repository level development cycle Project for the next official release
    Select: Milestone as the next official version
  • Iterate until the reviewer(s) accept and merge your changes.
  • Delete your fork or branch.
  • Close this issue.
@JohnHalleyGotway
Copy link
Collaborator Author

Related to Voroni Tesselations in MET #2661

@JohnHalleyGotway JohnHalleyGotway added the required: FOR OFFICIAL RELEASE Required to be completed in the official release for the assigned milestone label May 15, 2024
@JohnHalleyGotway JohnHalleyGotway self-assigned this Aug 22, 2024
@j-opatz
Copy link
Contributor

j-opatz commented Oct 8, 2024

Given time remaining in this beta, time remaining for this project's completion and associated funding, it seems like now is a good time to coordinate with the UK Met Office POCs and determine how much can be done to consider this issue closed.

I will reach out to the necessary contacts later today.

The remaining material will be captured for future work and future funding sources.

@JohnHalleyGotway
Copy link
Collaborator Author

JohnHalleyGotway commented Oct 9, 2024

Task checklist:

  • Update library code to parse and store lists of station id's formatted as station_name(weight), where weight is the numeric weight to be applied.
  • Add point_weight_flag config options to Point-Stat and Ensemble-Stat config files.
  • Add point_weight_flag config options to Point-Stat and Ensemble-Stat code.
  • If point_weight_flag = SID, apply the station ID weights when computing statistics.
  • Add a unit test to demonstrate the application of point_weight_flag = SID.
  • Update the MET User's Guide documentation.

JohnHalleyGotway added a commit that referenced this issue Oct 9, 2024
@JohnHalleyGotway
Copy link
Collaborator Author

@georgemccabe and @j-opatz, I initially wrote this issue up to let the point_weight_flag be set separately for each verification task, but I'm wondering whether or not that level of control is actually needed? I'll note that for Grid-Stat and Ensemble-Stat, the grid_weight_flag can only be set once at the highest level of config file context. For consistency, I suspect we should do the same with the point_weight_flag... only support setting it once to be applied to the whole run. Practically speaking, in this initial implementation, it'll only have an effect when a list of station id's is provided with the weights pre-defined. In the future, we'll compute the weight on the fly based on the actual density of the observations for each verification task.

Any feedback on this detail?

JohnHalleyGotway added a commit that referenced this issue Oct 10, 2024
…mble-Stat config file and tweaking whitespace.
JohnHalleyGotway added a commit that referenced this issue Oct 10, 2024
…config class. Also remove sue unneeded wgt_dp argument for the add_point_obs() functions. Plan to add logic to set the point weights only AFTER all the observations have been collected for each verification task.
JohnHalleyGotway added a commit that referenced this issue Oct 10, 2024
@j-opatz
Copy link
Contributor

j-opatz commented Oct 10, 2024

Any feedback on this detail?

@JohnHalleyGotway your new direction, keeping point_weight_flag restricted to a highest level config option, sounds like the reasonable path to me. Given that grid_weight_flag behaves in the same manner, I'd rather not have one allow individual verification process adjustments while the other does not.

@JohnHalleyGotway JohnHalleyGotway removed alert: NEED MORE DEFINITION Not yet actionable, additional definition required alert: NEED ACCOUNT KEY Need to assign an account key to this issue alert: NEED CYCLE ASSIGNMENT Need to assign to a release development cycle labels Oct 10, 2024
JohnHalleyGotway added a commit that referenced this issue Oct 11, 2024
…it_point_weight.xml unit test to run Point-Stat on scalar and probability inputs weighting the stations by their elevation. Still need to add Ensemble-Stat calls.
@JohnHalleyGotway
Copy link
Collaborator Author

@j-opatz note that I added a sanity check in the code. If point_weight_flag = SID and the user is doing station ID masking by setting the mask.sid config option, but that mask contains no numeric weights, this warning message is printed:

DEBUG 4: Applying point weights for the "SID_CONUS_ADPSFC_ELEV" station ID masking region.
WARNING: 
WARNING: PairBase::set_point_weight() -> station ID point weighting requested but no weights were defined in the "SID_CONUS_ADPSFC_ELEV" station ID mask. Using default weights of 1.
WARNING:

I suppose it's possible that a user could specify multiple station ID masks, request weighting, but only have weights defined for one or two of the masks, but not all of them. In that case, they might not want to see this warning.

Should I keep it as a warning or switch to a debug log message?

@JohnHalleyGotway JohnHalleyGotway linked a pull request Oct 14, 2024 that will close this issue
17 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
MET: PreProcessing Tools (Point) priority: high High Priority requestor: UK Met Office United Kingdom Met Office required: FOR OFFICIAL RELEASE Required to be completed in the official release for the assigned milestone type: new feature Make it do something new
Projects
Status: 🏗 In progress
Development

Successfully merging a pull request may close this issue.

2 participants