Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature #1371 series_analysis #2951

Merged
merged 47 commits into from
Aug 29, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
47 commits
Select commit Hold shift + click to select a range
c6b4cab
Per #1371, add -input command line argument and add support for ALL f…
JohnHalleyGotway Jul 29, 2024
f49d9e9
Per #1371, rename the -input command line option as -aggregate instead
JohnHalleyGotway Jul 30, 2024
cb9410c
Merge remote-tracking branch 'origin/develop' into feature_1371_serie…
JohnHalleyGotway Jul 30, 2024
5f12f30
Per #1371, work in progress
JohnHalleyGotway Jul 30, 2024
5b5122d
Merge remote-tracking branch 'origin/develop' into feature_1371_serie…
JohnHalleyGotway Jul 30, 2024
ea1b00a
Per #1371, just comments
JohnHalleyGotway Aug 1, 2024
d92b2df
Per #1371, working on aggregating CTC counts
JohnHalleyGotway Aug 1, 2024
94abe8f
Per #1371, work in progress
JohnHalleyGotway Aug 1, 2024
8e33a49
Per #1371, update timing info using time stamps in the aggr file
JohnHalleyGotway Aug 1, 2024
6364c99
Per #1371, close the aggregate data file
JohnHalleyGotway Aug 1, 2024
fe26bf1
Per #1371, define set_event() and set_nonevent() member functions
JohnHalleyGotway Aug 7, 2024
7980c3e
Per #1371, add logic to aggregate MCTC and PCT counts
JohnHalleyGotway Aug 7, 2024
1891cea
Merging changes from develop
JohnHalleyGotway Aug 9, 2024
4c1ac83
Merge remote-tracking branch 'origin/develop' into feature_1371_serie…
JohnHalleyGotway Aug 9, 2024
a24485b
Per #1371, work in progress aggregating all the line statistics types…
JohnHalleyGotway Aug 9, 2024
ca3b2b1
Per #1371, switch to using get_stat() functions
JohnHalleyGotway Aug 12, 2024
48eb23c
Per #1371, work in progress. More consolidation
JohnHalleyGotway Aug 13, 2024
c0ae8ae
Per #1371, correct expected output file name
JohnHalleyGotway Aug 13, 2024
a6ffe06
Per #1371, consistent regridding log messages and fix the Series-Anal…
JohnHalleyGotway Aug 13, 2024
55e05ba
Per #1371, check the return status when opening the aggregate file.
JohnHalleyGotway Aug 13, 2024
0aec0ca
Per #1371, fix prc/pjc typo
JohnHalleyGotway Aug 13, 2024
2fba527
Per #1371, fix the series_analysis PCT aggregation logic and add a te…
JohnHalleyGotway Aug 13, 2024
8f7a80c
Per #1371, resolve a few SonarQube findings
JohnHalleyGotway Aug 13, 2024
b4b8bb0
Per #1371, make use of range-based for loop, as recommeded by SonarQube
JohnHalleyGotway Aug 13, 2024
9c9e945
Merge remote-tracking branch 'origin/develop' into feature_1371_serie…
JohnHalleyGotway Aug 15, 2024
a864f29
Per #1371, update series-analysis to apply the valid data threshold p…
JohnHalleyGotway Aug 15, 2024
47db1ce
Per #1371, update series_analysis to buffer data and write it all at …
JohnHalleyGotway Aug 15, 2024
def857c
Per #1371, add useful error message when required aggregation variabl…
JohnHalleyGotway Aug 16, 2024
2ea28a0
Per #1371, print a Debug(2) message listing the aggregation fields be…
JohnHalleyGotway Aug 16, 2024
cbba99b
Per #1371, correct operator+= logic in met_stats.cc for SL1L2Info, VL…
JohnHalleyGotway Aug 19, 2024
073e9e4
Per #1371, the DataPlane for the computed statistics should be initia…
JohnHalleyGotway Aug 19, 2024
1fd1f91
Per #1371, update logic of the compute_cntinfo() function so that CNT…
JohnHalleyGotway Aug 21, 2024
24f9cfd
Merge remote-tracking branch 'origin/develop' into feature_1371_serie…
JohnHalleyGotway Aug 21, 2024
83bf7da
Per #1371, fix logic of climo log message.
JohnHalleyGotway Aug 22, 2024
3eabea8
Per #1371, this is actually related to MET #2924. In compute_pctinfo(…
JohnHalleyGotway Aug 22, 2024
1c7a03f
Per #1371, fix indexing bug (+i instead of +1) when check the valid d…
JohnHalleyGotway Aug 23, 2024
457c1ca
Per #1371, add logic to aggregate the PSTD BRIERCL and BSS statistics…
JohnHalleyGotway Aug 23, 2024
d625e06
Merge remote-tracking branch 'origin/develop' into feature_1371_serie…
JohnHalleyGotway Aug 23, 2024
c0a1a0b
Per #1371, switch to using string literals to satisfy SonarQube
JohnHalleyGotway Aug 23, 2024
38e7a67
Per #1371, update series_analysis tests in unit_climatology_1.0deg.xm…
JohnHalleyGotway Aug 23, 2024
f0a5eb7
Per #1371, remove extra comment
JohnHalleyGotway Aug 23, 2024
972f867
Per #1371, skip writing the PCT THRESH_i columns to the Series-Analys…
JohnHalleyGotway Aug 23, 2024
eecd22d
Per #1371, fix the R string literals to remove \t and \n escape seque…
JohnHalleyGotway Aug 26, 2024
12c1eec
Per #1371, update the read_aggr_data_plane() suggestion strings.
JohnHalleyGotway Aug 26, 2024
4f78b26
Per #1371, ignore unneeded PCT 'THRESH_' variables both when reading …
JohnHalleyGotway Aug 26, 2024
52589c9
Per #1371, update the test named series_analysis_AGGR_CMD_LINE to inc…
JohnHalleyGotway Aug 29, 2024
1560700
Per #1371, update the -aggr note to warn users about slow runtimes
JohnHalleyGotway Aug 29, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 11 additions & 4 deletions docs/Users_Guide/series-analysis.rst
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@ The usage statement for the Series-Analysis tool is shown below:
-fcst file_1 ... file_n | fcst_file_list
-obs file_1 ... file_n | obs_file_list
[-both file_1 ... file_n | both_file_list]
[-aggr file]
[-paired]
-out file
-config file
Expand All @@ -58,13 +59,17 @@ Optional Arguments for series_analysis

5. To set both the forecast and observations to the same set of files, use the optional -both file_1 ... file_n | both_file_list option to the same set of files. This is useful when reading the NetCDF matched pair output of the Grid-Stat tool which contains both forecast and observation data.

6. The -paired option indicates that the -fcst and -obs file lists are already paired, meaning there is a one-to-one correspondence between the files in those lists. This option affects how missing data is handled. When -paired is not used, missing or incomplete files result in a runtime error with no output file being created. When -paired is used, missing or incomplete files result in a warning with output being created using the available data.
6. The -aggr option specifies the path to an existing Series-Analysis output file. When computing statistics for the input forecast and observation data, Series-Analysis aggregates the partial sums (SL1L2, SAL1L2 line types) and contingency table counts (CTC, MCTC, and PCT line types) with data provided in the aggregate file. This option enables Series-Analysis to run iteratively and update existing partial sums, counts, and statistics with new data.

7. The -log file outputs log messages to the specified file.
.. note:: When the -aggr option is used, only statistics that are derivable from partial sums and contingency table counts can be requested. Runtimes are generally much slower when aggregating data since it requires many additional NetCDF variables containing the scalar partial sums and contingency table counts to be read and written.

8. The -v level overrides the default level of logging (2).
7. The -paired option indicates that the -fcst and -obs file lists are already paired, meaning there is a one-to-one correspondence between the files in those lists. This option affects how missing data is handled. When -paired is not used, missing or incomplete files result in a runtime error with no output file being created. When -paired is used, missing or incomplete files result in a warning with output being created using the available data.

9. The -compress level option indicates the desired level of compression (deflate level) for NetCDF variables. The valid level is between 0 and 9. The value of "level" will override the default setting of 0 from the configuration file or the environment variable MET_NC_COMPRESS. Setting the compression level to 0 will make no compression for the NetCDF output. Lower number is for fast compression and higher number is for better compression.
8. The -log file outputs log messages to the specified file.

9. The -v level overrides the default level of logging (2).

10. The -compress level option indicates the desired level of compression (deflate level) for NetCDF variables. The valid level is between 0 and 9. The value of "level" will override the default setting of 0 from the configuration file or the environment variable MET_NC_COMPRESS. Setting the compression level to 0 will make no compression for the NetCDF output. Lower number is for fast compression and higher number is for better compression.

An example of the series_analysis calling sequence is shown below:

Expand Down Expand Up @@ -179,3 +184,5 @@ The output_stats array controls the type of output that the Series-Analysis tool
11. PJC for Joint and Conditional factorization for Probabilistic forecasts (See :numref:`table_PS_format_info_PJC`)

12. PRC for Receiver Operating Characteristic for Probabilistic forecasts (See :numref:`table_PS_format_info_PRC`)

.. note:: When the -input option is used, all partial sum and contingency table count columns are required to aggregate statistics across multiple runs. To facilitate this, the output_stats entries for the CTC, SL1L2, SAL1L2, and PCT line types can be set to "ALL" to indicate that all available columns for those line types should be written.
10 changes: 5 additions & 5 deletions internal/test_unit/config/SeriesAnalysisConfig_climo
Original file line number Diff line number Diff line change
Expand Up @@ -132,13 +132,13 @@ vld_thresh = 0.5;
//
output_stats = {
fho = [ "TOTAL", "F_RATE", "H_RATE", "O_RATE" ];
ctc = [ ];
ctc = [ "ALL" ];
cts = [ ];
mctc = [ ];
mcts = [ "ACC" ];
cnt = [ "TOTAL", "RMSE", "ANOM_CORR" ];
sl1l2 = [ ];
sal1l2 = [ ];
mcts = [ ];
cnt = [ "TOTAL", "RMSE", "ANOM_CORR", "RMSFA", "RMSOA" ];
sl1l2 = [ "ALL" ];
sal1l2 = [ "ALL" ];
pct = [ ];
pstd = [ ];
pjc = [ ];
Expand Down
2 changes: 1 addition & 1 deletion internal/test_unit/config/SeriesAnalysisConfig_climo_prob
Original file line number Diff line number Diff line change
Expand Up @@ -148,7 +148,7 @@ output_stats = {
cnt = [ ];
sl1l2 = [ ];
sal1l2 = [ ];
pct = [ ];
pct = [ "ALL" ];
pstd = [ "TOTAL", "ROC_AUC", "BRIER", "BRIERCL", "BSS", "BSS_SMPL" ];
pjc = [ ];
prc = [ ];
Expand Down
96 changes: 76 additions & 20 deletions internal/test_unit/xml/unit_climatology_1.0deg.xml
Original file line number Diff line number Diff line change
Expand Up @@ -154,32 +154,28 @@
<stat>&OUTPUT_DIR;/climatology_1.0deg/stat_analysis_MPR_to_PSTD.stat</stat>
</output>
</test>

--!>
<test name="climatology_SERIES_ANALYSIS_1.0DEG">
<exec>&MET_BIN;/series_analysis</exec>
<env>
<pair><name>CLIMO_MEAN_FILE_LIST</name>
<value>"&DATA_DIR_CLIMO;/NCEP_NCAR_40YR_1.0deg/cmean_1d.19590409",
"&DATA_DIR_CLIMO;/NCEP_NCAR_40YR_1.0deg/cmean_1d.19590410",
"&DATA_DIR_CLIMO;/NCEP_NCAR_40YR_1.0deg/cmean_1d.19590411"
"&DATA_DIR_CLIMO;/NCEP_NCAR_40YR_1.0deg/cmean_1d.19590410"
</value>
</pair>
<pair><name>CLIMO_STDEV_FILE_LIST</name>
<value>"&DATA_DIR_CLIMO;/NCEP_NCAR_40YR_1.0deg/cstdv_1d.19590409",
"&DATA_DIR_CLIMO;/NCEP_NCAR_40YR_1.0deg/cstdv_1d.19590410",
"&DATA_DIR_CLIMO;/NCEP_NCAR_40YR_1.0deg/cstdv_1d.19590411"
"&DATA_DIR_CLIMO;/NCEP_NCAR_40YR_1.0deg/cstdv_1d.19590410"
</value>
</pair>
</env>
<param> \
-fcst &DATA_DIR_MODEL;/grib2/gfs/gfs_2012040900_F012.grib2 \
&DATA_DIR_MODEL;/grib2/gfs/gfs_2012040900_F024.grib2 \
&DATA_DIR_MODEL;/grib2/gfs/gfs_2012040900_F036.grib2 \
&DATA_DIR_MODEL;/grib2/gfs/gfs_2012040900_F048.grib2 \
-obs &DATA_DIR_MODEL;/grib2/gfsanl/gfsanl_4_20120409_1200_000.grb2 \
&DATA_DIR_MODEL;/grib2/gfsanl/gfsanl_4_20120410_0000_000.grb2 \
&DATA_DIR_MODEL;/grib2/gfsanl/gfsanl_4_20120410_1200_000.grb2 \
&DATA_DIR_MODEL;/grib2/gfsanl/gfsanl_4_20120411_0000_000.grb2 \
-paired \
-out &OUTPUT_DIR;/climatology_1.0deg/series_analysis_GFS_CLIMO_1.0DEG.nc \
-config &CONFIG_DIR;/SeriesAnalysisConfig_climo \
Expand All @@ -190,25 +186,84 @@
</output>
</test>

<test name="climatology_SERIES_ANALYSIS_1.0DEG_AGGR">
<exec>&MET_BIN;/series_analysis</exec>
<env>
<pair><name>CLIMO_MEAN_FILE_LIST</name>
<value>"&DATA_DIR_CLIMO;/NCEP_NCAR_40YR_1.0deg/cmean_1d.19590411"
</value>
</pair>
<pair><name>CLIMO_STDEV_FILE_LIST</name>
<value>"&DATA_DIR_CLIMO;/NCEP_NCAR_40YR_1.0deg/cstdv_1d.19590411"
</value>
</pair>
</env>
<param> \
-fcst &DATA_DIR_MODEL;/grib2/gfs/gfs_2012040900_F048.grib2 \
-obs &DATA_DIR_MODEL;/grib2/gfsanl/gfsanl_4_20120411_0000_000.grb2 \
-paired \
-aggr &OUTPUT_DIR;/climatology_1.0deg/series_analysis_GFS_CLIMO_1.0DEG.nc \
-out &OUTPUT_DIR;/climatology_1.0deg/series_analysis_GFS_CLIMO_1.0DEG_AGGR.nc \
-config &CONFIG_DIR;/SeriesAnalysisConfig_climo \
-v 2
</param>
<output>
<grid_nc>&OUTPUT_DIR;/climatology_1.0deg/series_analysis_GFS_CLIMO_1.0DEG_AGGR.nc</grid_nc>
</output>
</test>

<test name="climatology_SERIES_ANALYSIS_PROB_1.0DEG">
<exec>echo "&DATA_DIR_MODEL;/grib2/sref_pr/sref_prob_2012040821_F003.grib2 \
&DATA_DIR_MODEL;/grib2/sref_pr/sref_prob_2012040821_F009.grib2 \
&DATA_DIR_MODEL;/grib2/sref_pr/sref_prob_2012040821_F015.grib2 \
&DATA_DIR_MODEL;/grib2/sref_pr/sref_prob_2012040821_F021.grib2 \
&DATA_DIR_MODEL;/grib2/sref_pr/sref_prob_2012040821_F027.grib2 \
&DATA_DIR_MODEL;/grib2/sref_pr/sref_prob_2012040821_F033.grib2 \
&DATA_DIR_MODEL;/grib2/sref_pr/sref_prob_2012040821_F039.grib2 \
&DATA_DIR_MODEL;/grib2/sref_pr/sref_prob_2012040821_F045.grib2" \
> &OUTPUT_DIR;/climatology_1.0deg/input_fcst_file_list; \
&DATA_DIR_MODEL;/grib2/sref_pr/sref_prob_2012040821_F021.grib2" \
> &OUTPUT_DIR;/climatology_1.0deg/20120409_fcst_file_list; \
echo "&DATA_DIR_MODEL;/grib2/gfsanl/gfsanl_4_20120409_0000_000.grb2 \
&DATA_DIR_MODEL;/grib2/gfsanl/gfsanl_4_20120409_0600_000.grb2 \
&DATA_DIR_MODEL;/grib2/gfsanl/gfsanl_4_20120409_1200_000.grb2 \
&DATA_DIR_MODEL;/grib2/gfsanl/gfsanl_4_20120409_1800_000.grb2 \
&DATA_DIR_MODEL;/grib2/gfsanl/gfsanl_4_20120410_0000_000.grb2 \
&DATA_DIR_MODEL;/grib2/gfsanl/gfsanl_4_20120409_1800_000.grb2" \
> &OUTPUT_DIR;/climatology_1.0deg/20120409_obs_file_list; \
&MET_BIN;/series_analysis</exec>
<env>
<pair><name>DAY_INTERVAL</name> <value>1</value></pair>
<pair><name>HOUR_INTERVAL</name> <value>6</value></pair>
<pair><name>CLIMO_MEAN_FILE_LIST</name>
<value>"&DATA_DIR_CLIMO;/NCEP_NCAR_40YR_1.0deg/cmean_1d.19590409",
"&DATA_DIR_CLIMO;/NCEP_NCAR_40YR_1.0deg/cmean_1d.19590410",
"&DATA_DIR_CLIMO;/NCEP_NCAR_40YR_1.0deg/cmean_1d.19590411"
</value>
</pair>
<pair><name>CLIMO_STDEV_FILE_LIST</name>
<value>"&DATA_DIR_CLIMO;/NCEP_NCAR_40YR_1.0deg/cstdv_1d.19590409",
"&DATA_DIR_CLIMO;/NCEP_NCAR_40YR_1.0deg/cstdv_1d.19590410",
"&DATA_DIR_CLIMO;/NCEP_NCAR_40YR_1.0deg/cstdv_1d.19590411"
</value>
</pair>
</env>
<param> \
-fcst &OUTPUT_DIR;/climatology_1.0deg/20120409_fcst_file_list \
-obs &OUTPUT_DIR;/climatology_1.0deg/20120409_obs_file_list \
-paired \
-out &OUTPUT_DIR;/climatology_1.0deg/series_analysis_PROB_CLIMO_1.0DEG.nc \
-config &CONFIG_DIR;/SeriesAnalysisConfig_climo_prob \
-v 2
</param>
<output>
<grid_nc>&OUTPUT_DIR;/climatology_1.0deg/series_analysis_PROB_CLIMO_1.0DEG.nc</grid_nc>
</output>
</test>

<test name="climatology_SERIES_ANALYSIS_PROB_1.0DEG_AGGR">
<exec>echo "&DATA_DIR_MODEL;/grib2/sref_pr/sref_prob_2012040821_F027.grib2 \
&DATA_DIR_MODEL;/grib2/sref_pr/sref_prob_2012040821_F033.grib2 \
&DATA_DIR_MODEL;/grib2/sref_pr/sref_prob_2012040821_F039.grib2 \
&DATA_DIR_MODEL;/grib2/sref_pr/sref_prob_2012040821_F045.grib2" \
> &OUTPUT_DIR;/climatology_1.0deg/20120410_fcst_file_list; \
echo "&DATA_DIR_MODEL;/grib2/gfsanl/gfsanl_4_20120410_0000_000.grb2 \
&DATA_DIR_MODEL;/grib2/gfsanl/gfsanl_4_20120410_0600_000.grb2 \
&DATA_DIR_MODEL;/grib2/gfsanl/gfsanl_4_20120410_1200_000.grb2 \
&DATA_DIR_MODEL;/grib2/gfsanl/gfsanl_4_20120410_1800_000.grb2" \
> &OUTPUT_DIR;/climatology_1.0deg/input_obs_file_list; \
> &OUTPUT_DIR;/climatology_1.0deg/20120410_obs_file_list; \
&MET_BIN;/series_analysis</exec>
<env>
<pair><name>DAY_INTERVAL</name> <value>1</value></pair>
Expand All @@ -227,15 +282,16 @@
</pair>
</env>
<param> \
-fcst &OUTPUT_DIR;/climatology_1.0deg/input_fcst_file_list \
-obs &OUTPUT_DIR;/climatology_1.0deg/input_obs_file_list \
-fcst &OUTPUT_DIR;/climatology_1.0deg/20120410_fcst_file_list \
-obs &OUTPUT_DIR;/climatology_1.0deg/20120410_obs_file_list \
-paired \
-out &OUTPUT_DIR;/climatology_1.0deg/series_analysis_PROB_CLIMO_1.0DEG.nc \
-aggr &OUTPUT_DIR;/climatology_1.0deg/series_analysis_PROB_CLIMO_1.0DEG.nc \
-out &OUTPUT_DIR;/climatology_1.0deg/series_analysis_PROB_CLIMO_1.0DEG_AGGR.nc \
-config &CONFIG_DIR;/SeriesAnalysisConfig_climo_prob \
-v 2
</param>
<output>
<grid_nc>&OUTPUT_DIR;/climatology_1.0deg/series_analysis_PROB_CLIMO_1.0DEG.nc</grid_nc>
<grid_nc>&OUTPUT_DIR;/climatology_1.0deg/series_analysis_PROB_CLIMO_1.0DEG_AGGR.nc</grid_nc>
</output>
</test>

Expand Down
Loading
Loading