Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simulating ground motion fields conditioned on station data #8317

Closed
raoanirudh opened this issue Jan 3, 2023 · 0 comments · Fixed by #8318
Closed

Simulating ground motion fields conditioned on station data #8317

raoanirudh opened this issue Jan 3, 2023 · 0 comments · Fixed by #8318
Assignees
Labels
Milestone

Comments

@raoanirudh
Copy link
Member

Simulating ground motion fields conditioned on station data

Rationale

Given an earthquake rupture model, a site model, and a set of ground shaking intensity models (GSIMs), the OpenQuake scenario calculator simulates a set of ground motion fields (GMFs) at the target sites for the requested set of intensity measure types (IMTs). This set of GMFs can then be used in the scenario_damage and scenario_risk calculators to estimate the distribution of potential damage, economic losses, fatalities, and other consequences. The scenario calculators are useful for simulating both historical and hypothetical earthquakes.

For historical events, ground motion recordings and macroseismic intensity data ("ground truth") can provide additional constraints on the ground shaking estimates in the region affected by the earthquake. The U.S. Geological Survey ShakeMap system (Worden et al., 20201, Wald et al., 20222) provides near-real time estimates of ground shaking for earthquakes, conditioned on station and macroseismic data where available. Support to read USGS ShakeMaps and simulate GMFs based on the mean and standard deviation values estimated by the ShakeMap — for peak ground acceleration (PGA) and spectral acceleration (SA) at 0.3s, 1.0s, and 3.0s — was added to the OpenQuake-engine in #3545 and #3606, and a link to the scenario_damage and scenario_risk calculators was added in #3608, based on the methodology described by Silva & Horspool (2019)3. Further support for reading ShakeMaps from sources other than the USGS was added as discussed in #6572. Support for MMI was added in #6660, and some performance improvements were introduced in #6624.

Starting from USGS ShakeMap v4.1, the conditioning of the ground shaking is based on the methodology developed by Engler et al. (2022)4, which preserves the partition of the (conditioned) within-event and between-event residuals, thus enabling more accurate propagation of the uncertainties through a Monte Carlo simulation approach.

The ability to run OpenQuake scenario damage and loss calculations starting from ShakeMaps from multiple providers has proved to be a useful feature for the assessment of damage and losses for past earthquakes, development and calibration of fragility and vulnerability models, and testing of probabilistic seismic risk models56. However, there are still many instances where the user may wish to consider a different set of assumptions in the ground motion conditioning process compared to the ShakeMap service providers. For instance, users may be interested in:

  • testing the impact of using different rupture geometries and source parameters from published literature for an event7
  • including additional recording station data that they have obtained, or excluding certain outliers based on criteria they deem relevant
  • employing a different Ground-Motion–Intensity Conversion Equation (GMICE) to convert macroseismic data to ground motion intensities or choosing to ignore the macroseismic data altogether
  • testing the behavior of individual GMPEs compared to the station data or using a weighted average GMPE for the tectonic region based on a probabilistic hazard model
  • incorporating a custom vs30 grid or using custom site-amplification functions if microzonation studies are available for the affected area
  • obtaining estimates for IMT(s) that are not directly available from the ShakeMap provider(s)
  • employing different models for the spatial cross-correlation of the within-event residuals and cross-correlation of the IMs for the between-event residuals

For the reasons mentioned above, it would be beneficial to include within the OpenQuake-engine, a Conditioned GMF calculator tightly coupled with the existing scenario calculator8, that would:

  1. accept user-provided station data in addition to the usual inputs for a scenario calculation including an earthquake rupture model, a site model, and a single GSIM or a GSIM logic tree
  2. calculate the conditioned mean and conditioned between-event and within-event covariances for the target sites for the requested IMTs, for each GSIM given the station data, based on the procedure outlined in Engler et al. (2022)4
  3. simulate the requested number of GMFs at the target sites
  4. proceed to damage / loss / consequence calculations if desired

New inputs

Station data csv file

In addition to the usual inputs for a scenario calculation including an earthquake rupture model, a site model, and a single GSIM or a GSIM logic tree, the user would also need to provide a csv file containing the observed intensity values for one or more intensity measure types at a set of ground motion recording stations.

The path to this input file could be specified in the job file using a new parameter station_data_file, eg.:

[station_data]
station_data_file = stationlist_all.csv

The csv file snippet below illustrates the information that would be contained within such a station data file:

STATION_ID STATION_NAME LONGITUDE LATITUDE STATION_TYPE PGA_VALUE PGA_LN_SIGMA SA(0.3)_VALUE SA(0.3)_LN_SIGMA SA(1.0)_VALUE SA(1.0)_LN_SIGMA
VIGA LAS VIGAS -99.23326 16.75870 seismic 0.3550 0.0 0.5262 0.0 0.1012 0.0
VNTA LA VENTA -99.81885 16.91426 seismic 0.2061 0.0 0.3415 0.0 0.1051 0.0
COYC COYUCA -100.08996 16.99778 seismic 0.1676 0.0 0.2643 0.0 0.0872 0.0
UTM_14Q_041_186 NA -99.79820 16.86687 macroseismic 0.6512 0.8059 0.9535 1.0131 0.4794 1.0822
UTM_14Q_041_185 NA -99.79761 16.77656 macroseismic 0.5797 0.8059 0.8766 1.0131 0.4577 1.0822
UTM_14Q_040_186 NA -99.89182 16.86655 macroseismic 0.4770 0.8059 0.7220 1.0131 0.3223 1.0822

Mandatory fields

  • STATION_ID: string; subject to the same validity checks as the id fields in other input files
  • LONGITUDE, LATITUDE; or LON, LAT: floats; valid longitude and latitude values
  • STATION_TYPE: string; currently the only two valid options are 'seismic' and 'macroseismic'
  • <IMT>_VALUE, <IMT>_LN_SIGMA / <IMT>_STDDEV: floats; for each IMT observed at the recording stations, two values should be provided – for IMTs that are assumed to be lognormally distributed (eg. PGV, PGA, SA), these would be the median and lognormal standard deviation using the column headers <IMT>_VALUE, <IMT>_LN_SIGMA respectively; and for other IMTs (eg., MMI), these would simply be the mean and standard deviation using the column headers <IMT>_VALUE, <IMT>_STDDEV respectively

Optional fields

  • STATION_NAME: string; free form and not subject to the same constraints as the STATION_ID field, the optional STATION_NAME field can contain information that aids in identifying a particular station
  • Other fields: could contain notes about the station, flags indicating outlier status for the values reported by the station, site information, etc., but these optional fields will not be read by the station_data_file parser

Site model file for the station sites

All of the site parameters required by the GMMs used in the conditioned scenario calculation will also need to be provided for the set of sites in the station_data_file. This could be accomplished in various ways –

A: Use the existing site model csv file parser to read the site model for the station locations at the same time as the site model file for the hazard / exposure sites, eg:

[site_params]
site_model_file = site_model.csv  site_model_stations.csv

The advantage in this case would be that the already existing site model parser can be used directly, without needing to manage two separate site models under the hood. The drawback is that the two site models will get merged and it could become difficult to separate the stations from the hazard/exposure sites in downstream parts of the calculation.

B: Read the site model for the station locations separately from the site model file for the hazard / exposure sites, eg:

[site_params]
site_model_file = site_model.csv
station_site_model_file = site_model_stations.csv

The advantage in this case would be that the site model for the stations can be kept separate from the the site model file for the hazard / exposure sites. The site model for the stations will be used only during the conditioning process, whereas the site model file for the hazard / exposure sites will be used for the ground motion simulation. The drawback is the requirement of adding a new input parameter station_site_model_file and separate management of two site model files, which might be a non-trivial task.

Ground motion models

Users can choose to specify one or more GSIMs for the conditioned scenario calculation using any of the following options:

  • A single GMM, eg. gsim = ChiouYoungs2014
  • A GSIM logic tree file, eg. gsim_logic_tree_file = gsim_logic_tree.xml
  • A weighted average GSIM, eg. gsim_logic_tree_file = gsim_weighted_avg.xml, where the file gsim_weighted_avg.xml can be constructed using the modifiable GMPE structure for AvgGMPE as shown in the example below:
<?xml version="1.0" encoding="UTF-8"?>
<nrml xmlns:gml="http://www.opengis.net/gml" 
      xmlns="http://openquake.org/xmlns/nrml/0.4">
  <logicTree logicTreeID='lt1'>
    <logicTreeBranchingLevel branchingLevelID="bl1">
      <logicTreeBranchSet 
        branchSetID="bs1" 
        uncertaintyType="gmpeModel" 
        applyToTectonicRegionType="Active Shallow Crust">
        <logicTreeBranch branchID="br1">
          <uncertaintyModel>
            [AvgGMPE]
              b1.AbrahamsonEtAl2014.weight=0.22
              b2.BooreEtAl2014.weight=0.22
              b3.CampbellBozorgnia2014.weight=0.22
              b4.ChiouYoungs2014.weight=0.22
              b5.Idriss2014.weight=0.12
          </uncertaintyModel>
          <uncertaintyWeight>
              1.0
          </uncertaintyWeight>
        </logicTreeBranch>
      </logicTreeBranchSet>
    </logicTreeBranchingLevel>
  </logicTree>
</nrml>

For a regular GSIM logic tree, the conditioning steps will be undertaken for each of the GSIMs in the logic tree file separately, and GMFs will be generated for each of the GSIMs separately as well. This can be useful when the modeller is evaluating or comparing the predictions from different GSIMs for a scenario.

If a weighted average GSIM is provided, the conditioning and GMF simulation will occur only for the blended GSIM. This can be useful when the modeller is evaluating or trying to calibrate the performance of the set of GSIMs for the tectonic region as a whole against one or more scenarios, perhaps looking at the GSIM logic tree from a probabilistic hazard model for the region.

Note: Each of the GSIMs specified for a conditioned GMF calculation must provide the within-event and between-event standard deviations separately. If a GSIM of interest provides only the total standard deviation, a (non-ideal) workaround might be for the user to specify the ratio between the within-event and between-event standard deviations (example), which the engine will use to add the between and within standard deviations to the GSIM.

Calculation steps

The broad calculation steps are as follows, with more detailed descriptions of each step provided subsequently:

  1. Read the usual scenario/scenario_damage/scenario_risk inputs
  2. If the station_data_file is present in the job file,
    1. Read the station data input file
    2. Trigger the ConditionedGmfComputer instead of the regular GmfComputer
    3. Calculate and store the conditioned between-event residual mean at every station site
    4. Calculate and store the nominal event bias
    5. Calculate the conditioned mean and covariance of the ground motion at the target sites
    6. Simulate and store the requested number of ground motion fields at the target sites
  3. Proceed to damage / loss / consequence calculations if desired

Reading the station data input file

This would require adding a function get_station_data(..) to commonlib/readinput.py, to extract the following three pieces of information from the station data csv file:

  1. station_data: a dataframe containing the mean and standard deviation values, in lognormal space or otherwise, for the set of valid IMTs in the csv file
  2. station_sites: a dataframe containing {station_id, lon, lat} for the list of stations in the csv file
  3. imts: the set of valid intensity measure types for which intensity observations are available in the csv file

The ConditionedGmfComputer

The conditoned GMF calculator should essentially follow the conditioning steps outlined in Appendix B of Engler et al. (2022)4, see attached pdf of Appendix B for the equations. In short, the main steps are for each target IMT:

  1. From station_data, get the mean observations (recorded values at the stations, "yD") and any additional sigma ("var_addon_D") for the observations that are uncertain, which might arise if the values for this particular IMT were not directly recorded, but obtained by conversion equations or cross-correlation functions
  2. Compute the predicted mean intensity ("mu_yD") and uncertainty components ("phi_D" and "tau_D") at the observation points, from the specified GMM(s)
  3. Compute the raw residuals at the observation points, by subtracting the predicted intensity from the observed intensity (zeta_D = yD - mu_yD)
  4. Compute the station data within-event covariance matrix ("cov_WD_WD"), and add on the additional variance of the residuals for the cases where the station data is uncertain
  5. Compute the (pseudo)-inverse of the station data within-event covariance matrix ("cov_WD_WD_inv")
  6. Compute the cross-intensity measure correlations for the observed intensity measures ("corr_HD_HD")
  7. Using Bayes rule, compute the posterior distribution (i.e., mean "mu_HD_yD" and covariance "cov_HD_HD_yD") of the normalized between-event residual, employing Engler et al. (2022), eqns B8 and B9 (also eqns B18 and B19)
  8. Compute the distribution of the conditional between-event residual ("mu_BD_yD" and "cov_BD_BD_yD")
  9. Compute the nominal bias and its variance as the means of the conditional between-event residual mean and covariance, display this information in the calculation log, and also store it as one of the outputs of the calculation
  10. From the GMMs, compute the predicted mean ("mu_Y") and stddevs ("phi_Y" and "tau_Y") of the intensity at the target sites
  11. Compute the mean of the conditional between-event residual for the target sites ("mu_BY_yD"), eqn. B18 and B19a
  12. Compute the within-event covariance matrices for the target sites and observation sites ("cov_WY_WD" and "cov_WD_WY")
  13. Compute the within-event covariance matrix for the target sites (apriori, "cov_WY_WY")
  14. Compute the regression coefficient matrix ("RC" = cov_WY_WD × cov_WD_WD_inv)
  15. Compute the scaling matrix "C" for the conditioned between-event covariance matrix, eqn. B22
  16. Compute the conditioned within-event covariance matrix for the target sites ("cov_WY_WY_wD"), eqn. B21
  17. Compute the "conditioned between-event" covariance matrix for the target sites ("cov_BY_BY_yD"), last term of eqn. B17
  18. Compute the conditioned mean of the ground motion at the target sites ("mu_Y_yD"), eqn. B16
  19. Compute the conditional covariance of the ground motion at the target sites ("cov_Y_Y_yD"), eqn. B17
  20. Use the conditioned mean vector (mu_Y_yD) and standard deviation matrices (cov_WY_WY_wD and cov_BY_BY_yD) computed in the preceding two steps to simulate the requested number of ground motion fields at the target sites

Outputs

  1. Nominal event bias (from step 9 above): one bias value for each IMT, for every GSIM used in the calculation
  2. Conditioned between-event residual mean (from step 8 above): one bias value for each station site, for each IMT, for every GSIM used in the calculation (useful if the between-event uncertainties are heteroscedastic, depending upon either Vs30, the median estimated ground motions, or both)
  3. Simulated ground motion fields
  4. All other existing outputs of the scenario calculator

Verification tests

Since the implementation is expected to closely follow Engler et al. (2022), the results should match for the set of eleven simplified one-dimensional verification tests devised by Worden et al. (2020) to check the USGS ShakeMap implementation of the Engler et al. (2022) equations.

Future improvements

The conditioning process requires the specification of a spatial correlation model of the within-event residuals between two points for the intensity measures involved in the calculation, a model for the cross-measure correlation of the within-event residuals, and a model for the cross-measure correlation of the between-event residuals. Assuming a conditional independence of the cross-measure correlation and the spatial correlation of a given intensity measure, the spatial cross-correlation of two different IMs at two different sites can be obtained as the product of the cross-correlation of two IMs at the same location and the spatial correlation due to the distance between sites. Given the limited set of correlation models available in the engine currently, the choice of the three aforementioned correlation models could be hardcoded in the initial implementation of the conditioned GMF calculator, using JB2009CorrelationModel as the spatial correlation model of the within-event residuals, BakerJayaram2008 as the cross-measure correlation model for the within-event residuals, and GodaAtkinson2009 as the cross-measure correlation model for the between-event residuals.

Ideally, the choice of the different correlation models should be exposed to the user through parameters in the job file. Direct spatial cross-correlation models for the within-event residuals910 could also be used instead of separate models for the spatial correlation and cross-measure correlation. Supporting these choices would entail a substantial refactoring of the existing correlation.py and cross_correlation.py modules of the engine, and is thus left for a future issue.

References

Footnotes

  1. Worden, C. B., Thompson, E. M., Hearne, M. G., & Wald, D. J. (2020). ShakeMap Manual Online: technical manual, user’s guide, and software guide, U. S. Geological Survey. URL: http://usgs.github.io/shakemap/. DOI: https://doi.org/10.5066/F7D21VPQ.

  2. Wald, D. J., Worden, C. B., Thompson, E. M., & Hearne, M. G. (2022). ShakeMap operations, policies, and procedures. Earthquake Spectra, 38(1), 756–777. DOI: https://doi.org/10.1177/87552930211030298

  3. Silva, V., & Horspool, N. (2019). Combining USGS ShakeMaps and the OpenQuake-engine for damage and loss assessment. Earthquake Engineering and Structural Dynamics, 48(6), 634–652. DOI: https://doi.org/10.1002/eqe.3154

  4. Engler, D. T., Worden, C. B., Thompson, E. M., & Jaiswal, K. S. (2022). Partitioning Ground Motion Uncertainty When Conditioned on Station Data. Bulletin of the Seismological Society of America, 112(2), 1060–1079. DOI: https://doi.org/10.1785/0120210177 2 3 4

  5. Riga, E., Karatzetzou, A., Apostolaki, S., Crowley, H., & Pitilakis, K. D. (2021). Verification of seismic risk models using observed damages from past earthquake events. Bulletin of Earthquake Engineering (Vol. 19). DOI: https://doi.org/10.1007/s10518-020-01017-5

  6. Crowley, H., Silva, V., Kalakonas, P., Martins, L., Weatherill, G. A., Pitilakis, K. D., Riga, E., Borzi, B., & Faravelli, M. (2020). Verification of the European Seismic Risk Model (ESRM20). In 17th World Conference on Earthquake Engineering. Sendai, Japan.

  7. de Pro‐Díaz, Y., Vilanova, S., & Canora, C. (2022). Ranking Earthquake Sources Using Spatial Residuals of Seismic Scenarios: Methodology Application to the 1909 Benavente Earthquake. Bulletin of the Seismological Society of America. DOI: https://doi.org/10.1785/0120220067

  8. The GMPE Strong Motion Modeller's Toolkit (gmpe-smtk) includes an example of conditional simulation of ground motion fields using hazardlib in conditional_simulation.py, though the current issue envisages a much tighter coupling with the engine's scenario calculator architecture, using the formulation from Appendix B of Engler et al. (2022)4 to compute the conditioned mean and within-event and between-event residuals, and permitting users to input station data directly.

  9. Loth, C., & Baker, J. W. (2013). A spatial cross‐correlation model of spectral accelerations at multiple periods. Earthquake Engineering & Structural Dynamics, 42(3), 397–417. DOI: https://doi.org/10.1002/eqe2212

  10. Du, W., & Ning, C. L. (2021). Modeling spatial cross-correlation of multiple ground motion intensity measures (SAs, PGA, PGV, Ia, CAV, and significant durations) based on principal component and geostatistical analyses. Earthquake Spectra, 37(1), 486–504. DOI: https://doi.org/10.1177/8755293020952442

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants