Several redundant functions for fitting exceedance freq to return periods #904

ValentinGebhart · 2024-06-27T13:05:53Z

Is your feature request related to a problem? Please describe.
There are several functions that do the same computation (fitting the relation between exceedance frequency and return periods) but in different interpolation ways. They might be worth to combine. The ones I can find now are:

Impact.calc_freq_curve
impact exceedance frequency curve (aggregated over centroids)
method: np.interp(freq_cum, imp)
Impact.local_exceedance_imp using loc_return_impusing _cen_return_imp
impact exceedance frequency per centroid for several return periods
method: np.polyfit( np.log(freq_cum), imp, deg = 1)
Hazard.local_exceedance_inten using _loc_return_inten using _cen_return_inten
hazard exceedance frequency per centroid for several return periods
method: np.polyfit( np.log(freq_cum), haz, deg = 1)
Hazard.local_return_period using _loc_return_period
return period per centroid for several threshold intensities
method: np.searchsorted() (i.e. fitting a step function between haz and freq_cum)

Describe the solution you'd like
We could write one or two flexible functions that do the computation for all above cases, and maybe some wrapper functions.

Describe alternatives you've considered
None
Additional context
related to issue #209

The text was updated successfully, but these errors were encountered:

peanutfun · 2024-06-27T13:59:11Z

We can adapt the Impact.calc_freq_curve to become a "kernel" for exceedance frequency curves that can be applied to Impact.at_event, an impact time series of a specific exposure point, or a hazard intensity time series of a specific centroid. This function can then be applied to the appropriate data in suitable wrapper functions. It can even be adapted to "flip" the axes, and return the return period for specific values of the time series (i.e., the inverse problem that it currently solves)

chahank · 2024-06-30T08:20:02Z

There is also a reason for having two methods. One is just an ordering of the values with a linear interpolation between these values Impact.calc_freq_curve. The other is a fit of a curve due to the potentially very small number of values Impact.local_exceedance_imp. Thus, I think that there remains a good reason to have two methods. I would make maybe one single method as a util function that can either interpolate or fit. Then, as @peanutfun suggested, one can use it in the classes and flip the axes if needed.

bguillod · 2024-07-02T07:40:50Z

I personally find the return period calculations rather obscure to the user, so I would welcome improvements.

Personally, in addition to the above points, I would consider three options and leave it to the user to specify which one (with a good default of course):

The interpolated case, albeit with a clear definition in which space (e.g. log-log) it is performed (or the space being an additional optional argument).
The non-interpolated case.
The fitting of an extreme value distribution.

In my view the default should probably be (1) or (3).

In addition, I would suggest the following:

If the user asks for a return period beyond the maximum one available in cases 1 and 2 above, a warning is raised and np.nan is returned, rather than returning the maximum value silently. Perhaps also in case (3) if the return period is way beyond the max available empirically, although extreme value distributions are designed for extrapolation...
The uncertainty of the estimate should ideally be optionally provided.
In case of multiple identical values, these should be grouped and their frequencies summed first.

Some of these points we've implemented on our end already, so if you undertake a PR, happy to contribute.

bguillod · 2024-07-02T07:42:50Z

(Just to be clear: when I write three options I mean as an input parameter to the function(s) rather than separate functions at the top-level code API - which doesn't prevent the options to be split into individual private functions)

ValentinGebhart added the enhancement label Jun 27, 2024

ValentinGebhart assigned chahank and peanutfun Jun 27, 2024

This was referenced Jul 11, 2024

Add functions to compute, plot, store the local hazard exceedence intensity and RP maps (new branch) #898

Merged

Combining several interpolation functions using a new util function #918

Merged

ValentinGebhart mentioned this issue Jul 31, 2024

New interpolation util functionality #930

Merged

13 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Several redundant functions for fitting exceedance freq to return periods #904

Several redundant functions for fitting exceedance freq to return periods #904

ValentinGebhart commented Jun 27, 2024 •

edited

Loading

peanutfun commented Jun 27, 2024

chahank commented Jun 30, 2024

bguillod commented Jul 2, 2024

bguillod commented Jul 2, 2024

Several redundant functions for fitting exceedance freq to return periods #904

Several redundant functions for fitting exceedance freq to return periods #904

Comments

ValentinGebhart commented Jun 27, 2024 • edited Loading

peanutfun commented Jun 27, 2024

chahank commented Jun 30, 2024

bguillod commented Jul 2, 2024

bguillod commented Jul 2, 2024

ValentinGebhart commented Jun 27, 2024 •

edited

Loading