Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue 72: Split why it works vignette into intro and analytic solutions vignette #106

Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,10 @@

This is the development version of `primarycensoreddist` and is not yet ready for release.

## Package

* Split "Why it works" vignette into two separate vignettes, "Why it works" and "Analytic solutions for censored delay distributions".

# primarycensoreddist 0.5.0

This release adds a new `{touchstone}` based benchmark suite to the package. It also adds a new "How it works" vignette which aims to give the reader more details into how the primary censored distributions work.
Expand Down
2 changes: 2 additions & 0 deletions _pkgdown.yml
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,8 @@ navbar:
href: articles/fitting-dists-with-fitdistrplus.html
- text: Why it works
href: articles/why-it-works.html
- text: Analytic solutions for censored delay distributions
href: articles/analytic-solutions.html

authors:
Sam Abbott:
Expand Down
4 changes: 4 additions & 0 deletions inst/WORDLIST
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@ dist
dp
dprimary
dprimarycensoreddist
dx
dy
dz
eq
Expand All @@ -38,6 +39,7 @@ fitdistrplus
frac
gammapartexp
geq
gh
hexsticker
int
kz
Expand All @@ -47,6 +49,7 @@ lim
ln
lognormpartexp
mathbb
mathrm
modelhelpers
modeling
nF
Expand All @@ -71,6 +74,7 @@ sim
speedups
stan
stantools
startpoint
survgammaunifprim
survivalfunc
swindow
Expand Down
265 changes: 265 additions & 0 deletions vignettes/analytic-solutions.Rmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,265 @@
---
title: "Analytic solutions for censored delay distributions"
format: html
bibliography: library.bib
output:
bookdown::html_document2:
fig_caption: yes
code_folding: show
pkgdown:
as_is: true
csl: https://raw.githubusercontent.com/citation-style-language/styles/master/apa-numeric-superscript-brackets.csl
link-citations: true
vignette: >
%\VignetteIndexEntry{Analytic solutions}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---


# Analytic solutions for exponentially tilted primary event times

In epidemiological analysis, it is common for primary events to be arriving at exponentially increasing or decreasing rates, for example, incidence of new infections in a growing epidemic. In this case, the distribution of the primary event time within its censoring window is biased by the exponential growth or decay. If we assume a reference uniform distribution within a primary censoring window $[k, k + w_P)$ then the distribution of the primary event time within the censoring window is the exponential tilted uniform distribution:

$$
f_P(t) \propto \exp(r t) \mathbb{1}_{[k, k + w_P]}(t).
$$

In this case, the distribution function for $C_P$, that is the length of time left in the primary censor window _after_ the primary event time, is given by:

$$
F_{C_P}(p; r) = { 1 - \exp(-r p) \over 1 - \exp(-r w_P)}, \qquad p \in [0, w_P]. (\#eq:fcp)
$$
Note that taking the limit $\lim_r \rightarrow 0$ in equation \@ref(eq:fcp) gives the uniform distribution function $F_{C_P}(p, 0) = p / w_P$.

In the following it is convenient to use a (linear) difference operator defined as:

$$
\Delta_{w}f(t) = f(t + w) - f(t).
$$

# Uniform primary event time ($r=0$)

Applying a uniform primary event time distribution to equation 3.1 from ["Why it works"](why-it-works.html) gives:

$$
Q_{S_+}(t) = Q_T(t + w_P) + { 1 \over w_P} \int_0^{w_P} f_T(t+p) p~ dp.
$$

This is analytically solvable whenever the upper distribution function of $T$ is known and the mean of $T$ is analytically solvable from its integral definition.

In each case considered below it is easier to change the integration variable:

$$
\begin{aligned}
Q_{S_+}(t) &= Q_T(t + w_P) + { 1 \over w_P} \int_t^{t+w_P} f_T(z) (z-t)~ dz \\
&= Q_T(t + w_P) + { 1 \over w_P} \Big[ \int_t^{t+w_P} f_T(z) z~ dz - t \Delta_{w_P}F_T(t) \Big].
\end{aligned} (\#eq:unifprim)
$$

**General partial expectation**

Note that for any distribution with an analytically available distribution function $F_T$ equation \@ref(eq:unifprim) can be solved so long as the _partial expectation_

$$
\int_t^{t+w_P} f_T(z) z~ dz (\#eq:partexp)
$$

can be reduced to an analytic expression.

The insight here is that this will be possible for any distribution where the average of the distribution can be calculated analytically, which includes commonly used non-negative distributions such as the Gamma, Log-Normal and Weibull distributions.

**General Discrete censored delay distribution**

First, we note that equation 3.3 from ["Why it works"](why-it-works.html) can be written using the difference operator: $f_n = -\Delta_1 Q_{S_+}(n-1)$. We can insert this expression into equation \@ref(eq:unifprim) to give the discrete censored delay distribution for uniformly distributed primary event times:

$$
\begin{aligned}
f_n &= \Delta_1\Big[(n-1) \Delta_1F_T(n-1)\Big] - \Delta_1Q_T(n) - \Delta_1\Big[ \int_{n-1}^n f_T(z) z ~dz \Big] \\
&= (n+1)F_T(n+1) + (n-1)F_T(n-1) - 2nF_T(n) - \Delta_1\Big[ \int_{n-1}^n f_T(z) z ~dz \Big].
\end{aligned} (\#eq:disccensunifprim)
$$

We now consider some specific delay distributions.

## Gamma distributed delay times

The Gamma distribution has the density function:

$$
f_T(z;k, \theta) = {1 \over \Gamma(k) \theta^k} z^{k-1} \exp(-z/\theta).
$$
Where $\Gamma$ is the Gamma function.

The Gamma distribution has the distribution function:

$$
\begin{aligned}
F_T(z;k, \theta) &= {\gamma(k, z/\theta) \over \Gamma(k)}, \qquad z\geq 0,\\
F_T(z;k, \theta) &= 0, \qquad z < 0.
\end{aligned}
$$
Where $\gamma$ is the lower incomplete gamma function.

**Gamma partial expectation**

We know that the full expectation of the Gamma distribution is $\mathbb{E}[T] = k\theta$, which can be calculated as a standard integral. Doing the same integral for the partial expectation gives:

$$
\begin{aligned}
\int_t^{t+w_P} f_T(z) z~ dz &= {1 \over \Gamma(k) \theta^k} \int_t^{t+w_P}z z^{k-1} \exp(-z/\theta)~dz \\
&= {\Gamma(k+1) \theta^{k+1} \over \Gamma(k) \theta^k} {1 \over \Gamma(k+1) \theta^{k+1}} \int_t^{t+w_P}z^{k} \exp(-z/\theta)~dz\\
&= k\theta \Delta_{w_P} F_T(t; k + 1, \theta).
\end{aligned} (\#eq:gammapartexp)
$$

**Survival function of $S_{+}$ for Gamma distribution**

By substituting equation \@ref(eq:gammapartexp) into equation \@ref(eq:disccensunifprim) we can solve for the survival function of $S_+$ in terms of analytically available functions:

$$
\begin{aligned}
Q_{S_+}(t; k, \theta) &= Q_T(t + w_P; k, \theta) + { 1 \over w_P} \left( {1 \over \Gamma(k) \theta^k} \int_t^{t+w_P} z^{k-1} \exp(-z/\theta) (z-t)~ dz \right), \\
&= Q_T(t + w_P; k, \theta) + { 1 \over w_P} \big[ k \theta \Delta_{w_P}F_T(t; k+1, \theta) - t \Delta_{w_P}F_T(t; k, \theta) \big].
\end{aligned} (\#eq:survgammaunifprim)
$$

**Gamma discrete censored delay distribution**

By substituting \@ref(eq:survgammaunifprim) into \@ref(eq:disccensunifprim) we get the discrete censored delay distribution in terms of analytically available functions:
$$
\begin{aligned}
f_n &= (n+1) F_T(n+1; k, \theta) + (n-1) F_T(n-1; k, \theta) - 2 n F_T(n; k, \theta) - k \theta \Delta_1^{(2)}F_T(n-1; k+1, \theta)\\
&= (n+1) F_T(n+1; k, \theta) + (n-1) F_T(n-1; k, \theta) - 2 n F_T(n; k, \theta) \\
&+ k \theta \Big( 2 F_T(n; k+1, \theta) - F_T(n-1; k+1, \theta) - F_T(n+1; k+1,\theta) \Big) \qquad n = 0, 1, \dots.
\end{aligned}
$$

Which was also found by Cori _et al_ [@cori2013new].

## Log-Normal distribution

The Log-Normal distribution has the density function:

$$
f_T(z;\mu, \sigma) = {1 \over z \sigma \sqrt{2\pi}} \exp\left( - {(\log(z) - \mu)^2 \over 2 \sigma^2} \right).
$$
And distribution function:

$$
F_T(z;\mu, \sigma) = \Phi\left( {\log(z) - \mu \over \sigma} \right).
$$
Where $\Phi$ is the standard normal distribution function.

**Log-Normal partial expectation**

We know that the full expectation of the Log-Normal distribution is $\mathbb{E}[T] = e^{\mu + \frac{1}{2} \sigma^2}$, which can be calculated by integration with the integration substitution $y = (\ln z - \mu) / \sigma$. This has transformation Jacobian:

$$
\frac{dz}{dy} = \sigma e^{\sigma y + \mu}.
$$

Doing the same integral for the partial expectation, and using the same integration substitution gives:

$$
\begin{aligned}
\int_t^{t+w_P} z~ f_T(z; \mu, \sigma) dz &= {1 \over \sigma \sqrt{2\pi}} \int_t^{t+w_P} \exp\left( - {(\log(z) - \mu)^2 \over 2 \sigma^2} \right) dz \\
&= {1 \over \sqrt{2\pi}} \int_{(\ln t - \mu)/\sigma}^{(\ln(t+w_P) - \mu)/\sigma} e^{\sigma y + \mu} e^{-y^2/2} dy\\
&= {e^{\mu + \frac{1}{2} \sigma^2} \over \sqrt{2 \pi} } \int_{(\ln t - \mu)/\sigma}^{(\ln(t+w_P) - \mu)/\sigma} e^{-(y- \sigma)^2/2} dy \\
&= e^{\mu + \frac{1}{2} \sigma^2} \Big[\Phi\Big({\ln(t+w_P) - \mu \over \sigma} - \sigma\Big) - \Phi\Big({\ln(t) - \mu \over \sigma} - \sigma\Big) \Big]\\
&= e^{\mu + \frac{1}{2} \sigma^2} \Delta_{w_P}F_T(t; \mu + \sigma^2, \sigma).
\end{aligned} (\#eq:lognormpartexp)
$$

**Survival function of $S_{+}$ for Log-Normal distribution**

By substituting equation \@ref(eq:lognormpartexp) into equation \@ref(eq:disccensunifprim) we can solve for the survival function of $S_+$ in terms of analytically available functions:

$$
\begin{aligned}
Q_{S+}(t ;\mu, \sigma) &= Q_T(t + w_P;\mu, \sigma) + { 1 \over w_P} \int_t^{t+w_P} {1 \over z \sigma \sqrt{2\pi}} \exp\left( - {(\log(z) - \mu)^2 \over 2 \sigma^2} \right) (z-t)~ dz, \\
&= Q_T(t + w_P;\mu, \sigma) + { 1 \over w_P} \Big[ e^{\mu + \frac{1}{2} \sigma^2} \Delta_{w_P}F_T(t; \mu + \sigma^2, \sigma) - t\Delta_{w_P}F_T(t; \mu, \sigma) \Big]
\end{aligned}
$$

**Log-Normal discrete censored delay distribution**

By substituting \@ref(eq:lognormpartexp) into \@ref(eq:disccensunifprim) we get the discrete censored delay distribution in terms of analytically available functions:

$$
\begin{aligned}
f_n &= (n+1) F_T(n+1; \mu, \sigma) + (n-1) F_T(n-1; \mu, \sigma) - 2 n F_T(n; \mu, \sigma) \\
&- e^{\mu + \frac{1}{2} \sigma^2} \Delta_1^{(2)}F_T(n-1;\mu + \sigma^2, \sigma) \\
&= (n+1) F_T(n+1; \mu, \sigma) + (n-1) F_T(n-1; \mu, \sigma) - 2 n F_T(n; \mu, \sigma) \\
&+ e^{\mu + \frac{1}{2} \sigma^2} \Big( 2 F_T(n; \mu + \sigma^2, \sigma) - F_T(n+1; \mu + \sigma^2, \sigma) - F_T(n-1; \mu + \sigma^2, \sigma) \Big)\qquad n = 0, 1, \dots.
\end{aligned}
$$

## Weibull distribution

The Weibull distribution has the density function:

$$
f_T(z;\lambda,k) =
\begin{cases}
\frac{k}{\lambda}\left(\frac{z}{\lambda}\right)^{k-1}e^{-(z/\lambda)^{k}}, & z\geq0 ,\\
0, & z<0,
\end{cases}
$$
And distribution function:

$$
F_T(z;\lambda,k))=\begin{cases}1 - e^{-(z/\lambda)^k}, & z\geq0,\\ 0, & z<0.\end{cases}
$$
Where $\Phi$ is the standard normal distribution function.

**Weibull partial expectation**

We know that the full expectation of the Weibull distribution is $\mathbb{E}[T] = \lambda \Gamma(1 + 1/k)$, which can be calculated by integration using the integration substitution $y = (z / \lambda)^k$, which has transformation Jacobian:

$$
\frac{dz}{dy} = \frac{\lambda}{k}y^{1/k - 1}.
$$

Doing the same integral for the partial expectation, and using the same integration substitution gives:

$$
\begin{aligned}
\int_t^{t+w_P} z~ f_T(z; \lambda,k) dz &= \int_t^{t+w_P} \frac{kz}{\lambda}\left(\frac{z}{\lambda}\right)^{k-1}e^{-(z/\lambda)^{k}} dz \\
&= k\int_t^{t+w_P} \left(\frac{z}{\lambda}\right)^{k}e^{-(z/\lambda)^{k}} dz \\
&= \lambda k \int_{(t / \lambda)^k}^{((t + w_P) / \lambda)^k} y y^{1/k - 1} e^{-y} dy \\
&= \lambda\int_{(t / \lambda)^k}^{((t + w_P) / \lambda)^k} y^{1/k} e^{-y} dy\\
&= \lambda \Delta_{w_P} g(t; \lambda,k)
\end{aligned} (\#eq:weibullpartexp)
$$

Where

$$
g(t; \lambda, k) = \gamma(1 + 1/k, (t/\lambda)^k) = \frac{1}{k}\gamma(1/k, (t/\lambda)^k) - \frac{t}{\lambda}e^{-(t/\lambda)^k}.
$$
is a reparametrisation of the lower incomplete gamma function.

**Survival function of $S_{+}$ for Weibull distribution**

By substituting equation \@ref(eq:weibullpartexp) into equation \@ref(eq:disccensunifprim) we can solve for the survival function of $S_+$ in terms of analytically available functions:

$$
Q_{S+}(t ;\lambda,k) = Q_T(t + w_P; \lambda,k) + { 1 \over w_P} \Big[ \lambda \Delta_{w_P} g(t; \lambda,k) - t\Delta_{w_P}F_T(t; \lambda,k)\Big].
$$

**Weibull discrete censored delay distribution**

By substituting \@ref(eq:weibullpartexp) into \@ref(eq:disccensunifprim) we get the discrete censored delay distribution in terms of analytically available functions:

$$
\begin{aligned}
f_n &= (n+1)F_T(n+1) + (n-1)F_T(n-1) - 2nF_T(n) - \Delta_1\Big[ \int_{n-1}^n f_T(z) z ~dz \Big] \\
&= (n+1)F_T(n+1) + (n-1)F_T(n-1) - 2nF_T(n) - \lambda \Delta_1^{(2)} g(n-1; \lambda,k) \\
&= (n+1)F_T(n+1) + (n-1)F_T(n-1) - 2nF_T(n) \\
&+ \lambda [2 g(n; \lambda,k) - g(n+1; \lambda,k) - g(n-1; \lambda,k)] \qquad n = 0, 1, \dots.
\end{aligned}
$$

Which was also found by Cori _et al_ [@cori2013new].
seabbs marked this conversation as resolved.
Show resolved Hide resolved
# References
Loading