twinstim
twinstim_iaf.Rd
A twinstim
model as described in Meyer et al. (2012) requires
the specification of the spatial and temporal interaction functions
(\(f\) and \(g\), respectively),
i.e. how infectivity decays with increasing spatial and temporal
distance from the source of infection.
Own such functions can be specified (see
siaf
and tiaf
, respectively), but the
package already predefines some common dispersal kernels returned by
the constructor functions documented here.
See Meyer and Held (2014) for various spatial interaction functions,
and Meyer et al. (2017, Section 3, available as vignette("twinstim")
)
for an illustration of the implementation.
# predefined spatial interaction functions
siaf.constant()
siaf.step(knots, maxRange = Inf, nTypes = 1, validpars = NULL)
siaf.gaussian(nTypes = 1, logsd = TRUE, density = FALSE,
F.adaptive = FALSE, F.method = "iso",
effRangeMult = 6, validpars = NULL)
siaf.exponential(nTypes = 1, validpars = NULL, engine = "C")
siaf.powerlaw(nTypes = 1, validpars = NULL, engine = "C")
siaf.powerlaw1(nTypes = 1, validpars = NULL, sigma = 1)
siaf.powerlawL(nTypes = 1, validpars = NULL, engine = "C")
siaf.student(nTypes = 1, validpars = NULL, engine = "C")
# predefined temporal interaction functions
tiaf.constant()
tiaf.step(knots, maxRange = Inf, nTypes = 1, validpars = NULL)
tiaf.exponential(nTypes = 1, validpars = NULL)
numeric vector of distances at which the step function
switches to a new height. The length of this vector determines the
number of parameters to estimate. For identifiability, the step
function has height 1 in the first interval \([0,knots_1)\). Note
that the implementation is right-continuous, i.e., intervals are
\([a,b)\).
An initial choice of knots could be based on quantiles of the
observed distances between events and their potential source events.
For instance, an identifiable spatial step function could be
siaf.step(quantile(getSourceDists(myepi, "space"), c(1,2,4)/10))
,
where myepi
is the "epidataCS"
data to be modelled.
a scalar larger than any of knots
.
Per default (maxRange=Inf
), the step function
never drops to 0 but keeps the last height for any distance larger
than the last knot. However, this might not work in some cases,
where the last parameter value would become very small and lead to
numerical problems. It is then possible to truncate
interaction at a distance maxRange
(just like what the
variables eps.s
and eps.t
do in the
"epidataCS"
object).
determines the number of parameters ((log-)scales or (log-)shapes)
of the kernels. In a multitype epidemic, the different types may
share the same spatial interaction function, in which case
nTypes=1
. Otherwise nTypes
should equal the number of
event types of the epidemic, in which case every type has its own
(log-)scale or (log-)shape, respectively.
Currently, nTypes > 1
is only implemented for
siaf.gaussian(F.adaptive = TRUE)
,
tiaf.step
, and tiaf.exponential
.
logicals affecting the parametrization of the Gaussian kernel.
Settings different from the defaults are deprecated.
The default is to use only the kernel of the bivariate, isotropic
normal distribution (density=FALSE
, see Details below),
parametrized with the log-standard deviation (logsd=TRUE
) to
avoid constrained optimisation (L-BFGS-B) or validpars
.
The power-law kernels always employ the log-scale for their scale
and shape parameters.
If F.adaptive = TRUE
, then an adaptive bandwidth of
adapt*sd
will be used in the midpoint-cubature
(polyCub.midpoint
in package polyCub)
of the Gaussian interaction
kernel, where adapt
is an extra parameter of the returned
siaf$F
function and defaults to 0.1. It can be customized
either by the control.siaf$F
argument list of
twinstim
, or by a numeric specification of F.adaptive
in the constructing call, e.g., F.adaptive = 0.05
to achieve
higher accuracy.
Otherwise, if F.adaptive = FALSE
, the F.method
argument determines which polyCub
method to
use in siaf$F
. The accuracy (controlled via, e.g.,
nGQ
, rel.tol
, or eps
, depending on the cubature
method) can then be adjusted in twinstim
's
control.siaf$F
argument.
determines the effective range for numerical integration
in terms of multiples of the standard deviation \(\sigma\) of the
Gaussian kernel, i.e. with effRangeMult=6
the \(6 \sigma\) region around the event is considered as
the relevant integration domain instead
of the whole observation region W
.
Setting effRangeMult=NULL
will disable
the integral approximation with an effective integration range.
function taking one argument, the parameter vector, indicating if it
is valid (see also siaf
).
If logsd=FALSE
and one prefers not to use
method="L-BFGS-B"
for fitting the twinstim
, then
validpars
could be set to function (pars) pars > 0
.
character string specifying the implementation to use.
Prior to surveillance 0.14.0, the intrfr
functions for
polyCub.iso
were evaluated in R (and this
implementation is available via engine = "R"
).
The new C-implementation, LinkingTo the newly exported
polyCub_iso
C-implementation in polyCub 0.6.0,
is considerably faster.
Fixed value of \(\sigma\) for the one-parameter power-law kernel.
Evaluation of twinstim
's likelihood involves cubature of the
spatial interaction function over polygonal domains. Various
approaches have been compared by Meyer (2010, Section 3.2) and a new
efficient method, which takes advantage of the assumed isotropy, has
been proposed by Meyer and Held (2014, Supplement B, Section 2) for
evaluation of the power-law kernels.
These cubature methods are available in the dedicated R package
polyCub and used by the kernels implemented in surveillance.
The readily available spatial interaction functions are defined as follows:
siaf.constant
:\(f(s) = 1\)
siaf.step
:\(f(s) = \sum_{k=0}^K \exp(\alpha_k) I_k(||s||)\),
where \(\alpha_0 = 0\), and \(\alpha_1, \dots, \alpha_K\) are
the parameters (heights) to estimate. \(I_k(||s||)\) indicates
if distance \(||s||\) belongs to the \(k\)th interval
according to c(0,knots,maxRange)
, where \(k=0\) indicates
the interval c(0,knots[1])
.
Note that siaf.step
makes use of the memoise package
if it is available -- and that is highly recommended to speed up
calculations. Specifically, the areas of the intersection of a
polygonal domain (influence region) with the “rings” of the
two-dimensional step function will be cached such that they are
only calculated once for every polydomain
(in the first
iteration of the twinstim
optimization). They are used in
the integration components F
and Deriv
.
See Meyer and Held (2014) for a use case and further details.
siaf.gaussian
:\(f(s|\kappa) = \exp(-||s||/2/\sigma_\kappa^2)\)
If nTypes=1
(single-type epidemic or type-invariant
siaf
in multi-type epidemic), then
\(\sigma_\kappa = \sigma\) for all types \(\kappa\).
If density=TRUE
(deprecated), then the kernel formula above is
additionally divided by \(2 \pi \sigma_\kappa^2\), yielding the
density of the bivariate, isotropic Gaussian distribution with
zero mean and covariance matrix \(\sigma_\kappa^2 I_2\).
The standard deviation is optimized on the log-scale
(logsd = TRUE
, not doing so is deprecated).
siaf.exponential
:\(f(s) = exp(-||s||/sigma)\)
The scale parameter \(sigma\) is estimated on the log-scale,
i.e., \(\sigma = \exp(\tilde{\sigma})\), and \(\tilde{\sigma}\)
is the actual model parameter.
siaf.powerlaw
:\(f(s) = (||s|| + \sigma)^{-d}\)
The parameters are optimized on the log-scale to ensure positivity, i.e.,
\(\sigma = \exp(\tilde{\sigma})\) and \(d = \exp(\tilde{d})\),
where \((\tilde{\sigma}, \tilde{d})\) is the parameter vector.
If a power-law kernel is not identifiable for the dataset at hand,
the exponential kernel or a lagged power law are useful alternatives.
siaf.powerlaw1
:\(f(s) = (||s|| + 1)^{-d}\),
i.e., siaf.powerlaw
with fixed \(\sigma = 1\).
A different fixed value for \(sigma\) can be specified via the
sigma
argument of siaf.powerlaw1
.
The decay parameter \(d\) is estimated on the log-scale.
siaf.powerlawL
:\(f(s) = (||s||/\sigma)^{-d}\), for \(||s|| \ge \sigma\), and
\(f(s) = 1\) otherwise,
which is a Lagged power-law kernel featuring uniform
short-range dispersal (up to distance \(\sigma\)) and a
power-law decay (Pareto-style) from distance \(\sigma\) onwards.
The parameters are optimized on the log-scale to ensure positivity, i.e.
\(\sigma = \exp(\tilde{\sigma})\) and \(d = \exp(\tilde{d})\),
where \((\tilde{\sigma}, \tilde{d})\) is the parameter vector.
However, there is a caveat associated with this kernel: Its
derivative wrt \(\tilde{\sigma}\) is mathematically undefined at
the threshold \(||s||=\sigma\). This local non-differentiability
makes twinstim
's likelihood maximization sensitive wrt
parameter start values, and is likely to cause false convergence
warnings by nlminb
. Possible workarounds are to use
the slow and robust method="Nelder-Mead"
, or to just ignore
the warning and verify the result by sets of different start values.
siaf.student
:\(f(s) = (||s||^2 + \sigma^2)^{-d}\),
which is a reparametrized \(t\)-kernel.
For \(d=1\), this is the kernel of the Cauchy density with scale
sigma
. In Geostatistics, a correlation function of this
kind is known as the Cauchy model.
The parameters are optimized on the log-scale to ensure
positivity, i.e. \(\sigma = \exp(\tilde{\sigma})\) and
\(d = \exp(\tilde{d})\), where \((\tilde{\sigma}, \tilde{d})\)
is the parameter vector.
The predefined temporal interaction functions are defined as follows:
tiaf.constant
:\(g(t) = 1\)
tiaf.step
:\(g(t) = \sum_{k=0}^K \exp(\alpha_k) I_k(t)\),
where \(\alpha_0 = 0\), and \(\alpha_1, \dots, \alpha_K\) are
the parameters (heights) to estimate. \(I_k(t)\) indicates
if \(t\) belongs to the \(k\)th interval
according to c(0,knots,maxRange)
, where \(k=0\) indicates
the interval c(0,knots[1])
.
tiaf.exponential
:\(g(t|\kappa) = \exp(-\alpha_\kappa t)\),
which is the kernel of the exponential distribution.
If nTypes=1
(single-type epidemic or type-invariant
tiaf
in multi-type epidemic), then
\(\alpha_\kappa = \alpha\) for all types \(\kappa\).
The specification of an interaction function, which is a list.
See siaf
and tiaf
, respectively, for a
description of its components.
Meyer, S. (2010):
Spatio-Temporal Infectious Disease Epidemiology based on Point Processes.
Master's Thesis, Ludwig-Maximilians-Universität
München.
Available as https://epub.ub.uni-muenchen.de/11703/
Meyer, S., Elias, J. and Höhle, M. (2012): A space-time conditional intensity model for invasive meningococcal disease occurrence. Biometrics, 68, 607-616. doi: 10.1111/j.1541-0420.2011.01684.x
Meyer, S. and Held, L. (2014): Power-law models for infectious disease spread. The Annals of Applied Statistics, 8 (3), 1612-1639. doi: 10.1214/14-AOAS743
Meyer, S., Held, L. and Höhle, M. (2017): Spatio-temporal analysis of epidemic phenomena using the R package surveillance. Journal of Statistical Software, 77 (11), 1-55. doi: 10.18637/jss.v077.i11
Sebastian Meyer
# constant temporal dispersal
tiaf.constant()
# step function kernel
tiaf.step(c(3,7), maxRange=14, nTypes=2)
# exponential temporal decay
tiaf.exponential()
# Type-dependent Gaussian spatial interaction function using an adaptive
# two-dimensional midpoint-rule to integrate it over polygonal domains
siaf.gaussian(2, F.adaptive=TRUE)
# Single-type Gaussian spatial interaction function (using polyCub.iso)
siaf.gaussian()
# Exponential kernel
siaf.exponential()
# Power-law kernel
siaf.powerlaw()
# Power-law kernel with fixed sigma = 1
siaf.powerlaw1()
# "lagged" power-law
siaf.powerlawL()
# (reparametrized) t-kernel
siaf.student()
# step function kernel
siaf.step(c(10,20,50), maxRange=100)