Temporal and Spatial Interaction Functions for twinstim

A twinstim model as described in Meyer et al. (2012) requires the specification of the spatial and temporal interaction functions ($f$ and $g$, respectively), i.e. how infectivity decays with increasing spatial and temporal distance from the source of infection. Own such functions can be specified (see siaf and tiaf, respectively), but the package already predefines some common dispersal kernels returned by the constructor functions documented here. See Meyer and Held (2014) for various spatial interaction functions, and Meyer et al. (2017, Section 3, available as vignette("twinstim")) for an illustration of the implementation.

Usage

# predefined spatial interaction functions
siaf.constant()
siaf.step(knots, maxRange = Inf, nTypes = 1, validpars = NULL)
siaf.gaussian(nTypes = 1, logsd = TRUE, density = FALSE,
              F.adaptive = FALSE, F.method = "iso",
              effRangeMult = 6, validpars = NULL)
siaf.exponential(nTypes = 1, validpars = NULL, engine = "C")
siaf.powerlaw(nTypes = 1, validpars = NULL, engine = "C")
siaf.powerlaw1(nTypes = 1, validpars = NULL, sigma = 1)
siaf.powerlawL(nTypes = 1, validpars = NULL, engine = "C")
siaf.student(nTypes = 1, validpars = NULL, engine = "C")

# predefined temporal interaction functions
tiaf.constant()
tiaf.step(knots, maxRange = Inf, nTypes = 1, validpars = NULL)
tiaf.exponential(nTypes = 1, validpars = NULL)

Arguments

knots: numeric vector of distances at which the step function switches to a new height. The length of this vector determines the number of parameters to estimate. For identifiability, the step function has height 1 in the first interval $[0,knots_1)$. Note that the implementation is right-continuous, i.e., intervals are $[a,b)$.
An initial choice of knots could be based on quantiles of the observed distances between events and their potential source events. For instance, an identifiable spatial step function could be siaf.step(quantile(getSourceDists(myepi, "space"), c(1,2,4)/10)), where myepi is the "epidataCS" data to be modelled.
maxRange: a scalar larger than any of knots. Per default (maxRange=Inf), the step function never drops to 0 but keeps the last height for any distance larger than the last knot. However, this might not work in some cases, where the last parameter value would become very small and lead to numerical problems. It is then possible to truncate interaction at a distance maxRange (just like what the variables eps.s and eps.t do in the "epidataCS" object).
nTypes: determines the number of parameters ((log-)scales or (log-)shapes) of the kernels. In a multitype epidemic, the different types may share the same spatial interaction function, in which case nTypes=1. Otherwise nTypes should equal the number of event types of the epidemic, in which case every type has its own (log-)scale or (log-)shape, respectively.
Currently, nTypes > 1 is only implemented for siaf.gaussian(F.adaptive = TRUE), tiaf.step, and tiaf.exponential.
logsd,density: logicals affecting the parametrization of the Gaussian kernel. Settings different from the defaults are deprecated. The default is to use only the kernel of the bivariate, isotropic normal distribution (density=FALSE, see Details below), parametrized with the log-standard deviation (logsd=TRUE) to avoid constrained optimisation (L-BFGS-B) or validpars.
The power-law kernels always employ the log-scale for their scale and shape parameters.
F.adaptive,F.method: If F.adaptive = TRUE, then an adaptive bandwidth of adapt*sd will be used in the midpoint-cubature (polyCub.midpoint in package polyCub) of the Gaussian interaction kernel, where adapt is an extra parameter of the returned siaf$F function and defaults to 0.1. It can be customized either by the control.siaf$F argument list of twinstim, or by a numeric specification of F.adaptive in the constructing call, e.g., F.adaptive = 0.05 to achieve higher accuracy.
Otherwise, if F.adaptive = FALSE, the F.method argument determines which polyCub method to use in siaf$F. The accuracy (controlled via, e.g., nGQ, rel.tol, or eps, depending on the cubature method) can then be adjusted in twinstim's control.siaf$F argument.
effRangeMult: determines the effective range for numerical integration in terms of multiples of the standard deviation $\sigma$ of the Gaussian kernel, i.e. with effRangeMult=6 the $6 \sigma$ region around the event is considered as the relevant integration domain instead of the whole observation region W. Setting effRangeMult=NULL will disable the integral approximation with an effective integration range.
validpars: function taking one argument, the parameter vector, indicating if it is valid (see also siaf). If logsd=FALSE and one prefers not to use method="L-BFGS-B" for fitting the twinstim, then validpars could be set to function (pars) pars > 0.
engine: character string specifying the implementation to use. Prior to surveillance 0.14.0, the intrfr functions for polyCub.iso were evaluated in R (and this implementation is available via engine = "R"). The new C-implementation, LinkingTo the newly exported polyCub_iso C-implementation in polyCub 0.6.0, is considerably faster.
sigma: Fixed value of $\sigma$ for the one-parameter power-law kernel.

Details

Evaluation of twinstim's likelihood involves cubature of the spatial interaction function over polygonal domains. Various approaches have been compared by Meyer (2010, Section 3.2) and a new efficient method, which takes advantage of the assumed isotropy, has been proposed by Meyer and Held (2014, Supplement B, Section 2) for evaluation of the power-law kernels. These cubature methods are available in the dedicated R package polyCub and used by the kernels implemented in surveillance.

The readily available spatial interaction functions are defined as follows:

siaf.constant:: $f(s) = 1$
siaf.step:: $f(s) = \sum_{k=0}^K \exp(\alpha_k) I_k(||s||)$,
where $\alpha_0 = 0$, and $\alpha_1, \dots, \alpha_K$ are the parameters (heights) to estimate. $I_k(||s||)$ indicates if distance $||s||$ belongs to the $k$th interval according to c(0,knots,maxRange), where $k=0$ indicates the interval c(0,knots[1]).
Note that siaf.step makes use of the memoise package if it is available – and that is highly recommended to speed up calculations. Specifically, the areas of the intersection of a polygonal domain (influence region) with the “rings” of the two-dimensional step function will be cached such that they are only calculated once for every polydomain (in the first iteration of the twinstim optimization). They are used in the integration components F and Deriv. See Meyer and Held (2014) for a use case and further details.
siaf.gaussian:: $f(s|\kappa) = \exp(-||s||/2/\sigma_\kappa^2)$
If nTypes=1 (single-type epidemic or type-invariant siaf in multi-type epidemic), then $\sigma_\kappa = \sigma$ for all types $\kappa$. If density=TRUE (deprecated), then the kernel formula above is additionally divided by $2 \pi \sigma_\kappa^2$, yielding the density of the bivariate, isotropic Gaussian distribution with zero mean and covariance matrix $\sigma_\kappa^2 I_2$. The standard deviation is optimized on the log-scale (logsd = TRUE, not doing so is deprecated).
siaf.exponential:: $f(s) = exp(-||s||/sigma)$
The scale parameter $sigma$ is estimated on the log-scale, i.e., $\sigma = \exp(\tilde{\sigma})$, and $\tilde{\sigma}$ is the actual model parameter.
siaf.powerlaw:: $f(s) = (||s|| + \sigma)^{-d}$
The parameters are optimized on the log-scale to ensure positivity, i.e., $\sigma = \exp(\tilde{\sigma})$ and $d = \exp(\tilde{d})$, where $(\tilde{\sigma}, \tilde{d})$ is the parameter vector. If a power-law kernel is not identifiable for the dataset at hand, the exponential kernel or a lagged power law are useful alternatives.
siaf.powerlaw1:: $f(s) = (||s|| + 1)^{-d}$,
i.e., siaf.powerlaw with fixed $\sigma = 1$. A different fixed value for $sigma$ can be specified via the sigma argument of siaf.powerlaw1. The decay parameter $d$ is estimated on the log-scale.
siaf.powerlawL:: $f(s) = (||s||/\sigma)^{-d}$, for $||s|| \ge \sigma$, and $f(s) = 1$ otherwise,
which is a Lagged power-law kernel featuring uniform short-range dispersal (up to distance $\sigma$) and a power-law decay (Pareto-style) from distance $\sigma$ onwards. The parameters are optimized on the log-scale to ensure positivity, i.e. $\sigma = \exp(\tilde{\sigma})$ and $d = \exp(\tilde{d})$, where $(\tilde{\sigma}, \tilde{d})$ is the parameter vector. However, there is a caveat associated with this kernel: Its derivative wrt $\tilde{\sigma}$ is mathematically undefined at the threshold $||s||=\sigma$. This local non-differentiability makes twinstim's likelihood maximization sensitive wrt parameter start values, and is likely to cause false convergence warnings by nlminb. Possible workarounds are to use the slow and robust method="Nelder-Mead", or to just ignore the warning and verify the result by sets of different start values.
siaf.student:: $f(s) = (||s||^2 + \sigma^2)^{-d}$,
which is a reparametrized $t$-kernel. For $d=1$, this is the kernel of the Cauchy density with scale sigma. In Geostatistics, a correlation function of this kind is known as the Cauchy model.
The parameters are optimized on the log-scale to ensure positivity, i.e. $\sigma = \exp(\tilde{\sigma})$ and $d = \exp(\tilde{d})$, where $(\tilde{\sigma}, \tilde{d})$ is the parameter vector.

The predefined temporal interaction functions are defined as follows:

tiaf.constant:: $g(t) = 1$
tiaf.step:: $g(t) = \sum_{k=0}^K \exp(\alpha_k) I_k(t)$,
where $\alpha_0 = 0$, and $\alpha_1, \dots, \alpha_K$ are the parameters (heights) to estimate. $I_k(t)$ indicates if $t$ belongs to the $k$th interval according to c(0,knots,maxRange), where $k=0$ indicates the interval c(0,knots[1]).
tiaf.exponential:: $g(t|\kappa) = \exp(-\alpha_\kappa t)$,
which is the kernel of the exponential distribution. If nTypes=1 (single-type epidemic or type-invariant tiaf in multi-type epidemic), then $\alpha_\kappa = \alpha$ for all types $\kappa$.

Value

The specification of an interaction function, which is a list. See siaf and tiaf, respectively, for a description of its components.

References

Meyer, S. (2010): Spatio-Temporal Infectious Disease Epidemiology based on Point Processes. Master's Thesis, Ludwig-Maximilians-Universität München.
Available as https://epub.ub.uni-muenchen.de/11703/

Meyer, S., Elias, J. and Höhle, M. (2012): A space-time conditional intensity model for invasive meningococcal disease occurrence. Biometrics, 68, 607-616. doi:10.1111/j.1541-0420.2011.01684.x

Meyer, S. and Held, L. (2014): Power-law models for infectious disease spread. The Annals of Applied Statistics, 8 (3), 1612-1639. doi:10.1214/14-AOAS743

Meyer, S., Held, L. and Höhle, M. (2017): Spatio-temporal analysis of epidemic phenomena using the R package surveillance. Journal of Statistical Software, 77 (11), 1-55. doi:10.18637/jss.v077.i11

Author

Sebastian Meyer

Examples

# constant temporal dispersal
tiaf.constant()
# step function kernel
tiaf.step(c(3,7), maxRange=14, nTypes=2)
# exponential temporal decay
tiaf.exponential()

# Type-dependent Gaussian spatial interaction function using an adaptive
# two-dimensional midpoint-rule to integrate it over polygonal domains
siaf.gaussian(2, F.adaptive=TRUE)

# Single-type Gaussian spatial interaction function (using polyCub.iso)
siaf.gaussian()

# Exponential kernel
siaf.exponential()

# Power-law kernel
siaf.powerlaw()

# Power-law kernel with fixed sigma = 1
siaf.powerlaw1()

# "lagged" power-law
siaf.powerlawL()

# (reparametrized) t-kernel
siaf.student()

# step function kernel
siaf.step(c(10,20,50), maxRange=100)

Temporal and Spatial Interaction Functions for `twinstim`