Bayesian Outbreak Detection Algorithm (BODA)
boda.RdThe function takes range values of a univariate surveillance time
series sts and for each time point uses a negative binomial
regression model to compute the predictive posterior distribution for
the current observation. The
\((1-\alpha)\cdot 100\%\)
quantile of this predictive distribution is then
used as bound: If the actual observation is above the bound an alarm
is raised.
The Bayesian Outbreak Detection Algorithm (boda) is due to
Manitz and Höhle (2013) and its implementation is
illustrated in Salmon et al. (2016).
However, boda should be considered as an experiment, see the
Warning section below!
Arguments
- sts
object of class sts (including the
observedand thestatetime series)- control
Control object given as a
listcontaining the following components:rangeSpecifies the index of all timepoints which should be tested. If range is
NULLall possible timepoints are used.XData frame (or matrix) of covariates with as many rows as there are time points, i.e.,
nrow(sts).trendBoolean indicating whether a linear trend term should be included in the model for the expectation the log-scale
seasonBoolean to indicate whether a cyclic spline should be included.
priorModel for temporal smoothing using an
f(time, model)term in theinlaformula.alphaThe threshold for declaring an observed count as an aberration is the \((1-\alpha)\cdot 100\%\) quantile of the predictive posterior.
mc.munumc.yNumber of samples of \(y\) to generate for each par of the mean and size parameter. A total of \(mc.munu \times mc.y\) samples are generated.
verboseArgument sent to the inla call. When using ESS it might be necessary to force verbose mode for INLA to work.
samplingMethodShould one sample from the parameters joint distribution (joint) or from their respective marginal posterior distribution (marginals)?
- quantileMethod
Character, either
MCorMM. Indicates how to compute the quantile based on the posterior distribution (no matter the inference method): either by samplingmc.munuvalues from the posterior distribution of the parameters and then for each sampled parameters vector samplingmc.yresponse values so that one gets a vector of response values based on which one computes an empirical quantile (MC method, as explained in Manitz and Höhle 2013); or by samplingmc.munufrom the posterior distribution of the parameters and then compute the quantile of the mixture distribution using bisectioning, which is faster.
Note
This function requires the R package INLA, which is currently
not available from CRAN. It can be obtained from INLA's own
repository via
install.packages("INLA", repos="https://inla.r-inla-download.org/R/stable").
Warning
This function is currently experimental!! It also heavily depends on the INLA package so changes there might affect the operational ability of this function. Since the computations for the Bayesian GAM are quite involved do not expect this function to be particularly fast.
Results are not reproducible if INLA uses parallelization (as by default);
set INLA::inla.setOption(num.threads = "1:1") to avoid that,
then do set.seed as usual.
References
Manitz, J. and Höhle, M. (2013): Bayesian outbreak detection algorithm for monitoring reported cases of campylobacteriosis in Germany. Biometrical Journal, 55(4), 509-526.
Salmon, M., Schumacher, D. and Höhle, M. (2016): Monitoring count time series in R: Aberration detection in public health surveillance. Journal of Statistical Software, 70 (10), 1-35. doi:10.18637/jss.v070.i10
Examples
if (FALSE) { # \dontrun{
## running this example takes a couple of minutes
#Load the campylobacteriosis data for Germany
data("campyDE")
#Make an sts object from the data.frame
cam.sts <- sts(epoch=campyDE$date,
observed=campyDE$case, state=campyDE$state)
#Define monitoring period
# range <- which(epoch(cam.sts)>=as.Date("2007-01-01"))
# range <- which(epoch(cam.sts)>=as.Date("2011-12-10"))
range <- tail(1:nrow(cam.sts),n=2)
control <- list(range=range, X=NULL, trend=TRUE, season=TRUE,
prior='iid', alpha=0.025, mc.munu=100, mc.y=10,
samplingMethod = "joint")
#Apply the boda algorithm in its simples form, i.e. spline is
#described by iid random effects and no extra covariates
library("INLA") # needs to be attached
cam.boda1 <- boda(cam.sts, control=control)
plot(cam.boda1, xlab='time [weeks]', ylab='No. reported', dx.upperbound=0)
} # }