Bayesian Outbreak Detection Algorithm (BODA)

The function takes range values of a univariate surveillance time series sts and for each time point uses a negative binomial regression model to compute the predictive posterior distribution for the current observation. The \((1-\alpha)\cdot 100\%\) quantile of this predictive distribution is then used as bound: If the actual observation is above the bound an alarm is raised. The Bayesian Outbreak Detection Algorithm (boda) is due to Manitz and Höhle (2013) and its implementation is illustrated in Salmon et al. (2016). However, boda should be considered as an experiment, see the Warning section below!

Usage

boda(sts, control = list(
    range=NULL, X=NULL, trend=FALSE, season=FALSE,
    prior=c('iid','rw1','rw2'), alpha=0.05, mc.munu=100, 
    mc.y=10, verbose=FALSE,
    samplingMethod=c('joint','marginals'),
    quantileMethod=c("MC","MM")
))

Arguments

sts

object of class sts (including the observed and the state time series)

control

Control object given as a list containing the following components:

range: Specifies the index of all timepoints which should be tested. If range is NULL all possible timepoints are used.
X: Data frame (or matrix) of covariates with as many rows as there are time points, i.e., nrow(sts).
trend: Boolean indicating whether a linear trend term should be included in the model for the expectation the log-scale
season: Boolean to indicate whether a cyclic spline should be included.
prior: Model for temporal smoothing using an f(time, model) term in the inla formula.
alpha: The threshold for declaring an observed count as an aberration is the \((1-\alpha)\cdot 100\%\) quantile of the predictive posterior.
mc.munu
mc.y: Number of samples of \(y\) to generate for each par of the mean and size parameter. A total of \(mc.munu \times mc.y\) samples are generated.
verbose: Argument sent to the inla call. When using ESS it might be necessary to force verbose mode for INLA to work.
samplingMethod: Should one sample from the parameters joint distribution (joint) or from their respective marginal posterior distribution (marginals)?
quantileMethod: Character, either MC or MM. Indicates how to compute the quantile based on the posterior distribution (no matter the inference method): either by sampling mc.munu values from the posterior distribution of the parameters and then for each sampled parameters vector sampling mc.y response values so that one gets a vector of response values based on which one computes an empirical quantile (MC method, as explained in Manitz and Höhle 2013); or by sampling mc.munu from the posterior distribution of the parameters and then compute the quantile of the mixture distribution using bisectioning, which is faster.

Note

This function requires the R package INLA, which is currently not available from CRAN. It can be obtained from INLA's own repository via install.packages("INLA", repos="https://inla.r-inla-download.org/R/stable").

Warning

This function is currently experimental!! It also heavily depends on the INLA package so changes there might affect the operational ability of this function. Since the computations for the Bayesian GAM are quite involved do not expect this function to be particularly fast.

Results are not reproducible if INLA uses parallelization (as by default); set INLA::inla.setOption(num.threads = "1:1") to avoid that, then do set.seed as usual.

Author

J. Manitz, M. Höhle, M. Salmon

References

Manitz, J. and Höhle, M. (2013): Bayesian outbreak detection algorithm for monitoring reported cases of campylobacteriosis in Germany. Biometrical Journal, 55(4), 509-526.

Salmon, M., Schumacher, D. and Höhle, M. (2016): Monitoring count time series in R: Aberration detection in public health surveillance. Journal of Statistical Software, 70 (10), 1-35. doi:10.18637/jss.v070.i10

Examples

if (FALSE) { # \dontrun{
  ## running this example takes a couple of minutes

  #Load the campylobacteriosis data for Germany
  data("campyDE")
  #Make an sts object from the data.frame
  cam.sts <-  sts(epoch=campyDE$date,
                  observed=campyDE$case, state=campyDE$state)

  #Define monitoring period
#  range <- which(epoch(cam.sts)>=as.Date("2007-01-01"))
#  range <- which(epoch(cam.sts)>=as.Date("2011-12-10"))
  range <- tail(1:nrow(cam.sts),n=2)

  control <- list(range=range, X=NULL, trend=TRUE, season=TRUE,
                  prior='iid', alpha=0.025, mc.munu=100, mc.y=10,
                  samplingMethod = "joint")

  #Apply the boda algorithm in its simples form, i.e. spline is
  #described by iid random effects and no extra covariates
  library("INLA")  # needs to be attached
  cam.boda1 <- boda(cam.sts, control=control)

  plot(cam.boda1, xlab='time [weeks]', ylab='No. reported', dx.upperbound=0)
} # }