# Count Data Regression Charts

`algo.glrnb.Rd`

Count data regression charts for the monitoring of surveillance time series as proposed by Höhle and Paul (2008). The implementation is described in Salmon et al. (2016).

## Usage

```
algo.glrnb(disProgObj, control = list(range=range, c.ARL=5,
mu0=NULL, alpha=0, Mtilde=1, M=-1, change="intercept",
theta=NULL, dir=c("inc","dec"),
ret=c("cases","value"), xMax=1e4))
algo.glrpois(disProgObj, control = list(range=range, c.ARL=5,
mu0=NULL, Mtilde=1, M=-1, change="intercept",
theta=NULL, dir=c("inc","dec"),
ret=c("cases","value"), xMax=1e4))
```

## Arguments

- disProgObj
object of class

`disProg`

to do surveillance for. For new`sts`

-class data, use the`glrnb`

wrapper, or the`sts2disProg`

converter.- control
A list controlling the behaviour of the algorithm

`range`

vector of indices in the observed vector to monitor (should be consecutive)

`mu0`

A vector of in-control values of the mean of the Poisson / negative binomial distribution with the same length as

`range`

. If`NULL`

the observed values in`1:(min(range)-1)`

are used to estimate the beta vector through a generalized linear model. To fine-tune the model one can instead specify`mu0`

as a list with two components:`S`

integer number of harmonics to include (typically 1 or 2)

`trend`

A Boolean indicating whether to include a term

`t`

in the GLM model

The fitting is controlled by the

`estimateGLRNbHook`

function. The in-control mean model is re-fitted after every alarm. The fitted models can be found as a list`mod`

in the`control`

slot after the call.Note: If a value for

`alpha`

is given, then the inverse of this value is used as fixed`theta`

in a`negative.binomial`

`glm`

. If`is.null(alpha)`

then the parameter is estimated as well (using`glm.nb`

) -- see the description of this parameter for details.`alpha`

The (known) dispersion parameter of the negative binomial distribution, i.e. the parametrization of the negative binomial is such that the variance is \(mean + alpha*mean^2\). Note: This parametrization is the inverse of the shape parametrization used in R -- for example in

`dnbinom`

and`glr.nb`

. Hence, if`alpha=0`

then the negative binomial distribution boils down to the Poisson distribution and a call of`algo.glrnb`

is equivalent to a call to`algo.glrpois`

. If`alpha=NULL`

the parameter is calculated as part of the in-control estimation. However, the parameter is estimated only once from the first fit. Subsequent fittings are only for the parameters of the linear predictor with`alpha`

fixed.`c.ARL`

threshold in the GLR test, i.e. \(c_{\gamma}\)

`Mtilde`

number of observations needed before we have a full rank the typical setup for the "

`intercept`

" and "`epi`

" charts is`Mtilde=1`

`M`

number of time instances back in time in the window-limited approach, i.e. the last value considered is \(\max{1,n-M}\). To always look back until the first observation use

`M=-1`

.`change`

a string specifying the type of the alternative. Currently the two choices are

`intercept`

and`epi`

. See the SFB Discussion Paper 500 for details.`theta`

if

`NULL`

then the GLR scheme is used. If not`NULL`

the prespecified value for \(\kappa\) or \(\lambda\) is used in a recursive LR scheme, which is faster.`dir`

a string specifying the direction of testing in GLR scheme. With

`"inc"`

only increases in \(x\) are considered in the GLR-statistic, with`"dec"`

decreases are regarded.`ret`

a string specifying the type of

`upperbound`

-statistic that is returned. With`"cases"`

the number of cases that would have been necessary to produce an alarm or with`"value"`

the GLR-statistic is computed (see below).`xMax`

Maximum value to try for x to see if this is the upperbound number of cases before sounding an alarm (Default: 1e4). This only applies for the GLR using the NegBin when

`ret="cases"`

-- see details.

## Value

`algo.glrpois`

simply calls `algo.glrnb`

with

`control$alpha`

set to 0.

`algo.glrnb`

returns a list of class

`survRes`

(surveillance result), which includes the alarm
value for recognizing an outbreak (1 for alarm, 0 for no alarm),
the threshold value for recognizing the alarm and the input object
of class disProg. The `upperbound`

slot of the object are
filled with the current \(GLR(n)\) value or with the number of
cases that are necessary to produce an alarm at any time point

\(\leq n\). Both lead to the same alarm timepoints, but

`"cases"`

has an obvious interpretation.

## Details

This function implements the seasonal count data chart based on
generalized likelihood ratio (GLR) as described in the Höhle and Paul
(2008) paper. A moving-window generalized likelihood ratio
detector is used, i.e. the detector has the form
$$N = \inf\left\{ n : \max_{1\leq k \leq
n} \left[ \sum_{t=k}^n \log \left\{
\frac{f_{\theta_1}(x_t|z_t)}{f_{\theta_0}(x_t|z_t)} \right\}
\right] \geq c_\gamma \right\} $$
where instead of \(1\leq k \leq n\) the GLR statistic is
computed for all \(k \in \{n-M, \ldots, n-\tilde{M}+1\}\). To
achieve the typical behaviour from \(1\leq k\leq n\) use
`Mtilde=1`

and `M=-1`

.

So \(N\) is the time point where the GLR statistic is above the
threshold the first time: An alarm is given and the surveillance is
reset starting from time \(N+1\). Note that the same
`c.ARL`

as before is used, but if `mu0`

is different at
\(N+1,N+2,\ldots\) compared to time \(1,2,\ldots\) the run length
properties differ. Because `c.ARL`

to obtain a specific ARL can
only be obtained my Monte Carlo simulation there is no good way to
update `c.ARL`

automatically at the moment. Also, FIR GLR-detectors
might be worth considering.

In case `is.null(theta)`

and `alpha>0`

as well as
`ret="cases"`

then a brute-force search is conducted for each time
point in range in order to determine the number of cases necessary
before an alarm is sounded. In case no alarm was sounded so far by time
\(t\), the function increases \(x[t]\) until an alarm is sounded any
time before time point \(t\). If no alarm is sounded by `xMax`

, a return value
of 1e99 is given. Similarly, if an alarm was sounded by time \(t\) the
function counts down instead. Note: This is slow experimental code!

At the moment, window limited ```intercept`

'' charts have not been
extensively tested and are at the moment not supported. As speed is
not an issue here this doesn't bother too much. Therefore, a value of
`M=-1`

is always used in the intercept charts.

## References

Höhle, M. and Paul, M. (2008): Count data regression charts for the monitoring of surveillance time series. Computational Statistics and Data Analysis, 52 (9), 4357-4368.

Salmon, M., Schumacher, D. and Höhle, M. (2016):
Monitoring count time series in R: Aberration detection in public
health surveillance. *Journal of Statistical Software*,
**70** (10), 1-35. doi:10.18637/jss.v070.i10

## Examples

```
##Simulate data and apply the algorithm
S <- 1 ; t <- 1:120 ; m <- length(t)
beta <- c(1.5,0.6,0.6)
omega <- 2*pi/52
#log mu_{0,t}
base <- beta[1] + beta[2] * cos(omega*t) + beta[3] * sin(omega*t)
#Generate example data with changepoint and tau=tau
tau <- 100
kappa <- 0.4
mu0 <- exp(base)
mu1 <- exp(base + kappa)
## Poisson example
#Generate data
set.seed(42)
x <- rpois(length(t),mu0*(exp(kappa)^(t>=tau)))
s.ts <- sts(observed=x, state=(t>=tau))
#Plot the data
plot(s.ts, xaxis.labelFormat=NULL)
#Run
cntrl = list(range=t,c.ARL=5, Mtilde=1, mu0=mu0,
change="intercept",ret="value",dir="inc")
glr.ts <- glrpois(s.ts,control=cntrl)
plot(glr.ts, xaxis.labelFormat=NULL, dx.upperbound=0.5)
lr.ts <- glrpois(s.ts,control=c(cntrl,theta=0.4))
plot(lr.ts, xaxis.labelFormat=NULL, dx.upperbound=0.5)
#using the legacy interface for "disProg" data
lr.ts0 <- algo.glrpois(sts2disProg(s.ts), control=c(cntrl,theta=0.4))
stopifnot(upperbound(lr.ts) == lr.ts0$upperbound)
## NegBin example
#Generate data
set.seed(42)
alpha <- 0.2
x <- rnbinom(length(t),mu=mu0*(exp(kappa)^(t>=tau)),size=1/alpha)
s.ts <- sts(observed=x, state=(t>=tau))
#Plot the data
plot(s.ts, xaxis.labelFormat=NULL)
#Run GLR based detection
cntrl = list(range=t,c.ARL=5, Mtilde=1, mu0=mu0, alpha=alpha,
change="intercept",ret="value",dir="inc")
glr.ts <- glrnb(s.ts, control=cntrl)
plot(glr.ts, xaxis.labelFormat=NULL, dx.upperbound=0.5)
#CUSUM LR detection with backcalculated number of cases
cntrl2 = list(range=t,c.ARL=5, Mtilde=1, mu0=mu0, alpha=alpha,
change="intercept",ret="cases",dir="inc",theta=1.2)
glr.ts2 <- glrnb(s.ts, control=cntrl2)
plot(glr.ts2, xaxis.labelFormat=NULL)
```