# Predictive Model Assessment for `hhh4`

Models

`hhh4_validation.Rd`

The function `oneStepAhead`

computes successive one-step-ahead
predictions for a (random effects) HHH model fitted by `hhh4`

.
These can be inspected using the `quantile`

, `confint`

or
`plot`

methods.
The associated `scores`

-method computes a number of (strictly) proper
scoring rules based on such one-step-ahead predictions;
see Paul and Held (2011) for details.
There are also `calibrationTest`

and `pit`

methods for `oneStepAhead`

predictions.

Scores, calibration tests and PIT histograms can also be
computed for the fitted values of an `hhh4`

model
(i.e., in-sample/training data evaluation).

## Usage

```
oneStepAhead(result, tp, type = c("rolling", "first", "final"),
which.start = c("current", "final"),
keep.estimates = FALSE, verbose = type != "final",
cores = 1)
# S3 method for oneStepAhead
quantile(x, probs = c(2.5, 10, 50, 90, 97.5)/100, ...)
# S3 method for oneStepAhead
confint(object, parm, level = 0.95, ...)
# S3 method for oneStepAhead
plot(x, unit = 1, probs = 1:99/100,
start = NULL, means.args = NULL, ...)
## assessment of "oneStepAhead" predictions
# S3 method for oneStepAhead
scores(x, which = c("logs", "rps", "dss", "ses"),
units = NULL, sign = FALSE, individual = FALSE, reverse = FALSE, ...)
# S3 method for oneStepAhead
calibrationTest(x, units = NULL, ...)
# S3 method for oneStepAhead
pit(x, units = NULL, ...)
## assessment of the "hhh4" model fit (in-sample predictions)
# S3 method for hhh4
scores(x, which = c("logs", "rps", "dss", "ses"),
subset = x$control$subset, units = seq_len(x$nUnit), sign = FALSE, ...)
# S3 method for hhh4
calibrationTest(x,
subset = x$control$subset, units = seq_len(x$nUnit), ...)
# S3 method for hhh4
pit(x, subset = x$control$subset, units = seq_len(x$nUnit), ...)
```

## Arguments

- result
fitted

`hhh4`

model (class`"hhh4"`

).- tp
numeric vector of length 2 specifying the time range in which to compute one-step-ahead predictions (for the time points

`tp[1]+1`

, ...,`tp[2]+1`

). If a single time index is specified, it is interpreted as`tp[1]`

, and`tp[2]`

is set to the penultimate time point of`result$control$subset`

.- type
The default

`"rolling"`

procedure sequentially refits the model up to each time point in`tp`

and computes the one-step-ahead predictions for the respective next time point. The alternative`type`

s are no true one-step-ahead predictions but much faster:`"first"`

will refit the model for the first time point`tp[1]`

only and use this specific fit to calculate all subsequent predictions, whereas`"final"`

will just use`result`

to calculate these. The latter case thus gives nothing else than a subset of`result$fitted.values`

if the`tp`

's are part of the fitted subset`result$control$subset`

.- which.start
Which initial parameter values should be used when successively refitting the model to subsets of the data (up to time point

`tp[1]`

, up to`tp[1]+1`

, ...) if`type="rolling"`

? Default (`"current"`

) is to use the parameter estimates from the previous time point, and`"final"`

means to always use the estimates from`result`

as initial values. Alternatively,`which.start`

can be a list of`start`

values as expected by`hhh4`

, which then replace the corresponding estimates from`result`

as initial values. This argument is ignored for “non-rolling”`type`

s.- keep.estimates
logical indicating if parameter estimates and log-likelihoods from the successive fits should be returned.

- verbose
non-negative integer (usually in the range

`0:3`

) specifying the amount of tracing information to output. During`hhh4`

model updates, the following verbosity is used:`0`

if`cores > 1`

, otherwise`verbose-1`

if there is more than one time point to predict, otherwise`verbose`

.- cores
the number of cores to use when computing the predictions for the set of time points

`tp`

in parallel (with`mclapply`

). Note that parallelization is not possible in the default setting`type="rolling"`

and`which.start="current"`

(use`which.start="final"`

for this to work).- object
an object of class

`"oneStepAhead"`

.- parm
unused (argument of the generic).

- level
required confidence level of the prediction interval.

- probs
numeric vector of probabilities with values in [0,1].

- unit
single integer or character selecting a unit for which to produce the plot.

- start
x-coordinate of the first prediction. If

`start=NULL`

(default), this is derived from`x`

.- means.args
if a list (of graphical parameters for

`lines`

), the point predictions (from`x$pred`

) are added to the plot.- x
an object of class

`"oneStepAhead"`

or`"hhh4"`

.- which
character vector determining which scores to compute. The package surveillance implements the following proper scoring rules: logarithmic score (

`"logs"`

), ranked probability score (`"rps"`

), Dawid-Sebastiani score (`"dss"`

), and squared error score (`"ses"`

). The normalized SES (`"nses"`

) is also available but it is improper and hence not computed by default.

It is possible to name own scoring rules in`which`

. These must be functions of`(x, mu, size)`

, vectorized in all arguments (time x unit matrices) except that`size`

is`NULL`

in case of a Poisson model. See the available scoring rules for guidance, e.g.,`dss`

.- subset
subset of time points for which to calculate the scores (or test calibration, or produce the PIT histogram, respectively). Defaults to the subset used for fitting the model.

- units
integer or character vector indexing the units for which to compute the scores (or the calibration test or the PIT histogram, respectively). By default, all units are considered.

- sign
logical indicating if the function should also return

`sign(x-mu)`

, i.e., the sign of the difference between the observed counts and corresponding predictions. This does not really make sense when averaging over multiple`units`

with`individual=FALSE`

.- individual
logical indicating if the individual scores of the

`units`

should be returned. By default (`FALSE`

), the individual scores are averaged over all`units`

.- reverse
logical indicating if the rows (time points) should be reversed in the result. The long-standing but awkward default was to do so for the

`oneStepAhead`

-method. This has changed in version 1.16.0, so time points are no longer reversed by default.- ...
Unused by the

`quantile`

,`confint`

and`scores`

methods.

The`plot`

-method passes further arguments to the`fanplot`

function, e.g.,`fan.args`

,`observed.args`

, and`key.args`

can be used to modify the plotting style.

For the`calibrationTest`

-method, further arguments are passed to`calibrationTest.default`

, e.g.,`which`

to select a scoring rule.

For the`pit`

-methods, further arguments are passed to`pit.default`

.

## Value

`oneStepAhead`

returns a list (of class `"oneStepAhead"`

)
with the following components:

- pred
one-step-ahead predictions in a matrix, where each row corresponds to one of the time points requested via the argument

`tp`

, and which has`ncol(result$stsObj)`

unit-specific columns. The rownames indicate the predicted time points and the column names are identical to`colnames(result$stsObj)`

.- observed
matrix with observed counts at the predicted time points. It has the same dimensions and names as

`pred`

.- psi
in case of a negative-binomial model, a matrix of the estimated overdispersion parameter(s) at each time point on the internal -log-scale (1 column if

`"NegBin1"`

,`ncol(observed)`

columns if`"NegBinM"`

or shared overdispersion). For a`"Poisson"`

model, this component is`NULL`

.- allConverged
logical indicating if all successive fits converged.

If `keep.estimates=TRUE`

, there are the following additional elements:

- coefficients
matrix of estimated regression parameters from the successive fits.

- Sigma.orig
matrix of estimated variance parameters from the successive fits.

- logliks
matrix with columns

`"loglikelihood"`

and`"margll"`

with their obvious meanings.

The `quantile`

-method computes quantiles of the one-step-ahead
forecasts. If there is only one unit, it returns a tp x prob matrix,
otherwise a tp x unit x prob array.
The `confint`

-method is a convenient wrapper with `probs`

set
according to the required confidence level.

The function `scores`

computes the scoring rules specified in the
argument `which`

.
If multiple `units`

are selected and `individual=TRUE`

, the
result is an array of dimensions
`c(nrow(pred),length(units),5+sign)`

(up to surveillance

1.8-0, the first two dimensions were collapsed to give a matrix).
Otherwise, the result is a matrix with `nrow(pred)`

rows and
`5+sign`

columns. If there is only one predicted time point, the
first dimension is dropped in both cases.

The `calibrationTest`

- and `pit`

-methods are
just convenient wrappers around the respective default methods.

## References

Czado, C., Gneiting, T. and Held, L. (2009):
Predictive model assessment for count data.
*Biometrics*, **65** (4), 1254-1261.
doi:10.1111/j.1541-0420.2009.01191.x

Paul, M. and Held, L. (2011):
Predictive assessment of a non-linear random effects model for
multivariate time series of infectious disease counts.
*Statistics in Medicine*, **30** (10), 1118-1136.
doi:10.1002/sim.4177

## Examples

```
### univariate salmonella agona count time series
data("salmonella.agona")
## convert from old "disProg" to new "sts" class
salmonella <- disProg2sts(salmonella.agona)
## generate formula for temporal and seasonal trends
f.end <- addSeason2formula(~1 + t, S=1, period=52)
model <- list(ar = list(f = ~1), end = list(f = f.end), family = "NegBin1")
## fit the model
result <- hhh4(salmonella, model)
## do sequential one-step-ahead predictions for the last 5 weeks
pred <- oneStepAhead(result, nrow(salmonella)-5, type="rolling",
which.start="final", verbose=FALSE)
pred
quantile(pred)
confint(pred)
## simple plot of the 80% one-week-ahead prediction interval
## and point forecasts
if (requireNamespace("fanplot"))
plot(pred, probs = c(.1,.9), means.args = list())
# \dontshow{
## test equivalence of parallelized version
if (.Platform$OS.type == "unix" && isTRUE(parallel::detectCores() > 1))
stopifnot(identical(pred,
oneStepAhead(result, nrow(salmonella)-5, type="rolling",
which.start="final", verbose=FALSE, cores=2)))
# }
## note: oneStepAhead(..., type="final") just means fitted values
stopifnot(identical(
unname(oneStepAhead(result, nrow(salmonella)-5, type="final")$pred),
unname(tail(fitted(result), 5))))
## compute scores of the one-step-ahead predictions
(sc <- scores(pred))
## the above uses the scores-method for "oneStepAhead" predictions,
## which is a simple wrapper around the default method:
scores(x = pred$observed, mu = pred$pred, size = exp(pred$psi))
## scores with respect to the fitted values are similar
(scFitted <- scores(result, subset = nrow(salmonella)-(4:0)))
# \dontshow{
## test that scFitted is equivalent to scores(oneStepAhead(..., type = "final"))
stopifnot(all.equal(
scFitted,
scores(oneStepAhead(result, nrow(salmonella)-5, type="final")),
check.attributes = FALSE))
# }
## test if the one-step-ahead predictions are calibrated
calibrationTest(pred) # p = 0.8746
## the above uses the calibrationTest-method for "oneStepAhead" predictions,
## which is a simple wrapper around the default method:
calibrationTest(x = pred$observed, mu = pred$pred, size = exp(pred$psi))
## we can also test calibration of the fitted values
## using the calibrationTest-method for "hhh4" fits
calibrationTest(result, subset = nrow(salmonella)-(4:0))
## plot a (non-randomized) PIT histogram for the predictions
pit(pred)
## the above uses the pit-method for "oneStepAhead" predictions,
## which is a simple wrapper around the default method:
pit(x = pred$observed, pdistr = "pnbinom", mu = pred$pred, size = exp(pred$psi))
### multivariate measles count time series
## (omitting oneStepAhead forecasts here to keep runtime low)
data("measlesWeserEms")
## simple hhh4 model with random effects in the endemic component
measlesModel <- list(
end = list(f = addSeason2formula(~0 + ri(type="iid"))),
ar = list(f = ~1),
family = "NegBin1")
measlesFit <- hhh4(measlesWeserEms, control = measlesModel)
## assess overall (in-sample) calibration of the model, i.e.,
## if the observed counts are from the fitted NegBin distribution
calibrationTest(measlesFit) # default is DSS (not suitable for low counts)
calibrationTest(measlesFit, which = "logs") # p = 0.7238
## to assess calibration in the second year for a specific district
calibrationTest(measlesFit, subset = 53:104, units = "03452", which = "rps")
pit(measlesFit, subset = 53:104, units = "03452")
### For a more sophisticated multivariate analysis of
### areal time series of influenza counts - data("fluBYBW") -
### see the (computer-intensive) demo("fluBYBW") script:
demoscript <- system.file("demo", "fluBYBW.R", package = "surveillance")
demoscript
#file.show(demoscript)
```