Proper Scoring Rules for Poisson or Negative Binomial Predictions

Proper scoring rules for Poisson or negative binomial predictions of count data are described in Czado et al. (2009). The following scores are implemented: logarithmic score (logs), ranked probability score (rps), Dawid-Sebastiani score (dss), squared error score (ses).

Usage

scores(x, ...)

# Default S3 method
scores(x, mu, size = NULL,
       which = c("logs", "rps", "dss", "ses"),
       sign = FALSE, ...)

logs(x, mu, size = NULL)
rps(x, mu, size = NULL, k = 40, tolerance = sqrt(.Machine$double.eps))
dss(x, mu, size = NULL)
ses(x, mu, size = NULL)

Arguments

x: the observed counts. All functions are vectorized and also accept matrices or arrays. Dimensions are preserved.
mu: the means of the predictive distributions for the observations x.
size: either NULL (default), indicating Poisson predictions with mean mu, or dispersion parameters of negative binomial forecasts for the observations x, parametrized as in dnbinom with variance mu*(1+mu/size).
which: a character vector specifying which scoring rules to apply. By default, all four proper scores are calculated. The normalized squared error score ("nses") is also available but it is improper and hence not computed by default.
sign: a logical indicating if the function should also return sign(x-mu), i.e., the sign of the difference between the observed counts and corresponding predictions.
...: unused (argument of the generic).
k: scalar argument controlling the finite sum approximation for the rps with truncation at max(x, ceiling(mu + k*sd)).
tolerance: absolute tolerance for the finite sum approximation employed in the rps calculation. A warning is produced if the approximation with k summands is insufficient for the specified tolerance. In this case, increase k for higher precision (or use a larger tolerance).

Value

The scoring functions return the individual scores for the predictions of the observations in x (maintaining their dimension attributes).

The default scores-method applies the selected (which) scoring functions (and calculates sign(x-mu)) and returns the results in an array (via simplify2array), where the last dimension corresponds to the different scores.

References

Czado, C., Gneiting, T. and Held, L. (2009): Predictive model assessment for count data. Biometrics, 65 (4), 1254-1261. doi:10.1111/j.1541-0420.2009.01191.x

Author

Sebastian Meyer and Michaela Paul

Examples

mu <- c(0.1, 1, 3, 6, 3*pi, 100)
size <- 0.5
set.seed(1)
y <- rnbinom(length(mu), mu = mu, size = size)
scores(y, mu = mu, size = size)
scores(y, mu = mu, size = 1)  # ses ignores the variance
scores(y, mu = 1, size = size)

## apply a specific scoring rule
scores(y, mu = mu, size = size, which = "rps")
rps(y, mu = mu, size = size)

## rps() gives NA (with a warning) if the NegBin is too wide
rps(1e5, mu = 1e5, size = 1e-5)