Proper Scoring Rules for Poisson or Negative Binomial Predictions
Proper scoring rules for Poisson or negative binomial predictions
of count data are described in Czado et al. (2009).
The following scores are implemented:
logarithmic score (
ranked probability score (
Dawid-Sebastiani score (
squared error score (
the observed counts. All functions are vectorized and also accept matrices or arrays. Dimensions are preserved.
the means of the predictive distributions for the observations
NULL(default), indicating Poisson predictions with mean
mu, or dispersion parameters of negative binomial forecasts for the observations
x, parametrized as in
a character vector specifying which scoring rules to apply. By default, all four proper scores are calculated. The normalized squared error score (
"nses") is also available but it is improper and hence not computed by default.
a logical indicating if the function should also return
sign(x-mu), i.e., the sign of the difference between the observed counts and corresponding predictions.
unused (argument of the generic).
scalar argument controlling the finite sum approximation for the
rpswith truncation at
max(x, ceiling(mu + k*sd)).
absolute tolerance for the finite sum approximation employed in the
rpscalculation. A warning is produced if the approximation with
ksummands is insufficient for the specified
tolerance. In this case, increase
kfor higher precision (or use a larger tolerance).
The scoring functions return the individual scores for the predictions
of the observations in
x (maintaining their dimension attributes).
scores-method applies the selected (
scoring functions (and calculates
sign(x-mu)) and returns the
results in an array (via
simplify2array), where the last
dimension corresponds to the different scores.
Czado, C., Gneiting, T. and Held, L. (2009): Predictive model assessment for count data. Biometrics, 65 (4), 1254-1261. doi:10.1111/j.1541-0420.2009.01191.x
The R package scoringRules implements the logarithmic score and the (continuous) ranked probability score for many distributions.
mu <- c(0.1, 1, 3, 6, 3*pi, 100) size <- 0.5 set.seed(1) y <- rnbinom(length(mu), mu = mu, size = size) scores(y, mu = mu, size = size) scores(y, mu = mu, size = 1) # ses ignores the variance scores(y, mu = 1, size = size) ## apply a specific scoring rule scores(y, mu = mu, size = size, which = "rps") rps(y, mu = mu, size = size) # failed in surveillance <= 1.19.1 stopifnot(!is.unsorted(rps(3, mu = 10^-(0:8)), strictly = TRUE)) if (FALSE) # rps() gives NA (with a warning) if the NegBin is too wide rps(1e5, mu = 1e5, size = 1e-5)