Proper Scoring Rules for Poisson or Negative Binomial Predictions
scores.Rd
Proper scoring rules for Poisson or negative binomial predictions
of count data are described in Czado et al. (2009).
The following scores are implemented:
logarithmic score (logs
),
ranked probability score (rps
),
Dawid-Sebastiani score (dss
),
squared error score (ses
).
Arguments
- x
the observed counts. All functions are vectorized and also accept matrices or arrays. Dimensions are preserved.
- mu
the means of the predictive distributions for the observations
x
.- size
either
NULL
(default), indicating Poisson predictions with meanmu
, or dispersion parameters of negative binomial forecasts for the observationsx
, parametrized as indnbinom
with variancemu*(1+mu/size)
.- which
a character vector specifying which scoring rules to apply. By default, all four proper scores are calculated. The normalized squared error score (
"nses"
) is also available but it is improper and hence not computed by default.- sign
a logical indicating if the function should also return
sign(x-mu)
, i.e., the sign of the difference between the observed counts and corresponding predictions.- ...
unused (argument of the generic).
- k
scalar argument controlling the finite sum approximation for the
rps
with truncation atmax(x, ceiling(mu + k*sd))
.- tolerance
absolute tolerance for the finite sum approximation employed in the
rps
calculation. A warning is produced if the approximation withk
summands is insufficient for the specifiedtolerance
. In this case, increasek
for higher precision (or use a larger tolerance).
Value
The scoring functions return the individual scores for the predictions
of the observations in x
(maintaining their dimension attributes).
The default scores
-method applies the selected (which
)
scoring functions (and calculates sign(x-mu)
) and returns the
results in an array (via simplify2array
), where the last
dimension corresponds to the different scores.
References
Czado, C., Gneiting, T. and Held, L. (2009): Predictive model assessment for count data. Biometrics, 65 (4), 1254-1261. doi:10.1111/j.1541-0420.2009.01191.x
See also
The R package scoringRules implements the logarithmic score and the (continuous) ranked probability score for many distributions.
Examples
mu <- c(0.1, 1, 3, 6, 3*pi, 100)
size <- 0.5
set.seed(1)
y <- rnbinom(length(mu), mu = mu, size = size)
scores(y, mu = mu, size = size)
scores(y, mu = mu, size = 1) # ses ignores the variance
scores(y, mu = 1, size = size)
## apply a specific scoring rule
scores(y, mu = mu, size = size, which = "rps")
rps(y, mu = mu, size = size)
## rps() gives NA (with a warning) if the NegBin is too wide
rps(1e5, mu = 1e5, size = 1e-5)