The implemented calibration tests for Poisson or negative binomial predictions of count data are based on proper scoring rules and described in detail in Wei and Held (2014). The following proper scoring rules are available: Dawid-Sebastiani score ("dss"), logarithmic score ("logs"), ranked probability score ("rps").

calibrationTest(x, ...)

# S3 method for default
calibrationTest(x, mu, size = NULL,
                which = c("dss", "logs", "rps"),
                tolerance = 1e-4, method = 2, ...)



the observed counts. All involved functions are vectorized and also accept matrices or arrays.


the means of the predictive distributions for the observations x.


either NULL (default), indicating Poisson predictions with mean mu, or dispersion parameters of negative binomial forecasts for the observations x, parametrized as in dnbinom with variance mu*(1+mu/size).


a character string indicating which proper scoring rule to apply.


absolute tolerance for the null expectation and variance of "logs" and "rps". For the latter, see the note below. Unused for which = "dss" (closed form).


selection of the \(z\)-statistic: method = 2 refers to the alternative test statistic \(Z_s^*\) of Wei and Held (2014, Discussion), which has been recommended for low counts. method = 1 corresponds to Equation 5 in Wei and Held (2014).


unused (argument of the generic).


an object of class "htest", which is a list with the following components:


a character string indicating the type of test performed (including which scoring rule).

a character string naming the supplied x argument.


the \(z\)-statistic of the test.


the number of predictions underlying the test, i.e., length(x).


the p-value for the test.


If the gsl package is installed, its implementations of the Bessel and hypergeometric functions are used when calculating the null expectation and variance of the rps. These functions are faster and yield more accurate results (especially for larger mu).


Wei, W. and Held, L. (2014): Calibration tests for count data. Test, 23, 787-805.


Sebastian Meyer and Wei Wei


mu <- c(0.1, 1, 3, 6, pi, 100)
size <- 0.1
y <- rnbinom(length(mu), mu = mu, size = size)
calibrationTest(y, mu = mu, size = size) # p = 0.99
calibrationTest(y, mu = mu, size = 1) # p = 4.3e-05
calibrationTest(y, mu = 1, size = size) # p = 0.6959
calibrationTest(y, mu = 1, size = size, which = "rps") # p = 0.1286