Bayesian Test of Digits against a Reference Distribution

This function extracts and performs a Bayesian test of the distribution of (leading) digits in a vector against a reference distribution. By default, the distribution of leading digits is checked against Benford's law.

distr.btest(x, check = 'first', reference = 'benford', 
            alpha = NULL, BF10 = TRUE, log = FALSE)

Arguments

x: a numeric vector.
check: location of the digits to analyze. Can be first, firsttwo, or last.
reference: which character string given the reference distribution for the digits, or a vector of probabilities for each digit. Can be benford for Benford's law, uniform for the uniform distribution. An error is given if any entry of reference is negative. Probabilities that do not sum to one are normalized.
alpha: a numeric vector containing the prior parameters for the Dirichlet distribution on the digit categories.
BF10: logical. Whether to compute the Bayes factor in favor of the alternative hypothesis (BF10) or the null hypothesis (BF01).
log: logical. Whether to return the logarithm of the Bayes factor.

Value

An object of class dt.distr containing:

observed: the observed counts.
expected: the expected counts under the null hypothesis.
n: the number of observations in x.
statistic: the value the chi-squared test statistic.
parameter: the degrees of freedom of the approximate chi-squared distribution of the test statistic.
p.value: the p-value for the test.
check: checked digits.
digits: vector of digits.
reference: reference distribution
data.name: a character string giving the name(s) of the data.

Details

Benford's law is defined as \(p(d) = log10(1/d)\). The uniform distribution is defined as \(p(d) = 1/d\).

The Bayes Factor \(BF_{10}\) quantifies how much more likely the data are to be observed under \(H_{1}\): the digits are not distributed according to the reference distribution than under \(H_{0}\): the digits are distributed according to the reference distribution. Therefore, \(BF_{10}\) can be interpreted as the relative support in the observed data for \(H_{1}\) versus \(H_{0}\). If \(BF_{10}\) is 1, there is no preference for either \(H_{1}\) or \(H_{0}\). If \(BF_{10}\) is larger than 1, \(H_{1}\) is preferred. If \(BF_{10}\) is between 0 and 1, \(H_{0}\) is preferred. The Bayes factor is calculated using the Savage-Dickey density ratio.

References

Benford, F. (1938). The law of anomalous numbers. In Proceedings of the American Philosophical Society, 551-572.

Author

Koen Derks, k.derks@nyenrode.nl

Examples

set.seed(1)
x <- rnorm(100)

# Bayesian digit analysis against Benford's law
distr.btest(x, check = 'first', reference = 'benford')
#> 
#> 	Digit distribution test
#> 
#> data:  x
#> n = 100, BF10 = 0.019696
#> alternative hypothesis: leading digit(s) are not distributed according to the benford distribution.

# Bayesian digit analysis against Benford's law, custom prior
distr.btest(x, check = 'first', reference = 'benford', alpha = 9:1)
#> 
#> 	Digit distribution test
#> 
#> data:  x
#> n = 100, BF10 = 0.56808
#> alternative hypothesis: leading digit(s) are not distributed according to the benford distribution.

# Bayesian digit analysis against custom distribution
distr.btest(x, check = 'last', reference = rep(1/9, 9))
#> 
#> 	Digit distribution test
#> 
#> data:  x
#> n = 100, BF10 = 0.00018458
#> alternative hypothesis: last digit(s) are not distributed according to the reference distribution.