This function extracts and performs a test of the distribution of (leading) digits in a vector against a reference distribution. By default, the distribution of leading digits is checked against Benford's law.
digit_test(
x,
check = c("first", "last", "firsttwo", "lasttwo"),
reference = "benford",
conf.level = 0.95,
prior = FALSE
)
a numeric vector.
location of the digits to analyze. Can be first
,
last
, firsttwo
, or lasttwo
.
which character string given the reference distribution for
the digits, or a vector of probabilities for each digit. Can be
benford
for Benford's law, uniform
for the uniform
distribution. An error is given if any entry of reference
is
negative. Probabilities that do not sum to one are normalized.
a numeric value between 0 and 1 specifying the confidence level (i.e., 1 - audit risk / detection risk).
a logical specifying whether to use a prior distribution, or a numeric value equal to or larger than 1 specifying the prior concentration parameter, or a numeric vector containing the prior parameters for the Dirichlet distribution on the digit categories.
An object of class jfaDistr
containing:
the specified data.
a numeric value between 0 and 1 giving the confidence level.
the observed counts.
the expected counts under the null hypothesis.
the number of observations in x
.
the value the chi-squared test statistic.
the degrees of freedom of the approximate chi-squared distribution of the test statistic.
the p-value for the test.
checked digits.
vector of digits.
reference distribution
a list containing the row numbers corresponding to the observations matching each digit.
a vector indicating which digits deviate from their expected relative frequency under the reference distribution.
a logical indicating whether a prior distribution was used.
a character string giving the name(s) of the data.
Benford's law is defined as \(p(d) = log10(1/d)\). The uniform distribution is defined as \(p(d) = 1/d\).
Benford, F. (1938). The law of anomalous numbers. In Proceedings of the American Philosophical Society, 551-572.
set.seed(1)
x <- rnorm(100)
# First digit analysis against Benford's law
digit_test(x, check = "first", reference = "benford")
#> Warning: Some expected counts < 5, Chi-squared approximation may be incorrect
#>
#> Classical Digit Distribution Test
#>
#> data: x
#> n = 100, MAD = 0.033314, X-squared = 14.557, df = 8, p-value = 0.06836
#> alternative hypothesis: leading digit(s) are not distributed according to the benford distribution.
# Bayesian first digit analysis against Benford's law
digit_test(x, check = "first", reference = "benford", prior = TRUE)
#>
#> Bayesian Digit Distribution Test
#>
#> data: x
#> n = 100, MAD = 0.033314, BF₁₀ = 0.019696
#> alternative hypothesis: leading digit(s) are not distributed according to the benford distribution.
# Last digit analysis against the uniform distribution
digit_test(x, check = "last", reference = "uniform")
#>
#> Classical Digit Distribution Test
#>
#> data: x
#> n = 100, MAD = 0.01679, X-squared = 3.68, df = 8, p-value = 0.8848
#> alternative hypothesis: last digit(s) are not distributed according to the uniform distribution.
# Bayesian last digit analysis against the uniform distribution
digit_test(x, check = "last", reference = "uniform", prior = TRUE)
#>
#> Bayesian Digit Distribution Test
#>
#> data: x
#> n = 100, MAD = 0.01679, BF₁₀ = 0.00014198
#> alternative hypothesis: last digit(s) are not distributed according to the uniform distribution.
# First digit analysis against a custom distribution
digit_test(x, check = "last", reference = 1:9)
#> Warning: Some expected counts < 5, Chi-squared approximation may be incorrect
#>
#> Classical Digit Distribution Test
#>
#> data: x
#> n = 100, MAD = 0.052346, X-squared = 76.864, df = 8, p-value =
#> 2.087e-13
#> alternative hypothesis: last digit(s) are not distributed according to the reference distribution.
# Bayesian first digit analysis against a custom distribution
digit_test(x, check = "last", reference = 1:9, prior = TRUE)
#>
#> Bayesian Digit Distribution Test
#>
#> data: x
#> n = 100, MAD = 0.052346, BF₁₀ = 252577
#> alternative hypothesis: last digit(s) are not distributed according to the reference distribution.