jfa is an R package that provides statistical methods for auditing. The package includes functions for planning, performing, and evaluating an audit sample compliant with international auditing standards, as well as functions for auditing data, such as testing the distribution of leading digits in the data against Benford's law. In addition to offering classical frequentist methods, jfa also provides a straightforward implementation of their Bayesian counterparts.

The functionality of the jfa package and its intended workflow are implemented with a graphical user interface in the Audit module of JASP, a free and open-source software program for statistical analyses.

For documentation on jfa itself, including the manual and user guide for the package, worked examples, and other tutorial information visit the package website.

Author

Koen Derks (maintainer, author)<k.derks@nyenrode.nl>

Please use the citation provided by R when citing this package. A BibTex entry is available from citation('jfa').

See also

Useful links:

Examples


# Load the jfa package
library(jfa)

#################################
### Example 1: Audit sampling ###
#################################

# Load the BuildIt population
data('BuildIt')

# Stage 1: Planning
stage1 <- planning(materiality = 0.03, expected = 0.01)
summary(stage1)
#> 
#> 	Classical Audit Sample Planning Summary
#> 
#> Options:
#>   Confidence level:              0.95 
#>   Materiality:                   0.03 
#>   Hypotheses:                    H₀: Θ >= 0.03 vs. H₁: Θ < 0.03 
#>   Expected:                      0.01 
#>   Likelihood:                    poisson 
#> 
#> Results:
#>   Minimum sample size:           220 
#>   Tolerable errors:              2.2 
#>   Expected most likely error:    0.01 
#>   Expected upper bound:          0.02997 
#>   Expected precision:            0.01997 
#>   Expected p-value:              0.049761 

# Stage 2: Selection
stage2 <- selection(data = BuildIt, size = stage1,
                    units = 'values', values = 'bookValue',
                    method = 'interval', start = 1)
summary(stage2)
#> 
#> 	Audit Sample Selection Summary
#> 
#> Options:
#>   Requested sample size:         220 
#>   Sampling units:                monetary units 
#>   Method:                        fixed interval sampling 
#>   Starting point:                1 
#> 
#> Data:
#>   Population size:               3500 
#>   Population value:              1403221 
#>   Selection interval:            6378.3 
#> 
#> Results:
#>   Selected sampling units:       220 
#>   Proportion of value:           0.080554 
#>   Selected items:                220 
#>   Proportion of size:            0.062857 

# Stage 3: Execution
sample <- stage2[['sample']]

# Stage 4: Evaluation
stage4 <- evaluation(data = sample, method = 'stringer.binomial',
                     values = 'bookValue', values.audit = 'auditValue')
summary(stage4)
#> 
#> 	Classical Audit Sample Evaluation Summary
#> 
#> Options:
#>   Confidence level:               0.95 
#>   Method:                         stringer.binomial 
#> 
#> Data:
#>   Sample size:                    220 
#>   Number of errors:               5 
#>   Sum of taints:                  2.9999929 
#> 
#> Results:
#>   Most likely error:              0.013636 
#>   95 percent confidence interval: [0, 0.033724] 
#>   Precision:                      0.020087 

#################################
### Example 2: Data auditing ####
#################################

# Load the sinoForest data set
data('sinoForest')

# Test first digits in the data against Benford's law
digit_test(sinoForest[["value"]], check = "first", reference = "benford")
#> 
#> 	Classical Digit Distribution Test
#> 
#> data:  sinoForest[["value"]]
#> n = 772, MAD = 0.0065981, X-squared = 7.6517, df = 8, p-value = 0.4682
#> alternative hypothesis: leading digit(s) are not distributed according to the benford distribution.

######################################
### Example 3: Algorithm auditing ####
######################################

# Load the compas data set
data('compas')

# Test algorithmic fairness against Caucasian ethnicity
model_fairness(compas, "Ethnicity", "TwoYrRecidivism", "Predicted",
               privileged = "Caucasian", positive = "yes")
#> 
#> 	Classical Algorithmic Fairness Test
#> 
#> data: compas
#> n = 6172, X-squared = 18.799, df = 5, p-value = 0.002095
#> alternative hypothesis: fairness metrics are not equal across groups
#> 
#> sample estimates:
#>   African_American: 1.1522 [1.1143, 1.1891], p-value = 5.4523e-05
#>   Asian: 0.86598 [0.11706, 1.6149], p-value = 1
#>   Hispanic: 1.0229 [0.87836, 1.1611], p-value = 0.78393
#>   Native_American: 1.0392 [0.25396, 1.6406], p-value = 1
#>   Other: 1.0596 [0.86578, 1.2394], p-value = 0.5621
#> alternative hypothesis: true odds ratio is not equal to 1