Audit Sampling: Selection

selection() is used to perform statistical selection of audit samples. It offers flexible implementations of the most common audit sampling algorithms for attributes sampling and monetary unit sampling. The function returns an object of class jfaSelection that can be used with the associated summary() method.

selection(
  data,
  size,
  units = c("items", "values"),
  method = c("interval", "cell", "random", "sieve"),
  values = NULL,
  order = NULL,
  decreasing = FALSE,
  randomize = FALSE,
  replace = FALSE,
  start = 1
)

Arguments

data: a data frame containing the population data.
size: an integer larger than 0 specifying the number of units to select. Can also be an object of class jfaPlanning.
units: a character specifying the type of sampling units. Possible options are items (default) for selection on the level of items (rows) or values for selection on the level of monetary units.
method: a character specifying the sampling algorithm. Possible options are interval (default) for fixed interval sampling, cell for cell sampling, random for random sampling, or sieve for modified sieve sampling.
values: a character specifying the name of a column in data containing the book values of the items.
order: a character specifying the name of a column in data containing the ranks of the items. The items in the data are ordered according to these values in the order indicated by decreasing.
decreasing: a logical specifying whether to order the items from smallest to largest. Only used if order is specified.
randomize: a logical specifying if items should be randomly shuffled prior to selection. Note that randomize = TRUE overrules order.
replace: a logical specifying if sampling units should be selected with replacement. Only used for method random when selecting items.
start: an integer larger than 0 specifying index of the unit that should be selected. Only used for method interval.

Value

An object of class jfaSelection containing:

data: a data frame containing the population data.
sample: a data frame containing the selected data sample.
n.req: an integer giving the requested sample size.
n.units: an integer giving the number of obtained sampling units.
n.items: an integer giving the number of obtained sample items.
N.units: an integer giving the number of sampling units in the population data.
N.items: an integer giving the number of items in the population data.
interval: if method = "interval", a numeric value giving the size of the selection interval.
units: a character indicating the type of sampling units.
method: a character indicating the sampling algorithm.
values: if values is specified, a character indicating the book value column.
start: if method = "interval", an integer giving the index of the selected unit in each interval.
data.name: a character indicating the name of the population data.

Details

This section elaborates on the possible options for the units argument:

items: In attributes sampling each item in the population is a sampling unit. An item with a book value of $5000 is therefore equally likely to be selected as an item with a book value of $500.
values: In monetary unit sampling each monetary unit in the population is a sampling unit. An item with a book value of $5000 is therefore ten times more likely to be selected as an item with a book value of $500.

This section elaborates on the possible options for the method argument:

interval: In fixed interval sampling the sampling units are divided into a number of equally large intervals. In each interval, a single sampling unit is selected according to a fixed starting point (specified by start).
cell: In cell sampling the sampling units in the population are divided into a number (equal to the sample size) of equally large intervals. In each interval, a single sampling unit is selected randomly.
random: In random sampling all sampling units are drawn with equal probability.
sieve: In modified sieve sampling items are selected with the largest sieve ratio (Hoogduin, Hall, & Tsay, 2010).

References

Derks, K., de Swart, J., Wagenmakers, E.-J., Wille, J., & Wetzels, R. (2021). JASP for audit: Bayesian tools for the auditing practice. Journal of Open Source Software, 6(68), 2733. doi:10.21105/joss.02733

Hoogduin, L. A., Hall, T. W., & Tsay, J. J. (2010). Modified sieve sampling: A method for single-and multi-stage probability-proportional-to-size sampling. Auditing: A Journal of Practice & Theory, 29(1), 125-148. doi:10.2308/aud.2010.29.1.125

Leslie, D. A., Teitlebaum, A. D., & Anderson, R. J. (1979). Dollar-unit Sampling: A Practical Guide for Auditors. Copp Clark Pitman; Belmont, CA. ISBN: 9780773042780.

Author

Koen Derks, k.derks@nyenrode.nl

Examples

data("BuildIt")

# Select 100 items using random sampling
set.seed(1)
selection(data = BuildIt, size = 100, method = "random")
#> 
#> 	Audit Sample Selection
#> 
#> data:  BuildIt
#> number of sampling units = 100, number of items = 100
#> sample selected via method 'items' + 'random'

# Select 150 monetary units using fixed interval sampling
selection(
  data = BuildIt, size = 150, units = "values",
  method = "interval", values = "bookValue"
)
#> 
#> 	Audit Sample Selection
#> 
#> data:  BuildIt
#> number of sampling units = 150, number of items = 150
#> sample selected via method 'values' + 'interval'