This function takes a data frame and performs statistical selection according to one of four algorithms: fixed interval sampling, cell sampling, random sampling, and modified sieve sampling. Selection is done on the level of two possible sampling units: items (records / rows) or monetary units. The function returns an object of class `jfaSelection`

which can be used with associated `summary()`

and a `plot()`

methods.

For more details on how to use this function, see the package vignette:
`vignette('jfa', package = 'jfa')`

- data
a data frame containing the population of items the auditor wishes to sample from.

- size
an integer larger than 0 specifying the number of sampling units that need to be selected from the population. Can also be an object of class

`jfaPlanning`

.- units
a character specifying the sampling units used. Possible options are

`items`

(default) for selection on the level of items (rows) or`values`

for selection on the level of monetary units.- method
a character specifying the sampling algorithm used. Possible options are

`interval`

(default) for fixed interval sampling,`cell`

for cell sampling,`random`

for random sampling, or`sieve`

for modified sieve sampling.- values
a character specifying the name of a column in

`data`

containing the book values of the items.- order
a character specifying the name of a column in

`data`

containing the ranks of the items. The items in the`data`

are ordered according to these values in the order indicated by`decreasing`

.- decreasing
if

`order`

is specified, a logical specifying whether to order the items from smallest to largest. Defaults to`FALSE`

.- randomize
a logical specifying whether the items in the data should be randomly shuffled before selection. Defaults to

`FALSE`

. Note that specifying if`randomize = TRUE`

overrules`order`

.- replace
if

`method = 'random'`

, a logical specifying whether sampling should be performed with replacement. Defaults to`FALSE`

.- start
if

`method = 'interval'`

, an integer larger than 0 specifying the starting point of the algorithm.

An object of class `jfaSelection`

containing:

- data
a data frame containing the input data.

- sample
a data frame containing the selected sample of items.

- n.req
an integer indicating the requested sample size.

- n.units
an integer indicating the total number of obtained sampling units.

- n.items
an integer indicating the total number of obtained sample items.

- N.units
an integer indicating the total number of sampling units in the population.

- N.items
an integer indicating the total number of items in the population.

- interval
if

`method = 'interval'`

, a numeric value indicating the size of the selection interval.- units
a character indicating the sampling units that were used to create the selection.

- method
a character indicating the the algorithm that was used to create the selection.

- values
if

`values`

is specified, a character indicating the name of the book value column.- start
if

`method = 'interval'`

, an integer indicating the starting point in the interval.- data.name
a character string giving the name of the data.

The first part of this section elaborates on the two possible options for the `units`

argument:

`items`

: In record sampling each item in the population is seen as a sampling unit. An item of $5000 is therefore equally likely to be selected as an item of $500.`values`

: In monetary unit sampling each monetary unit in the population is seen as a sampling unit. An item of $5000 is therefore ten times more likely to be selected as an item of $500.

The second part of this section elaborates on the four possible options for the `method`

argument:

`interval`

: In fixed interval sampling the sampling units in the population are divided into a number (equal to the sample size) of intervals. From each interval one sampling unit is selected according to a fixed starting point (specified by`start`

).`cell`

: In cell sampling the sampling units in the population are divided into a number (equal to the sample size) of intervals. From each interval one sampling unit is selected with equal probability.`random`

: In random sampling each sampling unit in the population is drawn with equal probability.`sieve`

: In modified sieve sampling each item in the population is selected proportional to its value (Hoogduin, Hall, & Tsay, 2010).

Hoogduin, L. A., Hall, T. W., & Tsay, J. J. (2010). Modified sieve sampling: A method for single-and multi-stage probability-proportional-to-size sampling. *Auditing: A Journal of Practice & Theory*, 29(1), 125-148.

Leslie, D. A., Teitlebaum, A. D., & Anderson, R. J. (1979). *Dollar-unit Sampling: A Practical Guide for Auditors*. Copp Clark Pitman; Belmont, Calif.: distributed by Fearon-Pitman.

Wampler, B., & McEacharn, M. (2005). Monetary-unit sampling using Microsoft Excel. *The CPA journal*, 75(5), 36.

```
data("BuildIt")
# Select 100 items using random sampling
selection(data = BuildIt, size = 100, method = "random")
#>
#> Audit Sample Selection
#>
#> data: BuildIt
#> number of sampling units = 100, number of items = 100
#> sample selected via method 'items' + 'random'
# Select 150 monetary units using fixed interval sampling
selection(
data = BuildIt, size = 150, units = "values",
method = "interval", values = "bookValue"
)
#>
#> Audit Sample Selection
#>
#> data: BuildIt
#> number of sampling units = 150, number of items = 150
#> sample selected via method 'values' + 'interval'
```