vignettes/pkgdown/sampling.Rmd
sampling.Rmd
Use simple random sampling to select observations from a sampling frame
To use the sampling tool, select a data set where each row in the data set is unique (i.e., no duplicates). A dataset that fits these requirements is bundled with Radiant and is available through the Data > Manage tab (i.e., choose Examples
from the Load data of type
drop-down and press Load
). Select rndnames
from the Datasets
dropdown.
Names
is a unique identifier in this dataset. If we select this variable and choose the desired sample size, e.g., 10, list of names of the desired length will be created.
How does this work? Each person in the data is assigned a random number between 0 and 1 from a uniform distribution. Rows are then sorted on that random number and the \(n\) people from the list with the highest score are selected for the sample. By using a random number, every respondent has the same probability of being in the sample. For example, if we need a sample of 10 people from the 100 included in the rndnames
dataset, each individual has a 10% chances of being included in the sample. By default, the random seed is set to 1234
to ensure the sampling results are reproducible. If there is no input in Rnd. seed
, the selected rows will change every time we generate a sample.
The full list of 100 people is called the sampling frame
. Ideally, this is a comprehensive list of all sampling units (e.g., customers or companies) in your target market. To determine the appropriate value for n, use the sample size tools in the Design menu. To show the full sampling frame, click on the Show sampling frame
check box.
To download data for the generated sample in CSV format, click on the icon in the top-right of your screen. The created sample can also be stored in Radiant by providing a name for the dataset and then clicking on the Store
button.
Add code to Report > Rmd to (re)create the sample by clicking the icon on the bottom left of your screen or by pressing ALT-enter
on your keyboard.
For an overview of related R-functions used by Radiant for sampling and sample size calculations see Design > Sample
The key function from the stats
package used in the sampling
tool is runif
. This function is used to generate the random numbers assigned to each row in the available data.