Create pivot tables to explore your data

If you have used pivot-tables in Excel the functionality provided in
the *Data > Pivot* tab should be familiar to you. Similar to
the
*Data
> Explore* tab, you can generate summary statistics for
variables in your data. You can also generate frequency tables. Perhaps
the most powerful feature in *Data > Pivot* is that you can
easily describe the data *by* one or more other variables.

For example, with the `diamonds`

data loaded, select
`clarity`

and `cut`

from the
`Categorical variables`

drop-down. The categories for the
first variable will be the column headers but you can drag-and-drop the
selected variables to change their ordering. After selecting these two
variables, and clicking on the `Create pivot table`

button, a
frequency table of diamonds with different levels of clarity and quality
of cut is shown. Choose `Row`

, `Column`

, or
`Total`

from the `Normalize by`

drop-down to
normalize cell frequencies or create an index from a summary statistic
by the row, column, or overall total. If a normalize option is selected
it can be convenient to check the `Percentage`

box to express
the numbers as percentages. Choose `Color bar`

or
`Heat map`

from the `Conditional formatting`

drop-down to emphasize the highest frequency counts.

It is also possible to summarize numerical variables. Select
`price`

from the `Numeric variables`

drop-down.
This will create the table shown below. Just as in the
*Data
> View* tab you can sort the table by clicking on the column
headers. You can also use sliders (e.g., click in the input box below
`I1`

) to limit the view to values in a specified range. To
view only information for diamonds with a `Very good`

,
`Premium`

or `Ideal`

cut click in the input box
below the `cut`

header.

Below you will find a brief description of several functions
available from the `Apply function`

dropdown menu. Most
functions, however, will be self-explanatory.

`n`

calculates the number of observations, or rows, in the data or in a group if a`Group by`

variable has been selected (`n`

uses the`length`

function in R)`n_distinct`

calculates the number of distinct values`n_missing`

calculates the number of missing values`cv`

is the coefficient of variation (i.e., mean(x) / sd(x))`sd`

and`var`

calculate the sample standard deviation and variance for numeric data`me`

calculates the margin of error for a numeric variable using a 95% confidence level`prop`

calculates a proportion. For a variable with only values 0 or 1 this is equivalent to`mean`

. For other numeric variables it captures the occurrence of the maximum value. For a`factor`

it captures the occurrence of the first level.`sdprop`

and`varprop`

calculate the sample standard deviation and variance for a proportion`meprop`

calculates the margin of error for a proportion using a 95% confidence level`sdpop`

and`varpop`

calculate the population standard deviation and variance

You can also create a bar chart based on the generated table (see
image above). To download the table in *csv* format or the plot
in *png* format click the appropriate download icon on the
right.

Note that when a categorical variable (

`factor`

) is selected from the`Numeric variable(s)`

dropdown menu it will be converted to a numeric variable if required for the selected function(s). If the factor levels are numeric these will be used in all calculations. Since the mean, standard deviation, etc. are not relevant for non-binary categorical variables, these will be converted to 0-1 (binary) variables where the first level is coded as 1 and all other levels as 0.

Use the `Filter data`

box to select (or omit) specific
sets of rows from the data to tabulate. See the help file for
*Data
> View* for details.

The created pivot table can be stored in Radiant by clicking the
`Store`

button. This can be useful if you want do additional
analysis on the table or to create plots of the summarized data in
*Data
> Visualize*. To download the table to *csv* format
click the download icon on the top-right.

Add code to
*Report
> Rmd* to (re)create the pivot table by clicking the
icon on the bottom
left of your screen or by pressing `ALT-enter`

on your
keyboard.

If a plot was created it can be customized using `ggplot2`

commands (e.g.,
`plot(result) + labs(title = "Pivot graph")`

). See
*Data
> Visualize* for details.

For an overview of related R-functions used by Radiant to create
pivot tables see
*Data
> Pivot*