--- title: "Creating two-way frequency tables" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Creating two-way frequency tables} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) library(qacBase) ``` The **crosstab** function calculates and prints a two-way frequency table. Given a data frame, a row variable, a column variable, and a type (frequencies, cell percents, row percents, or column percents) the function returns a table with * labeled rows and columns * frequencies or percents * a level labeled \ for missing values (if `na.rm = FALSE`) * a level labeled Total (unless `total = FALSE`) * a Chi-square test of independence (if `chisquare = TRUE`) Tables are printed with 2 decimal places for percents (modifiable using `digits=#`). Variables are coerced to factors if necessary. Adding `plot=TRUE` produces a `ggplot2` graph instead of a table. In the examples below, the number of car cylinders (*cyl*) is cross-tabulated with the number of gears (*gear*) for 32 automobiles in the *cars74* data frame. ### Frequencies By default, the crosstab function reports frequency counts for each combination of the two categorical variables. The most common car type has 3 gears and 8 cylinders. ```{r} crosstab(cars74, cyl, gear) crosstab(cars74, cyl, gear, plot=TRUE) ``` ### Cell percents Cell percents add up to 100% overall all the cells in the table. 25% of all cars in the data frame have 4 gears and 4 cylinders. ```{r} crosstab(cars74, cyl, gear, type="percent") crosstab(cars74, cyl, gear, type="percent", plot=TRUE) ``` ### Row percents Row percents sum to 100% for each row of the table. 86% of 8 cylinder cars have 3 gears. ```{r} crosstab(cars74, cyl, gear, type = "rowpercent") crosstab(cars74, cyl, gear, type = "rowpercent", plot=TRUE) ``` ### Column percents Column percents sum to 100% for each column of the table. Only 7% of 3 gear cars have 4 cylinders. ```{r} crosstab(cars74, cyl, gear, type = "colpercent") crosstab(cars74, cyl, gear, type = "colpercent", plot=TRUE) ``` ### Chi-square test of independence You can include a test that the two categorical variables are independent, by adding the option `chisquare = TRUE`. ```{r} crosstab(cars74, cyl, gear, type = "colpercent", chisquare=TRUE) crosstab(cars74, cyl, gear, type = "colpercent", plot=TRUE, chisquare = TRUE) ```