Dataset summary with summarizor

Required packages

library(flextable)
use_df_printer()

The CO2 dataset

The example use datasets::CO2.

CO2[-1]

Definition of a formatting function

We will use this function to display nice labels when showing counts and percentages.

count_format <- function(n, percent) {
  z <- character(length = length(n))
  wcts <- !is.na(n)
  z[wcts] <- sprintf(
    "%.0f (%.01f %%)",
    n[wcts], percent[wcts] * 100
  )
  z
}

The summarizor object

The object returned by summarizor() is an array restructured so that it can be used as input by tabulator().

obj <- summarizor(CO2[-1], by = "Treatment", overall_label = "Overall")
obj

The flextable

Let’s now create a cross-table of this summary by calling tabulator() and as_flextable().

Two columns are defined here, one for the numerical variables and one for the categorical variables that is using our function named count_format().

ft <- tabulator(obj, 
    rows = c("variable", "stat"),
    columns = "Treatment",
    `Est.` = as_paragraph(value),
    `N` = as_paragraph(count_format(cts, percent))
  ) |>
  as_flextable(separate_with = "variable")
  ft