Running many regressions in R

With map and broom

R
Author

Nick Twort

Published

14 Sep 2023

If you’re in a situation where you have a lot of parameters and want to run regressions on many subsets of a dataset based on those parameters, one option is to pmap:

library(purrr)
library(broom)
library(dplyr)

# Example list of parameters to use as subsets
categories <- mtcars |> 
  as_tibble() |> 
  distinct(am)

# Run the regression
pmap_dfr(categories, function(am) {
  
  # Subset data
  df <- mtcars |> 
    as_tibble() |> 
    filter(am == !!am)
  
  # Run regression
  reg <- lm(mpg ~ hp, df)
  
  # Tidy output
  tidy(reg)
  
}, .id = "regression")
# A tibble: 4 × 6
  regression term        estimate std.error statistic  p.value
  <chr>      <chr>          <dbl>     <dbl>     <dbl>    <dbl>
1 1          (Intercept)  31.8      1.99        16.0  5.84e- 9
2 1          hp           -0.0587   0.0133      -4.43 1.01e- 3
3 2          (Intercept)  26.6      1.62        16.5  6.92e-12
4 2          hp           -0.0591   0.00958     -6.17 1.02e- 5