How to Create an APA Style Interaction Graph in R
When running regression analyses, analyzing the interactions between predictor variables is an important tool for understanding how one predictor influences the relationship between another predictor and the outcome variable. Examining p-values and coefficients is not enough, though, we also need to plot the lines to really understand what is happening in the relationship.
This is very easy to do using the R programming language with a number of packages.
library(ggeffects)
model <- lm(Sepal.Length ~ Petal.Length * Petal.Width, data = iris)
plot <- ggeffects::ggpredict(model, c("Petal.Length",
"Petal.Width")) |> plot()
plot
The above interaction graph looks amazing from only two lines of code. What more do we really need?
Not much. When we’re deep in analysis exploring data, running models, and testing hypotheses, this is all we need to understand the interaction relationships. We can even make the plot look more professional with the interactions package.
library(tidyverse)
library(interactions)
model <- lm(Sepal.Length ~ Petal.Length * Petal.Width, data = iris)
interact_plot(model,
pred = Petal.Length,
modx = Petal.Width,
modx.values = "plus-minus",
) +
ylim(0,NA)
What’s Wrong With the Default Interaction Graphs in R?
The problem with the default interaction graphs in R only emerges when we are ready to drop our figures and tables into a manuscript. Many academic journals, white papers, and websites don’t have strict guidelines on a graph’s formating. Some variation of the two above graphs would be just fine.
The American Psychological Association (APA), on the other hand, publishes guidance on their preferred formatting for graphs and figures in their publication manual. You can also use Google Images to find examples.
A short story to illustrate the pain. I used R for the analyses in my dissertation project. When it came to formatting all these interaction graphs (and I had many) in APA style to drop into my manuscript, I couldn’t find any satisfactory way to do it in R. In desperation, I resorted to entering data into the spreadsheet template available from Jeremy Dawson.
Later, I did run into Kris Preacher’s website and this blog post. The post shows how to generate interaction graphs in APA format. It is very useful and you may need to do this if you have more complex requirements. But it involves both running R code and then plugging numbers into an online form. This is error-prone and time-consuming.
How to Create an APA Style Interaction Graph in R
The code below will allow you to create a simple APA-style interaction graph in R with two predictors (a 2-way interaction). It should, of course, be modified for your specific needs. This code requires a continuous DV, continuous IV, and a continuous moderator. See Part 2 of this topic if your data have a categorical moderator.
##
# Generate an APA compatible interaction plot
##
library(tidyverse)
library(ggeffects)
library(jtools)
library(forcats)
# Declare data frame
df <- iris
# Declare names of variables in data frame
dv_var <- "Sepal.Length"
iv_var <- "Petal.Length"
mod_var <- "Petal.Width"
# Declare plot labels
dv_label <- "Sepal Length"
iv_label <- "Petal Length"
mod_label <- "Petal Width"
# Declare location of legend
legend_pos <- "bottomright"
##
# The rest of the code should not be modified unless you want
# to customize
##
# Rename variables in df
df <- df |>
rename(dv = all_of(dv_var),
iv = all_of(iv_var),
modv = all_of(mod_var))
# Generate your model, using the new var names
model <- lm(dv ~ iv * modv, data = df)
# Calculate one standard deviation above and below
# the predictor and moderator
cond1 <- mean(df$iv) - sd(df$iv)
cond2 <- mean(df$iv) + sd(df$iv)
mod_low <- mean(df$modv) - sd(df$modv)
mod_high <- mean(df$modv) + sd(df$modv)
mods <- c(mod_low, mod_high)
conds <- c(cond1, cond2)
# Generate the dependent variable predictions. These
# will be our y-axis values
predictions <- ggpredict(model,
terms = c("iv [conds]",
"modv [mods]"))
# Update the data frame with the variables needed
# to build the plot
plot_vars <- predictions |>
mutate(mod_order = ifelse(group == mod_low, 1, 2)) |>
mutate(iv1_order = ifelse(x == cond1, 1, 2)) |>
mutate(mod = ifelse(group == mod_low,
paste("Low", mod_label),
paste("High", mod_label))) |>
mutate(iv1 = ifelse(x == cond1,
paste("Low", iv_label),
paste("High", iv_label)))
# Generate the plot
p <- ggplot(plot_vars,
aes(x = fct_reorder(iv1, iv1_order),
y = predicted,
shape = fct_reorder(mod, mod_order),
group = fct_reorder(mod, mod_order))) +
geom_point(size = 2) +
geom_line(aes(linetype = fct_reorder(mod, mod_order))) +
labs(x = '', y = dv_label) +
ylim(0,NA) +
theme_apa(legend.pos = all_of(legend_pos)) +
scale_colour_grey() +
theme(legend.background = element_rect(fill = "white",
colour = "black"),
axis.text = element_text(size=12, colour = "black"))
# Output the plot
p
For a simple 2-way interaction graph, the above code only requires declaring:
- the data frame,
- variable names,
- variable labels, and optionally,
- the legend position.
This code forces the y-axis to include zero, which is a good data visualization practice. Notice the graph has a solid white background and border. It also standardizes the fonts, among other things.
Have fun formatting your next interaction graph in R and please let me know in the comments below if you have questions or suggestions for improvement.
Hmm. I used this exact code (not changing anything, just to check) and something weird happened to the data frame for plot_vars, such that it ended up with 6 more rows labelled “NA”, “NA.1”, and so on, all full of NAs.
I also got a warning message:
“Warning message:
Using `all_of()` outside of a selecting function was deprecated in tidyselect 1.2.0.
ℹ See details at ?tidyselect::faq-selection-context”
I am not sure whether this is related, but I don’t think so, because renaming the variables in the original data frame seems to work fine.
Any clues?
Cheers
Deborah.
Hi Kevin,
Thank you very much for your code. It is incredibly helpful! I would like to create an interaction plot for a categorial IV (with two experimental conditions) and continuous moderator/DV. I have attempted to adapt your code, but I have not succeeded. Could you help me with that?
Thank you very much!
Cathy
Hello
I tried to run this code. But I encounter an error in row 52 :
# Update the data frame with the variables needed
# to build the plot
plot_vars
mutate(mod_order = ifelse(group == mod_low, 1, 2)) |>
mutate(iv1_order = ifelse(x == cond1, 1, 2)) |>
mutate(mod = ifelse(group == mod_low,
paste(“Low”, mod_label),
paste(“High”, mod_label))) |>
mutate(iv1 = ifelse(x == cond1,
paste(“Low”, iv_label),
paste(“High”, iv_label)))
I get the error message:
Error in `mutate()`:
ℹ In argument: `mod_order = ifelse(group == mod_low, 1, 2)`.
Caused by error:
! object ‘group’ not found
Backtrace:
1. dplyr::mutate(…)
11. base::ifelse(group == mod_low, 1, 2)
I do not know how to resolve this. I would be happy, if you could see if you detect a mistake?
Kind regards
I fixed this error. For my R version there were “” necessary around group and x. However, now the code works and I get a plot. But this plot does not look right. It only shows something for high IV and Mod. For low it only shows NA. I can not add a picture here,, but maybe you have encountered this problem before?
I just posted a new article that has code to generate an interaction plot if your moderator is a factor (categorical) variable. https://www.founderscholar.com/how-to-create-an-apa-style-interaction-graph-in-r-part-2/
One note is the current version of this code only works for a continuous IV and a continuous moderator. So, if you get errors about doubles or numbers, that is probably the issue. I hope that you can modify this code to work with a moderator that is a factor. I may update it in the future to handle this.
Hello! Thank you for the very helpful thread. I have run the code exactly as-is, except for changing the variable names to the names in my own dataframe. When I run the lines
predictions <- ggpredict(model,
terms = c("iv [conds]",
"modv [mods]"))
I receive the error message
Error: variables ‘iv’, ‘modv’ were specified with different types from the fit
the variables are of type "double". and I cannot reason why it isn't running. Do you have advice for this?
Best,
Everett
Hi Everett, can you post a few rows of your data? Maybe with just the three variables you are using in the analysis? Then I can run it on my computer.
Hi, I have some experience in Python and about an hour’s worth of class in R – but I’ve always been interested in learning more about R. Unfortunately I’ve never had time and now my capstone paper is due in a week, and I stumbled across your page on making interaction graphs in R, great! I just wanted to see how my data would look and thought your code would make the plot a breeze, and it would be my gateway into R. However, upon adding my data, I’m running into the error:
Error in UseMethod(“rename”) :
no applicable method for ‘rename’ applied to an object of class “function”
once the “# Rename variables in df” block runs. I did not change anything in the code other than the variables for my dataset, what’s the issue here? Thanks in advance!
I have the same issue
Happy to help you trouble shoot it if you provide more info.
Sorry it didn’t work for you. Can you post more information? For instance, can you paste e.g. 5 rows of your data here so I can run it and see the error? Let me know what the dv, iv, and moderator is, please. Also, paste your lm() function so I can see what you’re trying to plot.