Scatterplot with Linear Trend Demonstration

By Nathan Nguyen

May 15, 2022

Making a scatterplot with a linear trend

I’ll use the built-in Iris dataset for this demonstration

attach(iris)
data <- iris

data %>%
  ggplot(aes(x = Sepal.Width, y = Sepal.Length, group = Species)) +
  geom_point(aes(color = Species)) +
  geom_smooth(aes(color = Species), method = "lm", se = FALSE) +
  scale_color_d3() +
  labs(x = "Sepal Width",
       y = "Sepal Length",
       title = "Relationship between Sepal Length and Sepal Width by Species") +
  theme(legend.position = "bottom")

Although there is exists somewhat of a linear relationship between Sepal Length and Sepal Width, the plot looks a little cluttered. Let’s take a look at the scatter-plot for each species by creating one plot for each species. We’ll use the facet_wrap() call with ggplot. The relationship looks strongest for the setosa species and weakest for the virginica species. An individual plot will help to determine this.

data %>%
  ggplot(aes(x = Sepal.Width, y = Sepal.Length)) +
  geom_point(color = "dodgerblue3") +
  geom_smooth(method = "lm", se = FALSE) +
  facet_wrap(~Species)
# using library(ggpubr) we can add the linear trend data easily
ggscatter(
  data = data, x = "Sepal.Width", y = "Sepal.Length",
  color = "Species", add = "reg.line"
) +
  facet_wrap(~Species) +
  stat_cor(label.y = 7.8) +
  stat_regline_equation(label.y = 7.5)
Posted on:
May 15, 2022
Length:
1 minute read, 197 words
See Also: