System of right-angled coordinates

Coordinate systems in ggplot2: easily overlooked and rather underrated

Lea Waniek Blog, Data Science, Statistik

All plots have coordinate systems. Perhaps because they are such an integral element of plots, they are easily overlooked. However, in ggplot2, there are several very useful options to customize the coordinate systems of plots, which we will not overlook but explore in this blog post.

Since it is spring, we will use a random subset of the famous iris data set. When we plot the petal length against the petal width, and map species onto color and play around a little with the shape, color and sizes of aesthetics, one obtains this vernal plot:

plot base iris data

# Base plot
plot_base <- ggplot(data = df_iris) +
  geom_point(aes(x = Petal.Length, y = Petal.Width, color = Species), 
             size = 3, alpha = 0.9, shape = 8) +
  geom_point(aes(x = Petal.Length, y = Petal.Width), 
             color = "yellow", size = 0.4) +
  scale_color_manual(values = c("#693FE9", "#A089F8", "#0000FF")) +
  theme_minimal()

Cartesian coordinate system

Zooming in and out

The coordinate system can be manipulated by adding one of ggplot’s different coordinate systems. When you are imagining a coordinate system, you are most likely thinking of a Cartesian one. The Cartesian coordinate system combines x and y dimension orthogonally and is ggplots default (coord_cartesian).

There also are several varaitions of the familiar Cartesian coordinate system in ggplot, namely coord_fixed, coord_flip and coord_trans. For all of them, the displayed section of the data can be specified by defining the maximal value depicted on the x (xlim =) and y (ylim =) axis. This allows to “zoom in” or “zoom out” of a plot. It is a great advantage, that all manipulations of the coordinate system only alter the depiction of the data but not the data itself.

# Zooming in with xlim/ylim
plot_base +
  coord_cartesian(xlim = 5, ylim = 2) +
  ggtitle("coord_cartesian with xlim = 5 and ylim = 2")

zoomed scatter plot

Specifying the “aspect ratio” of the axes

Via coord_fixed one can specify the exact ratio of the length of a y unit relative to the length of a x unit within the final visualization.

# Setting the "aspect ratio" of y vs. x units
plot_base + 
  coord_fixed(ratio = 1/2) +
  ggtitle("coord_fixed with ratio = 1/2")

plot aspect ratio

Transforming the scales of the axes

This helps to emphasize the exact insight one wants to communicate. Another way to do so is coord_trans, which allows several transformations of the x and y variable (see table below, taken from Wickham 2016 page 97). Let me stress this again, very conveniently such transformations only pertain to the depicted – not the actual – scale of the data. This also is the reason why, regardless of the conducted transformation, the original values are used as axis labels.

NameFunktion f(x)Inverse f^{ -1 } (y)
asntanh^{-1}(x)tanh^{-1}(y)
expe^xlog(y)
identityxy
loglog(x)e^y
log10log_{10}(x)10^y
log2log_{2}(x)2^y
logitlog(\frac{ x }{ 1-x })log(\frac{ x }{ 1+e(y) })
pow1010^xlog_{10}(y)
probit\Phi(x)\Phi^{-1}(y)
recipx^{-1}y^{-1}
reverse-x-y
sqrtx^{1/2}y^2
# Transforming the axes 
plot_base + 
  coord_trans(x = "log", y = "log2") +
  ggtitle("coord_trans with x = \"log\" and y = \"log2\"")

transformed data

Swapping the axes

The last of the Cartesian options, cood_flip, swaps x and y axis. For example, this option can be useful, when we intend to change the orientation of univariate plots as histograms or plot types – like box plots – that visualize the distribution of a continuous variable over the categories of another variable. Nonetheless, coord_flip also works with all other plots. This multiplies the overall possibilities for designing plots, especially since all Cartesian coordinate systems can be combined.

# Swapping axes 
# base plot #2
p1 <- ggplot(data = df_iris) +
  geom_bar(aes(x = Species, fill = Species), alpha = 0.6) +
  scale_fill_manual(values = c("#693FE9", "#A089F8", "#4f5fb7")) +
  theme_minimal() 

# base plot & coord_flip()
p2 <- ggplot(data = df_iris) +
  geom_bar(aes(x = Species, fill = Species), alpha = 0.6) +
  scale_fill_manual(values = c("#693FE9", "#A089F8", "#4f5fb7")) +
  theme_minimal() +
  coord_flip() 

gridExtra::grid.arrange(p1, p2, top = "Bar plot without and with coord_flip")

fliped coordinate system

Polar coordinate system

The customization of Cartesian coordinate systems allows for the fine tuning of plots. However, coord_polar, the final coordinate system discussed here, changes the whole character of a plot. By using coord_polar, bar geoms are transformed to pie charts or “bullseye” plots, while line geoms are transformed to radar charts. This is done by mapping x and y to the angle and radius of the resulting plot. By default, the x variable is mapped to the angle but by setting the theta augment in coord_polar to “y” this can be changed.

pie charts
polygon plot

While such plots might shine with respect to novelty and looks, their perceptual properties are intricate, and their correct interpretation may be quite hard and rather unintuitive.

# Base plot 2 (long format, x = 1 is summed up to generate count)
plot_base_2 <- df_iris %>%
  dplyr::mutate(x = 1) %>%
  ggplot(.) +
  geom_bar(aes(x = x, fill = Species), alpha = 0.6) + 
  theme(axis.text = element_blank(),
        axis.ticks = element_blank(),
        axis.title = element_blank()) +
  scale_fill_manual(values = c("#693FE9", "#A089F8", "#4f5fb7")) +
  theme_minimal() +
  ggtitle("base plot")

# Bullseye plot 
# geom_bar & coord_polar(theta = "x")
p2 <- plot_base_2 +
  coord_polar(theta = "x") +
  ggtitle("theta = \"x\"")

# Pie chart 
# geom_bar & coord_polar(theta = "y")
p3 <- plot_base_2 +
  coord_polar(theta = "y") +
  ggtitle("theta = \"y\"")

gridExtra::grid.arrange(p2, p3, plot_base_2, top = "geom_bar & coord_polar", ncol = 2)
# Base plot 3 (long format, mean width/length of sepals/petals calculated)
plot_base_3 <- iris %>%
  dplyr::group_by(Species) %>%
  dplyr::summarise(Petal.Length = mean(Petal.Length),
                   Sepal.Length = mean(Sepal.Length),
                   Sepal.Width = mean(Sepal.Width),
                   Petal.Width = mean(Petal.Width)) %>%
  reshape2::melt() %>%
  ggplot() +
  geom_polygon(aes(group = Species, color = Species, y = value, x = variable), 
  fill = NA) +
  scale_color_manual(values = c("#693FE9", "#A089F8", "#4f5fb7")) +
  theme_minimal() +
  ggtitle("base plot")

# Radar plot
# geom_polygon & coord_polar
p2 <- plot_base_3 +
  theme_minimal() +
  coord_polar() +
  ggtitle("coord_polar")

gridExtra::grid.arrange(plot_base_3, p2, top = "geom_polygon & coord_polar", ncol = 2)

 

References

  • Wickham, H. (2016). ggplot2: elegant graphics for data analysis. Springer.
Über den Autor
Lea Waniek

Lea Waniek

Lea ist Mitglied im Data Science Team und unterstützt ebenfalls im Bereich Statistik.