Did you know, that you can transform plain old static ggplot
graphs to animated ones? Well, you can with the help of the package gganimate
by RStudio’s Thomas Lin Pedersen and David Robinson and the results are amazing! My STATWORX colleagues and I are very impressed how effortless all kinds of geoms are transformed to suuuper smooth animations. That’s why in this post I will provide a short overview of some of the wonderful functionalities of gganimate
, I hope you’ll enjoy them as much as we do!
Since Valentine’s Day is just around the corner, we’re going to explore the Speed Dating Experiment dataset compiled by Columbia Business School professors Ray Fisman and Sheena Iyengar. Hopefully, we’ll learn about gganimate
as well as how to find our Valentine. If you like, you can download the data from Kaggle.
Defining the basic animation: transition_*
How are static plots put into motion? Essentially, gganimate
creates data subsets, which are plotted individually and constitute the substantial frames, which, when played consecutively, create the basic animation. The results of gganimate
are so seamless because gganimate
takes care of the so-called tweening for us by calculating data points for transition frames displayed in-between frames with actual input data.
The transition_*
functions define how the data subsets are derived and thus define the general character of any animation. In this blogpost we’re going to explore three types of transitions: transition_states()
, transition_reveal()
and transition_filter()
. But let’s start at the beginning.
We’ll start with transition_states()
. Here the data is split into subsets according to the categories of the variable provided to the states
argument. If several rows of a dataset pertain to the same unit of observation and should be identifiable as such, a grouping variable defining the observation units needs to be supplied. Alternatively, an identifier can be mapped to any other aesthetic.
Please note, to ensure the readability of this post, all text concerning the interpretation of the speed dating data is written in italics. If you’re not interested in that part you simply can skip those paragraphs. For the data prep, I’d like to refer you to my GitHub.
First, we’re going to explore what the participants of the Speed Dating Experiment look for in a partner. Participants were asked to rate the importance of attributes in a potential date by allocating a budget of 100 points to several characteristics, with higher values denoting a higher importance. The participants were asked to rate the attributes according to their own views. Further, the participants were asked to rate the same attributes according to the presumed wishes of their same-sex peers, meaning they allocated the points in the way they supposed their average same-sex peer would do.
We’re going to plot all of these ratings (x-axis) for all attributes (y-axis). Since we want to compare the individual wishes to the individually presumed wishes of peers, we’re going to transition between both sets of ratings. Color always indicates the personal wishes of a participant. A given bubble indicates the rating of one specific participant for a given attribute, switching between one’s own wishes and the wishes assumed for peers.
## Static Plot
# ...characteristic vs. (presumed) rating...
# ...color&size mapped to own rating, grouped by ID
plot1 <- ggplot(df_what_look_for,
aes(x = value,
y = variable,
color = own_rating, # bubbels are always colord according to own whishes
size = own_rating,
group = iid)) + # identifier of observations across states
geom_jitter(alpha = 0.5, # to reduce overplotting: jitttering & alpha
width = 5) +
scale_color_viridis(option = "plasma", # use virdis' plasma scale
begin = 0.2, # limit range of used hues
name = "Own Rating") +
scale_size(guide = FALSE) + # no legend for size
labs(y = "", # no axis label
x = "Allocation of 100 Points", # x-axis label
title = "Importance of Characteristics for Potential Partner") +
theme_minimal() + # apply minimal theme
theme(panel.grid = element_blank(), # remove all lines of plot raster
text = element_text(size = 16)) # increase font size
## Animated Plot
plot1 +
transition_states(states = rating) # animate contrast subsets acc. to variable rating
First off, if you’re a little confused which state is which, please be patient, we’ll explore dynamic labels in the section about ‚frame variables‘.
It’s apparent that different people look for different things in a partner. Yet attractiveness is often prioritized over other qualities. But the importance of attractiveness varies most strongly of all attributes between individuals. Interestingly, people are quite aware that their peer’s ratings might differ from their own views. Further, especially the collective presumptions (= the mean values) about others are not completely off, but of higher variance than the actual ratings.
So there is hope for all of us that somewhere out there somebody is looking for someone just as ambitious or just as intelligent as ourselves. However, it’s not always the inner values that count.
gganimate
allows us to tailor the details of the animation according to our wishes. With the argument transition_length
we can define the relative length of the transition from one to the other real subsets of data takes and with state_length
how long, relatively speaking, each subset of original data is displayed. Only if the wrap
argument is set to TRUE
, the last frame will get morphed back into the first frame of the animation, creating an endless and seamless loop. Of course, the arguments of different transition functions may vary.
## Animated Plot
# ...replace default arguments
plot1 +
transition_states(states = rating,
transition_length = 3, # 3/4 of total time for transitions
state_length = 1, # 1/4 of time to display actual data
wrap = FALSE) # no endless loop
Styling transitions: ease_aes
As mentioned before, gganimate
takes care of tweening and calculates additional data points to create smooth transitions between successively displayed points of actual input data. With ease_aes
we can control which so-called easing function is used to ‚morph‘ original data points into each other. The default
argument is used to declare the easing function for all aesthetics in a plot. Alternatively, easing functions can be assigned to individual aesthetics by name. Amongst others quadric
, cubic
, sine
and exponential
easing functions are available, with the linear
easing function being the default. These functions can be customized further by adding a modifier-suffix: with -in
the function is applied as-is, with -out
the function is reversely applied with -in-out
the function is applied as-is in the first half of the transition and reversed in the second half.
Here I played around with an easing function that models the bouncing of a ball.
## Animated Plot
# ...add special easing function
plot1 +
transition_states(states = rating) +
ease_aes("bounce-in") # bouncy easing function, as-is
Dynamic labelling: {frame variables}
To ensure that we, mesmerized by our animations, do not lose the overview gganimate
provides so-called frame variables that provide metadata about the animation as a whole or the previous/current/next frame. The frame variables – when wrapped in curly brackets – are available for string literal interpretation within all plot labels. For example, we can label each frame with the value of the states
variable that defines the currently (or soon to be) displayed subset of actual data:
## Animated Plot
# ...add dynamic label: subtitle with current/next value of states variable
plot1 +
labs(subtitle = "{closest_state}") + # add frame variable as subtitle
transition_states(states = rating)
The set of available variables depends on the transition function. To get a list of frame variables available for any animation (per default the last one) the frame_vars()
function can be called, to get both the names and values of the available variables.
Indicating previous data: shadow_*
To accentuate the interconnection of different frames, we can apply one of gganimates
’shadows‘. Per default shadow_null()
i.e. no shadow is added to animations. In general, shadows display data points of past frames in different ways: shadow_trail()
creates a trail of evenly spaced data points, while shadow_mark()
displays all raw data points.
We’ll use shadow_wake()
to create a little ‚wake‘ of past data points which are gradually shrinking and fading away. The argument wake_length
allows us to set the length of the wake, relative to the total number of frames. Since the wakes overlap, the transparency of geoms might need adjustment. Obviously, for plots with lots of data points shadows can impede the intelligibility.
plot1B + # same as plot1, but with alpha = 0.1 in geom_jitter
labs(subtitle = "{closest_state}") +
transition_states(states = rating) +
shadow_wake(wake_length = 0.5) # adding shadow
The benefits of transition_*
While I simply love the visuals of animated plots, I think they’re also offering actual improvement. I feel transition_states
compared to facetting has the advantage of making it easier to track individual observations through transitions. Further, no matter how many subplots we want to explore, we do not need lots of space and clutter our document with thousands of plots nor do we have to put up with tiny plots.
Similarly, e.g. transition_reveal
holds additional value for time series by not only mapping a time variable on one of the axes but also to actual time: the transition length between the individual frames displays of actual input data corresponds to the actual relative time differences of the mapped events. To illustrate this, let’s take a quick look at the ’success‘ of all the speed dates across the different speed dating events:
## Static Plot
# ... date of event vs. interest in second date for women, men or couples
plot2 <- ggplot(data = df_match,
aes(x = date, # date of speed dating event
y = count, # interest in 2nd date
color = info, # which group: women/men/reciprocal
group = info)) +
geom_point(aes(group = seq_along(date)), # needed, otherwise transition dosen't work
size = 4, # size of points
alpha = 0.7) + # slightly transparent
geom_line(aes(lty = info), # line type according to group
alpha = 0.6) + # slightly transparent
labs(y = "Interest After Speed Date",
x = "Date of Event",
title = "Overall Interest in Second Date") +
scale_linetype_manual(values = c("Men" = "solid", # assign line types to groups
"Women" = "solid",
"Reciprocal" = "dashed"),
guide = FALSE) + # no legend for linetypes
scale_y_continuous(labels = scales::percent_format(accuracy = 1)) + # y-axis in %
scale_color_manual(values = c("Men" = "#2A00B6", # assign colors to groups
"Women" = "#9B0E84",
"Reciprocal" = "#E94657"),
name = "") +
theme_minimal() + # apply minimal theme
theme(panel.grid = element_blank(), # remove all lines of plot raster
text = element_text(size = 16)) # increase font size
## Animated Plot
plot2 +
transition_reveal(along = date)
Displayed are the percentages of women and men who were interested in a second date after each of their speed dates as well as the percentage of couples in which both partners wanted to see each other again.
Most of the time, women were more interested in second dates than men. Further, the attraction between dating partners often didn’t go both ways: the instances in which both partners of a couple wanted a second date always were far more infrequent than the general interest of either men and women. While it’s hard to identify the most romantic time of the year, according to the data there seemed to be a slack in romance in early autumn. Maybe everybody still was heartbroken over their summer fling? Fortunately, Valentine’s Day is in February.
Another very handy option is transition_filter()
, it’s a great way to present selected key insights of your data exploration. Here the animation browses through data subsets defined by a series of filter conditions. It’s up to you which data subsets you want to stage. The data is filtered according to logical statements defined in transition_filter()
. All rows for which a statement holds true are included in the respective subset. We can assign names to the logical expressions, which can be accessed as frame variables. If the keep
argument is set to TRUE
, the data of previous frames is permanently displayed in later frames.
I want to explore, whether one’s own characteristics relate to the attributes one looks for in a partner. Do opposites attract? Or do birds of a feather (want to) flock together?
Displayed below are the importances the speed dating participants assigned to different attributes of a potential partner. Contrasted are subsets of participants, who were rated especially funny, attractive, sincere, intelligent or ambitious by their speed dating partners. The rating scale went from 1 = low to 10 = high, thus I assume value of >7 to be rather outstanding.
## Static Plot (without geom)
# ...importance ratings for different attributes
plot3 <- ggplot(data = df_ratings,
aes(x = variable, # different attributes
y = own_rating, # importance regarding potential partner
size = own_rating,
color = variable, # different attributes
fill = variable)) +
geom_jitter(alpha = 0.3) +
labs(x = "Attributes of Potential Partner", # x-axis label
y = "Allocation of 100 Points (Importance)", # y-axis label
title = "Importance of Characteristics of Potential Partner", # title
subtitle = "Subset of {closest_filter} Participants") + # dynamic subtitle
scale_color_viridis_d(option = "plasma", # use viridis scale for color
begin = 0.05, # limit range of used hues
end = 0.97,
guide = FALSE) + # don't show legend
scale_fill_viridis_d(option = "plasma", # use viridis scale for filling
begin = 0.05, # limit range of used hues
end = 0.97,
guide = FALSE) + # don't show legend
scale_size_continuous(guide = FALSE) + # don't show legend
theme_minimal() + # apply minimal theme
theme(panel.grid = element_blank(), # remove all lines of plot raster
text = element_text(size = 16)) # increase font size
## Animated Plot
# ...show ratings for different subsets of participants
plot3 +
geom_jitter(alpha = 0.3) +
transition_filter("More Attractive" = Attractive > 7, # adding named filter expressions
"Less Attractive" = Attractive <= 7,
"More Intelligent" = Intelligent > 7,
"Less Intelligent" = Intelligent <= 7,
"More Fun" = Fun > 7,
"Less Fun" = Fun <= 5)
Of course, the number of extraordinarily attractive, intelligent or funny participants is relatively low. Surprisingly, there seem to be little differences between what the average low vs. high scoring participants look for in a partner. Rather the lower scoring group includes more people with outlying expectations regarding certain characteristics. Individual tastes seem to vary more or less independently from individual characteristics.
Styling the (dis)appearance of data: enter_* / exit_*
Especially if displayed subsets of data do not or only partially overlap, it can be favorable to underscore this visually. A good way to do this are the enter_*()
and exit_*()
functions, which enable us to style the entry and exit of data points, which do not persist between frames.
There are many combinable options: data points can simply (dis)appear (the default), fade (enter_fade()/exit_fade()
), grow or shrink (enter_grow()/exit_shrink()
), gradually change their color (enter_recolor()/exit_recolor()
), fly (enter_fly()/exit_fly()
) or drift (enter_drift()/exit_drift()
) in and out.
We can use these stylistic devices to emphasize changes in the databases of different frames. I used exit_fade()
to let further not included data points gradually fade away while flying them out of the plot area on a vertical route (y_loc = 100
), data points re-entering the sample fly in vertically from the bottom of the plot (y_loc = 0
):
## Animated Plot
# ...show ratings for different subsets of participants
plot3 +
geom_jitter(alpha = 0.3) +
transition_filter("More Attractive" = Attractive > 7, # adding named filter expressions
"Less Attractive" = Attractive <= 7,
"More Intelligent" = Intelligent > 7,
"Less Intelligent" = Intelligent <= 7,
"More Fun" = Fun > 7,
"Less Fun" = Fun <= 5) +
enter_fly(y_loc = 0) + # entering data: fly in vertically from bottom
exit_fly(y_loc = 100) + # exiting data: fly out vertically to top...
exit_fade() # ...while color is fading
Finetuning and saving: animate() & anim_save()
Gladly, gganimate
makes it very easy to finalize and save our animations. We can pass our finished gganimate
object to animate()
to, amongst other things, define the number of frames to be rendered (nframes
) and/or the rate of frames per second (fps
) and/or the number of seconds the animation should last (duration
). We also have the option to define the device in which the individual frames are rendered (the default is device = “png”
, but all popular devices are available). Further, we can define arguments that are passed on to the device, like e.g. width
or height
. Note, that simply printing an gganimate
object is equivalent to passing it to animate()
with default arguments. If we plan to save our animation the argument renderer
, is of importance: the function anim_save()
lets us effortlessly save any gganimate
object, but only so if it was rendered using one of the functions magick_renderer()
or the default gifski_renderer()
.
The function anim_save()
works quite straightforward. We can define filename
and path
(defaults to the current working directory) as well as the animation object (defaults to the most recently created animation).
# create a gganimate object
gg_animation <- plot3 +
transition_filter("More Attractive" = Attractive > 7,
"Less Attractive" = Attractive <= 7)
# adjust the animation settings
animate(gg_animation,
width = 900, # 900px wide
height = 600, # 600px high
nframes = 200, # 200 frames
fps = 10) # 10 frames per second
# save the last created animation to the current directory
anim_save("my_animated_plot.gif")
Conclusion (and a Happy Valentine’s Day)
I hope this blog post gave you an idea, how to use gganimate
to upgrade your own ggplots
to beautiful and informative animations. I only scratched the surface of gganimates
functionalities, so please do not mistake this post as an exhaustive description of the presented functions or the package. There is much out there for you to explore, so don’t wait any longer and get started with gganimate
!
But even more important: don’t wait on love. The speed dating data shows that most likely there’s someone out there looking for someone just like you. So from everyone here at STATWORX: Happy Valentine’s Day!
## 8 bit heart animation
animation2 <- plot(data = df_eight_bit_heart %>% # includes color and x/y position of pixels
dplyr::mutate(id = row_number()), # create row number as ID
aes(x = x,
y = y,
color = color,
group = id)) +
geom_point(size = 18, # depends on height & width of animation
shape = 15) + # square
scale_color_manual(values = c("black" = "black", # map values of color to actual colors
"red" = "firebrick2",
"dark red" = "firebrick",
"white" = "white"),
guide = FALSE) + # do not include legend
theme_void() + # remove everything but geom from plot
transition_states(-y, # reveal from high to low y values
state_length = 0) +
shadow_mark() + # keep all past data points
enter_grow() + # new data grows
enter_fade() # new data starts without color
animate(animation2,
width = 250, # depends on size defined in geom_point
height = 250, # depends on size defined in geom_point
end_pause = 15) # pause at end of animation
Nearly one year ago, I analyzed how we use emojis in our Slack messages. Since then, STATWORX grew, and we are a lot more people now! So, I just wanted to check if something changed.
Last time, I did not show our custom emojis, since they are, of course, not available in the fonts I used. This time, I will incorporate them with geom_image()
. It is part of the ggimage
package from Guangchuang Yu, which you can find here on his Github. With geom_image()
you can include images like .png
files to your ggplot
.
What changed since last year?
Let’s first have a look at the amount of emojis we are using. In the plot below, you can see that since my last analysis in October 2018 (red line) the amount of emojis is rising. Not as much as I thought it would, but compared to the previous period, we now have more days with a usage of over 100 emojis per day!
Like last time, our top emoji is ????, followed by ???? and ????. But sneaking in at number ten is one of our custom emojis: party_hat_parrot!
How to include custom images?
In my previous blogpost, I hid all our custom emojis behind❓since they were not part of the font. It did not occur to me to use their images, even though the package is from the same creator! So, to make up for my ignorance, I grabbed the top 30 custom emojis and downloaded their images from our Slack servers, saved them as .png
and made sure they are all roughly the same size.
To use geom_image()
I just added the path of the images to my data (the …
are just an abbreviation for the complete path).
NAME COUNT REACTION IMAGE 1: alnatura 25 63 .../custom/alnatura.png 2: blog 19 20 .../custom/blog.png 3: dataiku 15 22 .../custom/dataiku.png 4: dealwithit_parrot 3 100 .../custom/dealwithit_parrot.png 5: deananddavid 31 18 .../custom/deananddavid.png
This would have been enough to just add the images now, but since I wanted the NAME
attribute as a label, I included geom_text_repel
from the ggrepel
library. This makes handling of non-overlapping labels much simpler!
ggplot(custom_dt, aes( x = REACTION, y = COUNT, label = NAME)) +
geom_image(aes(image = IMAGE), size = 0.04) +
geom_text_repel(point.padding = 0.9, segment.alpha = 0) +
xlab("as reaction") +
ylab("within message") +
theme_minimal()
Usually, if a label is „too far“ away from the marker, geom_text_repel
includes a line to indicate where the labels belong. Since these lines would overlap the images, I used segment.alpha = 0
to make them invisible. With point.padding = 0.9
I gave the labels a bit more space, so it looks nicer. Depending on the size of the plot, this needs to be adjusted. In the plot, one can see our usage of emojis within a message (y-axis) and as a reaction (x-axis).
To combine the emoji font and custom emojis, I used the following data and code — really… why did I not do this last time? ???? Since the UNICODE
is NA
when I want to use the IMAGE
, there is no „double plotting“.
EMOJI REACTION COUNT SUM PLACE UNICODE IMAGE 1: :+1: 1090 0 1090 1 U0001f44d 2: :joy: 609 152 761 2 U0001f602 3: 🙂 91 496 587 3 U0001f604 4: :-1: 434 9 443 4 U0001f44e 5: :tada: 346 38 384 5 U0001f389 6: :fire: 274 17 291 6 U0001f525 7: :slightly_smiling_face: 1 250 251 7 U0001f642 8: 😉 27 191 218 8 U0001f609 9: :clap: 201 13 214 9 U0001f44f 10: :party_hat_parrot: 192 9 201 10 <NA> .../custom/party_hat_parrot.png
quartz()
ggplot(plotdata2, aes(x = PLACE, y = SUM, label = UNICODE)) +
geom_bar(stat = "identity", fill = "steelblue") +
geom_text(family="EmojiOne") +
xlab("Most popular emojis") +
ylab("Number of usage") +
scale_fill_brewer(palette = "Paired") +
geom_image(aes(image = IMAGE), size = 0.04) +
theme_minimal()
ps = grid.export(paste0(main_path, "plots/top-10-used-emojis.svg"), addClass=T)
dev.off()
The meaning behind emojis
Now we know what our top emojis are. But what is the rest of the world doing? Thanks to Emojimore for providing me with this overview! On their site, you can find meanings for a lot more emojis.
Behind each of our custom emojis is a story as well. For example, all the food emojis are helping us every day to decide where to eat and provide information on what everyone is planning for lunch! And if you do not agree with the decision, just react with sadphan to let the others know about your feelings. If you want to know the whole stories behind all custom emojis or even help create new ones, then maybe you should join our team — check out our available job offers here!
In the last post of this series, we dealt with axis systems. In this post, we are also dealing with axes but this time we are taking a look at the position scales of dates, time, and datetimes. Since we at STATWORX are often forecasting – and thus plotting – time series, this is an important issue for us. The choice of axis ticks and labels can make the message conveyed by a plot clearer. Oftentimes, some points in time are – e.g. due to their business implications – more important than others and should be easily identified. Unequivocal, yet parsimonious labeling is key to the readability of any plot. Luckily, ggplot2 enables us to do so for dates and times with almost any effort at all.
We are using ggplot’s economics data set. Our base Plot looks like this:
base_plot <- ggplot(data = economics) +
geom_line(aes(x = date, y = unemploy),
color = "#09557f",
alpha = 0.6,
size = 0.6) +
labs(x = "Date",
y = "US Unemployed in Thousands",
title = "Base Plot") +
theme_minimal()
Scale Types
As of now, ggplot2
supports three date and time classes: POSIXct
, Date
and hms
. Depending on the class at hand, axis ticks and labels can be controlled by using scale_*_date
, scale_*_datetime
or scale_*_time
, respectively. Depending on whether one wants to modify the x or the y axis scale_x_*
or scale_y_*
are to be employed. For sake of simplicity, in the examples only scale_x_date
is employed, but all discussed arguments work just the same for all mentioned scales.
Minor Modifications
Let’s start easy. With the argument limits
the range of the displayed dates or time can be set. Two values of the correct date or time class have to be supplied.
base_plot +
scale_x_date(limits = as.Date(c("1980-01-01","2000-01-01"))) +
ggtitle("limits = as.Date(c("1980-01-01","2000-01-01"))")
The expand
argument ensures that there is some distance between the displayed data and the axes. The multiplicative constant is multiplied with the range of the displayed data, the additive is multiplied with one unit of the depicted data. The sum of the two resulting distances is added to the axis limits as padding. The resulting empty space is added at the left and right end of the x-axis or the top and bottom of the y-axis.
base_plot +
scale_x_date(expand = c(0, 5000)) + #5000/365 = 13.69863 years
ggtitle("expand = c(0, 5000)")
position
argument defines where the labels are displayed: Either “left”
or “right”
from the y-axis or on the “top”
or on the “bottom”
of the x-axis.
base_plot +
scale_x_date(position = "top") +
ggtitle("position = "top"")
Axis Ticks and Grid Lines
More essential than the cosmetic modifications discussed so far are the axis ticks. There are several ways to define the axis ticks of dates and times. There are the labelled major breaks and further the minor breaks, which are not labeled but marked by grid lines. These can be customized with the arguments breaks
and minor_breaks
, respectively. The breaks
as the well as minor_breaks
can be defined by a numeric vector of exact positions or a function with the axis limits as inputs and breaks as outputs. Alternatively, the arguments can be set to NULL
to display (minor) breaks at all. These options are especially handy if irregular intervals between breaks are desired.
base_plot +
scale_x_date(breaks = as.Date(c("1970-01-01", "2000-01-01")),
minor_breaks = as.Date(c("1975-01-01", "1980-01-01",
"2005-01-01", "2010-01-01"))) +
ggtitle("(minor_)breaks = fixed Dates")
base_plot +
scale_x_date(breaks = function(x) seq.Date(from = min(x),
to = max(x),
by = "12 years"),
minor_breaks = function(x) seq.Date(from = min(x),
to = max(x),
by = "2 years")) +
ggtitle("(minor_)breaks = custom function")
base_plot +
scale_x_date(breaks = NULL,
minor_breaks = NULL) +
ggtitle("(minor_)breaks = NULL")
Another and very convenient way to define regular breaks are the date_breaks
and the date_minor_breaks
argument. As input both arguments take a character vector combining a string specifying the time unit (either “sec“, „min“, „hour“, „day“, „week“, „month“ or „year“) and an integer specifying number of said units specifying the break intervals.
base_plot +
scale_x_date(date_breaks = "10 years",
date_minor_breaks = "2 years") +
ggtitle("date_(minor_)breaks = "x years"")
If both are given, date(_minor)_breaks
overrules (minor_)breaks
.
Axis Labels
Similar to the axis ticks, the format of the displayed labels can either be defined via the labels
or the date_labels
argument. The labels
argument can either be set to NULL
if no labels should be displayed, with the breaks as inputs and the labels as outputs. Alternatively, a character vector with labels for all the breaks can be supplied to the argument. This can be very useful, since like this virtually any character vector can be used to label the breaks. The number of labels must be the same as the number of breaks. If the breaks are defined by a function, date_breaks
or by default the labels must be defined by a function as well.
base_plot +
scale_x_date(date_breaks = "15 years",
labels = function(x) paste((x-365), "(+365 days)")) +
ggtitle("labels = custom function")
base_plot +
scale_x_date(breaks = as.Date(c("1970-01-01", "2000-01-01")),
labels = c("~ '70", "~ '00")) +
ggtitle("labels = character vector")
Furthermore and very conveniently, the format of the labels can be controlled via the argument date_labels
set to a string of formatting codes, defining order, format and elements to be displayed:
Code | Meaning |
---|---|
%S | second (00-59) |
%M | minute (00-59) |
%l | hour, in 12-hour clock (1-12) |
%I | hour, in 12-hour clock (01-12) |
%H | hour, in 24-hour clock (01-24) |
%a | day of the week, abbreviated (Mon-Sun) |
%A | day of the week, full (Monday-Sunday) |
%e | day of the month (1-31) |
%d | day of the month (01-31) |
%m | month, numeric (01-12) |
%b | month, abbreviated (Jan-Dec) |
%B | month, full (January-December) |
%y | year, without century (00-99) |
%Y | year, with century (0000-9999) |
Source: Wickham 2009 p. 99
base_plot +
scale_x_date(date_labels = "%Y (%b)") +
ggtitle("date_labels = "%Y (%b)"")
The choice of axis ticks and labels might seem trivial. However, one should not underestimate the amount of confusion that can be caused by too many, too less or poorly positioned axis ticks and labels. Further, economical yet clear labeling of axis ticks can increase the readability and visual appeal of any time series plot immensely. Since it is so easy to tweak the date and time axes in ggplot2 there is simply no excuse not to do so.
References
- Wickham, H. (2009). ggplot2: elegant graphics for data analysis. Springer.