R for Stata users

Graphs can be made with ggplot2, with a syntax based on Leland Wilkinson’ The Grammar of Graphics. Lots of tutorials are available on the Internet (the official documentation is here), so I’ll be brief.

Basically, a graph is composed of three distinct parts:

Given an aesthetic and a stat, different geoms correspond to different visualization of the same data. They can be superposed with the + operator

ggplot(DT, aes(x = v1, y = v2)) + 
geom_point() + geom_smooth(method = "loess")

As a rule of thumb, the dataset given to ggplot must be “longer” than Stata. As an example, let’s start from the following dataset:

#> year   p50  p25  p75
#> 1990   4.5  2.2  8.4
#> 1991   4.2  3.5  9.3
#> 1992   8.3  1.8 12.3
#> 1993   2.9  0.5  3.5
Stata twoway line p50 year || line p25 year || line p75 tear
R gather(df, variable, value, p50:p75) %>%
ggplot(aes(x = year, y = value, color = variable) + geom_line()