This exercise is inspired by a recent New York Times visualization of college enrollment rates by race/ethnicity. We will focus on public flagship universities.
enrollment.txt from our Canvas site and import the data set into an R data frame named
(These data are from IPEDS, a survey conducted by the National Center for Education Statistics.)
Print the first few rows of
enr in your R console. For each of the 50 public flagship universities, this data set contains the number (
count) of new freshmen in each of five race/ethnicity categories (
reth) for the years 1994–2015.
Reference material: statistical transformations
First we will focus on the University of Michigan. Filter
enr so it only contains data for Michigan:
umenr <- filter(enr, School=="University of Michigan-Ann Arbor")
Now recreate the following three plots. In parts (a) and (b), map the
count variable to the
y aesthetic and specify the appropriate
stat arguments to
geom_bar. In part (c), map the
pct variable to the
These graphs use a color palette developed by Color Brewer:
Which plot you think is more informative?
Now recreate, as closely as possible, the NYT plot of all 50 public flagship universities, displaying the proportion of freshmen in each race/ethnicity category over time.
pct variable in the
enr data frame. You can remove the “other” category for simplicity.
Here is a code snippet to get you started. Fill in the
... with your own code.
ggplot(filter(enr, reth!="Other/unknown")) + ... # add the appropriate geom facet_wrap(...) + scale_x_continuous(breaks=..., labels=...) + ... + # change the axis labels scale_color_brewer(palette='Set1',name='')
My attempt is below. There are clearly some problems with the way I am calculating these percentages (e.g. University of Maine). Perhaps a future lab exercise will consist of properly downloading and computing these enrollment percentages.