# Lab 3, September 26

## Exercise 1

### Background

This exercise is inspired by a recent New York Times visualization of college enrollment rates by race/ethnicity. We will focus on public flagship universities.

Download `enrollment.txt` from our Canvas site and import the data set into an R data frame named `enr`.

(These data are from IPEDS, a survey conducted by the National Center for Education Statistics.)

Print the first few rows of `enr` in your R console. For each of the 50 public flagship universities, this data set contains the number (`count`) of new freshmen in each of five race/ethnicity categories (`reth`) for the years 1994–2015.

### Assignment Part 1

Reference material: statistical transformations

First we will focus on the University of Michigan. Filter `enr` so it only contains data for Michigan:

``````umenr <- filter(enr, School=="University of Michigan-Ann Arbor")
``````

Now recreate the following three plots. In parts (a) and (b), map the `count` variable to the `y` aesthetic and specify the appropriate `position` and `stat` arguments to `geom_bar`. In part (c), map the `pct` variable to the `y` aesthetic.

These graphs use a color palette developed by Color Brewer:

`````` +  scale_fill_brewer(palette='Set2',name='')
``````

Which plot you think is more informative?

### Assignment Part 2

Now recreate, as closely as possible, the NYT plot of all 50 public flagship universities, displaying the proportion of freshmen in each race/ethnicity category over time.

Use the `pct` variable in the `enr` data frame. You can remove the “other” category for simplicity.

Here is a code snippet to get you started. Fill in the `...` with your own code.

``````ggplot(filter(enr, reth!="Other/unknown")) +
... # add the appropriate geom
facet_wrap(...) +
scale_x_continuous(breaks=...,
labels=...) +
... + # change the axis labels
scale_color_brewer(palette='Set1',name='')
``````

My attempt is below. There are clearly some problems with the way I am calculating these percentages (e.g. University of Maine). Perhaps a future lab exercise will consist of properly downloading and computing these enrollment percentages.

## Solutions

Solutions to exercise 1