Fill This Form To Receive Instant Help
Homework answers / question archive / ### Problem 4: ddply() practice This problem uses the Adult dataset, which we load below
### Problem 4: ddply() practice This problem uses the Adult dataset, which we load below. The main variable of interest here is `high.income`, which indicates whether the individual's income was over $50K. Anyone for whom `high.income == 1` is considered a "high earner". ```{r} adult.data <- read.csv("http://archive.ics.uci.edu/ml/machine-learning-databases/adult/adult.data", header=FALSE, fill=FALSE, strip.white=T, col.names=c("age", "type_employer", "fnlwgt", "education", "education_num","marital", "occupation", "relationship", "race","sex", "capital_gain", "capital_loss", "hr_per_week","country", "income")) adult.data <- mutate(adult.data, high.income = as.numeric(income == ">50K")) ``` ##### (a) Income by education level Use the `ddply()` function to produce a summary table showing how many individuals there are in each `education_num` bin, and how the proportion of high earners varies across `education_num` levels. Your table should have column names: `education_num`, `count` and `high.earn.rate`. ```{r} # Edit me ``` ##### (b) Constructing a bar chart Using the `ggplot` and `geom_bar` commands along with your data summary from part **(a)** to create a bar chart showing the high earning rate on the y axis and `education_num` on the x axis. Specify that the color of the bars should be determined by the number of individuals in each bin. ```{r} # Edit me ``` ##### (c) summary table with multiple splitting variables Use the `ddply()` function to produce a summary table showing how the proportion of high earners varies across all combinations of the following variables: `sex`, `race`, and `marital` (marital status). In addition to showing the proportion of high earners, your table should also show the number of individuals in each bin. Your table should have column names: `sex`, `race`, `marital`, `count` and `high.earn.rate`. ```{r} # Edit me ``` ##### (d) Nicer table output using `kable()` Use the `kable()` function from the `knitr` library to display the table from part **(c)** in nice formatting. You should use the `digits` argument to ensure that the values in your table are being rounded to a reasonable number of decimal places. ```{r} # Edit me ``` ### Problem 5: Getting the right plot ##### (a) A more complex bar chart. Using the table you created in 4(c), use ggplot graphics to construct a plot that looks like [the one at this link](http://www.andrew.cmu.edu/user/achoulde/94842/homework/target_fig.png) **Hint** You may find it useful to use the following layers: `facet_grid`, `coord_flip` (for horizontal bar charts), `theme` (rotating x axis text) and `guides` (removing fill legend). ```{r, fig.height = 4, fig.width = 8} # Edit me ``` ##### (b) Hiding code with `echo` Repeat part **(a)**, but this time set the `echo` argument of the code chunk in such a way that the code is not printed, but the plot is still displayed. ```{r, fig.height = 4, fig.width = 8} # Edit me