Trusted by Students Everywhere
Why Choose Us?
0% AI Guarantee
Human-written only.
24/7 Support
Anytime, anywhere.
Plagiarism Free
100% Original.
Expert Tutors
Masters & PhDs.
100% Confidential
Your privacy matters.
On-Time Delivery
Never miss a deadline.
### Problem 4: ddply() practice This problem uses the Adult dataset, which we load below
### Problem 4: ddply() practice
This problem uses the Adult dataset, which we load below. The main variable of interest here is `high.income`, which indicates whether the individual's income was over $50K. Anyone for whom `high.income == 1` is considered a "high earner".
```{r}
adult.data <- read.csv("http://archive.ics.uci.edu/ml/machine-learning-databases/adult/adult.data", header=FALSE, fill=FALSE, strip.white=T,
col.names=c("age", "type_employer", "fnlwgt", "education",
"education_num","marital", "occupation", "relationship", "race","sex",
"capital_gain", "capital_loss", "hr_per_week","country", "income"))
adult.data <- mutate(adult.data,
high.income = as.numeric(income == ">50K"))
```
##### (a) Income by education level
Use the `ddply()` function to produce a summary table showing how many individuals there are in each `education_num` bin, and how the proportion of high earners varies across `education_num` levels. Your table should have column names: `education_num`, `count` and `high.earn.rate`.
```{r}
# Edit me
```
##### (b) Constructing a bar chart
Using the `ggplot` and `geom_bar` commands along with your data summary from part **(a)** to create a bar chart showing the high earning rate on the y axis and `education_num` on the x axis. Specify that the color of the bars should be determined by the number of individuals in each bin.
```{r}
# Edit me
```
##### (c) summary table with multiple splitting variables
Use the `ddply()` function to produce a summary table showing how the proportion of high earners varies across all combinations of the following variables: `sex`, `race`, and `marital` (marital status). In addition to showing the proportion of high earners, your table should also show the number of individuals in each bin. Your table should have column names: `sex`, `race`, `marital`, `count` and `high.earn.rate`.
```{r}
# Edit me
```
##### (d) Nicer table output using `kable()`
Use the `kable()` function from the `knitr` library to display the table from part **(c)** in nice formatting. You should use the `digits` argument to ensure that the values in your table are being rounded to a reasonable number of decimal places.
```{r}
# Edit me
```
### Problem 5: Getting the right plot
##### (a) A more complex bar chart.
Using the table you created in 4(c), use ggplot graphics to construct a plot that looks like [the one at this link](http://www.andrew.cmu.edu/user/achoulde/94842/homework/target_fig.png)
**Hint** You may find it useful to use the following layers: `facet_grid`, `coord_flip` (for horizontal bar charts), `theme` (rotating x axis text) and `guides` (removing fill legend).
```{r, fig.height = 4, fig.width = 8}
# Edit me
```
##### (b) Hiding code with `echo`
Repeat part **(a)**, but this time set the `echo` argument of the code chunk in such a way that the code is not printed, but the plot is still displayed.
```{r, fig.height = 4, fig.width = 8}
# Edit me
Expert Solution
For detailed step-by-step solution, place custom order now.
Need this Answer?
This solution is not in the archive yet. Hire an expert to solve it for you.
Get a Quote





