Fill This Form To Receive Instant Help
Homework answers / question archive / STATS 330 Assignment2 Write your assignment using R Markdown
STATS 330 Assignment2 Write your assignment using R Markdown.Knit your report to either a HTML or PDF document.
Create a section for each question. Include all relevant code and output in the final document.
Please keep your code tidy and your plots neat and professional. For example, it’s very useful for the reader if you use informative, readable axis labels rather than allowing the default behaviour of printing the R object
name.
Please refer to the plots in-text when interpreting.
5 presentation marks are available.
Please remember to upload your HTML or PDF document to Canvas by the due date.
Please remember to upload your R Markdown file to Canvas before the deadline, too. If the markers identify an error in your work, being able to run the code you have written can help determine what you did wrong.
Due on 13th April 2022 at 11:00 AM (NZ time)
Question 1
In this exercise, we will explore whether a person’s frequency of prayer describes whether they believe divorce should be more difficult to obtain than it is now. The data set divorce contains information about divlaw : Should divorce in this country be easier or more difficult to obtain than it is now? (See
Table 1) pray : How often does respondent pray? (See Table 2) age : Age of respondent
relig : What is your religious preference? (See Table 3)
1. You will need to do data preparation first before you start building models. 5 marks
a. Recode divlaw into a binary variable that suits the purpose of this study. (See Table
4)
b. Recode pray into the following categories. (See Table 5)
c. Recode relig into the following categories. (See Table 6)
d. Do you think it is reasonable to consider IAP and DK as NA for divlaw ?
2. Make appropriate plots of the response variable with age , pray and relig and comment on what you can observe. 12 marks
3. Fit a logistic regression model with all the predictors ( age , pray and relig ) as main
effects. 2 marks
4. Report the residual deviance and associated degrees of freedom. Can we use this information to determine the goodness of fit of the fitted model? Explain. 4 marks
5. Plot the deviance residuals against the fitted values on the logit scale. What can be concluded from this plot? 2 marks
6. Provide an appropriate residual plot to determine the adequacy of the fitted model and comment on what you can observe. 3 marks
7. Provide a Chi-squared test for each variable included in the model and comment on the results. 3 marks
8. Provide an interpretation of the significant coefficients. 5 marks
9. Which subset of people are predicted to have the highest probability of supporting stricter
divorce laws, and which subset of people are predicted to have the lowest probability of
supporting stricter divorce laws? 4 marks
40 marks
Question 2
The dataset airbnb contains information about 1561 Airbnb listings in Chicago, collected in
2016. The variables in the dataset are: price : the nightly price of the listing (in USD)
rating : the listing’s average rating, on a scale from 1 to 5 room_type : the type of listing (a shared room, an entire house/apartment, a private room) bedrooms : the number of bedrooms the listings has
min_stay : the minimum number of nights to stay in the listing
district : the district in which the listing is located
transit_score : the neighborhood’s rating for access to public transit (0–100)
In this exercise, we are interested in building a model to describe the variation in price using
rating , room_type , bedrooms , min_stay , transit_score and district .
1. Plot the response variable and comment on what you can observe. Describe how we can use
this response variable to fit a linear model. 5 marks
2. Fit a linear model to predict the price using the given predictor variables. 2 marks
3. How well does this model fit the data in terms of . Plot the residuals against the fitted
values and comment on what you can observe. 4 marks
4. Based on the type of the response variable, what distribution is appropriate to use for the
response variable. 3 marks
5. Write down the full model you would fit for the response and predictor variables based on
part 4. 2 marks
6. Fit the model specified in part 5 by using the data given. 2 marks
7. Perform and interpret a test of goodness-of-fit for the model fitted in part 6. 4 marks
8. Produce and comment on appropriate residual plots for the model fitted in part 5. 5 marks
9. Suggest two possible models that you can use to address the problems noted in part 8. 2
marks
10. Fit the models you have suggested in part 9. 3 marks
11. Produce and comment on appropriate residual plots for the models fitted in part 10. 4 marks
12. Do you observe any usual observation(s) in the residual plots obtained in part 11?
Investigate any possible reason(s) for the presence of such observation(s) and comment. 3
marks
13. Compare and comment on the coefficients for the two models you fitted above with the one
you fitted in part 2. Plot fitted values from these three models against the actual
observations and comment on what you can observe. 7 marks
14. Out of these three models, which model would you recommend? Give reasons for your
selection. 4 marks
50 marks
Presentation 5 marks
Total: 95 marks
Appendix
Table 1: Levels used in divlaw
Response Meaning
1 Easier
2 More difficult
3 Stay same
0 IAP, question is inapplicable to this person for some reason
8 DK, don't know
9 NA, question not asked to this individual
Table 2: Levels used in pray
Response Meaning
1 Several times a day
2 Once a day
3 Several times a week
4 Once a week
5 Less than once a week
6 Never
0 IAP, question is inapplicable to this person for some reason
8 DK, don't know
9 NA, question not asked to this individual
Table 3: Levels used in relig
Response Meaning
1 Protestant
2 Catholic
3 Jewish
4 None
5 Other
6 Buddhism
7 Hinduism
8 Other eastern religions
9 Muslim/islam
10 Orthodox-christian
11 Christian
12 Native american
13 Inter-nondenominational
98 Don't know
99 NA, question not asked to this individual
Table 4: Recoded levels in
divlaw
Old code New code
Easier No
More difficult Yes
Stay same No
IAP/DK/NA NA
Table 5: Recoded levels in pray
Old code New code
Once a day Daily
Several times a day > Daily
Less than once a week < Weekly
Once a week Weekly
Several times a week > Weekly
Never Never
IAP/DK/NA NA
Table 6: Recoded levels in relig
Old code New code
Protestant Christian
Orthodox-christian Christian
Christian Christian
Inter-nondenominational Christian
Catholic Catholic
Jewish Jewish
Buddhism Eastern
Hinduism Eastern
Other eastern region Eastern
Muslim/islam Muslim
Native american Native american
Other Other
None None
DK/NA NA
Code
R
2
Please download the answer file using this link
https://drive.google.com/file/d/1dj1bAEC2G5Zw4TnkCMtY9pK4Cfp6PuxJ/view?usp=sharing