Fill This Form To Receive Instant Help

Help in Homework
trustpilot ratings
google ratings


Homework answers / question archive / Logistic Regression   Assessment Description   It is important to have experience implementing one of the most common applications of regression currently used in business, finance, and healthcare

Logistic Regression   Assessment Description   It is important to have experience implementing one of the most common applications of regression currently used in business, finance, and healthcare

Statistics

Logistic Regression

 

Assessment Description

 

It is important to have experience implementing one of the most common applications of regression currently used in business, finance, and healthcare.

Questions like should a loan be approved, is a driver entitled a discount, and will a patient survive are all answered with a form of logistic regression (i.e., with a Yes/No answer).

 

Using a dataset representing applications for a bank loan, the task will be to build a logistic regression model that can predict whether or not a loan will be approved.

 

Useful R functions for this assignment are:

 

  1. Data explorations: na(), summary()
  2. Split data into train/test: sample()
  3. Build the model: glm(), summary()
  4. Model performance evaluation: predict()
  5. Model validation: library(gains), gains(), plot(), lines(), dim(), library(car), vif(), glm(), summary(), predict(), ifelse()
  6. Validate prediction: table(), mean()
  7. Results interpretation: library(ROCR), predict(), prediction(), performance()

 

For this activity, perform the following:

 

Load the "application_record.csv," located in the topic Resources, into a data frame and perform initial exploratory tasks:

 

  1. Display representative portions of the data.
  2. Check for missing values and clean the data.
  3. Check for outliers and decide if and how to process them.

 

Formally state what your model will predict using the variables in the data.

 

Split the data into a training set and a testing set with a split ratio of 70:30.

 

Build the Predictive Model:

 

  1. Define the formula for the glm().
  2. Run the model.
  3. Interpret the results, referring to the p-values.

 

Evaluate the Model Performance:

 

  1. Compare the predicted versus actual values.
  2. Search for any predictions that differ significantly from the actual values.

 

Validate the Model:

 

  1. Produce a Gain and Lift chart and use it to describe the performance of the model.
  2. Measure the Variation Inflation Factor (VIF) to test for multicollinearity. If changes are necessary to the model based in VIF, state and implement them.
  3. Has the formula, as defined in the previous section, changed? Why or why not?
  4. If changes to the model occurred, repeat the validation steps on the new model.

 

Make Predictions:

 

  1. Demonstrate a few examples of predictions your model can make.
  2. Validate the predictions by calculating the misclassification error.
  3. Interpret the results.
  4. State a few suggestions for improving the model.

 

Submit a professionally written and formatted R Markdown document knitted as a PDF. Make sure the documentation contains the R code, relevant plots, your analysis, and the appropriate citations and references.

 

While APA style is not required for the body of this assignment, solid academic writing is expected, and documentation of sources should be presented using APA formatting guidelines, which can be found in the APA Style Guide, located in the Student Success Center.

 

This assignment uses a rubric. Review the rubric prior to beginning the assignment to become familiar with the expectations for successful completion.

 

You are not required to submit this assignment to LopesWrite. 

Purchase A New Answer

Custom new solution created by our subject matter experts

GET A QUOTE