Fill This Form To Receive Instant Help

Help in Homework
trustpilot ratings
google ratings


Homework answers / question archive / 4-2 Exercise 4: Unsupervised Learning With Bubba Gump Data   Instructions Bubba Gump Data Using JMP, generate plots and graphs of the distribution of values for each of the variables in the Bubba Gump sample

4-2 Exercise 4: Unsupervised Learning With Bubba Gump Data   Instructions Bubba Gump Data Using JMP, generate plots and graphs of the distribution of values for each of the variables in the Bubba Gump sample

Statistics

4-2 Exercise 4: Unsupervised Learning With Bubba Gump Data

 

Instructions

Bubba Gump Data

Using JMP, generate plots and graphs of the distribution of values for each of the variables in the Bubba Gump sample. Do the plots reveal where there might be missing or defective data? Generate pairwise correlations between the continuous numeric variables in the Bubba Gump data set. Perform a principal components analysis with the continuous numeric values in the Bubba Gump data set. Describe how the data set supports analyses that address the stated business problem, and also describe any shortcomings in the data set that might limit its usefulness in a data mining exercise.

For additional details, please refer to the Module Four Exercise Four Guidelines and Rubric document.

DAT 220: Module Four Exercise 4 Guidelines and Rubric

 

Overview

This exercise is a continuation of the data mining project introduced in the Module Two Exercise.

 

Your Assignment

Open the Bubba Gump survey data in JMP. Examine the data set and prepare an analytics project plan that describes the survey data set and how it will be used to address the stated business problem. Specifically, the summary should:

 

  • Include a description of the population from which the sample was drawn, the sources of data that were combined to construct the sample, the number of customers in the sample, and descriptions of the variables that exist in the data set.
  • From plots and graphs (generated using JMP, with continuous variables appropriately binned) of the distribution of values for each of the variables in the Bubba Gump sample, describe instances where data may be missing or defective or where variables may contain extreme outliers that affect the usefulness of the survey in a data mining exercise.
  • Identify correlations and associations, using pairwise correlations and principal components analysis, that would be useful to measure as part of the preanalytics process, including descriptions of the benefits of each.
  • Describe how the data set supports analyses that address the stated business problem, and also describe any shortcomings in the data set that might limit its usefulness in a data mining exercise.

 

Guidelines

Assignment must follow these formatting guidelines: double spacing, 12-point Times New Roman font, one-inch margins, and APA citations. Page length requirements: 2–3 pages. 

               

 

 

 

Rubric

 

Critical Elements

Exemplary (100%)

Proficient (85%)

Needs Improvement (55%)

Not Evident (0%)

Value

Survey Data Source

Meets “Proficient” criteria and includes a process that can be extended to determine if other data sources might have utility to a customer data mining

exercise

A description is provided of all of the following: the population from which the sample was drawn, the source(s) from which the survey data was extracted, the sample size reflected in the survey data, and a description of the contents of the survey data set

A description is provided of three of the following: the population from which the sample was drawn, the source(s) from which the survey data was extracted, the sample size reflected in the survey data, or a description of the contents of the survey data set

No description of the survey data set, its source, or its lineage is given

25

Descriptive

Visualization of Data Set Variables

Meets “Proficient” criteria and includes a description of the implications of the variable descriptions for the data mining exercise

Distributions from the survey data set variables are plotted or graphed. Descriptions of the distributions, including missing data and extreme values, are provided

Distributions from the survey data set variables are plotted or graphed. Descriptions of the distributions, including missing data and extreme values, are not provided

No distributions from the survey data set variables are plotted or graphed

25

Correlations and Associations

Meets “Proficient” criteria and includes a process for the evaluation of additional variables that might be added to the analysis

Pre-analysis correlations and associations are identified.

Relevance to the Course Project hypothetical is identified

Pre-analysis correlations and associations are identified, but no relevance to the Course Project hypothetical is identified

No pre-analysis correlations or associations are identified

25

Strengths and

Weaknesses of

Survey Data Set

Meets “Proficient” criteria and contains a description of potential remedies for limitations in the survey data set

Usefulness of the data set in addressing the business problem and limits in the data set that may impair analysis are identified

Usefulness of the data set in addressing the business problem OR limits in the data set that may impair analysis are identified, but not both

No strengths or weaknesses in the survey data set are identified

25

 

 

 

 

Earned Total

100%

 

Option 1

Low Cost Option
Download this past answer in few clicks

18.99 USD

PURCHASE SOLUTION

Already member?


Option 2

Custom new solution created by our subject matter experts

GET A QUOTE