Fill This Form To Receive Instant Help

#### Coursework Assignment 1 A company that manufactures riding mowers wants to identify the best sales prospects for an intensive sales campaign

###### Statistics

Coursework Assignment 1 A company that manufactures riding mowers wants to identify the best sales prospects for an intensive sales campaign. In particular, the manufacturer is interested in classifying households as prospective owners or nonowners of the basis of Income (in \$1000s) and Lot Size (in 1000 ft2 ). The marketing expert looked at a random sample of 34 households, given in the file Riding Mower. Use the data to fit a logistic regression of ownership on the two predictors in SPSS and answer the following questions. I. Preparing the data sets:a) Split the data into training and validation datasets using a 70%:30% ratio.b) Why should the data be partitioned into training and validation sets? What will the training set be used for? What is the validation set used for? II. Pre-processing:c) Explore the data set by applying appropriate instruments depending on the data type: descriptive statistics, boxplots and histograms. Based on these methods describe the data and make relevant conclusions. Does the data need cleaning? Can you identify any missing values and outliers in data? Why? d) What percentage of households in the study were owners of a riding mower?e) Create a scatterplot of Income versus Lot Size using colour or symbol to differentiate owners from nonowners. From the scatterplot, which class seems to have the higher average income, owners or nonowners?f) Perform correlation analysis on the continuous data type independent variables. Are the independent variables highly correlated? III. Logistic regression on the validation data set:g) Use the obtained logistic model from the training data set to run regression on the validation data set using ‘enter’ attribute selection method.h) Report on the final logistic regression model, interpreting significance levels, coefficients/odds ratios and citing the error rates.i) How well the logistic model is fit?j) Among nonowners, what is the percentage of households classified correctly?