Fill This Form To Receive Instant Help

Help in Homework
trustpilot ratings
google ratings


Homework answers / question archive / Part 2 questions : Model building and interpretation

Part 2 questions : Model building and interpretation

Statistics

Part 2 questions :

  • Model building and interpretation.

a. Build various models (You can choose to build models for either or all of descriptive, predictive or prescriptive purposes)

b. Test your predictive model against the test set using various appropriate performance metrics

c.Interpretation of the model(s)   - 10marks

 

  • Model Tuning

a.Ensemble modelling, wherever applicable

b. Any other model tuning measures(if applicable)

c. Interpretation of the most optimum model and its implication on the business – 10 marks

 

 

  • the detailed analysis of project notes 1 and 2 along with business insights and recommendations.-  20 marks

 

 

 

 

 

 

Standard Instructions for Business Report:

  • All pages must be numbered.
  • Tables/figures/charts/graphics (if any) must have number and title.
  • Groups must make sure visualizations are clearly read at usual magnification and add value to the Report
    • All visualizations must be clearly labelled.
    • All axis labels and legends must be legible.
    • Tableau graphics default mode is not always conducive to normal copy-paste. A proper adjustment may be required. 

 

 

 

 

 

 

 

PART 1

FEED BACK  kindly go through this and implement at part 2

1. Problem Understanding

a) Defining problem statement b) Need of the study/project c) Understanding business/social opportunity

 

 

FB : Need to elaborate more on significance of Demand planning and supply Chain Management & optimisation techniques with facts & figures. The significance of Demand planning and supply Chain Management & optimisation techniques should be explained with facts & figures from industry source.

 

 

 

3. Exploratory Data Analysis

a) Univariate analysis (distribution and spread for every continuous attribute, distribution of data in categories for categorical ones) b) Bivariate analysis (relationship between different variables , correlations) a) Removal of unwanted variables (if applicable) b) Missing Value treatment (if applicable) d) Outlier treatment (if required) e) Variable transformation (if applicable) f) Addition of new variables (if required)

 

FB : The purpose of Univariate Analysis is to find out which variables have clear separation for the target variable – separation of mean & median of continuation variables and their skewness affecting the target variable. Bivariate analysis is to establish the relationship among various independent variables and with dependent variables. Specific statistical/business insights are missing. Variables having linear relationship in Pairplots should have identified and mentioned. For correlation heat map, which are the variables having multicollinearity should have identified. Heatmap & Pairplot not drawn properly. Wh_est_year should be converted to age of the warehouse & age plays a significant role in warehouse related issue. Missing value for certificates should be treated as applied but not received from govt. Should have done significant test of wh_est_yeat. If it happens to be significant, then could have done KNN imputation. wh_govt_certification is a categorical, so should have done mode imputation. Should have carried out significant test for categorical variables with target variable (ANOVA). Flood proof, flood impacted & electricity supply are categorical variables. So should not do distribution plots, outlier & correlation. The variable transformation – encoding of categorical variables should have been done & documented with methodology for specifying each variable.

 

 

 

4 . Business insights from EDA

a) Is the data unbalanced? If so, what can be done? Please explain in the context of the business b) Any business insights using clustering (if applicable) c) Any other business insights

 

 

FB: Checking data unbalance is very critical to classification problem, but not for any linear regression. But any unbalance of data (less data points) specific to any variable/segment could have checked & mentioned here, eg. Wh_east_year has more than 40% missing data. Data unbalancing is most common and critical for classification problem. But it does not mean that it's not applicable to regression problem. It does apply, but not that critical. It's applicable to regression problem where observations are less in a particular segment, class etc. for which one cannot do regression. Also, for any segment where missing values are high or with more extreme values either end. Imbalance concept in regression is there, but rarely used. It’s rarely used at Machine Learning stage, but it’s critical for Deep Learning. Like we apply Smote for classification problem to address imbalance issue, Smoter is used for regression problem. But here we don't have to apply this or need to take any action. But conceptually we have to mention these points. Am sharing few blogs for reference. https://analyticsindiamag.com/deep-imbalanced-regression-complete-guide/ https://www.kaggle.com/questions-and-answers/34328 Should have done clustering of warehouses based of demand. The business report should have List of Contents (index) with page numbering and list of tables & figures with numbering.

 

 

 

 

Option 1

Low Cost Option
Download this past answer in few clicks

28.99 USD

PURCHASE SOLUTION

Already member?


Option 2

Custom new solution created by our subject matter experts

GET A QUOTE

Related Questions