Fill This Form To Receive Instant Help

Help in Homework
trustpilot ratings
google ratings


Homework answers / question archive / Assignment 11 Decision Tree Random Forest tips dataset adaboost (optional)   Implement a decision tree classifier

Assignment 11 Decision Tree Random Forest tips dataset adaboost (optional)   Implement a decision tree classifier

Computer Science

Assignment 11

  1. Decision Tree
  2. Random Forest
  3. tips dataset
  4. adaboost (optional)

 

Implement a decision tree classifier. For each week, your feature set is (μ, σ) for that week. Use your labels (you will have 52 labels per year for each week) from year 1 to train your classifier and predict labels for year 2. Use ”entropy” as the splitting criteria (this is the default)

Questions:

  1. implement a decision tree and compute its accuracy for year 2
  2. compute the confusion matrix for year 2
  3. what is true positive rate and true negative rate for year 2?
  4. implement a trading strategy based on your labels for year 2 and compare the performance with the ”buy-and-hold” strategy. Which strategy results in a larger amount at the end of the year?

Implement a random forest classifier. For each week, your fea- ture set is (μ, σ) for that week. Use your labels (you will have 52 labels per year for each week) from year 1 to train your classifier and predict labels for year 2. Recall that are two hyper-parameters in the random forest classifier

1. N - number of (sub)trees to use

2. d - max depth of each subtree

Questions:

  1. take N = 1,...,10 and d = 1,2,...,5. For each value of N and d construct a random tree classifier (use ”entropy” as splitting criteria - this is the default) use your year 1 labels as training set and compute the error rate for year 2. Plot your error rates and find the best combination of N and d.
  2. using the optimal values from year 1, compute the confusion matrix for year 2
  3. what is true positive rate and true negative rate for year 2?
  4. implement a trading strategy based on your labels for year 2 and compare the performance with the ”buy-and-hold” strategy. Which strategy results in a larger amount at the end of the year?

Implement an Adaboost classifier. For each week, your feature set is (μ,σ) for that week. Use your labels (you will have 52 labels per year for each week) from year 1 to train your classifier and predict labels for year 2. Recall that are two hyper-parameters in the random forest classifier

1. N - number of ”weak” learners to use

2. d - base learner (base estimator)

3. learning rate λ

Questions:

  1. take λ = 0.5 and λ = 1. For each lambda, construct an Adaboost classifier with any three base estimators of your choice (e.g. logistic regression, naive bayesian, k-NN). Use your year 1 labels as training set and compute the error rate for year 2. Plot your error rates as you change N from 1 to 15.
  2. for each base estimator, what is the best value N for learn- ing rate λ = 0.5?
  3. what is your accuracy for each base estimator choice (as- suming the best N for that estimator)
  4. what classifier is best to use as base estimator for your data?
  5. implement a trading strategy (using the Adaboost with the best estimator) based on your labels for year 2 and compare the performance with the ”buy-and-hold” strategy. Which strategy results in a larger amount at the end of the year?

Read the ”tips.csv” file into Pandas and write Python code to answer the following

Questions:

  1. what is the average tip (as a percentage of meal cost) for for lunch and for dinner?
  2. what is average tip for each day of the week (as a percentage of meal cost)?
  3. when are tips highest (which day and time)?
  4. compute the correlation between meal prices and tips
  5. is there any relationship between tips and size of the group?
  6. what percentage of people are smoking?
  7. assume that rows in the tips.csv file are arranged in time. Are tips increasing with time in each day?
  8. is there any difference in correlation between tip amounts from smokers and non-smokers?

Option 1

Low Cost Option
Download this past answer in few clicks

22.99 USD

PURCHASE SOLUTION

Already member?


Option 2

Custom new solution created by our subject matter experts

GET A QUOTE