Fill This Form To Receive Instant Help

Help in Homework
trustpilot ratings
google ratings


Homework answers / question archive / Machine Learning 1, SS22 Homework 2 PCA

Machine Learning 1, SS22 Homework 2 PCA

Computer Science

Machine Learning 1, SS22 Homework 2

PCA. Neural Networks.

Contents

  1. Neural Networks [6 points + 3* points]                                              2
    1. PCA and Classification [16 points]............................................................... 2
    2. Model selection using GridSearchCV from sklearn [3* points]................. 3
  2. Regression with Neural Networks [9 points]                                               3

 

General remarks

Your submission will be graded based on:

  • Correctness (Is your code doing what it should be doing? Is your derivation correct?)
  • The depth of your interpretations (Usually, only a couple of lines are needed.)
  • The quality of your plots (Is everything clearly visible in the print-out? Are axes labeled?............................................................................................................................... )
  • Your submission should run with Python 3.5+.

For this assignment, we will be using an implementation of Multilayer Perceptron from scikit-learn. The documentation for this is available at the scikit website. The two relevant multi-layer perceptron classes are

– MLPRegressor for regression and MLPClassifier for classification.

For both classes (and all scikit-learn model implementations), calling the fit method trains the model, and calling the predict method with the training or testing data set gives the predictions for that data set, which you can use to calculate the training and testing errors.

 

 

 

 

 

1

 

 

 

2

 

  1. Neural Networks [6 points + 3* points]
    1. PCA and Classification [16 points]

 

PCA can be used as a data preprocessing technique, to reduce the dimensionality of data. In this task, we will use the Sign Language dataset. It consists of images of hands that should be classified into 10 different classes, as shown in Fig. 1. The images are of size (64, 64) pixels. That means, the input dimension to the neural network that we want to train to classify those images would be quite large (64 64 = 4096), and hence we want to reduce their dimension by means of PCA. By doing this, the benefit will be that the training time of the network will be shorter.

Tasks:

      1. PCA for dimensionality reduction. Load the dataset (features and targets). Use PCA from sklearn.decomposition to reduce the dimensionality of the data. Creating an instance of PCA class. Choose n components (number of principal components) such that about 85% of variance is explained. In the report, state n components that you used, and the exact percentage of variance explained that you get.

Hints: You will need to fit the model, and apply the dimensionality reduction of the original features, in order to obtain the data with a reduced dimension: (n  samples, n  components).  Check the Attributes of this class, and find which one to use to get the percentage of variance explained. To narrow your search, the number of principal components should be: 100 < n components < 200.

∈ {               }

Varying  the  number  of  hidden  neurons.   We  will  use  the  data  with  reduced  dimension  (from the previous step) to train a neural network to perform classification. We will vary the number of neurons in one hidden layer of the neural network: n hidden      5, 100, 200 .   For this task, we will use MLPClassifier from sklearn.neural network. Create an instance of MLPClassifier.  Set max iter  to 500, random state to 0, hidden layer sizes to be n hid (one of values from n hidden), and all the other parameters should have their default values. (If the warning about the optimization not converging appears, you can ignore it. No need to do anything about it.)

For each n hid, report accuracy on the train and test set, and the current loss.

Answer the questions: How do we know (in general) if the model does not have enough capacity (the case of underfitting)? How do we know (in general) if the model starts to overfit? Does that happen with some architectures/models? (If so, say with what number of neurons that happens). Which model would you choose here and why?

∈ {              }

To prevent overfitting, we could use a few approaches, for example, introducing regularization and/or early stopping. Copy the code from the previous task. Try out (a) alpha = 1.0 (b) early stopping = True, (c) alpha = 1.0 and early stopping = True. Choose (a), (b), or (c) that you think works best (write a sentence saying what your choice was). Then, report the train and test accuracy, and the loss for n hidden       5, 100, 200 . Does this improve the results from the previous step? Which model would you choose now?
      1. Variability of the performance. Choose the best performing parameters that you  found  in  the previous task,  and vary the seed (parameter random state) with 5 different values of your choice. (Note: When the seed is fixed, we can reproduce our results, and we avoid getting different results for every run.) When changing the seed, what exactly changes? Report minimum and maximum accuracy, and mean accuracy over 5 different runs and standard deviation i.e., (mean ± std).
      2. Using a model with any (fixed) seed of your choice, plot the loss curve (loss over iteration). Hint: Check the Attributes of the classifier.
      3. Using a model with any (fixed) seed of your choice, calculate predictions on the test set. In the code, print the classification report and confusion matrix, and include either a screenshot of both of them, or copy the values to the report. How could you calculate yourself recall from support and confusion matrix entries? Explain in words what is  recall.  What  is  the  most  misclassified  image? State which class (digit) it was, and how you concluded that.

 

 

    1. Model selection using GridSearchCV from sklearn [3* points]                           3

 

    1. Model selection using GridSearchCV from sklearn [3* points]

Finding the best-performing model can be cumbersome. We can use, for example, GridSearchCV to find the best architecture, by trying out all the different combinations.

Tasks:

      1. We want to check all possible combinations of the parameters:
        • α ∈ {0.0, 0.001, 1.0}
        • activation ∈ {identity, logistic, relu}
        • solver ∈ {lbfgs, adam}
        • hidden layer sizes ∈ {(100, ), (200, )}

Create a dictionary of these parameters that GridSearchCV from sklearn.model selection requires. How many different architectures will be checked? (State the number of architectures that will be checked and how you calculated it.)

      1. Set max iter = 500, random state = 0, early stopping = True as default parameters of MLPClassifier.
      2. What was the best score obtained? Hint: Check the Attributes of the classifier.
      3. What was the best parameter set? Hint: Check the Attributes of the classifier.

 

  1. Regression with Neural Networks [9 points]

Neural Networks can be used for regression problems as well. In this task, we will train a neural network to approximate a function.

Tasks:

  1. Load the dataset (x-datapoints.npy are the features, y-datapoints.npy are the targets).
  2. Implement the function calculate mse. In the report, include the code snippet.
  3. Train the network to solve the task (use MLPRegressor from sklearn). You can perform either a manual search or a random / grid search to find a good model. Vary at least 3 different numbers of neurons in the hidden layer.

If you use  manual  search:  Describe  how  you  chose  the  final  model  -  e.g.,  how  many  neurons you tried out, one or two layers,  which optimizer,  was it necessary to use early stopping (and if so, what was the percentage of validation set used), which activation function did you use for neurons (hidden layer sizes).

If you  use  random  /  grid  search  (i.e.,  GridSearchCV  from  sklearn):  Report  which  dictionary of parameters you used, and what was the best model. You have to try out at least different number of neurons, different optimizers, and some form of regularization. (In total there should be at least 8 different combinations that were checked by GridSearch.)

  1. For the final choice of the model, report the final loss achieved.

Option 1

Low Cost Option
Download this past answer in few clicks

22.99 USD

PURCHASE SOLUTION

Already member?


Option 2

Custom new solution created by our subject matter experts

GET A QUOTE

Related Questions