Fill This Form To Receive Instant Help

Help in Homework
trustpilot ratings
google ratings


Homework answers / question archive / Activity one Exercise 1 Using your preferred editor (colab is recommended) to fill the snippet gaps

Activity one Exercise 1 Using your preferred editor (colab is recommended) to fill the snippet gaps

Computer Science

Activity one

Exercise 1 Using your preferred editor (colab is recommended) to fill the snippet gaps.

The following is a simple demonstration of using WSS to decide and plot the clusters

based on k-means clusters algorithm.

%% Import the necessary packages

%

import numpy as np

import pandas as pd

from matplotlib import pyplot as plt

from sklearn.datasets.samples_generator import make_blobs

from sklearn.cluster import KMeans

%% Generate 6 artificial clusters for illustration purpose

%% Hint: you may need to use make_blobs and scatter functions: check the Python

%% official resources for more information of their usages

%

Insert your code block here

%% Implement the WSS method and check through the number of clusters from 1

%% to 12, and plot the figure of WSS vs. number of clusters.

%% Hint: reference the plots in the lecture slides;

%% You may need to use inertia_ from property WCSS, and kmeans function

%

wcss = []

for i in range(1, 12):

 kmeans = KMeans(n_clusters=i, init='k-means++', max_iter=300, n_init=10,

random_state=0)

Insert your code block here

%% Categorize the data using the optimum number of clusters (6)

%% we determined in the last step. Plot the fitting results

%% Hint: you may need to call fit_predict from kmeans; scatter

%

kmeans = KMeans(n_clusters=6, init='k-means++', max_iter=300, n_init=10,

random_state=0)

Insert your code block here

plt.scatter(X[:,0], X[:,1])

plt.scatter(kmeans.cluster_centers_[:, 0], kmeans.cluster_centers_[:, 1], s=300,

c='red')

plt.show()

 

Exercise 2 For the following code blocks and plots, run the code first; then provide your

interpretation/explanation for the required parts.

k-means on digits

We will attempt to use k-means to try to identify similar digits without using the original

label information; this might be similar to a first step in extracting meaning from a new

dataset about which you don't have any a priori label information.

We will start by loading the digits and then finding the k-Means clusters. The digits

consist of 1,797 samples with 64 features, where each of the 64 features is the

brightness of one pixel in an 8×8 image.

import seaborn as sns; sns.set() # for plot styling

from sklearn.datasets import load_digits

digits = load_digits()

digits.data.shape

## Provide your interpretation/explanation for the following block

#

kmeans = KMeans(n_clusters=10, random_state=0)

clusters = kmeans.fit_predict(digits.data)

kmeans.cluster_centers_.shape

## Provide your interpretation/explanation for the following block

#

fig, ax = plt.subplots(2, 5, figsize=(8, 3))

centers = kmeans.cluster_centers_.reshape(10, 8, 8)

for axi, center in zip(ax.flat, centers):

 axi.set(xticks=[], yticks=[])

axi.imshow(center, interpolation='nearest', cmap=plt.cm.binary)

 

 

 

from scipy.stats import mode

labels = np.zeros_like(clusters)

for i in range(10):

 mask = (clusters == i)

labels[mask] = mode(digits.target[mask])[0]

from sklearn.metrics import accuracy_score

accuracy_score(digits.target, labels)

## Provide your interpretation/explanation for the following block

 

#

from sklearn.metrics import confusion_matrix

mat = confusion_matrix(digits.target, labels)

sns.heatmap(mat.T, square=True, annot=True, fmt='d', cbar=False,

 xticklabels=digits.target_names,

 yticklabels=digits.target_names)

plt.xlabel('true label')

plt.ylabel('predicted label');

 

 

______________________________________________________

Activity two

 

Exercise 1 What is the Apriori property (in 1 or 2 sentences). Provide a simple

example.

Exercise 2 Following is a list of five transactions that include items A, B, C, and D:

T1: {A, B, C}

T2: {A, C}

T3: {B, C}

T4: {A, D}

T5: {A, C, D}

Which itemsets satisfy the minimum support of 0.5? Need to include your deduction

process.

Hint: Given an itemset L, the “support” of L is the percentage of transactions containing

L. To meet support criteria of 0.5, you need to find the sets of transactions that show

up at least 50% of the time.

_________________________________________________________________________________

Activity three

 

Exercise 1 In the Income linear regression example, consider the distribution of the

outcome variable Income. It is noticed the income values tend to be highly skewed to

the right (distribution of value has a large tail to the right).

Does such a non-normally distributed outcome variable violate the general assumption

of a linear regression model? Provide your supporting arguments.

Exercise 2: Describe how logistic regression can be used as a classifier.

Exercise 3: If the probability of an event occurring is 0.4, then

a. What is the odds ratio?

b. What is the log odds ratio? __________________________________________________________________________

Activity four

Exercise 1 We have three observed points (23, 41), (67, 84), (78, 100).

Question: fit them to a linear model: Y = ?0 + ?1X.

For the ease of computation, we use residual sum of squares (RSS) as the loss function to

estimate the parameters:

Set the learning rate λ = 0.00001, the initial guess for the parameters: ?0 = 7, ?1 = 1.

You only need to provide the first three iterations using Gradient Descent.

Calculate the results manually.

Hint:

1? Compose the loss function based on the RSS formula

2? Take the partial derivative of RSS function

3? Substitute the initialized parameters to check the RSS

Iteration 1:

a) Plug ?0 = 7, ?1 = 1 into the partial derivative equations for RSS

b) Compute the step size, use the provided learning rate

c) Update the parameters (check the lecture slides for the equation)

d) Re-compute the RSS to check the loss change

Repeat Iteration 1 for another 2 iterations to see the change in trend and loss

pur-new-sol

Purchase A New Answer

Custom new solution created by our subject matter experts

GET A QUOTE

Related Questions