Homework answers / question archive / 1 What will be output of tt <- sort(table(c("a", "b", "a", "a", "b", "c", "a1", "a1", "a1")), dec=T) SELECT THE CORRECT ANSWER Top of Form a a1 b c 3 3 2 1 3 3 2 1 a a1 b c 2 1 4 1 a a1 b c 7 3 2 1 Bottom of Form 2 The frequency distribution of a categorical variable can be checked using the table function in R language

1 What will be output of tt <- sort(table(c("a", "b", "a", "a", "b", "c", "a1", "a1", "a1")), dec=T) SELECT THE CORRECT ANSWER Top of Form a a1 b c 3 3 2 1 3 3 2 1 a a1 b c 2 1 4 1 a a1 b c 7 3 2 1 Bottom of Form 2 The frequency distribution of a categorical variable can be checked using the table function in R language

Statistics

Share With

What will be output of tt <- sort(table(c("a", "b", "a", "a", "b", "c", "a1", "a1", "a1")), dec=T)

SELECT THE CORRECT ANSWER

Top of Form

a a1 b c 3 3 2 1

3 3 2 1

a a1 b c 2 1 4 1

a a1 b c 7 3 2 1

Bottom of Form

The frequency distribution of a categorical variable can be checked using the table function in R language. gender=factor(c('M','F','M','F','F','F')) What would be the reference level by default?

SELECT THE CORRECT ANSWER

Top of Form

Both 1 & 2 above

None of above

Bottom of Form

What will be the result of multiplying two vectors in R having different lengths? x = c(1,56,8) y = c(8,2,11,8) x*y

SELECT THE CORRECT ANSWER

Top of Form

[1] 8 112 88 NA

[1] 8 112 88

[1] 8

[1] 8 112 88 Warning message:

Bottom of Form

What would be output of a in loop: for (year in 1:5) { Yr=print(year) }

SELECT THE CORRECT ANSWER

Top of Form

[1] 1 [1] 2 [1] 3 [1] 4 [1] 5

[1] 1

[1] 5

[1] 1 [1] 5

Bottom of Form

While running syntax View(Boston) system gives error - Error in View : object 'Boston' not found. What could be possible reason for this error?

SELECT THE CORRECT ANSWER

Top of Form

Library caTools is not loaded

Library MASS is not loaded

R needs to be rebooted

Is there any other package required which does not exist in CRAN

Bottom of Form

While running a numeric expression x = 1/4000 the denotion given in output is scientific: [1] 1e-04. What should be appropriate syntax to visualize decimal places?

SELECT THE CORRECT ANSWER

Top of Form

informat(x,sci=FALSE)

format(x,sci=FALSE)

round(x)

format(x,format(.001))

Bottom of Form

To find out position or element number of maximum value in below dataframe DataFrame1=data.frame(v1 = c(2,4,12,3,6)) . What should be the appropriate code?

SELECT THE CORRECT ANSWER

Top of Form

which(max(DataFrame1$v1))

max.pos(DataFrame1$v1))

max(DataFrame1$v1)

which(DataFrame1$v1==max(DataFrame1$v1))

Bottom of Form

Multiple objects are created in R session for e.g EmployeeData, Sales_recs. What is the code to save these objects?

SELECT THE CORRECT ANSWER

Top of Form

upload(EmployeeData, Sales_recs, file="Mydata.RData")

save(EmployeeData, Sales_recs, file="Mydata.RData")

write.csv(EmployeeData, Sales_recs, "Mydatasets.csv)

print(EmployeeData, Sales_recs)

Bottom of Form

Is R a case sensitive language?

SELECT THE CORRECT ANSWER

Top of Form

TRUE

FALSE

Bottom of Form

A user defined function is created x=function(s) { if (s>0) { print("+ve") } else { print("-ve") } } What would be output of x(-1)?

SELECT THE CORRECT ANSWER

Top of Form

"-ve"

"+ve"

NaN

R will throw error

Bottom of Form

What is the usage of switch function?

SELECT THE CORRECT ANSWER

Top of Form

It can be used interchangeably with ifelse function

It is used to switch numeric values to any other class type

It is used to replace any character string

It is used to concatenate strings

Bottom of Form

What is the difference between transform and mutate function?

SELECT THE CORRECT ANSWER

Top of Form

There is no difference both of them can be used in place of each other

Both Transform and mutate functions are used to create a new variable however in transform function one can not recalculate a newly calculated variable. In mutate this can be performed.

Transform is used to convert a variable type

Mutate is used to concatenate text string

Bottom of Form

What is the statistical method used in R to predict a classification variable?

SELECT THE CORRECT ANSWER(S)

Top of Form

Linear regression

Decision Tree

Logistic Regression

SVM

Bottom of Form

In a 2 tail test in normal distribution what will be CI limit range on the left and right side of bell curve considering 5% significance level?

SELECT THE CORRECT ANSWER

Top of Form

2.5% to 97.5%

5% to 95%

36% on left to 36% of data on right

None of above

Bottom of Form

What is significance of R-square in linear regression?

SELECT THE CORRECT ANSWER(S)

Top of Form

Explains the better prediction capability

Represents accuracy of model

Represents high likely hood estimates

Ranges from scale of 0 to 1

Bottom of Form

Suppose there is a CustomerOrder table with CustomerID, OrderDate and Amount_Paid. What will be the code to remove duplicate entries across CustomerID, OrderDate?

SELECT THE CORRECT ANSWER

Top of Form

Sort(CustomerOrder,unique(CustomerOrder [c("CustomerID "," OrderDate ")]))

Sort(CustomerOrder,arrange(CustomerOrder,CustomerID)[1,])

Sort(CustomerOrder,arrange(CustomerOrder,CustomerID,OrderDate)[1,])

None of above

Bottom of Form

What would be class of this object. LogicalVector = c(TRUE,FALSE,0,1)

SELECT THE CORRECT ANSWER

Top of Form

Logical

Numeric

Character

list

Bottom of Form

Which of the following is an invalid assignment?

SELECT THE CORRECT ANSWER(S)

Top of Form

M1 <-matrix(nrow=2, ncol=3)

M1 <-matrix(nrow=2, ncol=3.5)

M1 <-mat(nrow=2, ncol=3)

M1 <-mat(nrow=3, ncol=3)

Bottom of Form

What would be output for code m = matrix(1:6,nrow=2,ncol=3,byrow=TRUE)

SELECT THE CORRECT ANSWER

Top of Form

[,1] [,2] [,3] [1,] 1 2 3 [2,] 4 5 6

[,1] [,2] [,3] [1,] 1 3 5 [2,] 2 4 6

[,1] [,2] [1,] 1 4 [2,] 2 5 [3,] 3 6

[,1] [,2] [1,] 1 2 [2,] 3 4 [3,] 5 6

Bottom of Form

What would following code print rep(1:10,2)

SELECT THE CORRECT ANSWER

Top of Form

[1] 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10

[1] 1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9 10 10

[1] 1 1 2 2 3 4 5 6 7 8 9 10

[1] 1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9 10 10 1 1

Bottom of Form

What would the following code print? x <-1:9 y <-15:17 >cbind(x, y)[1,]

SELECT THE CORRECT ANSWER

Top of Form

x y 1 15

x y 2 16

[1] 1 2 3 4 5 6 7 8 9

x y 9 17

Bottom of Form

The ____ function concatenates 2 or more character strings:

SELECT THE CORRECT ANSWER

Top of Form

paste()

bind()

mutate()

arrange()

Bottom of Form

Which of the following statements is true about character strings in R programming?

SELECT THE CORRECT ANSWER

Top of Form

Character strings are entered with either matching double (") or single (') quotes.

Character vectors may be appended into a vector by the c() function.

None of the above

Both a and b

Bottom of Form

A programmer writes code in single line as x = 1 y=2. Which of the following statements matches this statement?

SELECT THE CORRECT ANSWER

Top of Form

This code is correct.

It should be written as x = 1; y=2

R is a free text language and codes can be written in single line

x = 1 > y=2

Bottom of Form

What will be result of the code: list(5, "John", TRUE, 1+ 9i)

SELECT THE CORRECT ANSWER

Top of Form

[[1]] [1] 5 [[2]] [1] "John" [[3]] [1] TRUE [[4]] [1] 1+9i

[1] "5" "John" "TRUE" "1+9i"

[1] 5

Error: unexpected ','

Bottom of Form

Identify correct statement in R

SELECT THE CORRECT ANSWER(S)

Top of Form

Matrix are 2 dimensional and can have elements of different data types

Vector are 1 dimensional and can have elements of same data types

Data frame are 2 dimensional

Lists can have elements of different data types

Bottom of Form

Which is valid syntax to convert a character vector to numeric x = c("1","5","98","23")

SELECT THE CORRECT ANSWER

Top of Form

as.num(x)

as.numeric(x)

as.factor(x)

as.char(x)

Bottom of Form

What would be output of below code x = list(78,"Tim",101,8i) length(x)

SELECT THE CORRECT ANSWER

Top of Form

Bottom of Form

What would be class of below object as.character(factor(c("No", "yes", "no", "yes", "no")))

SELECT THE CORRECT ANSWER

Top of Form

character

factor

numeric

list

Bottom of Form

What would be output of sum(2,8,9,NA)

SELECT THE CORRECT ANSWER

Top of Form

Compilation error

None of above

Bottom of Form

What would be output of c(T,F,TRUE,1)

SELECT THE CORRECT ANSWER

Top of Form

[1] 1 0 1 1

[1] TRUE FALSE TRUE TRUE

[1] T,F,TRUE,1

[1] TRUE,FALSE,TRUE,1

Bottom of Form

How many entries will be in this code EmpRecs = data.frame(EmpID = c(101,102,103) , Name = c("John","Theresa","Andy","Paul"))

SELECT THE CORRECT ANSWER

Top of Form

R will throw Error

Bottom of Form

How to convert a numeric vector in to factor permanently x= c(98,2,67,87)

SELECT THE CORRECT ANSWER

Top of Form

x = as.factor(x)

x = as.fac(x)

as.factor(x)

as.fac(x)

Bottom of Form

Which statement about R is correct?

SELECT THE CORRECT ANSWER

Top of Form

Character strings are imported as factors by default.

Factors are used to represent categorical data and can be unordered or ordered.

Neither a nor b is correct.

Both a and b are correct.

Bottom of Form

What would be the output of sum(2,8,9,NA,na.rm=T)?

SELECT THE CORRECT ANSWER

Top of Form

None of above

Bottom of Form

Which statement is correct about loops in R?

SELECT THE CORRECT ANSWER

Top of Form

To exit a repeat loop is to call break.

Infinite loops should be avoided.

Neither a nor b is correct.

Both a and b are correct.

Bottom of Form

Which of the following code stops a loop after 20 iterations?

SELECT THE CORRECT ANSWER

Top of Form

for(i in 1:100){ if(i >20){ break } print(i) }

for(i in 1:100){ if(i >19){ break } print(i) }

for(i in 1:100){ if(i <20){ break } print(i) }

for(i in 1:100){ if(i <19){ break } print(i) }

Bottom of Form

What will be output of below code for(i in seq(from=1,to=10,by=2)) { Variable1=print(i) }

SELECT THE CORRECT ANSWER

Top of Form

[1] 1 [1] 3 [1] 5 [1] 7 [1] 9

[1] 1 [1] 2 [1] 3 [1] 4 [1] 5 [1] 6 [1] 7 [1] 8 [1] 9 [1] 10

[1] 1 [1] 2 [1] 3 [1] 4 [1] 5 [1] 6 [1] 7 [1] 8 [1] 9 [1] 6 [1] 7 [1] 8 [1] 9

[1] 1 [1] 4 [1] 7

Bottom of Form

What will be the output of the this code? x <-1 switch(x, 2+2, sum(1:10), max(1:10))

SELECT THE CORRECT ANSWER

Top of Form

Bottom of Form

What will be output of this code x =c(51,93,8,67,22) x[x>50 & x<90]

SELECT THE CORRECT ANSWER

Top of Form

51 68

51 67

numeric(0)

8 22

Bottom of Form

Which function will give statistics like Min and Max values, Quartiles, Mean and Median

SELECT THE CORRECT ANSWER

Top of Form

Quantile

Mean

Median

Summary

Bottom of Form

What is the function of "describe.by" in this code- describe.by(iris,group=iris$Species)

SELECT THE CORRECT ANSWER

Top of Form

Summary statistics for each specie separately.

Overall summary statistics

Both a and b

Neither a or b

Bottom of Form

CustomerID = c(1098,1099)ProductOrdered = c('Fan','Washing Machine')What should be code to get below view:

SELECT THE CORRECT ANSWER

Top of Form

cbind(CustomerID,ProductOrdered)

rbind(CustomerID,ProductOrdered)

data.frame(CustomerID,ProductOrdered)

Either a or c

Bottom of Form

Can this dataframe named CustDetails be converted into matrix

SELECT THE CORRECT ANSWER

Top of Form

TRUE

FALSE

Bottom of Form

What will be class of variable CustomerID post conversion of CustDetails from data frame to matrix

SELECT THE CORRECT ANSWER

Top of Form

Numeric

Character

Integer

None of above

Bottom of Form

If we want to take mean of all variables in data frame without writing mean function for all variables separately what should be the function?

SELECT THE CORRECT ANSWER

Top of Form

mean

apply

ifelse

ColSums

Bottom of Form

Which statement is true about lapply function:

SELECT THE CORRECT ANSWER

Top of Form

lapply is a list apply which acts on a list or vector and returns a list

lapply returns a numeric vector

Bottom of Form

Which of following statement is true about sapply:

SELECT THE CORRECT ANSWER

Top of Form

Works for all variables in a dataframe and The output is a numeric

Works for only one variable

Is applicable to only numeric variables

None of above

Bottom of Form

What will be output of this code apply(iris[,1:4],2,mean)

SELECT THE CORRECT ANSWER

Top of Form

mean of all numeric columns

mean by rows for all observations

mean of selected columns i.e 1 to 4 only

mean of columns 1 to 3

Bottom of Form

What will be output of this code apply(iris[,1:4],1,mean)

SELECT THE CORRECT ANSWER

Top of Form

mean of all numeric columns

mean by rows for all observations for first 4 columns only

row means for all columns

None of above

Bottom of Form

Which are the measures of central tendency ?

SELECT THE CORRECT ANSWER

Top of Form

Mean, kurtosis and skewness

Mean, Median and Mode

Mode & Range

Standard Deviation and Range

Bottom of Form

What is the formula for standard deviation of a sample? (Please check the image and click on the correct answer)

SELECT THE CORRECT ANSWER

Top of Form

Bottom of Form

If a positively skewed distribution has a median as 75, which of the following statement is true?

SELECT THE CORRECT ANSWER

Top of Form

Mean greater than 75

Mode is less than 75

Mode is greater than 75

Both a and b

Bottom of Form

What does high standard deviation represent?

SELECT THE CORRECT ANSWER

Top of Form

Data is less scattered

Data is highly scattered

Bottom of Form

What would be the critical values of Z for 95% confidence interval for a two-tailed test?

SELECT THE CORRECT ANSWER

Top of Form

+/- 2.33

+/- 1.96

+/- 1.64

+/- 2.55

Bottom of Form

In a normal distribution, What percentage of data is between -1 to + 1 standard deviation

SELECT THE CORRECT ANSWER

Top of Form

68%

90%

34%

95%

Bottom of Form

Which graph has strong positive correlation? ( Please check the image and click on the correct answer)

SELECT THE CORRECT ANSWER

Top of Form

Bottom of Form

Overall average aptitude of Simplilearn participants is 75 out of 100 over last 2 years. If we look at a particular batch say starting in May'18 we may observe that average aptitude of this batch is 72. Knowing the population standard deviation is 5, what will be the CI range.

SELECT THE CORRECT ANSWER

Top of Form

65.2 to 84.8

66.8 to 83.2

63.25 to 86.75

60 to 90

Bottom of Form

Which statement is true about K fold cross validation:

SELECT THE CORRECT ANSWER(S)

Top of Form

It is an extension of cross validation and Original sample data is split into k random samples of equal sizes each

Only Two datasets training and test in 70:30 ratio are divided

It is a model validation technique

It gives robust measure of equation/ model testing

Bottom of Form

What happens when we move from simple linear regression to multiple linear regression?

SELECT THE CORRECT ANSWER

Top of Form

The r squared will increase or remain constant

The r squared may decrease

Both r square and adjusted r square always increase

R square might increase or decrease depending on the variables significance.

Bottom of Form

Is this statement correct In a normal distribution Mean = Median = Mode

SELECT THE CORRECT ANSWER

Top of Form

TRUE

FALSE

Bottom of Form

Pearson correlation is used where there is linear relationship between 2 variables however spearmen correlation is used for monotonic relation. Is this correct?

SELECT THE CORRECT ANSWER

Top of Form

TRUE

FALSE

Bottom of Form

How are errors calculated in linear regression?

SELECT THE CORRECT ANSWER

Top of Form

Error = Predicted Y - Actual Y

Error = Actual Y - Predicted Y

Error = (Actual Y - Predicted Y)2

Error = ?(Actual Y - Predicted Y)2

Bottom of Form

Which equation is this? Y = B0+B1*X1+ B2*X2+ B11*X12 + B22*X22 + B12*X1*X2 +Err

SELECT THE CORRECT ANSWER

Top of Form

Simple linear regression

Multiple linear regression

Logarithmic regression

Polynomial regression

Bottom of Form

The independent variable X is also called:

SELECT THE CORRECT ANSWER

Top of Form

Predictor variable

Predicted variable

Residual

Target variable

Bottom of Form

What will be the modeling technique used to predict a categorical variable. Select all that apply

SELECT THE CORRECT ANSWER(S)

Top of Form

Linear regression

ANOVA

Logistic Regression

SVM

Bottom of Form

What is the syntax for linear regression model?

SELECT THE CORRECT ANSWER

Top of Form

Aov

Glm

Rpart

Bottom of Form

Which statistical test/ distribution is non parametric

SELECT THE CORRECT ANSWER(S)

Top of Form

Chisquare

Normal distribution

Kruskal wallis

Mann Whitney

Bottom of Form

Is this statement correct: Decision trees can be used only to predict a binary or categorical variable.

SELECT THE CORRECT ANSWER

Top of Form

TRUE

FALSE

Bottom of Form

What are the methods to avoid overfitting in Decision Trees?

SELECT THE CORRECT ANSWER

Top of Form

Pre pruning

Post pruning

Both

None

Bottom of Form

How is information gain calculated in case of continuous variable. The steps are 1. Find the middle point in first 2 numbers 2. Sort the data in increasing order Is the sequence correct?

SELECT THE CORRECT ANSWER

Top of Form

Yes the sequence is correct

No the sequence should be 2nd and then 1st

Bottom of Form

Support vector machine is a:

SELECT THE CORRECT ANSWER

Top of Form

Machine learning algorithm

Statistical Algorithm

Bottom of Form

How many nodes are there in a decision tree:

SELECT THE CORRECT ANSWER

Top of Form

Root node only

Root, Branch and leaf nodes

Branch and leaf nodes only

None of above

Bottom of Form

Which of the following are called disjoint clusters?

SELECT THE CORRECT ANSWER

Top of Form

Kmeans cluster

Hierarchical clusters

Agglomerative clusters

None of above

Bottom of Form

Is this statement correct: Clustering is a unsupervised learning algorithm.

SELECT THE CORRECT ANSWER

Top of Form

TRUE

FALSE

Bottom of Form

Below are the steps for Kmeans clustering. Arrange it in right order: 1. The clusters are validated with actual centroids 2. Random seeds are joined with virtual lines and perpendicular bisectors are drawn 3. Random seeds are assigned 4. Area separated by perpendicular bisector are the clusters

SELECT THE CORRECT ANSWER

Top of Form

3,2,4,1

1,3,4,2

1,2,3,4

4,1,2,3

Bottom of Form

Which of the following statement (s) is true about clustering?

SELECT THE CORRECT ANSWER(S)

Top of Form

As we increase the number of clusters the R square increases

Clustering is unsupervised learning algorithm

Hierarchical clusters are more applicable to find out distance between different cities/ states.

Clustering is used to predict continuous variable

Bottom of Form

How will be the Euclidean distance calculated between 2 coordinates viz. (23,30) and (22,31)

SELECT THE CORRECT ANSWER

Top of Form

?(23-22)2 + (30-31)2

?(22-24)2 + (30-31)2

23-22 = 1

30-31 = -1

Bottom of Form

Which of the following statement is true?

SELECT THE CORRECT ANSWER

Top of Form

The Manhattan distance between 2 x and y coordinates will be more than euclidean distance

The euclidean distance between 2 x and y coordinates will be more than Manhattan distance

Bottom of Form

What is true about K-Means Clustering algorithm?

SELECT THE CORRECT ANSWER

Top of Form

K-means is sensitive to cluster center initializations

Bad initialization can lead to more number of iterations

Bad initialization can lead to less robust clusters

All of the above

Bottom of Form

Is this code correct: KmeansCluster = kmeans(Cust_Segment_Data)

SELECT THE CORRECT ANSWER

Top of Form

Yes

No, syntax center = 3 is missing

No, The distance measure needs to be specified as Manhattan

No, The distance measure needs to be specified as Euclidian

Bottom of Form

What is the prerequisite for hierarchical clustering?

SELECT THE CORRECT ANSWER(S)

Top of Form

Calculation of Distance matrix

Standardizing the variables

Removing missing values

All of above

Bottom of Form

What are the 2 major components of dbscan clustering?

SELECT THE CORRECT ANSWER

Top of Form

Eps and Minpts

Eps and Distance between centroids

Minpts and Within distance measure

Eps, Minpts, Distance between Centroids, Within distance measure

Bottom of Form

Which of clustering algorithms is called bottoms up approach?

SELECT THE CORRECT ANSWER

Top of Form

Kmeans

Fuzzy

Hierarchical

Dbscan

Bottom of Form

A retail chain store would like to roll out offer on multiple products together based on similarity of products. What will be best approach to provide solution to this business problem.

SELECT THE CORRECT ANSWER

Top of Form

Clustering

Regression

Market basket Analysis

Hypothesis testing

Bottom of Form

While reading/importing a transactional data for market basket analysis what should be kept in mind

SELECT THE CORRECT ANSWER

Top of Form

The data has to be in transactional format and should be read with syntax read.transactions

The data should be imported with read.csv or read.xlsx

Either of above

None of above

Bottom of Form

Which algorithm is used to find out association between products that a customer can buy at a retail outlet.

SELECT THE CORRECT ANSWER

Top of Form

Apriori algorithm

Kmeans algorithm

Random forest algorithm

Bootstrapping

Bottom of Form

In Market basket analysis how is support statistic derived. Choose the right answer

SELECT THE CORRECT ANSWER

Top of Form

For a Product A support = Number of transactions in which A was present/ Total number of transactions in a retail store

For a Product A support = Number of transactions in which A was NOT present/ Total number of transactions in a retail store

1/Total number of distinct products

None of above

Bottom of Form

In Market basket analysis how is confidence statistic derived between product A and B. Choose the right answer

SELECT THE CORRECT ANSWER

Top of Form

Number of transactions in which A & B were bought/ Number of transactions in which A alone was bought.

Number of transactions in which A & B were bought/ Total number of transactions for all the products

Number of transactions in which A was bought/ Number of transactions in which A & B were bought

Number of transactions in which B was bought/ Number of transactions in which A & B were bought

Bottom of Form

What is significance of confidence statistic?

SELECT THE CORRECT ANSWER

Top of Form

Higher the confidence higher are the chances that the defined 2 or more products will be bought together

Higher the confidence lower are the chances that the defined 2 or more products will be bought together

There is inverse relation ship between confidence and likelihood of purchase of 2nd product.

Bottom of Form

What is the syntax used to view the number of rules derived in apriori algorithm?

SELECT THE CORRECT ANSWER

Top of Form

Summary

Max

Inspect

itemFr

Bottom of Form

In an apriori algorithm, what is the syntax used to view the rules with lhs and rhs ?

SELECT THE CORRECT ANSWER

Top of Form

Summary

Max

Inspect

itemFr

Bottom of Form

What is applicability of association rules?

SELECT THE CORRECT ANSWER(S)

Top of Form

Market basket analysis

Purchase data analysis

Website traffic analysis

Predict cost of a product

Bottom of Form

What cutoffs are defined in apriori algorithm explicitly while deriving association rules?

SELECT THE CORRECT ANSWER

Top of Form

Support and confidence

Support and lift

Support, confidence and lift

None of above

Bottom of Form

While using a clustering algorithm how would you decide the ideal number of clusters:

SELECT THE CORRECT ANSWER

Top of Form

If the increase in clusters does not give a substantial increase in rsquare we should not increase the number of clusters.

We should keep on increasing number of clusters till the time we get maximum rsquare.

The rsquare is normally ideal at 5 number of clusters

Bottom of Form

What are the primary factors which influence the rsquare in clustering?

SELECT THE CORRECT ANSWER

Top of Form

Between distance - Distance between centroids

Within distance - Average Distance between observation in each cluster from its centroid.

Both a and c

None of above

Bottom of Form

Out of 3000 products in a retail store Product A has a support statistics of .001. What does this represent?

SELECT THE CORRECT ANSWER

Top of Form

Product A is not very popular and does not have lot of transactional volume

Product A is running short in supply

Product A is popular and is bought frequently

None of above

Bottom of Form

ANOVA is used in which circumstances?

SELECT THE CORRECT ANSWER

Top of Form

Trying to predict a categorical variable

When Independent variables are categorical and dependent variable is continuous

When both independent and dependent variables are categorical

When both independent and dependent variables are continuous

Bottom of Form

In a campaign analysis, What does probability value of .003 signify in an ANOVA test?

SELECT THE CORRECT ANSWER

Top of Form

Either of campaign is different from another campaign

One of campaign is better than other campaigns

Either of a or b

All the campaigns are alike and there is no statistically significant difference in campaigns.

Bottom of Form

100

Is this formula correct for calculating F statistic

SELECT THE CORRECT ANSWER

Top of Form

TRUE

FALSE

Bottom of Form

101

Which of the following is not correct?

SELECT THE CORRECT ANSWER

Top of Form

Logistic regression is used to predict binary variables

Market basket analysis is used to predict continuous variables

Decision trees and Random forest predicts classification variables

Clustering is unsupervised learning

Bottom of Form

102

Predicting the sales volume is what kind of statistical problem:

SELECT THE CORRECT ANSWER

Top of Form

Classification

Prediction

Market basket analysis

Segmentation

Bottom of Form

Option 1

Low Cost Option

Download this past answer in few clicks

26.99 USD

PURCHASE SOLUTION

Already member? Sign In

Option 2

Custom new solution created by our subject matter experts

GET A QUOTE

rated 5 stars

Purchased 8 times

Completion Status 100%

Google (5.0)

Statistics

Option 1

Low Cost Option

Download this past answer in few clicks

26.99 USD

PURCHASE SOLUTION

Option 2

Custom new solution created by our subject matter experts

GET A QUOTE

rated 5 stars

View Answer

Sitejabber (5.0)

BBC (5.0)

Trustpilot (4.9)

Related Questions

menu

Statistics

Option 1

Low Cost Option

Download this past answer in few clicks

26.99 USD

PURCHASE SOLUTION

Option 2

Custom new solution created by our subject matter experts

GET A QUOTE

rated 5 stars

View Answer

Sitejabber (5.0)

BBC (5.0)

Trustpilot (4.9)

Google (5.0)

Related Questions