Fill This Form To Receive Instant Help

#### 1 What will be output of tt <- sort(table(c("a", "b", "a", "a", "b", "c", "a1", "a1", "a1")), dec=T) SELECT THE CORRECT ANSWER Top of Form a a1 b c 3 3 2 1 3 3 2 1 a a1 b c 2 1 4 1 a a1 b c 7 3 2 1 Bottom of Form   2 The frequency distribution of a categorical variable can be checked using the table function in R language

###### Statistics

1

What will be output of tt <- sort(table(c("a", "b", "a", "a", "b", "c", "a1", "a1", "a1")), dec=T)

Top of Form

a a1 b c 3 3 2 1

3 3 2 1

a a1 b c 2 1 4 1

a a1 b c 7 3 2 1

Bottom of Form

2

The frequency distribution of a categorical variable can be checked using the table function in R language. gender=factor(c('M','F','M','F','F','F')) What would be the reference level by default?

Top of Form

M

F

Both 1 & 2 above

None of above

Bottom of Form

3

What will be the result of multiplying two vectors in R having different lengths? x = c(1,56,8) y = c(8,2,11,8) x*y

Top of Form

[1] 8 112 88 NA

[1] 8 112 88

[1] 8

[1] 8 112 88 Warning message:

Bottom of Form

4

What would be output of a in loop: for (year in 1:5) { Yr=print(year) }

Top of Form

[1] 1 [1] 2 [1] 3 [1] 4 [1] 5

[1] 1

[1] 5

[1] 1 [1] 5

Bottom of Form

5

While running syntax View(Boston) system gives error - Error in View : object 'Boston' not found. What could be possible reason for this error?

Top of Form

R needs to be rebooted

Is there any other package required which does not exist in CRAN

Bottom of Form

6

While running a numeric expression x = 1/4000 the denotion given in output is scientific: [1] 1e-04. What should be appropriate syntax to visualize decimal places?

Top of Form

informat(x,sci=FALSE)

format(x,sci=FALSE)

round(x)

format(x,format(.001))

Bottom of Form

7

To find out position or element number of maximum value in below dataframe DataFrame1=data.frame(v1 = c(2,4,12,3,6)) . What should be the appropriate code?

Top of Form

which(max(DataFrame1\$v1))

max.pos(DataFrame1\$v1))

max(DataFrame1\$v1)

which(DataFrame1\$v1==max(DataFrame1\$v1))

Bottom of Form

8

Multiple objects are created in R session for e.g EmployeeData, Sales_recs. What is the code to save these objects?

Top of Form

save(EmployeeData, Sales_recs, file="Mydata.RData")

write.csv(EmployeeData, Sales_recs, "Mydatasets.csv)

print(EmployeeData, Sales_recs)

Bottom of Form

9

Is R a case sensitive language?

Top of Form

TRUE

FALSE

Bottom of Form

10

A user defined function is created x=function(s) { if (s>0) { print("+ve") } else { print("-ve") } } What would be output of x(-1)?

Top of Form

"-ve"

"+ve"

NaN

R will throw error

Bottom of Form

11

What is the usage of switch function?

Top of Form

It can be used interchangeably with ifelse function

It is used to switch numeric values to any other class type

It is used to replace any character string

It is used to concatenate strings

Bottom of Form

12

What is the difference between transform and mutate function?

Top of Form

There is no difference both of them can be used in place of each other

Both Transform and mutate functions are used to create a new variable however in transform function one can not recalculate a newly calculated variable. In mutate this can be performed.

Transform is used to convert a variable type

Mutate is used to concatenate text string

Bottom of Form

13

What is the statistical method used in R to predict a classification variable?

Top of Form

Linear regression

Decision Tree

Logistic Regression

SVM

Bottom of Form

14

In a 2 tail test in normal distribution what will be CI limit range on the left and right side of bell curve considering 5% significance level?

Top of Form

2.5% to 97.5%

5% to 95%

36% on left to 36% of data on right

None of above

Bottom of Form

15

What is significance of R-square in linear regression?

Top of Form

Explains the better prediction capability

Represents accuracy of model

Represents high likely hood estimates

Ranges from scale of 0 to 1

Bottom of Form

16

Suppose there is a CustomerOrder table with CustomerID, OrderDate and Amount_Paid. What will be the code to remove duplicate entries across CustomerID, OrderDate?

Top of Form

Sort(CustomerOrder,unique(CustomerOrder [c("CustomerID "," OrderDate ")]))

Sort(CustomerOrder,arrange(CustomerOrder,CustomerID)[1,])

Sort(CustomerOrder,arrange(CustomerOrder,CustomerID,OrderDate)[1,])

None of above

Bottom of Form

17

What would be class of this object. LogicalVector = c(TRUE,FALSE,0,1)

Top of Form

Logical

Numeric

Character

list

Bottom of Form

18

Which of the following is an invalid assignment?

Top of Form

M1 <-matrix(nrow=2, ncol=3)

M1 <-matrix(nrow=2, ncol=3.5)

M1 <-mat(nrow=2, ncol=3)

M1 <-mat(nrow=3, ncol=3)

Bottom of Form

19

What would be output for code m = matrix(1:6,nrow=2,ncol=3,byrow=TRUE)

Top of Form

[,1] [,2] [,3] [1,] 1 2 3 [2,] 4 5 6

[,1] [,2] [,3] [1,] 1 3 5 [2,] 2 4 6

[,1] [,2] [1,] 1 4 [2,] 2 5 [3,] 3 6

[,1] [,2] [1,] 1 2 [2,] 3 4 [3,] 5 6

Bottom of Form

20

What would following code print rep(1:10,2)

Top of Form

[1] 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10

[1] 1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9 10 10

[1] 1 1 2 2 3 4 5 6 7 8 9 10

[1] 1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9 10 10 1 1

Bottom of Form

21

What would the following code print? x <-1:9 y <-15:17 >cbind(x, y)[1,]

Top of Form

x y 1 15

x y 2 16

[1] 1 2 3 4 5 6 7 8 9

x y 9 17

Bottom of Form

22

The ____ function concatenates 2 or more character strings:

Top of Form

paste()

bind()

mutate()

arrange()

Bottom of Form

23

Which of the following statements is true about character strings in R programming?

Top of Form

Character strings are entered with either matching double (") or single (') quotes.

Character vectors may be appended into a vector by the c() function.

None of the above

Both a and b

Bottom of Form

24

A programmer writes code in single line as x = 1 y=2. Which of the following statements matches this statement?

Top of Form

This code is correct.

It should be written as x = 1; y=2

R is a free text language and codes can be written in single line

x = 1 > y=2

Bottom of Form

25

What will be result of the code: list(5, "John", TRUE, 1+ 9i)

Top of Form

[[1]] [1] 5 [[2]] [1] "John" [[3]] [1] TRUE [[4]] [1] 1+9i

[1] "5" "John" "TRUE" "1+9i"

[1] 5

Error: unexpected ','

Bottom of Form

26

Identify correct statement in R

Top of Form

Matrix are 2 dimensional and can have elements of different data types

Vector are 1 dimensional and can have elements of same data types

Data frame are 2 dimensional

Lists can have elements of different data types

Bottom of Form

27

Which is valid syntax to convert a character vector to numeric x = c("1","5","98","23")

Top of Form

as.num(x)

as.numeric(x)

as.factor(x)

as.char(x)

Bottom of Form

28

What would be output of below code x = list(78,"Tim",101,8i) length(x)

Top of Form

10

7

4

0

Bottom of Form

29

What would be class of below object as.character(factor(c("No", "yes", "no", "yes", "no")))

Top of Form

character

factor

numeric

list

Bottom of Form

30

What would be output of sum(2,8,9,NA)

Top of Form

19

NA

Compilation error

None of above

Bottom of Form

31

What would be output of c(T,F,TRUE,1)

Top of Form

[1] 1 0 1 1

[1] TRUE FALSE TRUE TRUE

[1] T,F,TRUE,1

[1] TRUE,FALSE,TRUE,1

Bottom of Form

32

How many entries will be in this code EmpRecs = data.frame(EmpID = c(101,102,103) , Name = c("John","Theresa","Andy","Paul"))

Top of Form

3

4

0

R will throw Error

Bottom of Form

33

How to convert a numeric vector in to factor permanently x= c(98,2,67,87)

Top of Form

x = as.factor(x)

x = as.fac(x)

as.factor(x)

as.fac(x)

Bottom of Form

34

Which statement about R is correct?

Top of Form

Character strings are imported as factors by default.

Factors are used to represent categorical data and can be unordered or ordered.

Neither a nor b is correct.

Both a and b are correct.

Bottom of Form

35

What would be the output of sum(2,8,9,NA,na.rm=T)?

Top of Form

19

NA

None of above

Bottom of Form

36

Which statement is correct about loops in R?

Top of Form

To exit a repeat loop is to call break.

Infinite loops should be avoided.

Neither a nor b is correct.

Both a and b are correct.

Bottom of Form

37

Which of the following code stops a loop after 20 iterations?

Top of Form

for(i in 1:100){ if(i >20){ break } print(i) }

for(i in 1:100){ if(i >19){ break } print(i) }

for(i in 1:100){ if(i <20){ break } print(i) }

for(i in 1:100){ if(i <19){ break } print(i) }

Bottom of Form

38

What will be output of below code for(i in seq(from=1,to=10,by=2)) { Variable1=print(i) }

Top of Form

[1] 1 [1] 3 [1] 5 [1] 7 [1] 9

[1] 1 [1] 2 [1] 3 [1] 4 [1] 5 [1] 6 [1] 7 [1] 8 [1] 9 [1] 10

[1] 1 [1] 2 [1] 3 [1] 4 [1] 5 [1] 6 [1] 7 [1] 8 [1] 9 [1] 6 [1] 7 [1] 8 [1] 9

[1] 1 [1] 4 [1] 7

Bottom of Form

39

What will be the output of the this code? x <-1 switch(x, 2+2, sum(1:10), max(1:10))

Top of Form

55

4

10

NA

Bottom of Form

40

What will be output of this code x =c(51,93,8,67,22) x[x>50 & x<90]

Top of Form

51 68

51 67

numeric(0)

8 22

Bottom of Form

41

Which function will give statistics like Min and Max values, Quartiles, Mean and Median

Top of Form

Quantile

Mean

Median

Summary

Bottom of Form

42

What is the function of "describe.by" in this code- describe.by(iris,group=iris\$Species)

Top of Form

Summary statistics for each specie separately.

Overall summary statistics

Both a and b

Neither a or b

Bottom of Form

43

CustomerID = c(1098,1099)ProductOrdered = c('Fan','Washing Machine')What should be code to get below view:

Top of Form

cbind(CustomerID,ProductOrdered)

rbind(CustomerID,ProductOrdered)

data.frame(CustomerID,ProductOrdered)

Either a or c

Bottom of Form

44

Can this dataframe named CustDetails be converted into matrix

Top of Form

TRUE

FALSE

Bottom of Form

45

What will be class of variable CustomerID post conversion of CustDetails from data frame to matrix

Top of Form

Numeric

Character

Integer

None of above

Bottom of Form

46

If we want to take mean of all variables in data frame without writing mean function for all variables separately what should be the function?

Top of Form

mean

apply

ifelse

ColSums

Bottom of Form

47

Which statement is true about lapply function:

Top of Form

lapply is a list apply which acts on a list or vector and returns a list

lapply returns a numeric vector

Bottom of Form

48

Which of following statement is true about sapply:

Top of Form

Works for all variables in a dataframe and The output is a numeric

Works for only one variable

Is applicable to only numeric variables

None of above

Bottom of Form

49

What will be output of this code apply(iris[,1:4],2,mean)

Top of Form

mean of all numeric columns

mean by rows for all observations

mean of selected columns i.e 1 to 4 only

mean of columns 1 to 3

Bottom of Form

50

What will be output of this code apply(iris[,1:4],1,mean)

Top of Form

mean of all numeric columns

mean by rows for all observations for first 4 columns only

row means for all columns

None of above

Bottom of Form

51

Which are the measures of central tendency ?

Top of Form

Mean, kurtosis and skewness

Mean, Median and Mode

Mode & Range

Standard Deviation and Range

Bottom of Form

52

What is the formula for standard deviation of a sample? (Please check the image and click on the correct answer)

Top of Form

D

C

B

A

Bottom of Form

53

If a positively skewed distribution has a median as 75, which of the following statement is true?

Top of Form

Mean greater than 75

Mode is less than 75

Mode is greater than 75

Both a and b

Bottom of Form

54

What does high standard deviation represent?

Top of Form

Data is less scattered

Data is highly scattered

Bottom of Form

55

What would be the critical values of Z for 95% confidence interval for a two-tailed test?

Top of Form

+/- 2.33

+/- 1.96

+/- 1.64

+/- 2.55

Bottom of Form

56

In a normal distribution, What percentage of data is between -1 to + 1 standard deviation

Top of Form

68%

90%

34%

95%

Bottom of Form

57

Which graph has strong positive correlation? ( Please check the image and click on the correct answer)

Top of Form

D

C

B

A

Bottom of Form

58

Overall average aptitude of Simplilearn participants is 75 out of 100 over last 2 years. If we look at a particular batch say starting in May'18 we may observe that average aptitude of this batch is 72. Knowing the population standard deviation is 5, what will be the CI range.

Top of Form

65.2 to 84.8

66.8 to 83.2

63.25 to 86.75

60 to 90

Bottom of Form

59

Which statement is true about K fold cross validation:

Top of Form

It is an extension of cross validation and Original sample data is split into k random samples of equal sizes each

Only Two datasets training and test in 70:30 ratio are divided

It is a model validation technique

It gives robust measure of equation/ model testing

Bottom of Form

60

What happens when we move from simple linear regression to multiple linear regression?

Top of Form

The r squared will increase or remain constant

The r squared may decrease

Both r square and adjusted r square always increase

R square might increase or decrease depending on the variables significance.

Bottom of Form

61

Is this statement correct In a normal distribution Mean = Median = Mode

Top of Form

TRUE

FALSE

Bottom of Form

62

Pearson correlation is used where there is linear relationship between 2 variables however spearmen correlation is used for monotonic relation. Is this correct?

Top of Form

TRUE

FALSE

Bottom of Form

63

How are errors calculated in linear regression?

Top of Form

Error = Predicted Y - Actual Y

Error = Actual Y - Predicted Y

Error = (Actual Y - Predicted Y)2

Error = ?(Actual Y - Predicted Y)2

Bottom of Form

64

Which equation is this? Y = B0+B1*X1+ B2*X2+ B11*X12 + B22*X22 + B12*X1*X2 +Err

Top of Form

Simple linear regression

Multiple linear regression

Logarithmic regression

Polynomial regression

Bottom of Form

65

The independent variable X is also called:

Top of Form

Predictor variable

Predicted variable

Residual

Target variable

Bottom of Form

66

What will be the modeling technique used to predict a categorical variable. Select all that apply

Top of Form

Linear regression

ANOVA

Logistic Regression

SVM

Bottom of Form

67

What is the syntax for linear regression model?

Top of Form

Lm

Aov

Glm

Rpart

Bottom of Form

68

Which statistical test/ distribution is non parametric

Top of Form

Chisquare

Normal distribution

Kruskal wallis

Mann Whitney

Bottom of Form

69

Is this statement correct: Decision trees can be used only to predict a binary or categorical variable.

Top of Form

TRUE

FALSE

Bottom of Form

70

What are the methods to avoid overfitting in Decision Trees?

Top of Form

Pre pruning

Post pruning

Both

None

Bottom of Form

71

How is information gain calculated in case of continuous variable. The steps are 1. Find the middle point in first 2 numbers 2. Sort the data in increasing order Is the sequence correct?

Top of Form

Yes the sequence is correct

No the sequence should be 2nd and then 1st

Bottom of Form

72

Support vector machine is a:

Top of Form

Machine learning algorithm

Statistical Algorithm

Bottom of Form

73

How many nodes are there in a decision tree:

Top of Form

Root node only

Root, Branch and leaf nodes

Branch and leaf nodes only

None of above

Bottom of Form

74

Which of the following are called disjoint clusters?

Top of Form

Kmeans cluster

Hierarchical clusters

Agglomerative clusters

None of above

Bottom of Form

75

Is this statement correct: Clustering is a unsupervised learning algorithm.

Top of Form

TRUE

FALSE

Bottom of Form

76

Below are the steps for Kmeans clustering. Arrange it in right order: 1. The clusters are validated with actual centroids 2. Random seeds are joined with virtual lines and perpendicular bisectors are drawn 3. Random seeds are assigned 4. Area separated by perpendicular bisector are the clusters

Top of Form

3,2,4,1

1,3,4,2

1,2,3,4

4,1,2,3

Bottom of Form

77

Which of the following statement (s) is true about clustering?

Top of Form

As we increase the number of clusters the R square increases

Clustering is unsupervised learning algorithm

Hierarchical clusters are more applicable to find out distance between different cities/ states.

Clustering is used to predict continuous variable

Bottom of Form

78

How will be the Euclidean distance calculated between 2 coordinates viz. (23,30) and (22,31)

Top of Form

?(23-22)2 + (30-31)2

?(22-24)2 + (30-31)2

23-22 = 1

30-31 = -1

Bottom of Form

79

Which of the following statement is true?

Top of Form

The Manhattan distance between 2 x and y coordinates will be more than euclidean distance

The euclidean distance between 2 x and y coordinates will be more than Manhattan distance

Bottom of Form

80

What is true about K-Means Clustering algorithm?

Top of Form

K-means is sensitive to cluster center initializations

All of the above

Bottom of Form

81

Is this code correct: KmeansCluster = kmeans(Cust_Segment_Data)

Top of Form

Yes

No, syntax center = 3 is missing

No, The distance measure needs to be specified as Manhattan

No, The distance measure needs to be specified as Euclidian

Bottom of Form

82

What is the prerequisite for hierarchical clustering?

Top of Form

Calculation of Distance matrix

Standardizing the variables

Removing missing values

All of above

Bottom of Form

83

What are the 2 major components of dbscan clustering?

Top of Form

Eps and Minpts

Eps and Distance between centroids

Minpts and Within distance measure

Eps, Minpts, Distance between Centroids, Within distance measure

Bottom of Form

84

Which of clustering algorithms is called bottoms up approach?

Top of Form

Kmeans

Fuzzy

Hierarchical

Dbscan

Bottom of Form

85

A retail chain store would like to roll out offer on multiple products together based on similarity of products. What will be best approach to provide solution to this business problem.

Top of Form

Clustering

Regression

Hypothesis testing

Bottom of Form

86

While reading/importing a transactional data for market basket analysis what should be kept in mind

Top of Form

The data has to be in transactional format and should be read with syntax read.transactions

Either of above

None of above

Bottom of Form

87

Which algorithm is used to find out association between products that a customer can buy at a retail outlet.

Top of Form

Apriori algorithm

Kmeans algorithm

Random forest algorithm

Bootstrapping

Bottom of Form

88

In Market basket analysis how is support statistic derived. Choose the right answer

Top of Form

For a Product A support = Number of transactions in which A was present/ Total number of transactions in a retail store

For a Product A support = Number of transactions in which A was NOT present/ Total number of transactions in a retail store

1/Total number of distinct products

None of above

Bottom of Form

89

In Market basket analysis how is confidence statistic derived between product A and B. Choose the right answer

Top of Form

Number of transactions in which A & B were bought/ Number of transactions in which A alone was bought.

Number of transactions in which A & B were bought/ Total number of transactions for all the products

Number of transactions in which A was bought/ Number of transactions in which A & B were bought

Number of transactions in which B was bought/ Number of transactions in which A & B were bought

Bottom of Form

90

What is significance of confidence statistic?

Top of Form

Higher the confidence higher are the chances that the defined 2 or more products will be bought together

Higher the confidence lower are the chances that the defined 2 or more products will be bought together

There is inverse relation ship between confidence and likelihood of purchase of 2nd product.

Bottom of Form

91

What is the syntax used to view the number of rules derived in apriori algorithm?

Top of Form

Summary

Max

Inspect

itemFr

Bottom of Form

92

In an apriori algorithm, what is the syntax used to view the rules with lhs and rhs ?

Top of Form

Summary

Max

Inspect

itemFr

Bottom of Form

93

What is applicability of association rules?

Top of Form

Purchase data analysis

Website traffic analysis

Predict cost of a product

Bottom of Form

94

What cutoffs are defined in apriori algorithm explicitly while deriving association rules?

Top of Form

Support and confidence

Support and lift

Support, confidence and lift

None of above

Bottom of Form

95

While using a clustering algorithm how would you decide the ideal number of clusters:

Top of Form

If the increase in clusters does not give a substantial increase in rsquare we should not increase the number of clusters.

We should keep on increasing number of clusters till the time we get maximum rsquare.

The rsquare is normally ideal at 5 number of clusters

Bottom of Form

96

What are the primary factors which influence the rsquare in clustering?

Top of Form

Between distance - Distance between centroids

Within distance - Average Distance between observation in each cluster from its centroid.

Both a and c

None of above

Bottom of Form

97

Out of 3000 products in a retail store Product A has a support statistics of .001. What does this represent?

Top of Form

Product A is not very popular and does not have lot of transactional volume

Product A is running short in supply

Product A is popular and is bought frequently

None of above

Bottom of Form

98

ANOVA is used in which circumstances?

Top of Form

Trying to predict a categorical variable

When Independent variables are categorical and dependent variable is continuous

When both independent and dependent variables are categorical

When both independent and dependent variables are continuous

Bottom of Form

99

In a campaign analysis, What does probability value of .003 signify in an ANOVA test?

Top of Form

Either of campaign is different from another campaign

One of campaign is better than other campaigns

Either of a or b

All the campaigns are alike and there is no statistically significant difference in campaigns.

Bottom of Form

100

Is this formula correct for calculating F statistic

Top of Form

TRUE

FALSE

Bottom of Form

101

Which of the following is not correct?

Top of Form

Logistic regression is used to predict binary variables

Market basket analysis is used to predict continuous variables

Decision trees and Random forest predicts classification variables

Clustering is unsupervised learning

Bottom of Form

102

Predicting the sales volume is what kind of statistical problem:

Top of Form

Classification

Prediction

Segmentation

Bottom of Form

## 26.99 USD

### Option 2

#### rated 5 stars

Purchased 8 times

Completion Status 100%