Fill This Form To Receive Instant Help

Help in Homework
trustpilot ratings
google ratings


Homework answers / question archive / STAT 6031            Instructor: Emily L

STAT 6031            Instructor: Emily L

Statistics

STAT 6031            Instructor: Emily L. Kang                Fall 2015

STAT 6031

Midterm Examination 2

Student name (Last, First):

 

Student ID:

 

This exam is closed book and notes. One sheet of paper (both sides but nothing photocopied) can be used. You will need a calculator.

There are four problems on this exam worth a total of 100 points. Please read each question carefully and ask the instructor if you have any questions. For Question 2 through Question 4, be sure to show all your work and explain all answers clearly and fully in proper English. All answers related to interpretation or conclusions must be in terms of the problem. Points values for each part are given in square brackets before the question.

Before turning your examination, please acknowledge that no unauthorized aid has been received, by signing and dating the following pledge (from the UC Code of Academic Integrity)

Signature:     Date:

 

 

  1. [30 × 7] Special instruction for this question: You DO NOT have to show your work for ( a )

 

through (g); only your answers in boxes will matter.

    1. Suppose the estimated regression equation is Yˆ = 3X + 2 . If the analysis was rerun with X replaced with U = 0.5X + 2, give the estimated regression equation

Answer: 

    1. In a simple linear regression there are two intervals associated with X = 3.5: (14.5,16.5) and (13,18), one of which is the confidence interval for the mean response at this level of X, while the other is the prediction interval. Which is the prediction interval?

Answer: 

    1. Indicate the following statement is True (T) or False (F): If the value of R2 is larger than 90%, then it is not necessary to do a residual analysis.

Answer: 

    1. Indicate the following statement is True (T) or False (F): A transformation of the explanatory variable, X, is required if the original X-values do not appear to follow a normal distribution.

Answer: 

    1. If we obtain 10 prediction intervals each with an individual confidence level of 99%, then the

 

joint

confidence

level

for

all

10

intervals

is

at

least

%

.

Answer:

 

    1. Increased arterial blood pressure in the lungs can lead to heart failure in patients with chronic obstructive pulmonary disease (COPD). Determining arterial lung pressure is invasive, difficult, and can hurt the patient. A cardiologist measured X = ejection rate of blood pumped out of the heart into the lungs (from radionuclide imaging) and Y = invasive measure arterial

blood pressure on n = 19 COPD patients, and considered to regress Y on X. The following figure is the scatterplot of the Y vs. X. Suggest a transformation of ejection rate to achieve a linear relationship if needed, or just say “No transformation needed”.

 

Answer:

 

    1. Let Y = Xβ + ε be the multiple regression model with ε Nn(0,Inσ2). Let β be p × 1.

Give the least squares estimator of β and its distribution.

Answer: 

  1. [290] Consider the diamond example we had in class, where the response variable Y = ring price (in Singapore dollars), and explanatory variable X = weight of diamond (in carats). We got the following output by regressing Y on X in SAS:
    1. [80] Using the output above, comment in detail on the fit of the ring price data to the explanatory variable, diamond weight. Interpret the fitted coefficients. Suggest what should be done in the next step.
    2. [60] Suppose that you want to test whether the slope β1 > 0 with the null hypothesis being H0 : β1 = 0 . Is it ok to use F-test for this purpose? Explain.
    3. [70] Give 90% joint confidence intervals for β0 and β1, using the Bonferroni correction. (Exact values needed)
    4. [70] Suppose that the average weight of all the diamonds in this dataset is 0.21 carats. Give the 95% confidence interval for the average price of the diamond whose weight is 0.31 carats. You don’t need to give exact values for this level of significance.
  2. [450] In a study of environmental effects on snow geese during the 1987/88 winter season, researchers at a wildlife refuge near the Texas coast explored weather conditions as predictors of the time at which the geese left their overnight roost sites to fly to their feeding areas. The variables they measured were:

Y : Time = minutes before (negative values) of after (positive values) sunrise that the geese departed

X1: Temp = aire temperature in degrees Celsius

X2: Light = light intensity

X3: Cloud = percent cloud cover

X4: Humidity = relative humidity

Below are the results from a multiple regression analysis of the time the geese left their roost as a function of the four explanatory variables

The REG Procedure

Model: MODEL1

Dependent Variable: Time

 

 

 

Analysis of Variance

Sum of

Mean

Source

DF         Squares

Square

F Value

Pr > F

Model

dfM       6382.6

MSM

F

<.0001

Error

dfE        SSE

MSE

 

 

Corrected Total

35          8412.3

Parameter Estimates

 

 

 

Parameter

Standard

Variable

DF

Estimate

Error

t Value

Pr > |t|

 

Intercept

1

-52.994

8.787

-6.03

<0.0001

 

Temp

1

0.9103

0.2646

3.45

0.002

 

Light

1

2.5160

0.7512

3.35

0.002

 

Cloud

1

0.0922

0.0439

2.10

0.044

 

Humidity

1

0.1425

0.1138

1.25

0.220

 

Variable

Type I SS

 

 

 

 

 

Intercept

12443.2

 

 

 

 

 

Temp

4996.6

 

 

 

 

 

Light

861.0

 

 

 

 

 

Cloud

422.3

 

 

 

 

 

Humidity

102.7

 

 

 

 

 

  1. [180] What are the missing values in ANOVA for: dfM, dfE, SSE, MSM, MSE, and F.
  2. [70] Compute R2 and Ra2 (adjusted R2). Interpret R2.
  3. [70] Use an appropriate procedure to check if it is worthwhile to regress the time the geese left their roost on at least one of air temperature, light intensity, percent cloud cover, and relative humidity.
  4. [30] Interpret b1 = 0.9103 found in the above output. State your conclusion for the corresponding t- test.
  5. [50] The following Type I SS are obtained from SAS. Based on this, compute SSM(2,3|1)

and Ry223|1.

  1. [50] For the model Y = β0 + β1X1 + β2X2 + β3X3 + β4X4 + ε, test the following: H0 :

β3 = β4 = 0,vs., Ha: not both β3 and β4 are 0. Give the test statistic and its distribution under the Null hypothesis.

4. Students in an undergraduate class at UC tok their one-minute pulses at the start of class. Some time later, they took a second one-minute pulse. A simple linear regression analysis treating the second pulse count as the response variable and the first pulse count as the explanatory variable is carried out.

a. [40] The second pulse counts were actually collected after the class conducted an experiment. Each student tossed a coin. Students whose coins came up heads ran in place for 1 minute; the other students remained seated for the minute. Then the second pulse count was taken immediately after the runners finished running. Below is a plot of the residuals from the simple linear regression model plotted against the first pulse count. The points are marked to indicate which students ran and which did not. What is this residual plot telling you?

Option 1

Low Cost Option
Download this past answer in few clicks

22.98 USD

PURCHASE SOLUTION

Already member?


Option 2

Custom new solution created by our subject matter experts

GET A QUOTE