Why Choose Us?
0% AI Guarantee
Human-written only.
24/7 Support
Anytime, anywhere.
Plagiarism Free
100% Original.
Expert Tutors
Masters & PhDs.
100% Confidential
Your privacy matters.
On-Time Delivery
Never miss a deadline.
STAT 6031 Instructor: Emily L
STAT 6031 Instructor: Emily L. Kang Fall 2015
STAT 6031
Midterm Examination 2
Student name (Last, First):
Student ID:
This exam is closed book and notes. One sheet of paper (both sides but nothing photocopied) can be used. You will need a calculator.
There are four problems on this exam worth a total of 100 points. Please read each question carefully and ask the instructor if you have any questions. For Question 2 through Question 4, be sure to show all your work and explain all answers clearly and fully in proper English. All answers related to interpretation or conclusions must be in terms of the problem. Points values for each part are given in square brackets before the question.
Before turning your examination, please acknowledge that no unauthorized aid has been received, by signing and dating the following pledge (from the UC Code of Academic Integrity)
Signature: Date:
- [30 × 7] Special instruction for this question: You DO NOT have to show your work for ( a )
through (g); only your answers in boxes will matter.
-
- Suppose the estimated regression equation is Yˆ = 3X + 2 . If the analysis was rerun with X replaced with U = 0.5X + 2, give the estimated regression equation
Answer:
-
- In a simple linear regression there are two intervals associated with X = 3.5: (14.5,16.5) and (13,18), one of which is the confidence interval for the mean response at this level of X, while the other is the prediction interval. Which is the prediction interval?
Answer:
-
- Indicate the following statement is True (T) or False (F): If the value of R2 is larger than 90%, then it is not necessary to do a residual analysis.
Answer:
-
- Indicate the following statement is True (T) or False (F): A transformation of the explanatory variable, X, is required if the original X-values do not appear to follow a normal distribution.
Answer:
-
- If we obtain 10 prediction intervals each with an individual confidence level of 99%, then the
|
joint |
|
confidence |
|
level |
|
for |
|
all |
|
10 |
|
intervals |
|
is |
|
at |
|
least |
|
% |
|
. |
|
Answer: |
-
- Increased arterial blood pressure in the lungs can lead to heart failure in patients with chronic obstructive pulmonary disease (COPD). Determining arterial lung pressure is invasive, difficult, and can hurt the patient. A cardiologist measured X = ejection rate of blood pumped out of the heart into the lungs (from radionuclide imaging) and Y = invasive measure arterial
blood pressure on n = 19 COPD patients, and considered to regress Y on X. The following figure is the scatterplot of the Y vs. X. Suggest a transformation of ejection rate to achieve a linear relationship if needed, or just say “No transformation needed”.
|
Answer: |
-
- Let Y = Xβ + ε be the multiple regression model with ε ∼ Nn(0,Inσ2). Let β be p × 1.
Give the least squares estimator of β and its distribution.
Answer:
- [290] Consider the diamond example we had in class, where the response variable Y = ring price (in Singapore dollars), and explanatory variable X = weight of diamond (in carats). We got the following output by regressing Y on X in SAS:
- [80] Using the output above, comment in detail on the fit of the ring price data to the explanatory variable, diamond weight. Interpret the fitted coefficients. Suggest what should be done in the next step.
- [60] Suppose that you want to test whether the slope β1 > 0 with the null hypothesis being H0 : β1 = 0 . Is it ok to use F-test for this purpose? Explain.
- [70] Give 90% joint confidence intervals for β0 and β1, using the Bonferroni correction. (Exact values needed)
- [70] Suppose that the average weight of all the diamonds in this dataset is 0.21 carats. Give the 95% confidence interval for the average price of the diamond whose weight is 0.31 carats. You don’t need to give exact values for this level of significance.
- [450] In a study of environmental effects on snow geese during the 1987/88 winter season, researchers at a wildlife refuge near the Texas coast explored weather conditions as predictors of the time at which the geese left their overnight roost sites to fly to their feeding areas. The variables they measured were:
Y : Time = minutes before (negative values) of after (positive values) sunrise that the geese departed
X1: Temp = aire temperature in degrees Celsius
X2: Light = light intensity
X3: Cloud = percent cloud cover
X4: Humidity = relative humidity
Below are the results from a multiple regression analysis of the time the geese left their roost as a function of the four explanatory variables
The REG Procedure
Model: MODEL1
|
Dependent Variable: Time |
|
|
||||
|
|
Analysis of Variance |
|||||
|
Sum of |
Mean |
|||||
|
Source |
DF Squares |
Square |
F Value |
Pr > F |
||
|
Model |
dfM 6382.6 |
MSM |
F |
<.0001 |
||
|
Error |
dfE SSE |
MSE |
|
|
||
|
Corrected Total |
35 8412.3 Parameter Estimates |
|
|
|
||
|
Parameter |
Standard |
|||||
|
Variable |
DF |
Estimate |
Error |
t Value |
Pr > |t| |
|
|
Intercept |
1 |
-52.994 |
8.787 |
-6.03 |
<0.0001 |
|
|
Temp |
1 |
0.9103 |
0.2646 |
3.45 |
0.002 |
|
|
Light |
1 |
2.5160 |
0.7512 |
3.35 |
0.002 |
|
|
Cloud |
1 |
0.0922 |
0.0439 |
2.10 |
0.044 |
|
|
Humidity |
1 |
0.1425 |
0.1138 |
1.25 |
0.220 |
|
|
Variable |
Type I SS |
|
|
|
|
|
|
Intercept |
12443.2 |
|
|
|
|
|
|
Temp |
4996.6 |
|
|
|
|
|
|
Light |
861.0 |
|
|
|
|
|
|
Cloud |
422.3 |
|
|
|
|
|
|
Humidity |
102.7 |
|
|
|
|
|
- [180] What are the missing values in ANOVA for: dfM, dfE, SSE, MSM, MSE, and F.
- [70] Compute R2 and Ra2 (adjusted R2). Interpret R2.
- [70] Use an appropriate procedure to check if it is worthwhile to regress the time the geese left their roost on at least one of air temperature, light intensity, percent cloud cover, and relative humidity.
- [30] Interpret b1 = 0.9103 found in the above output. State your conclusion for the corresponding t- test.
- [50] The following Type I SS are obtained from SAS. Based on this, compute SSM(2,3|1)
and Ry223|1.
- [50] For the model Y = β0 + β1X1 + β2X2 + β3X3 + β4X4 + ε, test the following: H0 :
β3 = β4 = 0,vs., Ha: not both β3 and β4 are 0. Give the test statistic and its distribution under the Null hypothesis.
4. Students in an undergraduate class at UC tok their one-minute pulses at the start of class. Some time later, they took a second one-minute pulse. A simple linear regression analysis treating the second pulse count as the response variable and the first pulse count as the explanatory variable is carried out.
a. [40] The second pulse counts were actually collected after the class conducted an experiment. Each student tossed a coin. Students whose coins came up heads ran in place for 1 minute; the other students remained seated for the minute. Then the second pulse count was taken immediately after the runners finished running. Below is a plot of the residuals from the simple linear regression model plotted against the first pulse count. The points are marked to indicate which students ran and which did not. What is this residual plot telling you?

Expert Solution
Buy This Solution
For ready-to-submit work, please order a fresh solution below.





