Homework answers / question archive / Lab 5: Hypothesis Testing Bonanza! Stats 10: Introduction to Statistical Reasoning Spring 2022 All rights reserved, Adam Chaffee, Michael Tsiang, and Maria Cha, 2017-2022

Lab 5: Hypothesis Testing Bonanza! Stats 10: Introduction to Statistical Reasoning Spring 2022 All rights reserved, Adam Chaffee, Michael Tsiang, and Maria Cha, 2017-2022

Statistics

Share With

Lab 5: Hypothesis Testing Bonanza!
Stats 10: Introduction to Statistical Reasoning
Spring 2022
All rights reserved, Adam Chaffee, Michael Tsiang, and Maria Cha, 2017-2022.
Do not post, share, or distribute anywhere or with anyone without explicit permission.
Objectives
1. Learn how to conduct hypothesis tests for one proportion
2. Learn how to conduct hypothesis tests for one mean
Collaboration Policy
In Lab you are encouraged to work in pairs or small groups to discuss the concepts on the
assignments. However, DO NOT copy each other’s work as this constitutes cheating. The work
you submit must be entirely your own. If you have a question in lab, feel free to reach out to
other groups or talk to your TA if you get stuck.
Overview: The pnorm() function
The pnorm() function computes probabilities from a normal distribution with specified mean and
standard deviation. The function inputs a value in the q argument and computes the probability
that a value drawn from a normal distribution will be less than or equal to q. The exact normal
distribution to compare with is specified using the mean and sd arguments. The default mean is
mean = 0 and the default standard deviation is sd = 1.
The optional argument lower.tail inputs a logical value (TRUE or FALSE) and changes the
direction of the probability. The default is lower.tail = TRUE, so the pnorm() function will, as
noted above, compute the probability that a value drawn from the normal distribution will be less
than or equal to q. If we set lower.tail = FALSE, then the pnorm() function will compute the
probability that a value drawn from the normal distribution will be greater than or equal to q.
For example, if we want the probability of observing a person with a height of 70 inches or less
from a population that follows a normal distribution with a mean height of 68 inches and a
standard deviation of 1.9 inches:
pnorm(70, mean = 68, sd = 1.9)
If we want the probability of observing a person with a height of 70 inches or more from a
population that follows a normal distribution with a mean height of 68 inches and a standard
deviation of 1.9 inches:
pnorm(70, mean = 68, sd = 1.9, lower.tail = FALSE)
Overview: The One-Sample z-Test for proportions
To do a (theory-based) hypothesis test (test of significance) for the proportions, we follow
several steps:
1. State the null and alternative hypotheses about the population proportion, p0.
2. From a sample of data, compute the sample proportion ?? and its standardized z-statistic z =
!"# !!
%&"#$
, where SEest is calculated as #!! ((#!!)
* where n is the sample size.
3. Check the validity conditions to apply the Central Limit Theorem.
4. If the validity conditions hold, compute the p-value by comparing the observed z-statistic to a
standard normal distribution (using normal tables or pnorm()).
Exercise 1 – Hypothesis testing with one proportion.
We will be working with a Flint dataset, which can be found on the Week 9 CCLE. Please
download this file and read it into R. The lead levels were considered dangerous if the result was
greater than or equal to 8PPB. We are interested in determining if the proportion of dangerous
lead levels in Flint is greater than 8%. Assume the ‘flint_sample.csv’ data from CCLE Week 9
contains a random sample (from the population size of 5000) used to address this research
question.
a. We will conduct a hypothesis test for this research question. Using symbols from lecture,
what are the null and alternative hypotheses? Is this a one-sided or a two-sided test?
b. Calculate the sample proportion and sample standard deviation of the sample proportion
of dangerous lead levels.
c. Now, calculate the SE of sample proportions, and the z-value for this test. Consult the
above instructions and/or the lecture slides for guidance.
d. Using the z-statistic in (c), calculate the p-value associated with this test. You may use
R’s pnorm() function or a normal table, but please show all work.
e. Using a significance level of 0.05, do you reject the null hypothesis?
f. If greater than 8% of households in Flint contain dangerous lead levels, the EPA requires
remediation action to be taken. Based on your results, what should you tell the EPA?
g. Another way to run this test is to use the prop.test() function using the mosaic package.
You will need to know your sample size, and the number of “successes” in the sample.
Use this function to conduct the same hypothesis test in (a)-(d) and obtain a p-value from
the test. Using the same significance level of 0.05, do your results change? An example
of the prop.test() function is shown in the two lines below:
## Example: We flipped 100 coins and 60 were heads. Is the long-run proportion of heads
greater than 0.5?
prop.test(x = 60, n = 100, p = 0.5, alt = "greater")
h. Notice that the prop.test() output produced a confidence interval. Try using the help
screen under the mosaic package for prop.test() to find the argument for the confidence
interval. Modify the argument and re-run the code in (f) to produce a 99% confidence
interval instead of a 95% interval.
Overview: The One-Sample t-Test for means
To do a (theory-based) hypothesis test (test of significance) for the mean of a quantitative
variable, we follow several steps:
1. State the null and alternative hypotheses about the mean parameter μ.
2. From a sample of data, compute the sample mean ?? and its standardized t-statistic t = +?# -
./√*,
where s is the sample standard deviation and n is the sample size.
3. Check the validity conditions to apply the Central Limit Theorem.
4. If the validity conditions hold, compute the p-value by comparing the observed t-statistic to a
t-distribution with df = n − 1.
Overview: The pt() Function
The pt() function computes probabilities from a t-distribution with specified degrees of freedom.
The syntax for pt() is very similar to the pnorm() function. The pt() function inputs a value in the
q argument and computes the probability that a value drawn from a t-distribution will be less
than or equal to q. The exact t-distribution to compare with is specified using the df argument (df
stands for degrees of freedom). There is no default value for the df, so you must input the df
argument for the command to work.
The optional argument lower.tail inputs a logical value (TRUE or FALSE) and changes the
direction of the probability. The default is lower.tail = TRUE, so the pt() function will, as noted
above, compute the probability that a value drawn from the t-distribution will be less than or
equal to q. If we set lower.tail = FALSE, then the pt() function will compute the probability that
a value drawn from the t-distribution will be greater than or equal to q.
For example, if we want the probability of observing a value of –1.5 or less from statistics that
follow a t-distribution with 29 degrees of freedom:
pt(-1.5, df = 29)
Exercise 2 – Hypothesis testing with means
Copper is another metal which can be dangerous in high quantities. We believe the average
copper levels in the state of Michigan’s drinking water is 50 PPM. We are interested in finding if
the copper level in Flint’s water differs from the Michigan average. Again, assume the
flint_sample data constitutes a random sample from the population size of 5000.
a. We will conduct a hypothesis test for this research question. What are the null and
alternative hypotheses? Is this a one-sided or a two-sided test?
b. Calculate the sample mean and sample standard deviation of the sample copper levels in
Flint.
c. Calculate the standard error of the sample mean for copper, ??.
d. Using the values in (b) and (c), calculate the t-test statistic and p-value associated with
this test. Use pt() to obtain a test statistic. Hint: is this a two-sided test?
e. Using a significance level of 0.01, do you reject the null hypothesis? Interpret this result
in the context of our research question.
f. Another way to run this test is to use the t.test() function using the mosaic package. Use
this function to conduct the same hypothesis test in (a)-(d) and obtain a p-value from the
test, again using a significance level of 0.05. Do your results change? An example of the
prop.test() function is shown in the two lines below:
## Using sample Flint lead data, do we believe the long run Flint lead average is not equal to 3?
t.test(flint$Pb, mu = 3, alt = "two.sided")

pur-new-sol

Purchase A New Answer

Custom new solution created by our subject matter experts

GET A QUOTE

Answer Preview

Please download the answer file using this link

https://drive.google.com/file/d/1pyLIQNkfA70IeE_MwbVKRQ3AGksOXigC/view?usp=sharing

Google (5.0)

Lab 5: Hypothesis Testing Bonanza! Stats 10: Introduction to Statistical Reasoning Spring 2022 All rights reserved, Adam Chaffee, Michael Tsiang, and Maria Cha, 2017-2022

Statistics

Purchase A New Answer

Custom new solution created by our subject matter experts

GET A QUOTE

Answer Preview

Sitejabber (5.0)

BBC (5.0)

Trustpilot (4.9)

Related Questions

menu

Lab 5: Hypothesis Testing Bonanza! Stats 10: Introduction to Statistical Reasoning Spring 2022 All rights reserved, Adam Chaffee, Michael Tsiang, and Maria Cha, 2017-2022

Statistics

Purchase A New Answer

Custom new solution created by our subject matter experts

GET A QUOTE

Answer Preview

Sitejabber (5.0)

BBC (5.0)

Trustpilot (4.9)

Google (5.0)

Related Questions