Fill This Form To Receive Instant Help

Help in Homework
trustpilot ratings
google ratings


Homework answers / question archive / Each of the three measures of central tendency?the mean, the median, and the mode?are more appropriate for certain populations than others

Each of the three measures of central tendency?the mean, the median, and the mode?are more appropriate for certain populations than others

Statistics

Each of the three measures of central tendency?the mean, the median, and the mode?are more appropriate for certain populations than others. For each type of measure, give two additional examples of populations where it would be the most appropriate indication of central tendency.

2. Find the mean, median, and mode of the following data set:
5 15 9 22 67 42 2 72 81 53 6 70 41 9 42 23

3. Sometimes, we can take a weighted approach to calculating the mean. Take our example of high temperatures in July.

Suppose it was 98°F on 7 days, 96°F on 14 days, 88°F on 1 day, 100°F on 6 days and 102°F on 3 days.
Rather than adding up 31 numbers, we can find the mean by doing the following:
Mean = ( 1 x 88 + 14 x 96 + 7 x 98 + 6 x 100 + 3 x 102) / 31
...where 1, 14, 7, 6, and 3 are the weights or frequency of a particular temperature's occurrence.
Then we divide by the total of number of occurrences.

Suppose we are tracking the number of home runs hit by the Boston Red Sox during the month of August:

Number of Games HRs Hit each Day
2 3
5 2
6 1
7 0

Using the weighted approach, calculate the average number of home runs per game hit by the Sox.

4. When a pair of dice is rolled, the total will range from 2 (1,1) to 12 (6,6). It is a fact that some numbers will occur more frequently than others as the dice are rolled over and over.

A. Why will some numbers come up more frequently than others?

B. Each die has six sides numbered from 1 to 6. How many possible ways can a number be rolled? In other words, we can roll (2,3) or (3,2) or (6,1) and so on. What are the total (x,y) outcomes that can occur?

C. How might you then estimate the percentage of the time a particular number will come up if the dice are rolled over and over?

D. Once these percentages have been calculated, how might the mean value of the all the numbers thrown be determined?

E. If you have completed part 2 already, you have an idea of what a population distribution is.

There is a very famous distribution that describes the frequency of the number of times a number comes up in a series of dice rolls.

Use the Internet to see if you can find its name.

Part 2

By far the most frequently utilized of these measures is the mean of a population.
Remember that the source of the data that you want to analyze always comes from what is called a population.

If you are interested in the average high temperature in your area for the month of July, then your population would be the 31 daily high temperatures in July, and the mean would be the total of these temperatures divided by 31.

Now, suppose you calculate a mean of a population and you want to know how representative that mean is of a random data point in that population.

In other words, is the data bunched tightly around the mean, or is it more loosely distributed over the possible range of values?

An example would be high temperatures in July versus high temperatures in April or October.

In general, the highs in April and October will vary more widely from the means in those months than the highs in July.

In summary, it takes not only the mean to adequately describe a population, but there must be some way to measure the dispersion, or distribution, of the data around the mean.

Problem -

1.Use the Internet to research the definition of what is called the distribution of a data population.

2.Also, find the statistic that measures the width of dispersion ("looseness" or "tightness") of the population data about its mean.

3.Give an example of the type of situation where this statistic might be critical to making good decisions about the population under study.

pur-new-sol

Purchase A New Answer

Custom new solution created by our subject matter experts

GET A QUOTE

Answer Preview

BASIC STATISTICS

1. Each of the three measures of central tendency?the mean, the median, and the mode?are more appropriate for certain populations than others.

For each type of measure, give two additional examples of populations where it would be the most appropriate indication of central tendency.

The mean (average) of a population is most appropriate when the data are normal or symmetrically distributed, and there are no outliers. When there are outliers, the mean might be inappropriately high or low. You could use the mean when measuring things like height or weight, as long as there aren't any outliers.

The median (middle value) of a population is most appropriate when the data is not symmetrical or there are outliers. Outliers don't affect the value of the median as much as they affect the value of the mean. Often, you use the median for things like income or temperature, which tend to have outlying values.

The mode (the number that occurs the most frequently) is not often used in statistics. It would be appropriate in a situation such as asking consumers which product they liked the best - you wouldn't be able to average the results (out of products 1, 2, and 3, there is no product 1.5, but it would be useful to say that people chose product 3 more times than they chose the others). It would also be useful if one number showed up many many more times than the others in a data set.

2. Find the mean, median, and mode of the following data set:
5 15 9 22 67 42 2 72 81 53 6 70 41 9 42 23

First put the data in order:
2 5 6 9 9 15 22 23 41 42 42 53 67 70 72 81

The average is 34.9375:

(2 + 5 + 6 + 9 + 9 + 15 + 22 + 23 + 41 + 42 + 42 + 53 + 67 + 70 + 72 + 81)/16 = 559/16 = 34.9375

The median is 32 (take the average of the two middle numbers):

2 5 6 9 9 15 22 23 41 42 42 53 67 70 72 81

(23 + 41)/2 = 32

The modes are 9 and 42 (they both show up twice):

2 5 6 9 9 15 22 23 41 42 42 53 67 70 72 81

3. Sometimes, we can take a weighted approach to calculating the mean. Take our example of high temperatures in July.

Suppose it was 98°F on 7 days, 96°F on 14 days, 88°F on 1 day, 100°F on 6 days and 102°F on 3 days.

Rather than adding up 31 numbers, we can find the mean by doing the following:
Mean = ( 1 x 88 + 14 x 96 + 7 x 98 + 6 x 100 + 3 x 102) / 31
...where 1, 14, 7, 6, and 3 are the weights or frequency of a particular temperature's occurrence.

Then we divide by the total of number of occurrences.

Suppose we are tracking the number of home runs hit by the Boston Red Sox during the month of August:

Number of Games HRs Hit each Day
2 3
5 2
6 1
7 0

Using the weighted approach, calculate the average number of home runs per game hit by the Sox.

(2*3 + 5*2 + 6*2 + 7*0)/(2 + 5 + 6 + 7)
= (6 + 10 + 12)/20
= 28/20
= 1.4 home runs per game

4. When a pair of dice is rolled, the total will range from 2 (1,1) to 12 (6,6). It is a fact that some numbers will occur more frequently than others as the dice are rolled over and over.

A. Why will some numbers come up more frequently than others?

Some numbers can be made in more ways than others. You can only roll a two in one way (1 and 1), but you can roll a six in multiple ways (1 and 5, 2 and 4, 3 and 3, 4 and 2, 5 and 1).

B. Each die has six sides numbered from 1 to 6. How many possible ways can a number be rolled? In other words, we can roll (2,3) or (3,2) or (6,1) and so on. What are the total (x,y) outcomes that can occur?

The first die can have 6 possibilities; same for the second die. Therefore there are (6)(6) = 36 total outcomes.

C. How might you then estimate the percentage of the time a particular number will come up if the dice are rolled over and over?

You would look at how many ways a number could be made rolling the dice, then divide that number by 36. For example, in part A we saw that a 2 can only be rolled in one way, so the probability of rolling a 2 is 1/36 = 0.0278. Same for 12, because it can only be made by rolling two 6's. But, a 6 can be rolled in five ways, so the probability of rolling a 6 is 5/36 = 0.1389.

D. Once these percentages have been calculated, how might the mean value of the all the numbers thrown be determined?

You use a version of the weighted average approach: you add together the values multiplied by their respective probabilities.

(2)(0.0278) + ... + (6)(0.1389) + ... + (12)(0.0278)

E. If you have completed part 2 already, you have an idea of what a population distribution is.

There is a very famous distribution that describes the frequency of the number of times a number comes up in a series of dice rolls.

Use he Internet to see if you can find its name.

It is the Binomial Distribution (also called the Bernoulli Distribution). This distribution describes the probabilities of a "success" (for example, we could call rolling a 6 a success) happening different numbers of times if you know the probability of a success happening once. You would use it to answer the question: if you roll a pair of dice 100 times, what is the probability of rolling a 6 only once? Twice? 50 times? etc.

You can read about it at these websites:

http://en.wikipedia.org/wiki/Bernoulli_trial
http://www.ds.unifi.it/VL/VL_EN/bernoulli/bernoulli2.html
http://www.mala.bc.ca/~johnstoi/maybe/maybe1.htm

Part 2

By far the most frequently utilized of these measures is the mean of a population.
Remember that the source of the data that you want to analyze always comes from what is called a population.

If you are interested in the average high temperature in your area for the month of July, then your population would be the 31 daily high temperatures in July, and the mean would be the total of these temperatures divided by 31.

Now, suppose you calculate a mean of a population and you want to know how representative that mean is of a random data point in that population.

In other words, is the data bunched tightly around the mean, or is it more loosely distributed over the possible range of values?

An example would be high temperatures in July versus high temperatures in April or October.

In general, the highs in April and October will vary more widely from the means in those months than the highs in July.

In summary, it takes not only the mean to adequately describe a population, but there must be some way to measure the dispersion, or distribution, of the data around the mean.

Problem -

1.Use the Internet to research the definition of what is called the distribution of a data population.

The distribution is a function that describes the probabilities associated with each value. Look at the dice example from above. Each x value (2 through 12) has an associated probability (I only calculated the probabilities for 2, 6, and 12, but every value from 2 to 12 has a probability).

All distribution functions must satisfy two criteria: (1) each probability associated with each x value must be between 0 and 1, and (2) if you add all of the possible probabilities together, they add up to 1. Another way of thinking about this is that if you graph the function, the area under the graph is 1.

One of the most common distributions is the normal distribution. If you look at a graph of it, it is symmetrical and is high in the middle and low on either side.

You can read about distributions at these websites:

http://www.cas.lancs.ac.uk/glossary_v1.1/basicdef.html
http://en.wikipedia.org/wiki/Probability_distribution

2.Also, find the statistic that measures the width of dispersion ("looseness" or "tightness") of the population data about its mean.

The statistic that measures this is the standard deviation. It is calculated as:

You calculate it by taking the square of each value minus the mean, adding those together, dividing by the sample size, then taking the square root. The larger the standard deviation, the more spread out the data is.

If you square the standard deviation, you get a statistic called the variance. It is also used to measure the variability around the mean.

3.Give an example of the type of situation where this statistic might be critical to making good decisions about the population under study.

The larger the standard deviation is, the more spread out the data. So, if the standard deviation is large, there will be lots of data points that aren't near the mean. That means that the mean might not be a good estimate of the data. Outliers can make the standard deviation larger than it would be otherwise, so you have to check if there are outliers in your data before using the mean and standard deviation.

You use the standard deviation a lot when doing statistical tests in order to see what the likelihood of an occurrence is.

Say that you have a population with a mean of 0, and you pick an object from that population with a value of 10. Now, if the population has a small standard deviation, with all the values clustered tightly around the mean, there would be an extremely small probability of picking something with a value of 10. On the other hand, if the population had a large standard deviation, it wouldn't be so rare to pick something with a value of 10.