Fill This Form To Receive Instant Help
Homework answers / question archive / Econ 221 – Midterm Project Fall 2020 The midterm project will require you to examine and model the distribution of sick days (the number of days an employee does not go to work in a year) in a population with N=100
Econ 221 – Midterm Project Fall 2020 The midterm project will require you to examine and model the distribution of sick days (the number of days an employee does not go to work in a year) in a population with N=100. The data is given in the Excel file “midterm project – data.xls”, with one column designating an index, and the second designating the number of sick days, called the variable x. The midterm project is due on Friday, Oct 23rd at 4:30pm by submission to the Learn Dropbox. Please submit Questions 1-3 in an Excel file with clearly stated reasoning and results. Submit Question 4 as a separate text file (word or pdf). Question 1. (15 marks) Open the data file, “midterm project – data.xls”, and perform the following tasks and explain your answers in Excel. i) (2 marks) Show using column-based calculations and sums that the average and variance of sick days in the population are 19.67 and 20.7611 respectively. ii) (3 marks) Find the quartiles of the population data. iii) (1 mark) If you sampled n=2 from the population of N=100 without replacement, how many distinct samples would be possible? iv) (3 marks) Assuming samples with n=2 drawn from the population without replacement, make a table that illustrates all the possible values of the sample mean with the first individual’s value recorded vertically, and the second individual’s value recorded horizontally. v) (2 marks) Use your table in iv) to determine how many possible values could arise for the sample mean when n=2. vi) (3 marks) Assuming samples with n=2 drawn from the population without replacement, in how many samples would the sample mean be exactly 13? vii) (1 mark) What is the probability of the event in vi)? Question 2. (15 marks) In this question you will construct a discrete probability distribution from the relative frequency of values in the population presented in the dataset. Continue in Excel using “midterm project – data.xls”, but clearly marking off the work for Question 2 from that for Question 1. i) (3 marks) Divide the range of values in the population data into 10 equal length bins. Construct a table that shows the relative frequency of individuals falling in each bin interval. ii) (2 marks) Use your work in i) to construct a histogram with 10 equal width bins to illustrate the relative frequency of values. iii) (2 marks) Use the midpoint of each bin in i) to define the values of a random variable Y and the relative frequency of individuals falling within each bin to define the probability of an y- event, ????(???? = ????). iv) (2 marks) Use the probability distribution in iii) to find ????(???? < 16). v) (2 marks) Find the expected value and variance of the discrete random variable defined in iii). vi) (2 marks) Suppose you had two random variables ????1 ???????????? ????2 that have the probability distribution in iii). Assume these random variables are statistically independent. What is the probability be that ????(????1 = 11.95 ???????????? ????2 = 29.05)?
vii) (2 marks) In studying a real-world problem, why might the random variables ????1 ???????????? ????2 be correlated, rather than statistically independent? How might the number of sick days of two individuals be related? Question 3. (10 marks) In this question you will construct a normal random variable by using the average and variance of the population we are studying. Thus, assume the random variable for sick days is ????~????(19.67, 20.7611). i) (2 marks) What is the probability that ???? takes a value less than 16, ????(???? < 16)? ii) (3 marks) Find the quartiles of the distribution, ????(19.67, 20.7611). iii) (3 marks) Suppose you have two statistically independent random variables ????1 ???????????? ????2 that both follows the ????(19.67, 20.7611) distribution. Form a new random variable ???? from the average of the two random variables. ???? = ????1 + ????2 2 What are the expected value and variance of the random variable ???? ? What distribution does W have? iv) (1 mark) What is the probability that ????(???? = 14)? v) (1 mark) What is the probability that ????(???? < 14)? Question 4. (10 marks) In a short paragraph (150-200 words) discuss the costs and benefits of modelling the empirical distribution given by the data set and described in Question 1 using: a) the discrete distribution you have examined in Question 2; b) the continuous distribution you have examined in Question 3.