Fill This Form To Receive Instant Help

Help in Homework
trustpilot ratings
google ratings


Homework answers / question archive / MAS8403 Assignment   This work should be submitted to Canvas before 23:45 on Sunday 24th October 2021

MAS8403 Assignment   This work should be submitted to Canvas before 23:45 on Sunday 24th October 2021

Statistics

MAS8403 Assignment

 

This work should be submitted to Canvas before 23:45 on Sunday 24th October 2021.

You should upload 2 files: a PDF file containing your answers to the questions, and a .R file containing annotated R code which can be used to reproduce your answers. Marks will be deducted for missing or inadequate annotation.

 

  1. The Poisson Process

An ability to forecast patient arrivals can be very useful for hospital emergency departments in various decision making processes, for example, concerning the staffing of the unit.

In the first part of this practical you will consider whether the Poisson process is a good model for patient arrivals in a particular hospital emergency department. For this emergency department, the arrival times of patients, in hours after 9am, have been recorded over a 12 hour period. The data can be read into R as follows

url = "http://www.mas.ncl.ac.uk/~nseg4/teaching/MAS8380/practical2.dat" arr.times = scan(url)

Part a:

Examine the arrivals process graphically by plotting each arrival time against its position in the sequence, i.e. plot i vs ti where ti is the arrival time of the i-th patient. In a Poisson process, events occur randomly in time at a constant rate, say λ. Does this seem compatible with your plot?

[5 marks]

If the patients are arriving according to a Poisson process with rate λ then in an interval of length ` the number of arrivals will be a random variable X with

X ∼ Po(λ`).

Note that E[X] = λ` and Var(X) = λ`. Suppose that the period over which arrivals are observed has length L and we split it into k equal length intervals (so each interval has length L/k). If we count the number of arrivals in each of the k intervals, this gives rise to random variables X1,X2,...,Xk. If the arrivals occur according to a Poisson process, the Xi will be independent with Xi ∼ Po(λL/k) for each i = 1,...,k. If x1,...,xk are the observed counts, then we should have

mean{x1,...,xk}≈ λL/k, and variance{x1,...,xk}≈ λL/k

so

mean{x1,...,xk}≈ variance{x1,...,xk}

You can count the number of arrivals in an interval of length L/k using the table and cut commands.

x = table(cut(arr.times, breaks=k))

The vector x will then contain the observations x1,...,xk.

Part b:

For the patient arrivals data set, count the number of arrivals in k = 10 intervals of equal length. Record the mean and variance of the counts. Repeat this for k = 25,50,75,100,150. Prepare a graph of mean versus variance for these different values of k. Does this plot support the idea the arrivals follow a Poisson process?

Provide an estimate of the rate parameter λ along with your method for doing this.

[12 marks]

A second emergency department believes it can model patient arrivals using a Poisson process with rate 10 arrivals per hour. It wants to simulate a sample of patient arrivals over a 12 hour period.

Part c:

You need to simulate a collection of points from a Poisson process with rate 10 arrivals per hour. As we saw in lectures, the distance between times (i.e. the inter-arrival times) in a Poisson process with rate λ corresponds to an exponential Exp(λ) random variable. You can therefore simulate points from a Poisson process as follows.

  1. Simulate a series of random quantities from an Exponential(λ) distribution (use rexp in R) where λ = 10, to act as the inter-arrival times (the first value representing the time until the first arrival, the second value representing the time between the first and second arrivals and so on)
  2. Take the cumulative sum of these inter-arrival times (e.g. using cumsum in R) to obtain the arrival time of each individual since the beginning of the experiment (so the first element of this new vector will be the first arrival time, the second element will be the first two inter-arrival times added together, the third element will be the first 3 times added together, and so on)
  3. Extract from this vector, all arrival times which were within L = 12 hours from the beginning of the experiment

Produce a plot of the arrival times like that you produced in Part a.

Verify graphically that your generated arrivals follow a Poisson process with rate λ = 10.

[8 marks]

Part d:

Now consider the case where the times between arrivals follow a Uniform(0,0.1) distribution. Repeat the procedure from Part c to simulate arrival times for this new process. By reproducing the plot from Part b, verify whether these new event times constitute a Poisson Process.

[5 marks]

  1. Maximum Likelihood

The number of text messages I receive per hour follows a Poisson distribution with rate θ. The probability I receive x texts per hour is given by the probability mass function,

 

x Pr(

 

Observing how many texts I receive in each hour, over a period of n hours gives a sample of observations x1,x2,...,xn.

Part a:

Write an R function which generates the log-likelihood value for a sample of size n from a Poisson distribution, with a given value of θ. The function should take the sample and a value of θ as inputs, and return the corresponding log-likelihood value.

[5 marks]

Part b:

Suppose we observe the following sample of size n = 10:

                                                             12     8    14    8    11     6    13    9     9  10

Use your function from Part a to produce a plot of log-likelihood against θ for this sample for a range of θ values. From your plot what would be a sensible estimate for θˆ?

[5 marks]

Part c:

Modify your function from Part a so that it now only takes the sample of data as an input, and returns the value of θ which maximises the log-likelihood for that sample (e.g. by using optimise in R). By using the sample of data from Part b, does the estimate of θˆ from this function agree with your estimate by inspecting the log-likelihood plot?

[5 marks]

Part d:

Write an additional function which takes input n, and generates N = 1000 samples of size n from a Po(θ = 10) distribution and computes the maximum likelihood estimate of θ for each sample. Your function should then return the mean and variance of the N = 1000 maximum likelihood estimates.

[5 marks]

Part e:

Call your function from Part d for values of n = 5,10,25,50,100,1000 and record the resulting means and variances either in a table or graphically. From this output is your maximum likelihood estimator unbiased and/or consistent?

[5 marks]

Purchase A New Answer

Custom new solution created by our subject matter experts

GET A QUOTE