Homework answers / question archive /
BF550: Fall 2021 Problem Set 4
Submit this assignment as a Python notebook that shows the code and the final plots

#### BF550: Fall 2021 Problem Set 4
Submit this assignment as a Python notebook that shows the code and the final plots

######
Statistics

Share With

**BF550: Fall 2021 Problem Set 4**

Submit this assignment as a Python notebook that shows the code and the final plots.

**Problem 1**

Let’s simulate mutations during PCR. Assume your are amplifying a sequence of *L *= 100 base pairs starting from a single sequence. Each PCR cycle the number of molecules doubles, and the entire amplification consists of *n *= 13 cycles. At each duplication event, every base pair is copied correctly with probability ^{[1]}−*µ *or replaced by a different nucleotide with probability *µ*; all three possible substitutions occur at equal probability *µ/*3. In this problem, we will explore three values of *µ*: 10^{−4}, 10^{−3}, and 10^{−2}. For simplicity, we assume that only the new strand could be mutated; the parent strand maintains its original state after duplication.

For each mutation rate do the following. Determine the expected number of distinct sequences at the end of the PCR. Visualize the distribution of relative abundances of these sequences. Compute the effective number of sequences using the definition based on entropy.

**Optional **(You may do this part for extra 5 points)

Find the confidence interval for the expected number of distinct sequences, for a confidence level c. You may choose any confidence level as you like. Explain which distribution did you use, standard normal or t? Explain your choice.