Fill This Form To Receive Instant Help

Help in Homework
trustpilot ratings
google ratings


Homework answers / question archive / Description: There are two components to this project : UPGMA and Neighbor Join implementation

Description: There are two components to this project : UPGMA and Neighbor Join implementation

Computer Science

Description: There are two components to this project : UPGMA and Neighbor Join implementation. (i) You are to implement the UPGMA clustering algorithm on distance matrix inputs to construct phylogenetic trees in Newick format.

 (ii) You are to implement the Neighbor Joining algorithm on distance matrix inputs inputs to construct phylogenetic trees in Newick format with branch distances shown.

Specifications: There are two components to this project : UPGMA and Neighbor Join implementation. The input specs for both are the same format of distance matrix inputs.

  1. UPGMA: Your input (sample files are DM-p127.txt and DM-p139.txt) are distance matrix inputs, where the sample examples were completed in the video lectures on UPGMA.

 You will run the centroid linkage based UPGMA algorithm on such distance matrix inputs as described in the video lectures and textbook chapter 11.1. Calculate average inter-cluster distances efficiently as described in videos 3 and 4 on UPGMA, without needing to look again at individual pairwise distances in original distance matrix. During program execution, your output to the console should indicate at each step which two clusters are being merged in addition to the average distance between them. Your final UPGMA output to the console should include the phylogenetic tree in the parentheses-based Newick format. An inefficient implementation of UPGMA is uploaded as pysip-DM.pl with I/O expectations consistent here.

  1. The neighbor joining algorithm is described in slide 20 of NeighborJoining.pptx and corresponding video lectures. Your input s (sample files are DM-p127.txt and DM-p139.txt) are distance matrix inputs. You will run the Neighbor Joining algorithm on such distance matrix inputs. From the initial distance matrix as well as after each merge step, you are to recompute the average distances r, the transition distance matrix, and the updated distance matrix. You are to output these to the console upon their computation. I have uploaded a perl implementation of NeighborJoin as oyop-DM-modGE.pl which you are encouraged to run and test. Your output should be consistent with that program’s output.
  2.  What to turn in: You must turn in a single zipped file containing your source code, a Makefile if needed for compilation, and a README file indicating how to execute your program.

The attached perl code is based on the (former) textbook implementation of Neighbor Join that I modified heavily, but it still does not incorporate any meaningful tie-breakers.  That means that it returns a valid Neighbor Join tree, but if you want to incorporate tie-breakers you would need to use your own modifications to do so (it is not hard but just needs slightly more bookkeeping).  When the Neighbor Join tree is unique, then the perl code does return the only valid tree.  When it is not unique, then it returns a valid tree, but there may be other valid outputs too.  I am not restricting your output in the case of ties as long as you return a valid Neighbor Join tree.

Option 1

Low Cost Option
Download this past answer in few clicks

32.99 USD

PURCHASE SOLUTION

Already member?


Option 2

Custom new solution created by our subject matter experts

GET A QUOTE