Below is the section on Algorithm Details from my newly updated Design Document. Once the entire document is posted on the DMD Senior Projects website, I will provide a link to it. For now, this is the part I've been working the hardest on lately, and it will outline my plan for the rest of the project, so I think it's worth posting about.
I have chosen to build on the existing PhyloWidget program. This will allow me to focus on combining images with trees and incorporating common names of organisms, without having to recreate a significant amount of code. I will take advantage of PhyloWidgetʼs existing interface, tree rendering process, Newick parser, and many other features. Modifications and additions will serve the purpose of adding content fit for students and adapting the interface to be better suited for their needs.
The program begins by asking the user to input either a newick format tree, a taxonomic name, or a common name of an organism (or multiple organisms). The programʼs overall algorithm is as follows:
1) Search www.ubio.org for the taxonomic name or common name, whichever was not given
2) Search TreeBase by taxonomic name for relevant trees, prune them, and import them into PhyloWidget
3) Search online images databases such as www.morphbank.com for images of the organisms
4) Display those images alongside the leaf labels containing both the common and taxonomic names
Inputs that do not yield a tree will return a dialogue box to the user asking for more or different information. Image searches that do not yield any images will either be symbolized by a standard replacement image or will be drawn without any images.
3.1.1 Scientific and Common Name Search
I will interact with the website www.ubio.org, which searches based on keywords and can return the common or taxonomic name of an organism along with other related information. If the user inputs a common name, that name will become
the search term, and the desired result will be the taxonomic name. The reverse is true if the user inputs a taxonomic name or imports a tree which contains taxonomic names as the node labels.
I hope to extend the functionality of this component to allow users to search for multiple organisms at once.
3.1.2 TreeBase Search/Prune
Using the taxonomic name from the ubio search, I will search the TreeBase database for phylogenetic trees containing the desired organism(s). I will determine which of the resulting trees is the best one to display and load that into PhyloWidget. Often, the tree will have more nodes than is suitable to display in an educational program to avoid overwhelming the user. I will prune the tree before displaying it and only search for images and common names for organisms contained in the pruned tree.
3.1.3 Image Search
Using the taxonomic name, I will search a series of image databases including Morphbank to find images of each organism in the tree. In the case that many images are found, I will display the first and allow the user to view the additional
images and change which one is displayed on the tree. Also at this time the common name search is repeated, using UBIO to find the common names for all of the organisms in the downloaded tree.
3.1.4 Image Integration
The image and common name will be displayed on the tree leaf nodes alongside the taxonomic name from TreeBase. The user will have the full range of control options available for manipulating the treeʼs display parameters that is already part of PhyloWidget. In addition, the user will be able to control selecting the displayed image for each node, if more than one image is available. Finally, the user will specify whether the taxonomic name or the common name is displayed more prominently.