Sunday, March 22, 2009
"Technical Difficulties"
But for the meantime, if it appears that I am not making due progress, this is why. I hope to have this resolved quite shortly, but since I have no idea what the problem is, all I can do is wait and see.
Friday, March 20, 2009
Common Name Search Begins
after text string "Scientific Match" search for "a href ='" string
save the next text till "'" as a link string
save a substring of the text after the next > and before <
compare that to the search term.
If it's a match
follow the saved link
on the resulting page:
Find the text "name = 'Common name'"
After that point, find "namebankID"
Save a substring of the text after the next > and before <
return the substring as the common name
Else
repeat by searching for next "a href = '" string
Stop when you've reached ".
Return either the search term or a null string as the common name.
Friday, March 13, 2009
Wednesday, March 4, 2009
Algorithm Details
Below is the section on Algorithm Details from my newly updated Design Document. Once the entire document is posted on the DMD Senior Projects website, I will provide a link to it. For now, this is the part I've been working the hardest on lately, and it will outline my plan for the rest of the project, so I think it's worth posting about.
I have chosen to build on the existing PhyloWidget program. This will allow me to focus on combining images with trees and incorporating common names of organisms, without having to recreate a significant amount of code. I will take advantage of PhyloWidgetʼs existing interface, tree rendering process, Newick parser, and many other features. Modifications and additions will serve the purpose of adding content fit for students and adapting the interface to be better suited for their needs.
The program begins by asking the user to input either a newick format tree, a taxonomic name, or a common name of an organism (or multiple organisms). The programʼs overall algorithm is as follows:
1) Search www.ubio.org for the taxonomic name or common name, whichever was not given
2) Search TreeBase by taxonomic name for relevant trees, prune them, and import them into PhyloWidget
3) Search online images databases such as www.morphbank.com for images of the organisms
4) Display those images alongside the leaf labels containing both the common and taxonomic names
Inputs that do not yield a tree will return a dialogue box to the user asking for more or different information. Image searches that do not yield any images will either be symbolized by a standard replacement image or will be drawn without any images.
3.1.1 Scientific and Common Name Search
I will interact with the website www.ubio.org, which searches based on keywords and can return the common or taxonomic name of an organism along with other related information. If the user inputs a common name, that name will become
the search term, and the desired result will be the taxonomic name. The reverse is true if the user inputs a taxonomic name or imports a tree which contains taxonomic names as the node labels.
I hope to extend the functionality of this component to allow users to search for multiple organisms at once.
3.1.2 TreeBase Search/Prune
Using the taxonomic name from the ubio search, I will search the TreeBase database for phylogenetic trees containing the desired organism(s). I will determine which of the resulting trees is the best one to display and load that into PhyloWidget. Often, the tree will have more nodes than is suitable to display in an educational program to avoid overwhelming the user. I will prune the tree before displaying it and only search for images and common names for organisms contained in the pruned tree.
3.1.3 Image Search
Using the taxonomic name, I will search a series of image databases including Morphbank to find images of each organism in the tree. In the case that many images are found, I will display the first and allow the user to view the additional
images and change which one is displayed on the tree. Also at this time the common name search is repeated, using UBIO to find the common names for all of the organisms in the downloaded tree.
3.1.4 Image Integration
The image and common name will be displayed on the tree leaf nodes alongside the taxonomic name from TreeBase. The user will have the full range of control options available for manipulating the treeʼs display parameters that is already part of PhyloWidget. In addition, the user will be able to control selecting the displayed image for each node, if more than one image is available. Finally, the user will specify whether the taxonomic name or the common name is displayed more prominently.
Tuesday, March 3, 2009
Alpha Preparation
I spoke with Val to outline the four specific tasks I’ll need to complete in order to allow the user to input an organism’s name and end up with a tree complete with images and both the common and scientific names for each organism. I divided up the major coding tasks and estimated how long it would take me to do each and which order it would make the most sense, and I used that information to revise my Gantt chart.
I’ve spent the rest of this week revising and updating my design document to reflect what I’ve accomplished and my goals for the rest of the semester.
I also contacted Anne Olsen at The National Biological Information Infrastructure (NBII) to ask about how to search the NBII image database, at the recommendation of Greg Riccardi from Morphbank.
I'll be meeting with Joe on Thursday morning and Val on Thursday afternoon to review my progress and updated design document.
Monday, March 2, 2009
Revised Gantt Chart
