Thursday, April 30, 2009

NBII Image Search

In the midst of preparing for my presentation, I am trying to have as much of a finished product as possible, so I have been working on including NBII.gov in my image search. Here's a detailed description of the search process that I wrote out before writing the code, so that I was sure I knew what I was doing before I attempted getting it to work. 
Here's the html for the form I need to "post" to in my Java code. (Note: I have to post it as an image because otherwise blogspot will actually interpret the code and do what it says - crazy.)

First I see that hitting the "go" button executes search_results.php and that I need to use the POST method to send data to the script. The next thing I need to find is that data that the script expects. One field is called "txtSearch" and the other is called "Submit". The value is txtSearch is not given, but the value of Submit is "Go". So assuming that the value of txtSearch is the search term, let's call it "term", I should be sending the string "txtSearch=term&Submit=Go" to http://life.nbii.gov/search_results.php".
When I get the response, I search for the term "Total images", since it indicates that I'm about to reach the place in the html where the images are listed. From then on, when I get to a line that contains "xt "Xthumbnail", since that indicates that I've gone past the section with the images I want. 

Monday, April 27, 2009

Posting to a .php site

I'm trying to expand my image search to include the nbii.gov image database, but their site is written using php instead of straight html. I don't know what makes it different, but I am pretty sure I'll need to interact with it differently than I did for my morphbank.org search. 

Here's the relevant line of source code: 
form action="search_results.php" method="post" enctype="application/x-www-form-urlencoded" name="frmSearch"
Anyone know about this and have suggestions for how I go about posting a search term to this form? Thanks! 

Friday, April 24, 2009

Displaying Images (2)

Morphbank image search and display is complete! Next: 
- fix ubio common name search bug
- NBII image search (which will most likely involve cleaning up some of the morphbank search code in favor of code reuse)
- common name display capabilities
- explore display settings for layouts other than rectangular

Wednesday, April 22, 2009

Displaying Images (1)

I added an option in the menu toolbar for a Morphbank image search. The callback function retrieves the tree, gets a list of all of its leaves, and then passes each of their labels to the search function one by one. The search works, but the images aren't displaying the way they used to. I'm still displaying in an external JFrame for now, and the frame pops up, correctly sized for the image, but it's empty (white) inside. I haven't changed that part of the code, so I'm not sure why that's happening, and I've stepped through the functions to make sure I'm not referencing something null somewhere. The answer remains to be found. 

In the meantime I've also attempted to sift through the massive network of code involved in rendering, so as to find out where exactly I jump in and write my code to insert the images before the label names of each leaf. No clear spot as of yet, but I'm knee deep in rendering code and at least starting to learn my way around. 

Monday, April 20, 2009

Downloading and Storing Images

I've integrated the image search into PhyloWidget. The images for each node are stored in the class PhyloNode as a LinkedList. Currently, when loading a tree, a search is conducted as each new node is created. This means that the loading process is quite slow, when really it should be possible to conduct the search after drawing the initial text-only display, or in a separate thread while the file continues to be parsed. Not quite sure how to do that, but I know they're options.

Next step is to display the images within the program as opposed to outside it in tiny little JFrame windows. That'll require some UI manipulation, which I was holding off on until the rest of the code was working. I'm also hoping to search some other databases, because Morphbank has very few images available for some categories of organisms -- at least that's true for the ones in the tree files I've been testing with.

Also, the search I'm conducting returns thumbnail size images. This is useful for quick downloading, but I'd also like to download the full size image so that when a user clicks on the image next to a node, they can see it full-size. I'll need to determine how best to access the full size images from the website and then how/when to download it and store it in the node.  

Finally, one of the more amusing issues is the "Image Not Available" image. The Morphbank search only displays results with images, but unfortunately some of those images look like this: 
Obviously that isn't too useful, but I don't know how I'd filter those out. I don't know if Image objects are comparable, because I could save one of these "Not Available" ones and test whether the returned image is equal to it before displaying it. I'll save that to work on later, though, since it's not crucial to the general functioning of my code. 

Harvesting Images

I am now able to search Morpbank.net for any given taxonomic name and return a LinkedList of all the images associated with that search. I have not yet integrated the code into PhyloWidget but intend to do so later today. 

One of the reasons I have not done this is because many revisions were made to the code during March, which I am not interested in including in my version of the code. Since I download the code using an SVN client, I need to figure out how to download the code as it was before those revisions took place. 

Once I am able to do that, the integration will consist of creating a variable to store these images within each node of the tree, determining when to call the search function with the node's taxonomic name, and displaying the resulting images within the current program. 

Of Note: This website listed a few short, easy examples about reading and writing images in Java, and as a novice at this I found it very useful.

Saturday, April 18, 2009

Pre-Parsing Label Names

Status Update Summary:
I have successfully coded the portion of my project which involves pre-parsing Newick tree files so as to display the scientific names of the organisms rather than the index numbers. 

Explanation:
Newick files exist in the following format, and their structure is defined by commas and parentheses: 

#NEXUS
BEGIN TREES;

TRANSLATE
1 Alligator_mississippiensis,
2 Chinchilla_brevicaudata,
3 Felis_silvestris_catus,
4 Balaenoptera_borealis,
5 Oryctolagus_cuniculus,
6 Balaenoptera_physalus,
7 Mesocricetus_auratus,
8 Meleagris_gallopavo,
9 Lepisosteus_spatula,
10 Camelus_dromedarius,
11 Proechimys_guairae,
12 Anas_platyrhynchos,
13 Platichthys_flesus,
14 Goosefish_lophinus,
15 Hydrolagus_colliei,
16 Canis_familiaris,
17 Physeter_catodon,
18 Acomys_cahirinus,
19 Hystrix_cristata,
20 Myocastor_coypus,
21 Struthio_camelus,
22 Myxine_glutinosa,
23 Saimiri_sciureus,
24 Cavia_porcellus,
25 Elephas_maximus,
26 Gadus_callarias,
27 Cyprinus_carpio,
28 Thunnus_thynnus,
29 Equus_caballus,
30 Crotalus_atrox,
31 Batrachoididae,
32 Myoxocephalus,
33 Gallus_gallus,
34 Capra_hircus,
35 Oncorhynchus,
36 Mus_musculus,
37 Homo_sapiens,
38 Anser_anser,
39 Ovis_aires,
40 Sus_scrofa,
41 Bos_taurus,
42 Mammalia,
43 Macaca
;
TREE tree_0 =  [&R] (((((((11,20),24),19),2,7,18,(36,23),37,43,5,(16,17,6,40),29,25,41,3,4,(34,39)),10),(33,8,21),30,1,15,(38,12),9,((31,26),32,14,28,13,27,35)),22);
ENDBLOCK;


Currently, PhyloWidget only looks at the last line, where the structure of the tree is defined. As a result, the leaves are labeled with numbers rather than the scientific names. My pre-parser replaces all of the numbers in the last line with the corresponding scientific names, so that the true names become the label names stored in each leaf node. For the tree listed above, the following image is now the new output: 


Since PhyloWidget supports loading trees from files as well as from manual input, I've inserted the pre-parsing phase into both processes, of course with the expectation that both the file and the user-input be formatted according to my example above. 

Monday, April 13, 2009

Schedule Update

What follows is the beginning of a revised set of deadlines. I intend to complete the tasks listed here by Tuesday April 21st:

- Pre-parse newick tree files to replace the given number-encoding with the appropriate scientific names, before sending the tree string to the regular PhyloWidget parser

- Retrieve images from Morphbank as MIME attachments, based on searches by scientific name, and cache the images locally

- Incorporate a thumbnail version of the each image into the PhyloWidget tree display at its corresponding node, according to the diagram posted previously. (If time, allow multiple images per node and create arrows so that the user can scroll through the thumbnails for each.)

After this deadline, I plan to return to the current bug in my common name search to complete that portion of the code. I will also look at the way that PhyloWidget draws the trees with the alternative layouts, to determine how best to incorporate the images in those cases. Finally, I will consider the various ways to save a user's work and reload a saved file for later use and work on implementing that feature.

Sunday, April 12, 2009

Health and Interfaces

I'm baaaack! 

To update anyone who actually reads this (other than Joe Kider): 
I got sick last weekend and after an evening in the hospital and a week pretty much in bed, I'm actually pretty close to better now. (It was just a virus. The hospital thing was, as my Dad says, "in an abundance of caution.") My thinking is, my computer decided to crash, and once it was fixed my body followed suit, each setting me back at least a week. That will mean a little readjusting when it comes to procedures to reach my goals and deadlines originally outlined in my design document, but since I'm still pretty excited about the prospect of sitting and standing, I'm figuring the time will come to handle that. That time might be tomorrow.

In the meantime, I'm reading up on Processing so that I can add things like check-boxes and images and scroll arrows to the tree display. It'd be a shame to do all this planning and coding and then neglect to write the part that allows the users to actually see and interact with all the new features. 

Processing Tutorial Notes (via links): 

Standard Image Loading: 


Request Image: Loads in a separate thread so the program doesn't freeze while loading large images

Check Boxes: 

Custom UI Components:

Additionally, Interfascia is a library for Processing that handles user interaction, and it looks simple enough that it may be worth using. 

Thursday, April 2, 2009

Up and Running

I've got a brand new 500GB hard drive, a reinstalled operating system, and a heck of a lot of catching up to do. Yesterday's project was to get Eclipse, Processing (UI), and Subclipse (SVN) installed so that I can run PhyloWidget and my own code as they were before the crash happened last Sat night (1.5 weeks ago). So that was an accomplishment because I had to remember how I'd installed everything in the past couple months, but I got it working. 

Today's issues are figuring out why my common name search is buggy and then proceeding onto dealing with searching multiple databases for images. At the same time I'll also need to make inroads into writing UI elements for each of these features and then integrating the code and the UI stuff into the PhyloWidget program.

The buggy-ness of the common name search refers to the fact that when I post data to the ubio.com search form and get the resulting HTML page, the page I get is not the same as the page I see when I do the same search in my browser. My code receives the HTML for an "Advance Search" page, rather than a results page. I've had my program print out the URL it's requesting, and it's the same as the results page URL in my browser, yet the HTML is for an advance search page. Totally strange. Any ideas, anyone? 

I'm also concerned about the SVN process, because whenever I want to update to a new revision of PhyloWidget, Eclipse overwrites the one I'm currently using. If I make changes and start including my own code, I want to be able to download revisions without deleting my own work. If you know anything about how to manage this, please share. 

Glad to be back on track, but it's an uphill battle as always!