Explore the Galaxy

Galaxy is a web-based application that allows you to do computational analysis using a great number of tools on large data sets from databases such as UCSC Genome Browser and others. (You can load your own data too.)

In Galaxy, the input of online datasets and processing takes place on the online servers and not on your own machine. This allows the use of workflows and histories to queue steps in your analysis without needing to wait for one tool to finish processing. The histories have never been deleted so you can retrieve them again and again or make workflows from them.

Histories and Workflow

Edit Workflow

Retrieve Data from Genome Browser

The other advantage that is it based on a user-friendly GUI and mostly involves clicking on things and does not involve complex coding. So its awesome for ppl like me who get a bit cross-eyed when we have to use R or Perl.

The site is free to register and you can usually get fast and easy results unless its a Friday afternoon when apparently the Americans wake up and sprint down the last stretch into the weekend. It is frequently updated with new tools so you are always exposed to new possibilities.

Galaxy 101

The site also has a nice set of exercises to help you get started and uses screencasts (videos of complete walkthroughs with narration). I have tried the exercises including finding Segmental Duplications in Genes where I carried out my analysis on the chromosome 6 (my favourite chromosome at the moment!). We found a higher than expected number of duplications in the chromosome. Chromosome 6 houses the MHC genes which are very important for antigen presentation in the interaction between white blood cells and other body cells.

Another exercise involves finding new exonic SNPs unique to Archbishop Desmond Tutu that are not found in the March 06 hg18 genome assembly, and finding out whether these SNPs could cause a change in the amino acid configuration of the protein that could lead to changes in the 3D structure and hence the function of the protein. I found no such change in the first 500 SNPs on chromosome 6 (servers were slowing down at this point), but there are definitely structure-altering SNPs in Desmond Tutu’s DNA.

Desmond Tutu

Ps. A few weeks ago I rummaged thru the genome browser looking for SNPs in a 15kb interval, a v manual process selecting SNPs manually and copying data. It took me 2 days to get my target sequence fully annotated and verified. If only I had known about Galaxy then… I could have just obtained a full list of SNPs in the region with a few clicks…so thanks to Dr Richard Badge for his nice introduction to using galaxy for bioinformatics.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.