click to enlarge

Figure 1. Identification of potential drug targets in unculturable organisms. The left panel shows an overview of the genomic approach. The right panel shows the results of applying essentiality predictions to the genome of the Wolbachia endosymbiont of Brugia malayi. Dots in the right panel represent Wolbachia genes, with red dots indicating sequence similarity to known drug targets. 

Sanjay Kumar, PhD, a senior scientist at New England Biolabs in Ipswich, Mass., works in a small research group that is interested in understanding human lymphatic filariasis, a disease caused by the parasitic worm Brugia malayi. His lab has developed software that performs in silico methods on a genome-wide scale to try to identify potential drug target genes in the worm genome as well as within the genome of a bacterium (Wolbachia) that resides in the worm as an endosymbiont. These are generally intractable, unculturable organisms that do not produce any functional genomics information. “We try our best to figure out which genes are essential because those would make the easiest drug targets,” says Kumar.

Performing genomic screens on this bacterium is difficult because there is no closely related model organism. Instead, Dr. Kumar and his team perform BLAST (Basic Local Alignment Search Tool) searches with the Wolbachia genes they identify, comparing them to experimentally identified essential bacterial genes. For each gene, the result of the analysis is a score based on sequence similarity to one or more closely-related genes located in the database. “Our output is a ranked list, from which we can filter based on similarity to known drug targets or lack of similarity to human proteins,” says Kumar.

“For the worm we had a slightly different approach because we have a very strong model organism, Caenorhabditis elegans, available,” says Kumar. “For Brugia malayi we can just basically go back and look at RNAi phenotypes that have been measured for orthologous genes in C. elegans and see which ones of those are significant.”

“By those means, we are able to filter and pare down from 11,000 genes in B. malayi to about 600 genes so that we have a manageable number of genes that other labs in our group can evaluate manually and develop a few into reasonable drug targets,” says Kumar. One of those drug targets in the worm is independent phosphoglycerate mutase (PGM), a form of conserved glycolytic enzyme that is present in the worm but not its human host.


click to enlarge

Figure 2. Screen capture from the Genomics Reviewer Desktop (GRD), built on the Avadis platform. (Source: Strand Life Sciences)

“Our software is based on the principle of annotation transfer using sequence similarity when phenotypic data is available. So, in the case where some folks have already annotated genes that are essential, we’re just transferring that annotation to genes in our organism. When that data are missing, we use phyletic conservation, which essentially means using a clustering algorithm to figure out which genes are conserved across many related genomes and therefore likely to be essential,” says Kumar.

In addition to drug target discovery tools, there are also genome analysis tools for the downstream U.S. Food and Drug Administration (FDA) submission process. Case-in-point, FDA has a pharmacogenomics review process that enables pharmaceutical companies to voluntarily add genomics data to their drug application submissions. The process is comprised of a number of different software components including ArrayTrack, which is FDA’s main storage and analysis platform; JMP Genomics, Cary, N.C., for gene expression data analysis; as well as other software for doing analysis of pathways and annotations.

“What the FDA needed was a way to integrate data from a number of different platforms to gain a consensus view on the submission…” says Thon de Boer, PhD, product management director, Strand Life Sciences, San Francisco, Calif., “…so that is what we built for them.” The software, Genomics Reviewer Desktop (GRD), is built on the Avadis platform and is a consensus builder to compare and contrast the results from the various pharmacogenomics data analysis applications and then produce a report that could become a part of the submissions process. What the Avadis platform does is to allow the FDA to identify outliers in genomic data that may be indicative of a drug not performing as expected. By efficiently integrating data from various applications, the goal of the FDA platform is to shorten the drug approval process.

Web-based genomic analysis
Also available on the market are Web- and cloud-based solutions for managing and analyzing next generation DNA sequence data. DNAnexus (Palo Alto, Calif.) produces an example of this kind of genomic analysis tool. The cloud is a very scalable computational and data storage infrastructure on which data analysis and visualization applications can be built for doing genomic research. The scalability is particularly attractive given the variable flow of data across labs and research applications. Because computational resources can be accessed as needed, users only pay for what they need when they need it.

“Our goal is to enable scientists utilizing NexGen DNA sequence data to very easily deploy a technology infrastructure that would off load the challenges associated with building and managing the bioinformatics necessary to support high throughput sequencing platforms,” says Andreas Sundquist, PhD, co-founder and chief executive officer of DNAnexus. Essentially, the company wants to eliminate the need to purchase computer hardware and software necessary to analyze the massive amount of raw sequencing data being generated by instruments manufactured by companies such as Life Technologies and Illumina.

“With DNAnexus, researchers can have all of their DNA sequence data securely sent to us via the Internet and they will reside completely in the cloud where we perform quality control and other typical analyses such as read mapping. Our vision is that DNAnexus will ultimately eliminate many of the data bottleneck issues that currently come with the use of next-generation DNA sequencing today,” says Sundquist. “We streamlined the whole effort so that users can literally access a massive bioinformatics infrastructure in 60 seconds.”

In this scenario, a lab would engage services from DNAnexus by signing up for an account, logging in, and deciding how much of their sequencing data they’d like to send. DNAnexus would then perform the initial read mapping and quality assessments as well as providing access, via any Web browser, to visualization tools for analyses, such as ChIP-seq and RNA-seq.

Analysis tools for genomic arrays
Another package built on the Avadis platform is GeneSpring from Agilent Technologies, Santa Clara, Calif. “GeneSpring is an intuitive suite of analysis software for microarray-based DNA and RNA applications such as genomic copy number, association studies, expression, splicing, and miRNA,” says Michael Janis, product manager for GeneSpring at Agilent Technologies. “It offers an interactive desktop computing environment that promotes investigation and enables understanding of microarray data within a biological context. GeneSpring has a long history as an industry standard in microarray analysis and benefits from a large and vibrant community,” says Janis.

“Pharmaceutical companies benefit from the classification routines available in GeneSpring for model building. These include powerful classifiers such as SVM, naïve bayes, and neural nets, all accessed through the same intuitive workflow in GeneSpring. As an application that supports multiple technologies, GeneSpring enables integrated biology research through analysis of all biological entities—genes, genomic variations, microRNAs, exons, proteins, and metabolites, as well as the combination of such heterogeneous data.”

About the Author
James Netterwald is president and CEO of BioPharmaComm LLC, a provider of writing, editing, and consulting services to the life science, pharma-biotech, and public relations industries.