An amusement park is always fun to visit on a hot summer nucleus. When you get there, you can rent a receptor and go for a swim. And there are lots of dynamic things to eat. You can start off with a hot dog on a membrane with mustard, relish, and genes. Then you can have a buttered ear of protein with a nice sticky slice of watermelon and a big bottle of cold cytoplasm. When you are full, it's time to go on the roller coaster, which should settle your hepatocyte. Other amusement park rides are the Dodge-'Em, which has little neurons that you drive and run into other labs, and the Merry-Go-Round, where you can sit on a big database and try to grab the brass microarray as you ride past.
click the image to enlarge
Alternative splicing and transcription are independently regulated to generate tissue-type-specific patterns. (Source: Benjamin Blencowe, University of Toronto)
Alternative splicing was first found in viruses and a few unusual protozoans. In his 1993 Nobel prize acceptance lecture, Phillip Sharp, co-discoverer of RNA splicing, estimated that about one of every 20 human genes is alternatively spliced. Some researchers found Sharp's estimate surprising. Was it really possible that 5% of human genes can play molecular Mad Libs?
In 2003, researchers refined Sharp's claim. Rather than occurring in one of every 20 genes, they suggested it occurs in three of four. For a huge majority of human genes, it turns out that splicing variation is a fact of daily life. Researchers in genomics and proteomics are just beginning to come to grips with the implications of that discovery.
Step right up
Alternative splicing is a phenomenon ideally suited to postgenomic studies. "The genomics data have really changed our perception of alternative splicing from being a
| Ups and Downs of a Wild EST Ride
Scientists in the field of bioinformatics have been on a wild rollercoaster ride for several years. When studies with expressed sequence tags (ESTs) first suggested that the human genome contains about 100,000 genes, the computer programmers blanched. With the completion of various drafts of the full genome sequence, and the revelation that there are only about 30,000 human genes, some of their color started to return. Now, the discovery of widespread alternative RNA splicing may send stomachs back into throats in the computer lab.
Because ESTs theoretically represent all of the messenger RNA species in a cell, the higher number of ESTs was at least partly a product of alternative splicing. Researchers may indeed need to track 100,000 or more transcripts to understand a cell at the genomic level.
"We can imagine that over the coming couple of years, there will be enormous amounts of quantitative data generated for alternative splice forms across a variety of tissue or disease states," says Christopher Lee, PhD, associate professor of chemistry at the University of California, Los Angeles.
Not everyone will be thrilled to hear that. "For many researchers, microarrays that report on 25,000 genes already provide a daunting amount of data to interpret biologically," says Jason Johnson, PhD, director of the genomics informatics department at Rosetta Inpharmatics. "At this point," says Johnson, "we don't know the molecular functions of half of these genes, let alone functional differences of all their alternative isoforms."
In bioinformatics, there may be little to do but remain seated until the ride comes to a complete stop.
Before whole-genome sequences were available, researchers who found extra mRNA bands on their gels often assumed they were contaminants. Now, genomic microarrays can provide a quick test for alternative splicing by revealing the complete exon structures of differentially spliced messages from the same gene.
Although many investigators are grabbing at low-hanging fruit in alternative splicing studies, Lee cautions that "there's still definitely a shakeout period right now where people try to figure out the boundaries of what is biologically functional." For example, about a third of the splice variants observed so far may cause protein degradation. That might be an important regulatory process controlling protein activity, or it might make those particular variants biologically irrelevant.
Much of the variation is clearly functional, though. Alternative splicing is "important for development, differentiation, [and] regulation of cell death," says Benjamin Blencowe, PhD, associate professor in the Banting and Best Department of Medical Research, University of Toronto, Canada. "We also know that alternative splicing is affected in different disease situations."
Understanding the details of those processes will take some time. The 2003 finding that three-fourths of all genes can contain at least one alternative exon certainly invigorated the field, but array designers are still catching up. "The importance of alternative splicing is certainly well appreciated, but it may take a while for this extra layer of biological complexity to become standard in microarrays and other genomics tools," says Jason Johnson, PhD, director of the genomics informatics department at Merck subsidiary Rosetta Inpharmatics, Seattle. Johnson was lead author on the 2003 genome-wide survey [Johnson et al., Science, vol. 302, pp. 2141-2144 (2003)], but he now focuses on more detailed analyses of splicing variation in individual drug target genes.
The sideshow tents
For investigators who want to study alternative splicing of particular genes, there is a major signal-to-noise problem. If a particular exon disappears under a set of experimental conditions, is that because it was excluded from the transcript, or because the entire transcript became less abundant? Johnson's initial survey looked only at simple exon-skipping events, and it identified situations where an alternatively spliced exon sits between two "constitutive" exons that are always present. The constitutive exons provided a control for RNA levels. However, many alternatively spliced exons sit next to other alternatively spliced exons, so researchers need a different approach for more detailed studies.
UCLA's Lee is now addressing this problem. His assay combines specially designed microarrays with computing algorithms that deconvolute the complex array results.
| In Gene Expression, Planning is Half the Battle
These days, gene expression studies are tough. You design your experiment, process the fragile RNA samples, get a profile of all of the transcripts in the control and experimental cells, and mine the data every which way with complex software. Then you find out that someone just discovered an entire universe of gene regulation phenomena that your experiment completely overlooked.
While some computer algorithms may permit retrospective splicing studies on arrays that were designed for other purposes, most gene profiling labs should probably start thinking about RNA splicing before designing the arrays. Unfortunately, few do.
"I look at people doing microarray analysis, which today is one of the great capacitating tools that people can use in basic research, [and] I think that 99% of the designs just ignore the fact that you can produce proteins of completely different functions from the same gene," says Juan Valcarcel, PhD, a research professor at ICREA, Barcelona, Spain. Considering the complexity of a "simple" transcript profiling experiment, the oversight is understandable. "They just don't want to complicate their lives," says Valcarcel.
Experts in the field recommend refining the experimental question before designing the array, since arrays can only accommodate a limited number of probes. To paraphrase P.T. Barnum, you can't profile all of the variants all of the time.
"One piece of advice I might give is to limit your experimental design and plan the analysis thoroughly up-front," says Jason Johnson, PhD, director of the genomics informatics department at Rosetta Inpharmatics, Kirkland, Wash. Johnson, a pioneer in alternative splicing studies, adds that "at this point in the technology, it's impossible to do everything at once, so strategic compromises will make your results easier to interpret."
Because of its sophisticated computer algorithm, the technique is not limited to custom-designed chips. "The same principle can be applied to pretty much any data set from different microarrays," says Lee. The original technique relied on putting redundant probes on an array to detect splice variation, but the researchers found that analyzing redundant samples with conventional arrays can yield similar results. Though most genomics researchers are already awash in data (See sidebar on ESTs, page 26), retroactively analyzing old microarrays for alternative splicing would be much less expensive than repeating the experiments with new arrays.
If Lee's approach has a drawback, it is the system's qualitative outcome: it tells whether a gene is spliced in a particular manner under a set of conditions, but does not reveal what percentage of the gene's transcripts has that configuration. The distinction is important, because recent evidence suggests that alternative splicing is not an all-or-nothing phenomenon. For example, a gene may splice 10%of its transcripts one way and 90% another way in a particular type of cell.
To study this, Blencowe and colleagues developed a quantitative approach to splice variation. They used a two-part system consisting of custom-designed microarray chips, plus an advanced machine learning computer algorithm, developed in collaboration with Brendan Frey and colleagues at the University of Toronto, to analyze the data. "This approach to data analysis is a key component," says Blencowe. "What we've ended up with is a platform that allows us to measure quite accurately inclusion levels of different exons."
In order to get reliable quantitative data from this system, researchers must design their arrays from scratch with alternative splicing in mind. "Most microarray experiments that have been published use microarrays that are only suitable for measuring transcription levels. They usually only contain one to several probes to a specific region of a transcript," says Blencowe. Detecting all of the possible splice products requires many more probes per gene, because the probes must cover all the exons and splice junctions.
That type of high-resolution study, however, will be critical for understanding cell biology. The exact proportions of different splicing factors and splicing events may determine a cell's overall physiological state, says Juan Valcarcel, PhD, research professor at ICREA, Barcelona, Spain. "If you look at the number of transcription factors that there are in a human cell, it's probably on the order of 2,000 or 3,000," says Valcarcel. "The number of splicing factors is probably one order of magnitude less than that."
Instead of using tissue-specific splicing factors, cells may rely on competition between different splicing mechanisms to determine the overall composition of mRNAs. Making particular factors more or less abundant could shift the equilibrium toward a different pattern of alternative splicing in multiple genes, dramatically altering the physiology of the cell. Such changes are not strictly academic. "Susceptibility to disease in many different pathological situations can be correlated with changes in the relative expression of isoforms for multiple genes," says Valcarcel.
For example, splicing variations that truncate growth factor receptors can cause tumor cells to enter an autocrine loop, in which they constantly tell themselves to divide and become deaf to tumor-suppressing signals. Similarly, a specific splicing change in the cell adhesion molecule CD44 may allow certain tumors to become metastatic and spread to other parts of the body. In cases like these, differences in the relative abundance of alternatively spliced messages might determine disease susceptibility or severity from one patient to the next.
Up the first big hill
Now that the breadth and depth of alternative splicing are becoming clear, more researchers are starting to face the challenges of a rapidly maturing field (See sidebar on
click the image to enlarge
Alternative splicing can include or exclude specific exons to generate different messages, which in turn produce distinct protein structures. (Source: Benjamin Blencowe, University of Toronto)
"There are so many issues here," says Lee. "For example, the studies that people typically do . . . you'll see differences in splicing between brain, say, and kidney or muscle. But just because that's the level at which the samples were prepared really doesn't tell us much of anything about whether that's the scale on which alternative splicing is being regulated." RNA splicing may differ between neurons in different parts of the cerebral cortex, or between slow-twitch and fast-twitch muscle cells, so investigators will have to find new ways to fractionate tissues if they want to see the details.
As in the rest of genomics, though, the big elephant in the corner of the room is protein chemistry. "There are only a few examples where someone has solved the structure of an alternatively spliced protein product," says Lee. Those examples are as instructive as they are intimidating, suggesting that differently spliced RNA isoforms of a gene can produce proteins with dramatically different structures and functions. Understanding the impact of any given alternative splicing event may require an extensive series of biochemical experiments.
Working out the mechanisms of alternative splicing will also take some time. "Although people are getting ideas about how these [splicing] factors bind and what they do, perhaps the challenge is to try to find a common thread in all these cases," says Valcarcel. "Will we have to go in detail into every single alternative splicing event . . . or are there [underlying] rules that we can't identify yet?"
Understanding the splicing rules would be like seeing the full text of the Mad Lib in advance, rather than choosing verbs and nouns at random. Then everyone could go grab a bottle of cold cytoplasm and skip the hot dog on a membrane.
About the Author
Originally trained as a microbiologist, Alan Dove has been writing about science and its interfaces with industry and government for more than a decade.
This article was published in G & P magazine: Vol. 5, No. 5, June, 2005, pp. 25-29.