The full potential of protein array technology is within reach, but there are still hurdles to be overcome before it is completely realized.

Although early proteomic tools shone a beacon on research problems that arise when researchers try to identify proteins in complex mixtures, they did not give clues as to
click to enlarge 
A procedure for the analysis of proteins using the MudPIT technology consists of multiple chromatographic steps, followed by two mass spectroscopy (MS) scans. MudPIT allows for identification of proteins from complex mixtures without the aid of an antibody. (Source: Thomas Kislinger, PhD)
the function of these proteins. This shortcoming led to the development of protein function arrays, which allowed for the identification of proteins and measured their activities, whether enzymatic, DNA-binding, or other. However, there were many obstacles that stood in the way of developing the first protein microarray and the first protein function microarray.

The first DNA microarray was introduced in 1994 by Affymetrix Inc., Santa Clara, Calif., marking an important milestone in the brief history of genomics. Although DNA microarrays have had a significant impact, they are used to study DNA and RNA, which does little for the study of the biologically-active molecule, protein. However, in the late 1990s, the first protein microarray was developed, giving birth to proteomics. "We decided that we would try to make protein microarrays because we envisioned that they would have a lot more impact than DNA microarrays," says Michael Snyder, PhD, a professor of molecular, cellular, and developmental biology at Yale University, New Haven, Conn.

Challenges to overcome
There were two major challenges to overcome before the protein array could come to fruition, namely finding a way to produce hundreds, if not thousands, of proteins for the array, and keeping those proteins functional after depositing them on the array surface. But according to Snyder, the biggest challenge was producing enough protein content for the array. This basically involved making collections of expression clones, overproducing the proteins they encode, and then purifying all of these proteins.

A major problem that Snyder encountered while developing protein content was that the organism in which a protein was over-expressed affected the protein's function. For example, Snyder observed that many of the yeast protein kinases that he over-expressed in Escherichia coli were inactive, but when the same proteins were expressed in yeast, they were active. He later found that this was due to the proteins requiring posttranslational modifications for their function, modifications not made in E. coli. To produce proteins in a high-throughput fashion, the Snyder lab set up the production procedures in 96-well plates.

The second challenge was trying to figure how to deposit the proteins on the array surface and keep them functional. While working out the conditions for this, Snyder found that "different array surfaces work better for different applications." So, it is necessary to tailor the surface for the particular type of assay that is being performed. The first surface to prove successful in Snyder's protein microarray consisted of chelated nickel atoms. This surface was successful because it was able to bind to histidine in "his" tagged proteins. However, although this strong, non-covalent interaction worked for some assay types, it did not work for protein kinase assays. "Usually, if you test a few different surfaces, you'll be able to find one that works well for your particular assay," says Snyder.

In 2000, the Snyder lab, in collaboration with scientists at the University of North Carolina, Chapel Hill, succeeded in producing the first protein microarray by arraying the yeast proteome. "The diversity of assays for protein microarrays is much higher than that for DNA microarrays. . . . So we think that the overall utility should be much higher," says Snyder. To demonstrate the diversity of applications for this new tool, the Snyder lab probed the first yeast proteome chip for several activities including protein-protein interactions and lipid-binding, but then tested it for several others over the next few years.

Recently, Snyder's lab published a study in which they looked for substrates of protein kinases and found that many of these proteins have multiple substrates. Along the way, Snyder set up a company to commercially produce the yeast proteome chip. The company was recently acquired by Invitrogen Corp., Carlsbad, Calif., which continues to produce the yeast proteome chip.

Strengths and weaknesses
"The classical way of studying protein function is to purify your favorite protein, throw some assays at it, and then you see what you see," says Snyder. That has all changed with the advent of "omics" technologies, such as the protein microarray, which allow researchers to study biological molecules in an unbiased fashion and make unexpected discoveries. "A classic example of this was in a study that we published in 2004 in which we found, unexpectedly, that yeast metabolic enzyme binds DNA, an activity that no one else would have thought to look for."

One of the shortcomings of this blinded approach is that the researcher must use what they know to find out what they don't know. For example, Snyder used yeast protein kinase A to develop his protein microarray-based protein kinase assay because it is well-characterized.

Proteomics has come a long way and now there is a human proteome chip composed of 5,000 proteins, which is not every protein encoded by the human genome. "At one
click to enlarge 
An example of the power of MudPIT technology, this hierarchical clustering of the proteomic profiles shows all of the proteins detected in each organelle of each of these organs. (Source: Thomas Kislinger, PhD)
level, there is at least one gene product for every gene locus," says Snyder, who adds that with alternative splicing and posttranslational modifications, there could be millions of human proteins if all the different isoforms are counted.

Snyder says that another issue with human arrays is that some multi-spanning proteins (e.g., G-protein-coupled receptors) are not functional. However, these complex proteins have multiple hydrophobic transmembrane domains that affect their ability to fold properly outside the context of a cellular membrane. Thus, it is understandable that it is difficult to maintain the function of these proteins on an array surface.

Despite these problems with the human proteome chip, researchers are still using it to investigate human disease. "We use protein arrays to monitor auto-antibodies that patients make to their disease state and to do high-throughput identification of protein-protein interactions," says Arul Chinnaiyan, MD, PhD, professor of pathology at the University of Michigan, Ann Arbor. To identify new cancer biomarkers, Chinnaiyan measures the differential immunoreactivity of human sera (control serum versus cancer patient) against human antigens displayed on a 2,300-element phage peptide microarray. Each spot on the array contains a different phage-generated human peptide. A candidate biomarker is identified when there is a difference in the fluorescence-generating immune reaction in a cancer patient's serum versus control serum.

"Protein arrays are also the best tools for looking at high-throughput interactions, and it certainly allows you the opportunity to look for interactions between more low-abundant proteins such as transcription factors," says Chinnaiyan. However, one of the problems with using these arrays is that there are nonspecific protein interactions because of the washing conditions, he says. "You cannot wash using harsh conditions because the interactions are generally not strong and you may lose interactions with your protein of interest."

Synthesis in situ 
The problem of keeping all proteins functional on the array surface may be solved by using a novel type of protein array. "The approach that we have been using is to print the genes of the proteins on the array and then synthesize the proteins in situ on the surface of the array," says Joshua LaBaer, MD, PhD, director of the Harvard University Institute of Proteomics, Cambridge, Mass.

In this method, each gene of interest is first cloned into a plasmid vector such that the encoded proteins will have functional tags attached to them upon expression. The plasmids are then added, along with a rabbit reticulocyte transcriptional/translational extract, to the array surface. This extract catalyzes the over-expression of the tagged
 Shotgun Proteomics
Determining the identity of a protein is just as important as determining its biological function. Often determined through bioinformatics, protein identity is only one piece of the proteomics puzzle. “I do pretty extensive shotgun proteomics using the MudPit technique,” says Thomas Kislinger, PhD, a scientist at the Ontario Cancer Institute in Canada. MudPIT, which stands for multidimensional protein identification technology, is basically a two-dimensional chromatography technique which he has used to identify cancer biomarkers in serum from a mouse model of human ovarian cancer.

The technique involves digesting the protein sample into its constituent peptides, which are then separated through a series of column chromatography steps. As the peptides elute from the column, they are sprayed or “shot” directly into a linear ion trap mass spectrometer (MS), hence the phrase “shotgun proteomics.” The first MS scan assigns each peptide a mass/charge ratio. The most intense peptide signals are then fragmented in a second MS/MS scan which assigns each peptide a unique “fingerprint.” The fingerprints are then fed into bioinformatics databases that yield the protein’s identity if the protein is in the database.

Kislinger says MudPIT lets researchers identify every protein in a given sample without any prior knowledge of a sample’s protein content. This method has a distinct advantage over antibody-based protein function microarrays because the proteins that a researcher can study are not limited to the availability of capture and detection antibodies. “Obviously, compared to a microarray, MudPIT is a lot more time-consuming,” says Kislinger. It took him about one year of dedicated MS work to identify the protein contents of six healthy mouse tissues, a task that may have taken one month with a microarray.
protein in situ. As soon as they are expressed, the proteins are captured by antibodies (also arrayed on the surface) which recognize and attach to the tag, thus allowing the recombinant protein to be bound to the array surface. "We have done this with a variety of tags and they all worked fine, so there is nothing magic about the tag," says LaBaer. He adds that the key to developing this type of array technology is getting the chemistry right so that the in situ synthesis will occur efficiently.

LaBaer cites several advantages with this novel proteomics approach. Chief among them is that researchers do not have the burden of expressing and/or purifying the thousands of proteins needed for the array. This is important as there is no guarantee that these proteins will remain functional on the array surface after the expression and purification steps are done. Another major advantage of this method is that proteins arrayed to the surface also remain functional after the arrays are dried and then stored for five months at room temperature, says LaBaer.

A third advantage of using this array design is the relatively low cost of producing it. Although there are different array surfaces, they use glass slides because they are cheaper, says LaBaer. "You can sputter gold onto glass, you can use hydogels. . . some work, some don't. But these cost $10 apiece, whereas the glass slides that we use are just five cents apiece." While this method has many advantages, it was a challenge to develop. "Just trying to get it to work in a 96-well dish was a six-month exercise," says LaBaer. Also, they tried six to seven transcription/translation chemistries before determining the most efficient one.

LaBaer says a majority of the work involved in developing the arrays consists of many DNA mini-preparations. "We had to work something out so that we can produce DNA at such a large volume, especially because the commercial kits for mini-preps are so expensive and produce inadequate, poor-quality yields," he says. "So we have spent a lot of time in the last year developing our own chemistries for getting mini-preps of adequate yield and quality for our arrays."

Deciding where to take this new technology seems to be LaBaer's biggest challenge. Currently, he is looking for biomarkers of autoimmune diseases such as Type 1 diabetes, and for biomarkers of diseases that may have an autoimmune aspect, such as some cancers. He is also working to find vaccine targets in infectious disease agents such as Vibrio cholera, the etiologic agent of cholera.

This article was published in G & P magazine: Vol. 6, No. 7, September, 2006, pp. G14-G16.