Characterizing Proteins of Unknown Function: Orphan Cytochrome P450 Enzymes as a Paradigm

  1. F. Peter Guengerich,
  2. Zhongmei Tang,
  3. S. Giovanna Salamanca-Pinzón and
  4. Qian Cheng
  1. Department of Biochemistry and Center in Molecular Toxicology, Vanderbilt University School of Medicine, Nashville, TN 37232-0146

Abstract

With the rapid completion of genomic sequences of organisms today, we have far more gene products than functions we can ascribe. A number of experimental strategies have been developed and applied, both in vitro and in vivo, to put functions to these orphan proteins. The “deorphanization” of human and Streptomyces cytochrome P450 enzymes is considered quite important for pharmacology, with ramifications for the use of clinical therapeutics. The myriad of possibilities is too enormous to screen one reaction at a time, thus metabolomic or proteomic screens with complex biological samples are promising current strategies.

Introduction

The classical biochemistry paradigm of the twentieth century (Figure 1) involved a rather linear sequence of events: 1) in vivo phenomena were observed and replicated in vitro; 2) an in vitro assay was developed; 3) the enzyme was purified; 4) with the advent of recombinant DNA technology (ca. 1980), information about the protein (or an antibody) could be used to clone a cDNA, of which the nucleotide sequence could be determined; 5) Using that sequence, the gene could be identified and studied. Today we have maps and sequences of all the genes in humans and numerous other organisms, but we only know the functions of about one-half the proteins in the best cases. It is estimated that the functions of 54% of the proteins expressed by Escherichia coli are known (1). Thus, in a sense we need to develop strategies to work backwards, going from gene to cDNA to protein and then function (Figure 1). In the authors’ opinion, this is one of the central problems in biology today, and strategies to solve these problems are still in their infancy. In the first part of this review, we discuss the basic requirements of experimental systems to address this problem and the various approaches that can be employed. In the second part, we focus on examples and progress with the cytochromes P450 family, traditionally a class of enzymes of great interest for their ability to metabolize exogenous substrates such as drugs and carcinogens.

Figure 1
View larger version:
    Figure 1

    Traditional and twenty-first century approaches in science, emphasizing “reverse” strategies.

    The grouping of genes into superfamilies based on predicted amino-acid sequences or higher-order organizational units is common, but this gives rise to many “orphan” proteins assigned to a family but without a cellular or biochemical function.1 One of the early, classic examples of a designated superfamily is that which comprises the steroid nuclear receptors (2, 3). The field opened with the characterization of well-studied receptors [e.g., glucocorticoid, estrogen, progesterone (46)]. Scrutiny of the nucleotide and amino acid sequences of these receptors revealed their relationship to each other and to a set of other genes, totaling more than fifty members in this family. Mining of this family has revealed functions for a number of these orphans, including the receptors now known as peroxisome proliferator-activated receptors (PPARs) (7), pregnane X receptor (PXR) (8), and constitutive androstane receptor (CAR) (9). Some of the natural ligands are still in question; for example, the ERRγ (estrogen-related receptor), and it is not clear that some will have ligands (CAR) (9).

    In numerous cases, insight into function of a gene product can be inferred by comparisons with known enzymes. For instance, in the cytochrome P450 (P450) field we will discuss, P450 3A5 has many (but not all) catalytic activities of the previously discovered P450 3A4 (85% sequence identity) (10). However, predicting functions of new genes (1) can be quite problematic because of an issue sometimes referred to as protein “promiscuity.” Proteins act as enzymes because they use basic chemical principles (11), such as acid-base chemistry, to achieve catalysis. However, such a fundamental property—the result of a particular amino-acid sequence—can be utilized to facilitate very different types of reactions. Relevant examples are the enolase and vicinal-oxygen-chelate (VOC) superfamilies (12). The former contains sequence-related racemases, isomerases, dehydratases, and enolases. The latter (VOC) contains epimerases, dioxygenases, glyoxylases, and glutathione transferases. Therefore, prediction of function from primary sequences remains problematic. Even comparisons of 3-dimensional structures of proteins may not be very revealing, because some proteins, including the P450s, undergo major conformational changes after binding ligands (13) and there are also elements of induced fit (14).

    Systems

    The deorphanizing of proteins is aided immensely by the presence of some associated phenotype to exploit. This phenotype could be a physiological change or a molecular one, such as a new peak identified through liquid chromatography (LC)-mass spectrometry (MS). Genetic control of protein sequence or expression can be manipulated to establish a correlation between such changes and phenotype, thus linking the two. In some cases, nature has already provided the necessary genetic variants associated with a particular phenotype. For example, a deficiency in steroid 5α-reductase Type II was characterized as the molecular basis of pseudo-hermaphroditism in men in the Dominican Republic (15). Subsequently the Type I isozyme was discovered, but it exhibited only limited amino-acid sequence identity (~50%) (16). Based on the suspected steroid reductase function of the Type II as well as Type I proteins, their substrate selectivities were studied and inhibitory drugs (e.g., finasteride and dutasteride) were discovered (16, 17).

    One powerful approach to identify the function of proteins involves the use of genetically altered organisms in which the gene of interest has been inactivated or deleted (knock-out) or otherwise attenuated (e.g., with RNA interference methods) (Figure 2). The latter approach is extremely powerful, particularly with cultured cells, simple organisms (e.g., Caenorhabditis elegans) and mammals (1820). The concept is to use an interfering RNA to destroy the mRNA related to a gene and then observe any phenotypic change, inferring a biochemical function in vivo. One of the problems in using this approach with mammals is that the apparent ortholog in mice may not have the same function as the protein of interest in humans. Another concern is that other related gene products may show compensatory activities, masking the functional loss of the gene of interest.

    Figure 2
    View larger version:
      Figure 2

      Strategies for elucidating substrates and reactions of proteins, as exemplified by P450s. One strategy involves the use of transgenic animals, either devoid of an ortholog of the protein under consideration or with the protein over-expressed. Body fluids or tissues can be utilized for LC-MS analysis. Alternatively, the expressed protein can be used in vitro with tissue extracts from wild type (WT) and knock-out (KO) animals (or human) tissues. See Figure 3 for an 18O2 labeling strategy. LC-MS analysis is done using an appropriate program, e.g. principal component analysis (PCA), DoGEX (for 18O data), MZmine, etc. Differences are used to provide leads, which may require extensive characterization. Ultimately the relevance of any findings must be considered in the context of known pathways and in repeated experiments.

      A similar approach involves transgenic (knock-in) expression of a protein in a heterologous host, with in vivo analysis of function (Figure 2). It is important to note that the background effect of orthologs may be an obstacle, unless closely related genes have been eliminated. For example, ablating the mouse P450 3a genes is necessary before expressing transgenic human P450 3A4 in mice. Creating the knock-out or knock-in mouse is only half the challenge. What tissues should be sampled and how they should be analyzed? Additionally, work with tissue samples is invasive and generally not feasible with humans; the process of tissue sampling itself might affect changes in observable phenotype. Similarly, relevant changes may or may not be observed in plasma or urine. In all cases, some type of metabolomic (i.e., an analysis of the metabolites produced in a specific cell, tissue, or organism to better understand the physiological profile of that cell, etc.) approach (discussed in more detail below) is needed to assess the effect of genetic manipulation. One concern is that observed metabolic changes may reflect transformations of dietary components (mice) or media (microorganisms) and be irrelevant to normal physiology.

      The final major choice involves heterologous expression of a protein of interest and its use to define substrates in vitro (Figure 2). Numerous microbial, insect, and some other hosts can be employed, and the protein could be utilized in a crude or purified state, with various screening strategies (Box 1).

      Box 1

      Screening Strategies

      These strategies apply to in vitro systems with expressed proteins. One strategy is to guess what the enzyme reaction might be, based upon knowledge of what reactions are catalyzed by gene products with sequence similarity. The strategy is biased but might reveal a reaction most quickly.

      Another strategy is to screen libraries of specific compounds that are representatives of different chemical classes. This approach is most useful if screening systems can be at least semi-automated. Even if one of the positive hits has only low activity, using a new library based on the initial hit structure may be productive (i.e., a search with representatives of major steroid classes might yield low activities but subsequent screening of more analogs of a specific hit may be more productive).

      The third strategy is to screen natural sources related to the gene product (Figure 2). For instance, if an orphan P450 is known to be expressed in the liver (at least as judged by mRNA expression), then screening liver extracts is an option.

      Approaches

      Designing assays to detect a positive interaction in a screen is one of the major challenges of the deorphanization process. One approach is to monitor the binding of ligands to a candidate protein. An example is the discovery of linoleic acid as an endogenous ligand of hepatocyte nuclear factor (HNF) 4α (21). HNF4α was expressed in cells, and nuclear extracts of these cells were immno-precipitated with HNF4α–specific antibodies. Organic extracts of the immuoprecipitate were analyzed by MS, leading to the identification of linoleic acid as an endogenous ligand. The studies were extended to extracts from the livers of mice; linoleic acid was not found bound to HNF4α in fasting mice, but was found in mice on a normal diet or those re-fed after fasting. A major drawback to such affinity-isolation of ligand is that only very high affinity ligands will be captured. In the case of enzymes, with a typical “on” rate of 107 M−1 s−1 (22) and Kd of 1 μM, the “off” rate is 10 s−1 (t1/2 70 msec).Even nanomolar affinity makes the washing of a substrate-bound enzyme problematic. Additionally, the binding of a ligand to an enzyme may not equate with its biotransformation. A second method involves screening for biological activity. One example is work to identify ligands that could stimulate transcriptional activation of the PXR, originally classified as an orphan nuclear receptor. Because the responsive DNA element was unknown, a PXR-Gal4 DNA binding domain chimera was expressed and screened for the ability to activate a Gal4 reporter utilizing a “small” library of chemicals (8) that included steroids, vitamin D analogs, thyroid hormone analogs, retinoids, fatty acids, and some additional chemicals. Several steroids, including pregnenolone 16α-carbonitrile, were identified as ligands for the PXR linking the PXR to the induction of P450 3A genes, which were already known to be elevated in response to pregnenolone 16α-carbonitrile.

      Another approach can be used when a class of enzymes has a common catalytic mechanism or function, which can be exploited to develop active site–directed probes designed to tag an active enzyme. This technique can be used in complex mixtures of proteins, and active enzymes can be identified with fluorescent, biotin, or “click chemistry” tags (23, 24). Tags designed to allow purification of the labeled enzyme can be followed up with MS to identify the protein and/or its active site (labeled) residues. This approach has been termed “activity-based protein profiling” by Cravatt and associates and has been used on a variety of hydro-lases and other enzymes, including kinases, glycosidases, histone deacteylases, and oxidoreductases (23, 24). The activity-based protein profiling method can identify members of an enzyme class not predicted by sequence similarity, for example, sialic acid 9-O-acetylesterase was identified as a serine hydrolase. The approach has been extended to human P450s, but the probes are based on known mechanism-based inhibitors (25). The promiscuity of the P450 enzymes, some of which react with multiple probes, complicates the use of activity-based probes for this class of enzymes. Although activity-based P450 probes may be informative for the study of various inhibitors, it is not clear how this approach would be applied in the deorphanization process.

      Untargeted approaches can also be used in vitro. For instance, a recombinant protein is mixed with an extract of the tissue in which it is normally found, and the extract is then interrogated for small changes. Because the changes that occur due to one enzyme are expected to be small, sophisticated metabolomic approaches are needed for analysis of the data.

      Metabolomics

      Both the in vivo and in vitro approaches presented above yield small changes in complex systems, which are not readily observed by manual inspection of complex data sets (“looking for a needle in a haystack”). Nuclear magnetic resonance (NMR) has been used extensively in metabolomic studies on physiological fluids but not with protein deorphanization projects. LC-MS and, to some extent, gas chromatography (GC)-MS have been used to identify the reactions and substrates of orphan P450s.

      Using LC-MS, the separations achieved with standard ultra performance (UP) LC are excellent and rapid. One problem with this approach is sensitivity: a single ionization mode will not be ideal for detecting all compounds present in complex natural mixtures. One general solution is to perform multiple runs with positive- and negative-ion electrospray and with atmospheric pressure chemical ionization.

      A number of software systems can be used to analyze and compare related sets of LC-MS data (e.g., tissue extract ± enzyme, tissues extract ± cofactor, samples from wild-type vs knockout organisms). Principal component analysis (PCA), a mathematical procedure for comparing data sets with reference to rank of parameters, without identification of the items, can be applied (26, 27). Other systems, including MZmine (28), MetAlign (29), XCMS (30), and Waters MarkerLynx® can be applied. Recently, some of the features of MZmine, MetAlign, and MarkerLynx were compared using a metabolomics data set to assess and define the quality of an alignment process without any subjective interference of the analyst (31). A program specifically developed for P450s is DoGEX (“Discovery of General Endo- and Xenobiotics”) (32) (Figure 3). The approach makes use of the fact that most (but not all) P450-mediated reactions involve incorporation of an oxygen atom into the substrate, such that the product is 16 amu heavier. Using a 1:1 mixture of 18O and 16O labeled oxygen in incubations generates M/M+2 doublets in the MS spectra, which the program searches for (32, 33). In principle, such a system could be used with sulfotransferases, epoxide hydrolases, and any other enzymes that catalyze reactions incorporating a cofactor that can be modified with a stable isotope.

      Figure 3
      View larger version:
        Figure 3

        Use of the program DoGEX in the analysis of substrates of P450 7A1 with human liver extracts. A reconstituted P450 7A1 system was incubated with NADPH and human liver extract under a 1:1 (v/v) atmosphere of 18O2and 16O2. LC-MS analysis of an organic extract (derivatized with succinic anhydride) yielded the total ion current trace (part A). Analysis with DoGEX software yielded the profile shown in part B, where a green band signifies a ratio of M and M+2 peaks near 0.95, the target. The positive region of the spectrum is expanded in part C and shown in three dimensions in part D, with a m/z 603/601 doublet at tR 7 min identified. Confirmation as 7α-hydroxycholesterol (succinic ester) was performed with standard material. See (33).

        Deorphanizing P450s

        Plants and Microorganisms

        The importance of human P450s to pharmacology is clear, but the relevance of studying the P450s of plants and microorganisms is less immediately obvious. Thus, it is worth noting that the Actinomycetes (including Streptomyces) produce two-thirds of the antibiotic drugs used today, and the ergot/lysergic acid alkaloids produced by Claviceps species are structurally similar to neurotransmitters, are the cause of ergotism, and can be used therapeutically for central nervous system disorders. With plants and microorganisms, knockout approaches can be done readily, and dramatic phenotypes can be observed in many cases. One major advantage in studying microorganisms is that in some cases, function can be inferred from the clustering of functionally related genes in an operon. This was the case for the P450 enzyme in the cluster of ergot alkaloid biosynthetic genes of the ascomycetes Claviceps purpurea (34). From the accumulation of clavines that resulted when the P450 gene was knocked out, the function of the P450 was identified as clavine oxidase, important for the conversion of elymoclavine to D-lysergic acid.

        Sequencing of the genome of the soil bacterium Streptomyces coelicolor revealed eighteen P450 genes, and the functions of some have been discovered using knock-out and heterologous expression techniques (35). One difficult issue in deorphanization, at least when using approaches with purified P450s from many more-complex bacteria, is that the electron transfer patheays can be complex. For instance, S. coelicolor has four ferredoxin reductases and six ferre-doxins, thus more than twenty-four pathways are possible for delivery of electrons (as opposed to a single one for most mammalian P450s) (36). In the case of S. coelicolor P450 105D5, a single electron transfer pathway was preferred (NADH to ferredoxin reductase 1 to ferredoxin 4 to P450 105D5) (36). However, most bacterial P450s are reconstituted with spinach ferredoxin and NADPH-ferredoxin reductase, and the relevance of results obtained with this system to the in vivo situation is not known.

        One example of a deorphanization approach involves a current study with S. coelicolor P450 154A1 in this laboratory.

        A phenotype of the CYP154A1 strain is a defect in sporulation (Q. Cheng, in preparation). Comparisons of LC-MS profiles of extracts of wild-type and CYP154A1 cultures showed several differences. Incubation of a CYP154A1 cell extract (expected to accumulate P450 154A1 substrates) with purified P450 154A1 led to disappearance of one of the LC-MS peaks (that had accumulated in the knockout organism). The substrate peak was purified from an extract using semi-preparative HPLC, and its structure was characterized using high-resolution MS, UV, and a combination of NMR spectroscopy methods (Q. Cheng, in preparation). Subsequently, the product of the in vitro P450 154A1 reaction was isolated; its structure is being determined. It is noteworthy that the transformation by P450 154A1 does not require NAD(P)H nor O2, and the product has the same mass as the substrate, thus indicating that an isomerization or rearrangement has occurred.

        Plants P450 genes are very abundant: Arabadopsis thaliana has 246 and Oryza sativa (rice) has 328 (37) [cf. 57 in humans (38)]. Transgenic knockouts (especially in A. thaliana) have been studied extensively and used to discern function. For example, P450 703 was found to be involved in generation of in-chain hydroxylated fatty acids that are used in the synthesis of cross-linked structures in pollen (39). An oxygen consumption approach to finding substrates of purified plant P450s has been used in medium-throughput assays (40); although attractive in principle this method has the same limitations of NADPH oxidation screens with mammalian P450s, namely, lack of detection of “slow” P450s and poor coupling of substrate oxidation with NADPH consumption. Progress has been made with the use of microarray expression methods to monitor timing of expression of particular P450s at various stages of plant development, yielding clues to function (41, 42). Currently the functions of nearly forty plant P450s have been discerned (42). Several of these are in the pathways involved in the synthesis of morphine, an obviously important pharmacological entity (4345).

        Mammalian P450s

        Two major approaches have been applied to discover the functions of mammalian P450 enzymes. One is the utilization of transgenic mice, with either a deletion (of the ortholog) or the over-expression of a (human) P450. This approach has been applied in the laboratories of both Gonzalez (46) and Russell (47), in various modes of the general scheme presented in Figure 2.

        Yu et al. (46) utilized CYP2D6-expressing transgenic mice to demonstrate 5-methoxyindoleethylamine O-demethylation (by P450 2D6) in liver microsomes. Cheng et al. (47) prepared a cDNA library from CYP27A1-deficient mice and used an expression cloning approach to identify P450 2R1, focusing on vitamin D3 metabolism. Thus, deorphanization of P450 2R1 was accomplished using the “traditional” approach of Figure 1. The other general approach involves in vitro analyses with recombinant enzymes and has been applied in several laboratories, including our own.

        In vitro screening

        P450s 1A2, 2C8, and 2C9 are all expressed in human liver and are not generally considered orphans because they have many xenobiotic substrates, although some endogenous fatty-acid substrates have been identified as well (48). Incubation of these P450s with liver extracts and 16O2/18O2 mixtures (Figure 3) and data processing with DoGEX revealed several fatty acids as substrates, undergoing ω-1 and ω-2 hydroxylation and epoxidation of double bonds (49). The steady-state kinetics of the reactions were determined with the purified enzymes. In contrast, P450 3A4, a major drug-metabolizing P450, did not yield positive results in the same liver screen.

        Several lines of investigation support the view that mammals, including humans, can synthesize morphine (5053). P450 2D6 has been shown to be capable of catalyzing three steps in morphine biosynthesis; the oxidation of (R)-reticuline to salutaridine, thebaine to oripavine, and codeine to morphine (53, 54) (Figure 4). P450 3A4 can also catalyze the first of these three steps (53). A key question is which P450s catalyze the important 6-O-demethylation reactions (i.e., oripavine to morphinone, the-baine to neopinone), particularly in tissues where synthesis of analgesics might be important in pain control.

        Figure 4
        View larger version:
          Figure 4

          Latter steps of postulated pathway for synthesis of morphine, based on studies with plants. See (5153).

          Human P450 Orphans

          Roughly one-fourth of the fifty-seven human P450 genes can still be considered orphans, in that their functions are not very well established (38). Current information about the human orphan P450s is presented in Table 1.

          Table 1

          Human P450 Orphans

          P450 4F11 is expressed in human liver, and 18O2/16O2-DoGEX analysis using liver extract yielded reactions involving fatty acids (33) (Table 1). Previous studies had shown that 4F11 had higher catalytic activity with β-hydroxy fatty acids than with (unmodified) fatty acids (55), and a tenfold higher catalytic efficiency for one of these was confirmed (33). The in vivo relevance of the oxidation of β-hydroxy fatty acids by P450 4F11 is unknown. P450 2U1, an enzyme found in the thymus and brain, has also been shown to oxidize fatty acids, as established in assays with several substrates, including arachidonic acid (56, 57). P450 4V2 was expressed in a baculovirus-based system and found to catalyze the ω-hydroxylation of fatty acids (58). P450 4X1 was expressed in E. coli, purified, and (in limited screening) found to catalyze the 14,15-epoxygenation of anandamide but not arachidonic acid (the fatty acid component of anandamide) (59). P450 4Z1 (which to date has been difficult to express in bacteria) was expressed in Schizosaccharomyces pombe and reported to catalyze the ω-2, -3, -4, and -5 hydroxylation of lauric and myristic acids (60).

          As indicated above, the orphan human P450 4F family has been a focus of research, and fatty acids have been the only predominant substrate identified. Although some fatty acid oxygenation products do have interesting biological activities (61)—and fatty acid hydroxylation was the assay used in the initial fractionation of the mammalian P450 system (62)—it is not clear that any of these fatty acid oxidations with the orphan P450s are important (or a necessary part of their degradation). In several studies of orphan human P450s, fatty acids were the only potential substrates tested. In the untargeted screens, the sensitivity of the mass spectrometer for fatty acids (negative-ion mode) may be driving the results. One recent development in this laboratory is the use of dansylation methods for the derivatization of unactivated alcohols, thereby greatly increasing the sensitivity of the MS for potential P450 products (Z. Tang, in preparation).

          Non-Physiological Substrates

          There are two general views regarding the function of mammalian P450s (which probably apply to other enzymes involved in xeno-biotic metabolism and even drug transporters). One concept is that all of these enzymes have endogenous substrates. The other is that many of these enzymes have a protective function and exist to detoxicate the load of natural products (e.g., terpenes and alkaloids) that mammals consume each day (63). The propensity to metabolize drugs,or carcinogens is just an ancillary function of the protective systems that are present. These systems have broad substrate-specificity and act to lower the amounts of these natural compounds (i.e., terpenes, etc.) in cells. Thus, the possibility exists that a P450 may not have an endogenous substrate. The viability of mice devoid of some of the major xenobiotic-metabolizing P450s (or hepatic NADPH–P450 reductase) lends credence to this view (6466).

          The orphan P450 4F11 has low activity for some drug oxidations (33, 67). Although no broad screens for finding drug substrates have been reported for P450 4F11 some screening has been done with other human P450s. Das et al. (68) placed P450 3A4 in a “nanodisc” membrane system and conducted a medium-throughput screen for alteration of the heme spectrum. A concern about the approach was that some known P450 3A4 substrates yielded “Type II” spectra instead of the “Type I” spectra (69) usually associated with P450 (3A4) substrates. In the other experiment, Veith et al. (70) screened > 17,000 chemicals (drugs, chemical libraries) for ability to inhibit reactions of human P450s 1A2, 2C8, 2C9, 2D6, and 3A4 (utilizing model luminescent substrates). In principle, a similar approach could be used with a library of natural compounds. The results are of interest but do not, in themselves, distinguish between substrates and competitive inhibitors (71).

          Carcinogens are also of interest in the context of P450 reactions. The orphan P450s 2S1 and 4F11 did not activate any of the carcinogens in the battery tested (33, 72). However, P450 2W1 exhibited broad substrate selectivity in the activation of the same set of compounds (72). P450 2W1 is apparently only expressed in tumors and not in normal tissues (72, 73). Except for trace activity towards arachidonic acid (72, 73), no other activities have been reported except for carcinogens. The ability of P450 2W1 to activate a wide variety of carcinogens may be of relevance, in light of its tumor-specific expression (72).

          P450 2S1 is an interesting case. The enzyme is reported to be expressed in skin and liver (and also trachea, lung, stomach, small intestine, and spleen) and its expression is induced via the aryl hydrocarbon receptor (74, 75). Although the enzyme has been reported to oxidize retinoic acid (76) and naphthalene (77), these products had not been quantified and the results were not replicated (72, 78). None of the compounds examined in this laboratory proved to be substrates of P450 2S1 when analyzed in the usual way (e.g., with NADPH-P450 reductase and NADPH) (72) nor have untargeted searches with tissue extracts yielded positive results to date. Recently Bui et al. (78) reported several activities with P450 2S1 in reactions supported by alkyl hydroperoxides, including oxidations of aflatoxin B1 (unknown products), benzo[a] pyrene, naphthalene, and styrene. Although hydroperoxide-dependent P450 reactions have been known for many years (79, 80), the physiological relevance of these reactions has never been demonstrated and transgenic mice devoid of NADPH-P450 reductase have very low hepatic P450 functions (65). Nevertheless, heterologous expression of P450 2S1 in a cell culture system did result in metabolism of benzo[a]pyrene (81).

          Conclusions and Future Opportunities

          In many respects we are only in the beginning of serious efforts to define the functional landscapes of genomes, including the human one. With deficiencies and limitations also come opportunities. With reference beyond the scope of P450s, improvements in heterologous expression are still needed. As pointed out earlier, even most of the current LC-MS assays need more sensitivity, and we are developing some derivatization methods that should be rather general without requiring multiple systems for analysis or introducing bias into the searches. Several useful software systems have become available in the past few years, but the opportunity for more innovation remains. Comparing multiple LC-MS (or GC-MS) profiles is not trivial, particularly if one is focusing on very minor differences.

          Perhaps the most important contribution one could make is development of more clever strategies for discovering and validating protein functions. Of concern is the tendency to be satisfied with any positive activity result and thus, be missing a more important function. The fatty-acid oxidations mentioned earlier may be a “red herring” in this regard. For instance, consider the role of glutathione transferase P1-1 (GSTP1-1) in Jun N-terminal kinase 1 (JNK1) signaling, as opposed to its better known role in glutathione conjugation (82); through protein-protein interactions GSTP1-1 may suppress JNK1 activity, affecting many responses in the cell. Also, the role of NADPH-quinone reductases (NQO) in adaptive and stress responses may involve protein-protein interactions, not only the reductions of quinone species (83).

          Are there more imaginative ways to approach these questions? The answer is probably yes, and we would submit that the opportunities may be greater in microorganisms than mammals (or plants) due to the ease of genetic manipulation. Again, C. elegans and yeasts are powerful models owing to the genetic and high-throughput approaches available, although the issue of mammalian paralogs can be confounding. Saccharomyces cerevisiae has only three P450 genes. Others have taken the view that comparative genomics is the single most effective strategy (1), although we think that such approaches generate hypotheses that must still be tested through experimentation.

          Finally, most of our attention here has been given to small molecule substrates, which can be subjected to chemical analyses of the type discussed here. However, delineating the functions of enzymes with protein substrates (e.g., kinases and phosphatases) is generally more difficult. Finally, dealing with components of multi-protein complexes (e.g., DNA replication complexes, DNA nucleotides excision repair) is even more complex.

          Acknowledgments

          Thanks are extended to J. W. Nelson for the invitation to contribute this review and to E. M. J. Gillam for her comments. P450 research in the authors’ laboratory has been funded by the National Institutes of Health National Cancer Institute (grant R37 CA090426) and National Institute of Environmental Health Sciences (grant P30 ES000267).

          Footnotes

          • 1 Hanson et al. (1) use a different terminology and would call these “unknown” proteins (compared to their “orphan” enzymes).

          References


          Qian Cheng, PhD, received a BS in Biological Sciences from the University of Science and Technology of China, Hefei and PhD in Pharmaceutical Sciences from the University of Arizona. He began postdoctoral training at Vanderbilt in 2007 and has been characterizing the functions of Streptomyces P450s, as well as studying the inhibition of human P450s by drugs.


          S. Giovanna Salamanca-Pinzón, PhD, obtained a BS in Industrial Microbiology from the Pontifical Javeriana University, Colombia, and a MS and PhD in Biological Sciences from the National Autonomous University of Mexico. In 2008, she began postdoctoral training at Vanderbilt, working on the deorphanization of human P450 enzymes and their potential roles in carcinogen metabolism.


          Zhongmei Tang, PhD, received her BS in Engineering in Biopharmaceutics from China Pharmaceutical University in 2002 and PhD in Analytical Chemistry from Shanghai Institute of Organic Chemistry, Chinese Academy of Sciences. Since 2007, her postdoctoral research at Vanderbilt has focused on the development of in vitro LC-MS/MS approaches to elucidation of functions of human P450 enzymes using LC-MS metabolomics and software (e.g., DoGEX, MZmine, and XCMS).


          F. Peter Guengerich, PhD, is the Harry Pearson Broquist Professor of Biochemistry, Interim Chair of the Department of Biochemistry, and Director of the Center in Molecular Toxicology at Vanderbilt University School of Medicine. His research interests include characterization of human and bacterial P450s, mechanisms of P450 catalysis, and mechanisms of mutagenesis and toxicity. He has been an ASPET member since 1979 and received the ASPET John J. Abel (1984) and Bernard B. Brodie (1992) Awards for his research, as well as several from other scientific organzations. E-mail f.guengerich{at}vanderbilt.edu; fax (615) 322-3141.

          | Table of Contents