Functional analysis of newly discovered growth control genes : experimental approaches

A large number of DNA sequences corresponding to human and animal transcripts have been filed in data banks, as cDNAs or ESTs (expression sequence tags). However, the actual function of their corresponding gene products is still largely unknown. Several of these genes may play a role in regulation of important biological processes such as cell division, differentiation, malignant transformation and oncogenesis. Elucidation of gene function is based on 2 main approaches, namely, overexpression and expression interference, which respectively mimick or suppress a given phenotype. The currently available tools and experimental approaches to gene functional analysis and the most recent advances in mass cDNA screening by functional analysis are discussed. Correspondence


Unfolding of novel growth-related genes
Throughout the years, a number of techniques have been developed to isolate DNA sequences coding for growth-related genes (1).Most methods yield partial cDNA sequences, thus requiring a search for complete coding sequences.One major problem in this field is confirmation of the actual role played by these sequences in growth control and/or neoplasia.Cell culture and animals have been widely used to this end.The main approaches involve one of two methods: a) overexpression of the gene (cDNA) of interest using adequate recombinant constructs in animals (transgenics) and/or in cultured cells, and b) blocking the expression of specific gene sequences using animals (knock-outs) or cultured cells.

Overexpression using recombinant DNA constructs
Plasmids and viral vectors (among which are retrovirus and adenovirus) have been used to transduce gene sequences to mammalian cells (2)(3)(4)(5).Overexpression is a powerful approach to study genes associated with activation of cell proliferation, cell cycle and oncogenesis.On the other hand, overexpression of tumor suppressor genes, which code for growth repressors, or of antisense constructs against essential growth-related genes, is hampered by the fact that this confers a selective disadvantage.Therefore, it is very important to rely on systems in which the transgene expression is inducible, since uncontrolled expression of certain genes can be deleterious or prevent isolation of stably transfected/infected cell lines.
Inducible gene expression systems appeared as a way out of this problem and as a means to achieve unequivocal interpretation of results obtained in functional analyses.Vectors in which gene expression is repressed until an inducer or stimulus comes into play are very useful; however, the choice of vectors and their respective inducers is critical, since some inducers may display effects of their own on cell growth.Therefore, the choice of the expression system is the most challenging step in gene function analysis.
Expression systems based on inducibility of gene expression by glucocorticoids conferred by the 5' long terminal repeat (LTR) regulatory region of mouse mammary tumor virus (6,7) are not indicated when gene function analysis is being performed in cells that are naturally responsive to glucocorticoids, since it would be difficult to sort out the effects caused by the gene under scrutiny from those due to the effects of the hormone itself.Plasmid vectors in which expression is controlled by a minimum metallothionein promoter, which is activated by heavy metals (cadmium, zinc, copper) (8,9) can lead to toxic side effects due to high levels of these metals.Another disadvantage of this type of vector is the difficulty to reverse the effects of these metals, even after prolonged culture (Muotri A and Menck CF, personal communication).Expression systems based on the interferon gene promoter, which is inducible by poly IC or by viral infection, have also been described (10); however, these inducers can also influence cell metabolism and growth.
A system based on the ecdisone promoter, which is activated by an ecdisone derivative (muristerone -a glucocorticoid analog), has recently been described (11).This system offers finely controlled expression; however, the choice of the cell line to be used is critical.The promoter is composed of the 9-cis retinoic acid receptor (RXR), which is homologous to the insect ultraspiracle protein (USP), and of the hy-brid (ecdisone/glucocorticoid) regulatory element.Even though this hybrid regulatory element, present in the vector, is synthetic, the ability of RXR/USP to promiscuously bind not only to muristerone but also to several other ligands and receptors could interfere with the transgene expression or alter the expression of cellular genes that are controlled by steroids and retinol.
More recently, inducible expression systems were developed in which regulatory elements from both prokaryotes and eukaryotes are used.This combination allows expression to be controlled by an inducer that is not associated with a direct response of the promoter in question or with any pleiotropic physiological response of the recipient cell line.These systems are based on the lac operon (12) or on the tet operon present in the E. coli Tn10 transposon (13).
Inducible vectors based on the lac operon regulate the availability of the LacI repressor protein to free or to block the operator sequence (LacO) localized next to a strong eukaryotic promoter from which the cloned gene of interest (transgene) will be transcribed.Variations in the levels of LacI and/or in the composition of the promoter region lead to transcription suppression (14,15) or activation (16) upon the addition of the isopropyl thio ß-d-galactopyranoside (IPTG) inducer.
In the commercially available system (Stratagene), a repressor protein derived from LacI is produced from the pCMVLacI plasmid using the cytomegalovirus promoter for constitutive expression.A nuclear localization signal (NLS) from SV40 monkey virus is incorporated into the repressor protein construct so as to direct the repressor produced to the nucleus (17).The transgene is cloned in another plasmid, also under a strong eukaryotic promoter, namely, the Rous sarcoma virus (RSV) LTR located so that it is separated from the promoter by the operator site.When the repressor protein accumulates in the nucleus, tetramers are formed.These tetramers bind to the LacO operator sequence, blocking formation of the transcription complex and its movement along the transgene (16).Addition of allolactose or its synthetic analog IPTG to the culture medium leads to conformational changes of the repressor tetramer, reducing its affinity for the operator sequence (18) which, in turn, allows transcription from the eukaryotic promoter (Figure 1).
This system constitutes an important alternative to traditional inducible expression systems, since the inducer (IPTG) is readily active and non-toxic to mammalian cells.Induction of gene expression can be achieved in 4-8 h, partially due to rapid transport of IPTG in mammalian cells.Moreover, when this system is used with reporter genes (CAT, luciferase) low basal expression (10-20 molecules/cell) is observed in the repressed state and the level of repression is partially dependent on the half-life of the transgene product (19).This particular system even allows control of transgene expression in whole animals (19).
Expression systems based on the tet operon are very similar to those based on the lac operon, since the repressor protein (TetR) binds to an operator site on the DNA and regulates the level of transcription of genes cloned downstream from it.Here again, variations in the level of the repressor protein allow expression to be regulated (positively or negatively) by the addition of a tetracycline derivative (doxicycline) to the culture medium.
Transcription regulation by tetracycline is based on a hybrid regulatory protein derived from the TetR repressor, which is constitutively synthesized.In the Tet-Off system (13) this regulatory protein activates transcription since it is composed of the Tetbinding domain and of the DNA-binding domain of TetR plus the C-terminal region of human herpesvirus 1 VP16 protein that displays transactivating activity (20).In the absence of tetracycline, this regulatory protein (tTa -tetracycline-responsive transcription activator) forms dimers that bind to the tetracycline-responsive element (TRE) composed of several tandem repeats of the TetO sequence, activating transcription of the transgene cloned downstream of a minimum promoter (usually the immediate early promoter from CMV).Addition of tetracycline or doxicycline to the culture medium reduces the affinity of tTa for TRE, decreasing the level of transcription in a dose-dependent manner (Figure 2).
Alteration of 4 amino acids in the TetR protein (21) originated a reverse repressor protein (rTetR) which binds to TRE in the absence of tetracycline.In this Tet-On system, fusion of rTetR to the transactivating Figure 1 -Schematic presentation of the LacSwitch (Stratagene) expression system.The repressor protein derived from LacI is fused with the SV40 nuclear localization sequence (NLS) and is constitutively produced from the cytomegalovirus promoter (P CMV ) present in the regulator plasmid.In the absence of isopropyl thio ß-dgalactopyranoside (IPTG), the repressor localized in the nucleus binds to the operator sequences present between the Rous sarcoma virus (RSV) promoter and the cloned cDNA in the response plasmid, blocking transcription.When IPTG is added to the cells it binds to the repressor, which, in turn, releases the operator sites and allows free transcription of the cDNA.LTR, 5' long terminal repeat.22) generates a reverse tTa (rtTa) that activates transcription when tetracycline is added to the culture medium (Figure 3).
These inducible expression systems have been used to evaluate the effects of overexpressing genes already associated with growth control.Controlled and reversible IPTG induction was successfully used to study c-fos transformation of rat fibroblasts (23).Apoptosis block was achieved through tetracyclineregulated bcl-2 expression (24).The inducible tetracycline (Tet-On) system was also used to study the role played by cyclins in the G0-G1 and G1-S transitions (25).The difficulty in obtaining stable transfectant cell lines using G1 phase cyclins cDNA is pointed out by these authors.The function of the recently uncovered FIN13 gene, a serine/threonine phosphatase that causes cell cycle block in the G1 and S phases, has also been studied using the Tet-On system (26).
Gene function is also widely studied by overexpressing cloned genes in animals (most commonly mice), generating invaluable strains of transgenic animals (27,28) which are used as models to study the effect of one or more gene products in development, physiology and diseases.

Interference with gene expression
Another approach to assess the role of specific genes in cell proliferation and/or differentiation is blocking gene expression at the DNA, RNA and protein levels, which can be achieved by the antigene/triple-helix approach (reviewed in Ref. 29), antisense or ribozymes (reviewed in Ref. 30) and antibodies (31), respectively.
Inhibition of gene expression by antisense oligonucleotides or constructs is based on the use of sequences that are complementary to that of the gene under study.The relevance of this artificial tool is illustrated by recently described evidence showing that natural antisense occurs in several species (32).The basis for inhibition of gene expression by antisense molecules has not been completely elucidated; however, the evidence points to translation blocking by steric hindrance or induction of transcript degradation by a specific nuclease (RNaseH) (33).
The efficiency with which a given sequence works as antisense is dependent on its ability to stably bind to the target sequence, to anneal to functionally important mRNA sites (translation start sites, splice junctions or ribosome-binding sites) and to affect the transcript.Therefore, it is critical to know the stability of the antisense oligonucleotide in the intracellular milieu, in addition to the structure of the transcript to be blocked (34) and the efficiency with which the antisense oligonucleotides causes steric block and/or activates RNaseH (35).
Figure 2 -Schematic presentation of the Tet-Off expression system.The tetracycline-controlled transactivator (tTa), that is a fusion of the TetR repressor with the activation domain of VP16, is constitutively produced from the cytomegalovirus promoter (P CMV ) by the regulator plasmid.In the absence of tetracycline, tTa binds the tetracycline response elements (TRE) upstream from the minimun CMV promoter (P minCMV ) in the response plasmid, activating transcription of the cDNA.When tetracycline or doxycycline is added to the medium, it binds to tTa, which, in turn, releases the TRE, stopping transcription.Binding of the antisense sequence to its target is essentially dependent on the mRNA secondary structure and on its interaction with proteins that render the majority of the RNA sites inaccessible.Both molecular modeling and empirical approaches have been used in attempts to foresee the mRNA accessibility to antisense sequences.Using arrays of immobilized oligonucleotides, sites capable of forming hybrids and of undergoing RNaseH digestion have been mapped (36).Incubation of angiotensin type-1 receptor mRNA with oligonucleotide libraries and RNaseH led to the identification of cleavage regions that are accessible to the enzyme (37).By selecting oligonucleotides according to their computer-determined hybridization kinetics and testing them in cellular extracts, it has recently been found that oligonucleotides with fastest association rates are more effective as antisense in vivo (38).The conclusion drawn from these studies is that the ability to form hybrids and to interfere with gene expression is not directly related to the sequence itself or to the mRNA secondary structure since in both of these studies among more than 1,200 oligonucleotides tested, no more than 10 sequences displayed antisense activity.
In order to improve oligonucleotides as antisense tools, increasing their resistance to DNases and their ability to stably bind the target mRNA and to induce RNaseH activity, a number of modifications were introduced.To confer stability against cellular DNases, the normal phosphodiester bonds in the DNA can be substituted by phosphorothioate bonds along the entire chain length or only at the terminals that are subject to the action of exonucleases (39).Reduction in the number of phosphorothyoate bonds eliminates nonspecific effects associated with the presence of thioates between nucleotides (40,41).To the same end, riboses with 2Oallyl radicals have been shown to block the action of DNases (42,43).To increase the stability of the oligonucleotide annealing to the mRNA, base modifications (5-methyl cytosine, 5-methyl uridine or 5-propinyl cytosine and 5-propinyl uridine) have been employed.The 5-propinyl modification of pyrimidines favors RNaseH activity on the hybrid mRNA-oligonucleotide target (42).Using this modification, oligonucleotides as small as 6-8 nucleotides have been shown to be effective (44).
Another critical factor to achieve inhibition of gene expression is the efficiency with which antisense DNA is transduced to the cells.Proteins and lipids that complex DNA and are internalized by the plasma membrane have been used (45).Microinjection into the nucleus and cationic liposomes are most commonly used (43,46).Several genes have been associated with important biological processes like cell proliferation, differentiation, tumorigenesis and invasiveness using the antisense approach.Oligonucleotides have been used to inhibit proto-oncogenes like c-myc (47,48) and cmyb (49,50), genes coding for cyclin-dependent kinase (CDK) inhibitors (45) and cyclins (51), among others.
Long-term antisense experiments can be carried out using antisense cDNA constructs in transient or stable transfection.Cloning of these constructs in inducible expression vectors yields stable cell lines in which the endogenous expression of specific genes is controlled by the presence or absence of the antisense inducer.Here also, the size, complexity and secondary structure of the antisense construct may affect its efficiency of hybridization with the target mRNA and the nonspecific hybridization with similar sequences.
Inducible antisense constructs have been used to study the role played by the S100 protein in glioblastoma (52) and in lung carcinoma (53), in addition to the role of IGF-1 (54) in rat prostate cancer and of the IGF-1 receptor in glioblastoma tumors (55,56).
In spite of the tempting success with the antisense approach, the results obtained with complex processes like cell proliferation should be considered with caution, as extensively discussed in a recent review (57).
Gene function can also be uncovered in vivo by interfering with gene expression through the generation of knock-out animals (27,28).

Direct functional analysis using retroviral libraries
The recently developed technology of expression cloning (58) based on the use of retroviral vectors in large-scale functional studies is very promising for the detection of new genes and new functions for already known genes.In this technology, gene isolation and characterization are no longer car-ried out in separate steps.
With the exception of differential hybridization and subtractive hybridization, most methods used for the isolation of differentially expressed sequences (differential display of mRNAS, serial analysis of gene expression, representative differential analysis and subtraction by suppressive PCR) only generate cDNA fragments, which are sequenced and/or used as probes in Northern blots to confirm the differential expression and to pick (from libraries) or synthesize (by RT-PCR) the corresponding full length cDNA to be used in functional studies.
Expression analysis by DNA arrays only yields hybridization signals, with the actual production of physical clones depending on the availability of previously cloned cDNAs or of probes like ESTs (available from the IMAGE Consortium, when not limited by patents held by private companies) used to screen cDNA libraries.
Functional characterization through overexpression of already identified genes requires cloning of each coding sequence in adequate expression vectors, which may be a very laborious and not so rewarding process.
In expression cloning or functional gene analysis by retroviral transduction, cDNA libraries constructed in retrovirus-based vectors are used to infect cells in culture.Since the efficiency of infection/transduction is high and the viral genome is stably integrated into the cellular genome, functional assays can be carried out using whole cell populations, in which each cell will be expressing a different cDNA and exhibiting different phenotypes that can be selected for using adequate biological assays.Thus, for screening of oncogenes, the focus formation assay or colony growth in agarose suspension can be used.When searching for genes related to tumor invasiveness, the Matrigel invasion chamber can be used.In the search for tumor suppressor genes, selection with fluorodeoxyuridine may be effective.Cells or colonies selected according to specific phenotypes can then be used to isolate the DNA sequence involved by PCR using primers corresponding to the retroviral vector, thus favoring the isolation of a large number of sequences related to a specific phenotype.
The greatest advantage of this approach is elimination of the most laborious steps involving sequence identification, confirmation of differential expression, search for complete coding sequences and discrimination of sequences that deserve to be analyzed versus those that should be discarded, allowing the analysis of a much larger number of sequences (Hunter T and Joazeiro CP, personal communication).
One disadvantage of this functional approach is the restriction imposed by retroviral vectors, in which the size of the cDNA to be cloned is limited by the virus ability to encapsidate and form viral particles.Another disadvantage is the requirement for high virus titers to allow efficient infection.Since the production of viral particles requires transfection of a retroviral library into packaging cells (that produce the structural proteins necessary to form the virion) infection efficiency is directly dependent on efficient packaging of the library.
Therefore, isolation of a representative population of cDNAs is dependent on the efficiency of several steps that ought to be individually optimized, namely, cDNA synthesis, ligation of cDNAs to the retroviral vector, transfection of packaging cells, recovery of viral supernatant, infection of target cells and functional biological assays.
Functional selection systems and vectors have been used (58,59) or are being developed (60)(61)(62), allowing us to foresee a rapid and massive push towards the discovery of new genes and new functional associations with already described genes and ESTs.
Understanding the role played by specific genes in cell proliferation control and tumorigenesis is crucial to the design of new targets for tumor gene therapy (63).

Functional analysis of DNA sequences associated with the transformed to normal phenotypic reversion
The C6 cell line ( 64) is derived from a chemically induced rat tumor.C6/ST1 and C6/P7 variant cell lines were isolated (2,65) as being respectively hyperresponsive and resistant to glucocorticoid hormones.
Complete phenotypic reversion (tumoral to normal) is displayed by the C6/ST1 variant both in vitro and in vivo upon treatment with physiological concentrations of hydrocortisone (65,66).This cell system is being used in our laboratory to assess the molecular mechanisms underlying the anti-tumor action of glucocorticoids.
Using cDNA library construction and differential hybridization, the cDNAs corresponding to metallothionein 1 and 2, alpha-1 acid glycoprotein and to an endogenous retroviral protein were isolated (67).A different approach (differential display, DDRT-PCR) also revealed the metallothionein 1 gene (68).More recently, using hybridization subtraction with suppressive PCR, a number of other genes were isolated (Vedoy CG and Sogayar MC, unpublished data).
The role of these genes in cell proliferation control is under study, using inducible expression of sense and antisense constructs and antisense oligonucleotides (Flatschart RB and Sogayar MC, unpublished data).Identification of specific sequences involved in the transformed to normal phenotype reversion can lead to valuable targets for gene therapy.
domain of VP16 (

Figure 3 -
Figure3-Schematic presentation of the Tet-On expression system.The reverse tetracycline-controlled transactivator (rtTa) is a fusion of a mutant TetR with the VP16 activator.In the absence of tetracycline, rtTa is free and transcription does not occur.When doxycycline is added to the cultures, it binds to rtTa which in turn binds tetracycline response elements (TRE), activating transcription of the cDNA.P minCMV = Minimum cytomegalovirus promoter.