IDENTIFICATION AND CHARACTERIZATION OF A RESISTANCE GENE ANALOG ( RGA ) FROM THE Caricaceae DUMORT FAMILY

The majority of cloned resistance (R) genes characterized so far contain a nucleotide-binding site (NBS) and a leucine-rich repeat (LRR) domain, where highly conserved motifs are found. Resistance genes analogs (RGAs) are genetic markers obtained by a PCRbased strategy using degenerated oligonucleotide primers drawn from these highly conserved “motifs”. This strategy has the advantage of the high degree of structural and amino acid sequence conservation that is observed in R genes. The objective of the present study was to search for RGAs in Carica papaya L. and Vasconcellea cauliflora Jacq. A. DC. Out of three combinations of primers tested, only one resulted in amplification. The amplified product was cloned in pCR2.1TOPO and than sequenced using M13 forward and reverse primers. Forty-eight clones were sequenced from each species. The 96 sequences generated for each species were cleaned of vector sequences and clustered using CAP3 assembler. From the GENEBANK, one RGA was identified in C. papaya showing a BlastX e-value of 2x10 to the gb|AAP45165.1| putative disease resistant protein RGA3 (Solanum bulbocastanum). To the extent of our knowledge this is the first report of a RGA in the Caricaceae Dumort family. Preliminary structural studies were performed to further characterize this putative NBS-LRR type protein. Efforts to search for other RGAs in papaya should continue, mostly to provide basis for the development of transgenic papaya with resistance to diseases.


INTRODUCTION
Carica papaya L. is the most well-known and cultivated species in the Caricaceae Dumort family.Cylicomorpha, Jacaratia, Horovitzia, Vasconcellea and Jarilla are the other five genera of this family.The Carica genus contains only one species, C. papaya, while the genera Vasconcellea contains several species, which were originally classified as belonging to the Carica genus, including Vasconcellea cauliflora Jacq. A. DC.This species, known as "tapaculo", "papayo de montaña" or "zonzapote" is found from the South of Mexico to the North of South America, as well as in Trinidad.Such species is a well-known source of natural resistance to Papaya Ringspot Virus (PRSV), the main virus attacking papaya worldwide (Badillo, 1993(Badillo, , 2000(Badillo, e 2001;;Manshardt & Wenslaff, 1989).
The majority of the resistance genes (R genes) cloned and sequenced until now are part of the nucleotide binding site-leucinerich repeat (NBS-LRR) gene family (Rommens & Kishore, 2000).The NBS-LRR gene products are generally composed of three main domains: a) a variable N-terminal domain of approximately 200 amino acids; b) a NBS domain of 300 amino acids, and c) a more variable tandem array of approximately 10 to 40 short LRR (leucine-richrepeat) motifs (Cannon et al., 2002).The NBS domain is believed to participate in signal transduction, while the LRR domain is thought to be involved in ligand binding and pathogen recognition (Young, 2000).P-loop, RNBS-A, kinase 2, RNBS-B, RNBS-C, GLPL, and RNBS-D are also highly conserved motifs generally present in the NBS domain of the R genes (Lee et al., 2003).
A new PCR-based strategy, using degenerated primers designed from these conserved motifs, has resulted in the isolation of numerous resistance gene analogs (RGAs) from a variety of plant species such as potato (Leister et al., 1996), bean (Ferrier-Cana et al. 2003), rice (Leister & Katagiri, 2000) and several others (for review see Chelkowski & Koczyk, 2003).Once found, a series of uses can be assigned to this type of marker: a) as a probe to screen BAC or IDENTIFICATION AND CHARACTERIZATION OF A RESISTANCE GENE ANALOG (RGA) FROM THE Caricaceae DUMORT FAMILY cDNA libraries, in the process of searching for R genes; b) as a marker to be applied in marker assisted selection; and c) to obtain resistance by its overexpression in the plant genome.The objective of the present study was to search for RGAs in C. papaya and V. cauliflora using primers designed from the P-loop and RNBS-D motifs.To the best of our knowledge, no RGA has so far been described for Caricaceae.

MATERIAL AND METHODS
Plant genomic DNA from the transgenic papaya Embrapa PTP18 (Souza Jr. et al., 2005) and from V. cauliflora plants was extracted as described in Souza Jr. et al. (2005).The DNA was quantified and stored at -20°C until use.
DNA products from the PCR reaction were cloned into the pCR2.1-TOPOvector (TOPO TA Cloning Kit -Invitrogen Life Technologies).Forty-eight white TOP10 E. coli colonies were randomly selected per species, and sequenced using M13 forward and reverse primers at the Embrapa Genetic Resources and Biotechnology DNA Sequencing Platform (http:// www.cenargen.embrapa.br/laboratorios/laboratorios.html#dna).
The software PHRED (Ewing et al., 1998) was used to base call and to estimate error probability in the 192 chromatograms.After trimming (Telles & da Silva, 2001) to remove off artifacts, low quality sequences, vector and primer regions, the 123 remaining sequences were clustered using the CAP3 assembler (Huang & Madan, 1999).BLAST (Altschul et al. 1997) was used to identify similarities between the resulting 76 clusters (43 singlets and 33 contigs) and sequences in the NCBI's nr database (Benson et al., 2002).Sequence alignment was performed using CLUSTALW (Thompson et al., 2000).To further characterize the translated protein we submitted the amino acid sequence to a secondary structure prediction server.The PSIPRED method was used (Jones, 1999).The preliminary homology molecular modeling was performed in the Meta-Server from CBS-CNRS site (http://bioserv.cbs.cnrs.fr/HTML_BIO/tito.html)

RESULTS AND DISCUSSION
The primer combination P1b and RNBS-D was the only one that successfully amplified DNA.The profiles generated after running on an agarose gel were different for the two Carica species, with the number of strong and well-defined DNA bands being higher in V. cauliflora.DNA bands of expected size (~700 bp) were observed in both species (Figure 1).
Multiple alignment of the C. papaya sequence with retrieved RGAs demonstrates that all of them share the same conserved sequences described in other plant resistance genes (Pan et al., 2000).The alignment is shown in Figure 2.This finding indicates that in C. papaya might have members of the family of NBS-LRR disease resistance genes.In particular interest for comparison are sequences conserved in plants NBS region.This region is a tripartite  conserved motif considered to be involved in the nucleotide binding (Holt III et al., 2003).Between the NBS and the LRR regions there is a hydrophobic domain (RNBS), whose three motifs (RNBS-A, B and C) are conserved in the majority of the R genes of the NBS-LRR class (Chelkowski & Koczyk, 2003).The presence of two conserved phenylalanine residues, separated by four amino acids, comprises the kinase 2 motif, which is a characteristic feature of NBS-LRR proteins (Mago et al., 2002).In the C. papaya sequence the majority of conserved feature were identified.At the N-terminus of the NBS domain, the conserved region denominated RNBS-A motif, may play a signal transduction function, and the consensus G-X-X-G-X-G-K-T-T appears as the P-loop or kinase 1 motif with an essential function for the orientation of the phosphate group (Moffett et al., 2002).Other important features are the hydrophobic residues usually present in RNBS-B and C, which are also shown in the C. papaya sequence.In addition, the consensus amino acid domain GLPL is present and it is functionally associated with structural stability of domains adjacent to the NBS complex in protein-protein interactions (Shirasu & Schulze-Lefert, 2003).
Intracellular R proteins can be divided into subfamilies with members that have either a coiled-coiled (CC) structure or a motif TIR.The TIR sequences contain domains in their amino terminus very similar to the Drosophila Toll or human interleukin receptorlike (TIR) region (Meyers et al., 2002).The TIR domain seems to be found only in dicotyledonous plans, whereas CC are found in monocots and dicots (Pan et al., 2000).For the TIR classification the characteristic amino acid is the Aspartic acid (D) at the end of the kinase-2 domain (position 91 -underlined in Figure 2).When there is a replacement for tryptophan (W), the sequence can be classified as non-TIR group.Consequently, as the C. papaya candidate shows the W residue at position 91, it should be classified as a non-TIR resistance gene analog.
A phylogenic tree (Figure 3) of RGAs from C. papaya and nine other plant species, based on alignment of amino acid sequences using CLUSTALW, shows an evolutionary proximity of the papaya sequence with RGAs from Solanum bulbocastanum (AAP45165), Vitis vinifera (AAM21291), and Glycine max (AAL50031).
The secondary structure prediction for the putative resistance gene from C. papaya PSI-BLAST algorithm to detect distant homologues reveals four potential â-strand and 11 á-helices which is similar to a predicted plant disease resistance gene product (Rigden et al., 2000).The secondary prediction is shown in Figure 4A.The proposed model for the common core domain is shown in the Figure 4B.The common core GLPL structure is organized by a âstrand followed by a coil.The hydrogen bond network is organized and the portion of the structure seems to be stable.Preliminary structural evaluation indicates that the sidechains of the conserved leucine residues might have a similar tridimensional structure to leucine-zipper domains and the binding helices might have the same motif.To determine the precise three dimensional structure it would be necessary to investigate other templates in the NBS region in order to stabilish further structural features.

CONCLUSIONS
At the extent of our knowledge, no partial or complete sequence of an NBS-LRR type protein has so far been described for any member of the Carica genus.The sequence described here is restricted to the NBS domain, and is the first one ever described for C. papaya.At our lab new degenerated primers were designed from conserved motifs present in the NBS, as well as in the N-terminal and LRR domains of resistance genes described in the literature and at the NCBI database.Seven new primers combinations were tested and all of them were successful in amplifying DNA from C. papaya and V. cauliflora (data not shown).Cloning and sequencing of these PCR products are in progress, and new RGAs are expected to be identified from this work.Efforts to search for other RGAs in papaya should continue, mostly to provide basis for the development of transgenic papaya with resistance to diseases.

FIGURE 1 -FIGURE 2 -
FIGURE 1 -Gel image of the PCR amplified RGAs fragments (arrow) with the size of approximately 700 bp for both species.(M) marker, (P) Carica papaya and (C) V. cauliflora.
P. DE P. R. AMARAL et al.Pairwise comparisons of the NBS candidate with the Protein DataBank showed the highest similarity between the GLPL sequence domain and (1Ko9) 8-oxoguanine DNA glycosylase (42% identity).