Isolation and characterization of CaMF3, an anther-specific gene in Capsicum annuum L.

Previous work on gene expression analysis based on RNA sequencing identified a variety of differentially expressed cDNA fragments in the genic male sterile-fertile line 114AB of Capsicum annuum L. In this work, we examined the accumulation of one of the transcript-derived fragments (TDFs), CaMF3 (male fertile 3), in the flower buds of a fertile line. The full genomic DNA sequence of CaMF3 was 1,951 bp long and contained 6 exons and 5 introns, with the complete sequence encoding a putative 25.89 kDa protein of 234 amino acids. The predicted protein of CaMF3 shared sequence similarity with members of the isoamyl acetate-hydrolyzing esterase (IAH1) protein family. CaMF3 expression was detected only in flower buds at stages 7 and 8 and in open flowers of a male fertile line; no expression was observed in any organs of a male sterile line. Fine expression analysis revealed that CaMF3 was expressed specifically in anthers of the fertile line. These results suggest that CaMF3 is an anther-specific gene that may be essential for anther or pollen development in C. annuum.


Introduction
Chili pepper is one of the most important vegetable crops in many countries, including China, primarily because of the high nutritional value of the fruits (high content of dry material, vitamin C and B-complex, minerals, essential oils and carotenoids); these peppers are widely used in the culinary and food industry (Pino et al., 2007;Irikova et al., 2011). Chili pepper plants are frequently crosspollinated and heterosis is widely used to increase production and economic return in the hot pepper industry. However, the manual emasculation and pollination of flower buds is still a major method for producing hybrid Capsicum seeds in many countries, despite its high cost and the difficulty in ensuring seed purity. Fortunately, the use of male sterile lines can solve this problem.
In peppers, male sterility is classified as either cytoplasmic male sterility (CMS) or genic male sterility (GMS) (Shifriss, 1997;Lee et al., 2010). Over a dozen GMS peppers have been found in nature or have been artificially produced by mutagenesis with X-rays, gamma rays or treatment with ethylmethanesulfonate (Shifriss, 1997). Since the only difference between fertile and sterile plants of the GMS lines 114AB of Capsicum annuum is that the anthers and pollen of fertile plants develop normally whereas the anthers of sterile plants are abnormal and there is no pollen in mature anthers, sterile plants have become very important for studying pollen development (Hao et al., 2008). Consequently, genes expressed in anthers of fertile plants but not in sterile plants probably have roles in the development of pollen or anthers.
Analysis of the chili pepper transcriptome identified tissue-specific genes among 11,225 consensus sequences, with only two genes being identified as anther-specific (Kim et al., 2008). This finding indicated that the molecular mechanism of anther development is largely unknown and that there are probably numerous genes related to anther development awaiting identification. Two anther-specific genes, Camf1 (HQ386720) and CaMF2 (JF411954), have been identified in C. annuum by cDNA-amplified fragment length polymorphism (Chen et al., 2011(Chen et al., , 2012. These two genes are expressed only in middle-stage flower buds of fertile plants and may have a role in pollen development and germination in peppers. In an analysis of differential gene expression based on RNA sequencing of flower buds at different developmental stages from sterile and fertile plants of the genic male sterile-fertile line 114AB of C. annuum we have identified hundreds of ESTs (expressed sequence tags) that are differentially expressed in fertile and sterile plants (unpublished data). In the present study, an anther-specific gene known as CaMF3 (accession num-ber JN975046) was isolated from flower buds of fertile plants by using an in silico approach and RT-PCR.
The deduced amino acid sequence of CaMF3 showed similarity to a putative isoamyl acetate-hydrolyzing esterase (IAH1). Expression analysis indicated that CaMF3 was an anther-specific gene expressed in flower buds at late developmental stages and in open flowers. To our knowledge, there has been no description of the expression or functional analysis of this gene family in plants or animals. The study of CaMF3 could improve our understanding of the molecular mechanisms of anther or pollen development and the roles of the isoamyl acetate-hydrolyzing esterase gene family.

Plant material
The genic male sterile line 114AB of chili pepper (C. annuum L.) was cultivated on the experimental farm of the South China Agriculture University, with 50% of the plants being sterile and 50% being fertile. The dynamics of anther development differ between sterile and fertile male plants, especially with regard to anther maturity. Anther length and diameter in sterile plants are smaller than in fertile plants at the stage when sepals spread out. The anther filaments of sterile plants are very short, dark purple and small, with no pollen in mature anthers. However, the sepals, petals and pistils of sterile plants are normal so that sterile plants can be maintained by fertilizing with normal pollen.
Plant material was collected and immediately frozen in liquid nitrogen prior to storage at -75°C until RNA and DNA extraction. The samples included roots, tender stems, fresh leaves, open flowers, sepals, petals, anthers, pistils and flower buds from eight developmental stages (Chen et al., 2011) from sterile and fertile plants.

DNA/RNA extraction and cDNA synthesis
Genomic DNA was extracted from young leaves with cetyltrimethylammonium bromide (CTAB), essentially as described by Murray and Thompson (1980). Total RNA was extracted from different tissues using TRIzol reagent (Invitrogen, USA), according to the manufacturer's instructions. Each RNA sample was subjected to DNase digestion (Takara, Dalian, China) to remove any remaining DNA. The RNA content was quantified spectrophotometrically (BioPhotometer plus, Eppendoff, Germany) and checked by electrophoresis on 1.2% denaturing agarose gels. First-strand cDNA was synthesized using a SMART PCR cDNA synthesis kit (Clontech, USA) according to the manufacturer's instructions.

Amplification of the CaMF3 gene
The EST sequence (EL812860) identified by RNA sequencing was used as a query in a BLAST search against the EST database at GenBank. EST sequences that shared high identity (> 98%) with EST EL812860 were selected and assembled. The primers for full length cDNA and genomic DNA of CaMF3 were designed based on the deduced full cDNA sequence. PCR was done by using cDNA and genomic DNA of the mixture of flower buds from different developmental stages of fertile plants as templates. The full-length cDNA and DNA were amplified with the same primers: P1: 5'-GCACGAGGAAAAAATCCAAG AATTTGG-3', P2: 5'-GACATTCTTTTGTTGATGAA ACTGGTA-3'. The reaction mixture of 50 mL contained 0.1 mg of template DNA, 2 mL of each primer (10 mM), 5.0 mL of 10PCR buffer (Mg 2+ free), 4.0 mL of Mg 2+ (25 mM), 1.0 mL of dNTPs (10 mM) and 3 U of Taq polymerase (Takara). The thermal cycling parameters were as follows: denaturation at 95°C for 3 min followed by 35 cycles of 95°C for 50 s, 52°C for 40 s and 72°C for 2 min, and an additional extension for 5 min at 72°C. The amplified PCR products were purified using GEL extraction kits (Takara), cloned into pMD19-T Vector (Takara) and then transformed into Escherichia coli (strain DH5a). Successful clones were sequenced.

Sequence analysis
Database searches were done through the NCBI World Wide Web server using the BLAST (Basic Local Alignment Search Tool) network service. DNASTAR and DNAMAN softwares were used to analyze the nucleotide sequence and for multiple amino acid sequence alignments. Signal P version 3.0 was used to identify signal peptides. Hydrophobic character prediction was done using ProtParam. The phylogenetic tree was constructed with Mega4.0 software (Tamura et al., 2007) and homology alignment was assessed with Clustal W.

Expression analysis by RT-PCR and real-time quantitative RT-PCR
For gene expression analysis, first-strand cDNAs from flower buds obtained at eight developmental stages, open flowers, roots, tender stems, fresh leaves, sepals, petals, anthers and pistils of fertile and sterile plants were used as templates. RT-PCR was done with gene-specific primers (5'-CGAAGAGACTGGAAAGTATGCGAA-3', 5'-AG GGTTAGACG GTAGGAGAGATTG-3') and primers for b-actin (forward: 5'-CCTCTTCACTCTC TGCTCTCTCC TCA-3', reverse: 5'-GTCATTTTCTCTCTATTTGCCTT GGG-3'). The RT-PCR products from each tissue were electrophoresed on 1.2% agarose gels and the amounts of template cDNA added to the reaction were changed until the levels of actin product were equal in different samples. The PCR was repeated until the level of product was consistent. Differential expression was analyzed by PCR using the amounts of template defined as described above and CaMF3-specific primers. Three rounds of RT-PCR were done with three independently isolated total RNA samples.
Quantitative real-time RT-PCR was done using a SYBR Primix Ex Taq kit (Takara), according to the manufacturer's instructions. CaMF3-specific primers (5'-GAA GAGACTGGAAAGTATGCGAAA-3', 5'-ACTGCCTT CTGGTGTGAAATGTAC-3') and b-actin-specific primers (5'-AATCAATCCCTCCACCTCTTCACTC-3', 5'-CATCACCAGCAAATCCAGCCTT-3') were designed for CaMF3 and b-actin. The expected PCR products of the target gene (CaMF3) and housekeeping (control) gene (b-actin) were 150 bp and 173 bp, respectively. The molecular mass of the products was confirmed by agarose gel electrophoresis and the melting curves were analyzed. Triplicate quantitative PCR experiments were run for each sample and the levels of expression were normalized against b-actin. Relative gene expression was assessed using the 2 -DDCt method (Livak and Schmittgen, 2001).

Cloning of CaMF3
In our previous study, differential gene expression was assessed by RNA sequencing of the genic male sterile-fertile line 114AB of C. annuum. A variety of differentially expressed cDNA fragments was detected in fertile or sterile lines (authors' unpublished data). Of these, a transcript-derived fragment, CaMF3, was investigated for its specific accumulation in the flower buds of a fertile line.
The full length of CaMF3 was obtained by in silico cloning and the CaMF3 EST (EL812860) sequence was used to search the NCBI EST database. As a result, two ESTs (GD089386 and GD099223) were identified from a cDNA library of C. annuum. The full length cDNA of CaMF3 was deduced by aligning these ESTs and eliminating redundant sequences. The sequence consisted of 974 bp containing a 705-bp open reading frame. Subsequently, band sizes of 1,000 bp and 2,000 bp were amplified from cDNA and genomic DNA templates, respectively, using P1/P2 primer pairs. The amplified PCR products were purified, cloned and sequenced.

Sequence analysis
The full-size cDNA of CaMF3 was 974 bp long and contained a 705-bp open reading frame (Figure 1). The genomic DNA of CaMF3 was isolated using primer pairs that were used to clone the cDNA sequence. Sequencing results showed that the DNA sequence was 1,951 bp long and contained five introns (96, 541, 133, 94 and 113 bp long) ( Figure 2A). The complete coding sequence encoded a putative 25.89 kDa protein of 234 amino acids with a theoretical pI of 6.62. The SignalP3.0 program was used to detect signal peptide sequences. However, no significant signal peptide was found in the amino acid sequence of CaMF3. Hydrophobic character prediction based on the ProtParam tool indicated that CaMF3 was hydrophilic with a grand average of hydropathicity (GRAVY) of -0.145. The instabil-812 CaMF3, a new anther-specific gene ity index (II) was calculated to be 37.53, which classified the protein as stable. The deduced protein contained an isoamyl acetate hydrolase-like domain (amino acids 3-212) (cd01838, NCBI Conserved Domain Database) (Wei et al., 1995;Marchler-Bauer et al., 2011), indicating that CaMF3 belongs to this protein family (Wei et al., 1995). The protein also contained three conserved features/sites: the active site (Ser12, Gly44, Asn78, Asp189 and His192), the typical catalytic triad (Ser12, Asp189 and His192) and three oxyanion holes (amino acids Ser12, Gly44 and Asn78) ( Figure 2B).
Blast analysis showed that the deduced amino acid sequence of CaMF3 was homologous with many putative isoamyl acetate-hydrolyzing esterases in plants and animals. The putative CaMF3 protein had > 38% identity with most of the proteins from plants and animals, but showed lower identity with those from bacteria and yeasts (Table S1, Supplementary material).
The deduced amino acid sequences of CaMF3 and 25 other known isoamyl acetate hydrolase-related proteins (Table S1) were aligned using the MEGA 4 program ( Figure  2C). A phylogenetic tree based on these alignments revealed four main clades (I-IV). CaMF3 and nine putative isoamyl Hao et al. 813  Table S1. Group I-IV were corresponded to land plants, metazoan origin, yeasts and bacteria respectively.
acetate hydrolases from land plants belonged to clade I. Clade contained enzymes of metazoan origin, clade contained only putative isoamyl acetate hydrolases from yeasts and five putative isoamyl acetate hydrolases from bacteria belonged to clade IV ( Figure 2C). The sequence alignment also showed that, like CaMF3, all of the other isoamyl acetate hydrolases had an isoamyl acetate hydrolase-like domain (cd01838, NCBI Conserved Domain Database), as well as the same Ser/Asp/His sequence in the catalytic triad and three oxyanion holes (Ser, Gly and Asn) (Figure 3).

814
CaMF3, a new anther-specific gene  Table S1. The conserved Ser/Asp/His sequence is indicated by asterisks and three residues (Ser12, Gly44 and Asn78) are indicated by triangles. CaMF3 is the predicted amino acid sequence of CaMF3 from Capsicum and the other 25 IAH1-related genes are listed in Table S1. The numbers on the right margin indicate the positions of amino acid residues. Identical amino acids are indicated in pink, 75% conserved amino acids in blue and 50% conserved amino acids in yellow.

Expression analysis of CaMF3
Expression analysis was done using equal amounts of template cDNA prepared from total RNA of different tissues by RT-PCR using actin as a reference gene. The results showed that the gene CaMF3 was expressed only in flower buds at stages 7 and 8 and in open flowers of the male fertile pepper line '114B', but not in flower buds at any stages of the male sterile line '114A' (Figure 4A). These results indicated that CaMF3 was expressed only in mature anthers or pollen of fertile plants. In addition, CaMF3 was expressed only in anthers of flower buds of the male fertile pepper line '114B', with no expression detected in sepals, petals, pistils, roots, tender stems or fresh leaves. There was also no CaMF3 expression in any tissues of the male sterile line '114A' (Figure 5A).
Quantitative real-time RT-PCR was used to examine the fine spatio-temporal expression pattern of CaMF3 in flower bud tissues of different developmental stages from fertile and sterile plants. The expression of CaMF3 was first detected at stage 7 and increased until anthesis; no CaMF3 transcript was detected in flower buds at stages 1-6 of the fertile line or in the flower buds at any stages of the sterility line ( Figure 4B), thus confirming the results obtained by RT-PCR. Together, these findings indicated that CaMF3 was an anther-specific gene and that it was strongly expressed in mature anthers of the fertile line but not in the sterile line ( Figure 5B).

Discussion
The anther plays a prominent role in crop production because it is responsible for male reproductive processes necessary for generating seeds that will produce the next generation of plants. Anther-specific gene expression programs correlate with the differentiation and degeneration of specific anther tissues and cell types (Goldberg, 1993). In recent years, a large number of anther/pollen-specific genes involved in anther/pollen development have been identified based on anther/pollen-specific expression patterns in vari-ous plant species (Tzeng et al., 2009;Agyare-Tabbi et al., 2010;Dobritsa et al., 2010;Zhang et al., 2010Zhang et al., , 2011Chen et al., 2011).
The only difference between fertile and sterile plants of GMS line 114AB of C. annuum is that the anthers and pollen of fertile plants develop normally whereas the anthers of sterile plants are abnormal and there is no pollen in mature anthers. We therefore speculated that a few early-expression genes controlled fertility and regulated other genes expressed in the middle and later periods of pollen development in pepper, and that the expression of these additional genes may affect anther or pollen development. In previous work (Chen et al., 2011(Chen et al., , 2012, we used cDNA-amplified fragment length polymorphism (cDNA-AFLP) and RNA-Seq technology to examine differential gene expression in the genic male sterile-fertile line 114AB of C. annum L. and identified two anther-specific genes, Camf1 (HQ386720) and CaMF2 (JF411954). Both of these genes were expressed only in flower buds in the middle stages of development in fertile plants. In the present work, we identified another anther-specific gene (CaMF3) in pepper that was only expressed in fertile plants. This finding indicated that CaMF3 may be related to the fertility of pepper and could be involved in anther development.
The genes expressed in anthers for male gametogenesis can be divided into "early" and "late" groups based on when they are expressed. Most of the tested genes expressed in pollen belonged to the "late" category during pollen development (Mascarenhas, 1990). Generally, these late genes start to express after microspore mitosis and the expression increases until anthesis; most of these "late" gene products may be involved in pollen maturation or germination (Mascarenhas, 1993). As shown here, CaMF3 was detected only in flower buds at stages 7 and 8 and in open flowers of male fertile pepper '114B' (Figure 4), indicating that CaMF3 belongs to the "late" group genes; the gene products of CaMF3 may be involved in pollen maturation or germination (Mascarenhas, 1993). However, since the cellular location of CaMF3 mRNA during pollen Hao et al. 815 Lanes F1-F7: roots, tender stems, fresh leaves, sepals, petals, anthers and pistils, respectively, from fertile plants. Lanes S1-S7: roots, tender stems, fresh leaves, sepals, petals, anthers and pistils, respectively, from sterile plants. and anther development has not been determined, the gene products of CaMF3 could also be involved in anther dehiscence. Sequence analysis using bioinformatics revealed that the deduced amino acid sequence of CaMF3 was homologous to a number of putative IAH1 proteins in many species, including plants and animals. As with other isoamyl acetate-hydrolyzing esterases, CaMF3 also has a Ser12/Asp189/His192 sequence in its predicted catalytic triad and three residues, (Ser12, Gly44 and Asn78) that act as oxyanion holes in IAH1 to stabilize the reaction intermediate (Wei et al., 1995;Ma et al., 2011).
Isoamyl acetate is hydrolyzed by isoamyl acetatehydrolyzing esterase, an enzyme that catalyzes the hydrolytic cleavage of acetylesters (Fukuda et al., 1996(Fukuda et al., , 1998(Fukuda et al., , 2000. Yeast IAH1 has a distinct GDSL sequence motif near the N-terminus and four conserved sequence blocks that are characteristic of SGNH-hydrolases, a subfamily of hydrolytic/lipolytic enzymes (Akoh et al., 2004;Marchler-Bauer et al., 2011). GDSL esterases and lipases are hydrolytic enzymes with multifunctional properties such as broad substrate specificity and regiospecificity. They have potential for use in the hydrolysis and synthesis of important ester compounds of pharmaceutical, food, biochemical and biological interests (Akoh et al., 2004). Initial studies of IAH1 focused on yeasts, but with the rapid sequencing of genomes in high plants and animals, homologous IAH1 sequences were identified in a number of plants and animals, such as Ricinus communis, Populus trichocarpa, Arabidopsis thaliana, Vitis vinifera, Oryza sativa japonica and Anolis carolinensis. However, little is known about the expression patterns or functions of IAH1-like genes in higher plants and animals.
In conclusion, this is the first report on the cloning and characterization of a new IAH1-like gene (CaMF3) in C. annuum. CaMF3 was identified as an anther-specific gene that was only expressed in flower buds at late stages of development and in open flowers from fertile pepper. We speculate that CaMF3 is involved in late pollen development and may possibly play a role in pollen maturation and germination or in other biological processes involved in anther development in chili pepper. Further study of CaMF3 could improve our understanding of the molecular mechanisms of anther development in pepper and of the roles of the IAH1 gene family in anther or pollen development.