Sequence-related amplified polymorphisms (SRAPs) reveal genetic diversity and variation regions in upland cotton (Gossypium..

The lack of genetic diversity is an obstacle for genetic improvement of upland cotton in China; thus, new technologies must be developed to produce more polymorphic molecular markers associated with agricultural traits of the existing resources for breeding. A highly efficient and economical technology of sequence-related amplified polymorphism (SRAP) molecular markers with an automated fragment analyzer ABI3500xl was developed, to detect genetic diversity in upland cotton. Using this new strategy, we easily screened for polymorphisms with 7,872 pairs of SRAP primers, and detected 504 polymorphic markers. Of all these, 165 were used for genetic diversity analysis in 128 upland cotton varieties collected nationwide in China. Our method combined a traditional molecular marker development technology with an economical and easy operation strategy for breeders.


INTRODUCTION
Upland cotton (Gossypium hirsutum L., 2n=AADD=52) is the most important textile fiber crop in the world.It is one of the four wild Gossypium species that were independently domesticated by aboriginal domesticators more than 5000 years ago and transformed into fiber plants (Smith and Stephens 1971).Due to the high yield, long fiber and wide environmental adaptability, upland cotton became the most common Gossypium species on the market, accounting for more than 90% of the world production (Dhivya et al. 2016).The planting area of upland cotton reached up to four million hectares in 2014, with an average yield of 1463.25 kg ha -1 in China (National Bureau of Statistics of People's Republic of China, 2014, http://www.stats.gov.cn/).Genetic improvement (e.g.hybridization and transgenic introgression for insect resistance) plays a key role in the great success of cotton products.
In spite of the significant achievements of upland cotton breeding, the genetic diversity is extremely low in China, according to recent studies (Wendel 1992, Ahmad et al. 2012).The low genetic diversity can be explained by three reasons: 1) upland cotton originated from South America (Basu 1996) and the absence of wild varieties and naturally occurring germplasm in China, resulting in genetic barriers against the introduction of wild resources into the available J Hou et al.
germplasm; 2) most of the cotton varieties planted in China were imported from the United States, resulting in simple genetic origins; and 3) the creation of insect-resistant plants by genetic engineering restricted the core germplasm collection to a few elite genotypes.
A low level of genetic diversity is a major obstacle for genetic research and improvement of upland cotton in China.Firstly, under low genetic diversity only few useful marker scan be identified to construct the genetic linkage map.An integrated map covering 7,424 molecular markers was constructed from 28 mapping populations of tetraploid cotton (Yu et al. 2010).Among those 28 mapping populations, 21 were derived from G. hirsutum x G. barbadense, one from G. hirsutum x G. tomentosum, and six from two G. hirsutum parents.The latter six populations with parent G. hirsutum contributed with only 574 markers to the integrated map (Yu et al. 2010).The restricted number of markers is a limitation for dissecting the genetic basis of complex agricultural traits and map-based gene cloning.Secondly, low genetic diversity hampers the screening of molecular markers associated with agricultural traits.The linkage disequilibrium (LD) of upland cotton was found to be about 3-4 cM and 1 cM is equal to a physical distance of 450 kb in upland cotton.In molecular marker-assisted selection, the LD block may harbor several antagonistic traits.More markers will be needed to break LD.Thirdly, low allele variation limits the gene source for genetic improvement, e.g., for choosing parents and predicting the heterosis level.
To accurately evaluate genetic diversity in plant species, especially in crop cultivars with low genetic diversity, researchers have been trying to explore more molecular markers to evaluate genetic diversity.There are many types of molecular marker, such as AFLP, TRAP, SSR, SNP (Vos et al. 1995, Powell et al. 1996, Zhang and Stewart 2000, Chen and Du 2006, Fang et al. 2010, Wang et al. 2012a, Klabunde et al. 2016) and Sequence-Related Amplified Polymorphism (SRAP) markers, a PCR system based on primer pair amplification, prior to the amplification of open reading frames and even distribution in the genome (Li and Quiros 2001).Compared with other molecular markers, SRAP has advantages for being more economical and efficient than SSR and SNP, simpler than AFLP, and more reproductive than RAPD.A set of upstream and downstream primers can be combined randomly for PCR amplification, producing plenty of primer pairs for SRAP.These characteristics of SRAP facilitate polymorphism detection among genotypes with a low level of genetic diversity, as for example in upland cotton.
In this study, SRAP technology was used to screen polymorphisms or genetic markers in a panel of 128 upland cotton varieties from the two main cotton-producing regions of China.We are trying to detect more molecular markers among upland cotton varieties with highly efficient and lower cost technology.With these, the genetic diversity of upland cotton varieties was evaluated.Moreover our results will provide more molecular markers for assisting breeding and accelerate the process of upland cotton breeding.

Plants material
A panel of 128 upland cotton varieties was collected from two main ecological cotton-producing areas in China: the Yellow and Yangtze river regions.These varieties include typical cultivars bred by major cotton breeding institutes in China, which inherited the accumulated genetic diversity of cotton germplasm since the 1950s.The varieties are described in Table 1.All experiments were carried out in Zhengzhou (lat 34° 44' N, long 113° 42' E, alt 110 m asl), China.

DNA extraction
The seed coats of upland cotton plants were removed after the seeds were immersed in H 2 O overnight at room temperature.Then the cotyledons were isolated and ground with CTAB preheated to 65 °C in a mortar.From the ground cotyledons, genomic DNA was extracted according to Zhang and Stewart (2000), with some modifications.The DNA quality was tested in agarose gel and manually adjusted to a final concentration of 50 ng uL -1 .

Polymorphism detection among varieties
One hundred and twenty-eight DNA samples were pipetted in 384-well PCR plates with three replications, and eight samples randomly selected in rows were used as template.Nine upstream primers of SRAP were labeled with different dyes to detect the PCR products with ABI3500xl.The PCR products with distinct bands and unbiased distribution among eight samples were considered polymorphic.

Genotyping varieties with polymorphism revealed by SRAP primers
Primer pairs with polymorphic amplification were used to amplify DNA samples of a total of 128 varieties.At least two PCR products derived from different fluorescence primers were mixed when genotyping with ABI3500xlto make the procedure more effective and economical.To avoid residues from different dyes used to label different primers, SRAP products with varied length or at least divergent in polymorphic bands were mixed for electrophoresis on ABI3500xl.For dominant markers, 1 and 0 were assigned to presence and absence of amplified fragments, respectively; for co-dominant markers, 1 and 0 represented polymorphic bands with different length.

Analyzing genetic diversity
Genetic similarities were calculated using the Numerical Taxonomy Multivariate Analysis System software package NTSYS-pc Version 2.10e.Cluster analysis was performed using the UPGMA (Unweighted Pair-Group Method based on Arithmetic Means) obtained from genetic distances.The UPGMA tree was constructed using the SHAN tool of NTsys-pc 2.10e software

Polymorphisms among varieties indicate lack of genetic diversity
Although SRAP markers are more efficient to detect polymorphisms, and 504 polymorphic markers were detected using 7,872 pairs of SRAP primers, the ratio of polymorphism was only 6.4%.This result is consistent with previous research and indicates low genetic diversity among upland cotton varieties in China.We chose 72 primer pairs with higher amplification efficiency and reproducibility (Figure 1A), which produced 165 polymorphic fragments.More than 60% of the primer pairs generated 1-2 polymorphisms, about 20% produced three, and 10% amplified more than four (Table 2).Using these primers, 128 varieties were genotyped.Most SRAPs in upland cotton are dominant (Figure 1B) and a few co-dominant among the different varieties (Figure 1C).

Genetic diversity of upland cotton collection
The genetic diversity of 128 cultivars was evaluated with 165 polymorphic markers.The genetic similarity coefficients among different cultivars were between 0.48 and 0.80 (Figure 2).The similarity between Lumianyan37 and Xiangzamian8 was highest (0.80), and second highest between Zhongmianyan30 and Zhongmiansuo41 (0.79) (Figure 2).All these cultivars were bred for insect resistance with transgenic Bt gene, and the high genetic similarity may be due to lines with a similar genetic background.In contrast, the similarity coefficients between Xinluzao19 and Huazamian4, from the Autonomous Region of Uyghur in the Xinjiang province in northwestern China and the Yangtze valley in the Jiangxi Province, respectively, were the lowest (0.48) (Figure 2).
To understand the genetic relationships of the 128 varieties, we performed cluster analysis.The varieties were divided into four clusters with a threshold of 0.52 (Figure 2).Cluster I, II, and IV included 36, 57 and 34 varieties, respectively, while cluster III contained only one (Xinluzao19).Each cluster contained varieties from different cultivation regions (Yellow River Valley, Yangtze River Valley and northwestern China) and different cotton types (hybrids and conventional).

Divergent SRAP fragments among three clusters
To demonstrate the significant divergence of SRAPrelated genomic regions among the three larger clusters (I, II and IV), 10 varieties from each were randomly selected to construct a genotype graph using 165 polymorphic fragments (Figure 3).Three sets of polymorphic markers (Figure 3) were selected to analyze the characteristic of the genomic regions that accumulated variation during breeding selection.Set A displays a similar genotype profile between cluster I and II, but differs from cluster IV, while set B shows a similar genotype profile between cluster I and IV, but differs from cluster II.Rich recombination among markers in set C was observed in the clusters.Markers from each cluster prefer to linkage with others in the same set except cluster I in set  A and cluster IV in set C. Primer sequences of those markers were aligned to a genome sequence of G. hirsutum TM-1 in the database (https://www.cottongen.org), and most of them were located around the microsatellite sequence.Those results indicate that the microsatellite regions prior to accumulate most genetic diversity during long period of selection with similar goals.

DISCUSSION
A number of research projects have characterized genome sequences of cotton species, for example, the genome of allotetraploid upland cotton "TM-1" (Zhang et al. 2015).Owing to genomic complexity and limited germplasm available in modern breeding, detection of diversities for genetic improvement is still a challenging task for breeders (Borba et al. 2005, Guo et al. 2010, Kawuki et al. 2011).In our research, we propose the SRAP technology combined with four types of fluorescence evaluated on ABI 3500xlas a more efficient and economical way to develop polymorphic markers in upland cotton.One pair of SRAP primers amplifies and detects 1-8 polymorphic loci, i.e., with higher efficiency than SSR.Since SRAP primers from upstream and downstream groups should randomly combine with each other in amplification, the number of primer pairs is much higher than for other types of primers.In addition, since the nucleotide sequences of gene coding regions were highly conserved among different plant species, the same set of primers can be used in different crops.The amplicons with primers labeled by four types of fluorescences can be mixed and detected at the same time using ABI 3500xl, which can highly improve efficiency and reduce costs.SRAP first amplifies the open reading frames and the markers are distributed evenly in the genome.Those characteristics can improve the usage of molecular markers to explore the genetic basis of agronomic traits for genetic improvement in crops.
Considering the low genetic diversity of upland cotton cultivars and germplasm, we propose several strategies of genetic improvement in China: developing new technologies for producing more polymorphic molecular markers to explore the genetic basis of key agronomic traits in the available germplasm; dissecting and transferring important genes controlling valuable agronomic traits from wild cotton species into elite lines.Molecular markers such as SRAPs and others, e.g., SSR and SNP, need to be combined with other technologies to develop a high-throughput and low-cost marker detection method (Ahmad 2004).Decades of intensive artificial selection for similar goals in upland cotton in China have severely depleted the genetic diversity available for breeding.Refer to elite gene dissecting and transferring, more polymorphisms should be detected for gene cloning and selection for break the linkage drag of harmful traits.
The genomes of both diploid and tetraploid Gossypium species were sequenced, and the draft genome sequence of upland cotton and its two ancestors were published.However, there are a series of gaps in those genomes (Yu et al. 2010,Wang et al. 2012b, Zhang et al. 2015).The SRAP fragments are evenly distributed in the whole genome, but those fragment sequences cannot be mapped into the known physical map, due to gaps in the genomes.Thus, SRAP product fragments will provide a sufficient amount of sequence information for filling the gaps.

Figure 1 .
Figure 1.Polymorphism screening and cultivar genotyping.A. Polymorphism screening with eight DNA upland cotton cultivars.Different primer pairs have distinct band profiles.Cycles indicate the polymorphisms with higher amplification efficiency, selected to genotype the 128 varieties.B and C. Show amplification profiles of dominant and co-dominant polymorphism of SRAPs, respectively.

Figure 2 .
Figure 2. Genetic diversity in 128 upland cotton varieties based on SRAPs.

Figure 3 .
Figure 3. Genotype graph constructed from SRAPs of three clusters.Bars represent the varieties of each cluster.Red and blue background colors show different alleles at the same polymorphic site.Framed regions indicate significantly divergent main regions among clusters.

Table 1 .
Summary of information of varieties in this study

Table 2 .
Information of polymorphisms produced by different primer pairs