SSR markers to detect gene flow from upland to mocó cotton 1

Mocó cotton was the most important crop in Northeastern Brazil. Local varieties had been developed before governmental breeding programs, which should be identified and preserved as a genetic resource. The aim of this work was to select SSR markers which can be used to monitor gene flow from herbaceous to mocó cotton plants, using non transgenic upland cotton as a pollen source. Unique mocó cotton feral populations were identified in the state of Rio Grande do Norte, and mother plants and their seeds were collected. Allele frequencies were studied among fifteen plants from these populations using 64 SSR primer pairs. Ten SSR markers, with alleles exclusive to mocó or upland genotypes, were selected to monitor gene flow by paternity analysis (BNL3627; 2960; 2572; 3261; 3398; 3948; 3502; 3646 and CIR 094; 097). The collected seeds had none of the upland cotton alleles. The absence of exclusive alleles from upland cotton in the offspring showed that gene flow to mocó cotton was absent or lower than the analysis could detect. Among the mother plants, the number of homozygous individuals at four polymorphic loci was greater than expected, showing that reproduction occurs preferentially by self-fertilization or crossing among related individuals. Paternity analysis is an accurate measure of gene flow, and the methodology can be used for monitor these populations or others. Preservation of the plants is spoiled mainly by drought.


Introduction
Cotton was the main agricultural economical activity for Brazilian semiarid region, at the North-east region of the country, during the twentieth century (IBGE, 1979).The cotton type used for cultivation is known as mocó (Gossypium hirsutum r. marie galante Hutch), which cultivation started around 1860, using seeds developed by farmers (MOREIRA et al., 1982).
The municipality of Acari, in the semi-aridregion, is probably the center of origin of at least part of local varieties of the mocó cotton planted in the past in Northeastern Brazil, since, after the drought of 1877 to 1879, in a small property next to a pond at Acari, called Olho d'Água da Seriema -where a farmer, noticing the drought resistance of a cotton plant, started to preserve it and obtain its seeds, distributing them to the neighborhood (MOREIRA et al,1989).Furthermore, the first report about mocó, written in 1860, described it growing like wild-type plants at Acari (MOREIRA et al, 1994).
Mocó cotton genotypes are adapted to the cultivation in this semi arid region.Cultivation in drought conditions is viable, since the plant maintain vigorous and productive for until eight years.
Local varieties are important genetic resources (CAMPBELL et al, 2003) and collections were intensified due to the accelerated decline of the in farm maintenance.Various local varieties have been planted in Northeast Brazil, which coexisted varied developed by governmental breeding programs, developed by Superintendência do Desenvolvimento do Nordeste (Sudene), Centro Nacional de Pesquisa da Algodão (CNPA, currently Embrapa Algodão) and Universidade Federal do Ceará (MOREIRA et al., 1972).
The release of transgenic cotton in Brazil was done considering that gene flow for the traditional mocó cotton varieties could possibly occur (BARROSO et al, 2005), but current gene flow levels between commercial and local varieties has not been determined.The main cotton type which is sympatric to mocó cotton is the upland cotton, G. hirsutum r. latifolium.G. mustelinum is also sympatric, although in Rio Grande do Norte the population is very small (BARROSO et al., 2010).G. barbadense is frequently present in gardens as an ornamental or medicinal plant (ALMEIDA et al, 2009).
Given that high variations can be found when gene flow is measured for particular populations, even considering the same species (ELLSTRAND , 2003), monitoring gene flow is proposed as a tool to prevent possible undesirable environmental impacts.Gene flow from crops to local varieties may just increase slightly the variability in the target population, or otherwise it may harm population conservation, especially if genes or genomes hybridization increases or decreases adaptability or reproduction taxes.Hybrid fitness higher than the fitness of both parents may lead to progenitor extinction by replacement (HEDGE et al, 2006), while if hybrids are less fertile than parents, populations may shrink (HAYGOOD et al., 2003).Paternity analysis may provide accurate information about gene flow (NEFF et al, 2000).When transgenic plants are not available or are not allowed, conventional non transgenic cultivars of the same species of the potential transgenic cultivars may be a confident model to measure gene flow, since plant transformation per se would change the amount of transgene flow.SSR markers, also known as microsatellite markers, are largely used in paternity analysis, and are well established for cotton genetic studies (OLIVEIRA et al., 2010).This report aimed to assess the existence of gene flow towards feral mocó cotton populations thought paternity analysis of offspring.
Plants were identified individually, and seeds and shoots of the same plants received the same designations.Shoots were collected and maintained in humidified plastic bags during transportation.Then, they were treated with indol acetic acid 1 g L -1 for ten minutes, just before planting them in pots in greenhouses.Seeds were planted in autoclaved sand and, after germination, transplanted to pots.
The varieties that could have possibly acted as a source of pollen are upland cotton (G.hirsutum var latifolium), which have been planted in Northeastern Brazil semiarid region.Therefore two varieties known to be planted in this region were used for comparison of allele size and frequency: CNPA 8H and Precoce 3.These varieties both developed by Brazilian Agriculture research corporation (Embrapa).A third upland cotton Guazuncho, developed in Argentina, was chosen as a control because the allele sizes of this cultivar are known, since it has been used for molecular mapping (NGUYEN, 2004).

Allele frequencies and marker selection
The calculation of allele frequencies among these populations was made from DNA collected from leaves of 15 adult plants.These leaves, directly collected in the fields where the plants had been found, were placed in 50 ml tubes containing TE (Tris EDTA pH 8.0; 0.5 M) and maintained in boxes in ice for transportation to the laboratory, where the entire tubes containing the lives were frozen to -20ºC.For DNA extraction, leaf discs (50 to 100 mg from leaves of each plant) were placed in microtubes, frozen in liquid nitrogen and ground.Buffer containing CTAB 2%; NaCl 1.4 M; EDTA 0.2 M; Tris HCl 0.1 M pH 8.0; PVP 2% and 2-mercaptoetanol 0.2% was added (600 µL).The suspension was incubated at 65ºC for 30 minutes, then 600 µL of a mixture of chloroform, isoamilic alcohol 24:1 (CIA).Was added, mixed by tube inversion and centrifuged at 12000 g for 10 minutes.DNA was recovered from supernatant and transferred to a new tube.After a second extraction by CIA, the collected DNA was precipitated with isopropanol, sequentially washed with 70% and 100% ethanol, and suspended in 100 μL Tris EDTA buffer.The concentration was estimated on agarose gels, and adjusted to 10 ng μL -1 .
The PCR reactions to obtain Simple Sequence Repeats (SSR) markers contained 20 ng of genomic DNA, 0.2 µM of each primer, 4 mM MgCl 2 , PCR reaction buffer (10 mM Tris-HCl pH 8.3 and 50 mM KCl), 0.2 mM of dNTP and 1 U of Taq DNA polymerase.Amplification reactions were achieved using a touchdown program, with an initial denaturation step at 94 o C for 12 minutes, followed by 10 cycles with denaturation at 94 o C for 15 s, in which annealing temperature decreased 1 ºC each cycle, from 65 o C to 55 o C, for 30 s; followed by 72 ºC by 1 minute.Additional 35 cycles were at 94 o C for 15 s, 55 o C for 30 s and 72 o C for 1 minute.A final elongation step at 72 °C for 5 minutes was added.
Reaction products were mixed to a loading buffer containing formaldehyde 95%; EDTA 0.5 M -pH 8.0, bromophenol blue and xylene cyanol 1 mg.mL -1 , and heated to 95 °C for 5 minutes prior to loading the samples to 6% acrylamide gels under denaturing conditions.The gels were run at 80 watts for 2 hours and 30 minutes and then stained with silver nitrate (CRESTE et al., 2001).The band patterns led two measure allele frequencies and the number of exclusive alleles among herbaceous cotton or mocó cotton genotypes.The expected (He) and observed (Ho) heterozygosities were calculated using the softwares GDA-Genetic Data Analysis version 1.1 and Popgen 1.31 (EXCOFFIER;HECKEL, 2006).

Gene flow
Gene flow was estimated by paternity analysis, comparing the DNA from the mother plants to the DNA obtained from seedlings germinated from seeds collected from the same mother plant.The DNAs were extracted from these 12 families, each constituted by the mother plant and 15 offspring plants.
Seeds received the same designation of the mother plant when they were collected.They were germinated in the laboratory and transplanted to field.The DNA was extracted the same way as described for the mother plants.
For this comparison just a selected set of primers was used.The selected primer pairs were those which revealed alleles that were present exclusively at mocó cotton plants, and absent at upland control plants of the varieties CNPA 8H, Precoce 3 e Guazuncho.The PCR amplification procedure and gel analysis were the same as described for the primer pair selection.The analyzed loci are presented in Table 2.

Results and discussion
Plants remained from abandoned fields of mocó cotton were localized in the municipalities of Acari and Santana do Seridó, state of Rio Grande do Norte.These two fields, for at least twenty years, had not been used for new plantations.Old and young cotton plants could be seen, and native vegetation grew among them.
The age of the plants was estimated as approximately 20 to 50 years, according to morphological characteristics and interviews with inhabitants in the nearby area, who could remember approximated planting dates.Young plants could be rarely observed.Considering visits to this region, which have been made approximately each five years during the last twenty years, the number of plants is diminishing.
The analysis for the presence of different alleles revealed that, among sixty four primer pairs, forty four (68.75%) presented the same alleles for all genotypes.Twenty loci were polymorphic between these 15 mother plants or the three upland cotton varieties used as controls.
The loci CIR 251, CIR170, CIR043 and BNL 1434 were polymorphic inside the mocó population (TAB.3), and presented a medium number of alleles per locus of 3.4.The polymorphic mocó alleles, their frequencies, the expected and observed heteregozyties and the intrapopulation fixation indexes are presented in Table 3.A homozygous excess is observed in most of the loci.The polymorphic loci CIR 364, CIR 121, CIR 249, BNL 3902, BNL 3816, CIR 251, CIR 170, CIR 049, CIR 043, BNL 1434 presented at least one common allele between mocó and herbaceous cotton.Therefore, they were not selected to paternity analyses studies.Private alleles of mocó population were identified for ten primer pairs, which lead to the distinction between mocó and upland cotton genotypes.These alleles are presented in Table 4.Among them, eight primer pairs, BNL3627;2960;2572;3261;3398;3948;3502 and 3646, amplified only one locus each.Two primer pairs, CIR 094 and 097, amplified two loci each, both of them showing alleles which led two distinguish between the two cotton types.
The seeds that were collected from the two feral populations presented low germination rates.The plants that had not at least 15 germinated seeds were not considered for this study.Nevertheless, 12 plants (mothers) had at least 15 offspring plants.Among these, 11 were from Acari (61;45;46;27;28;30;32;48;49;50;59) and only one from Santana do Seridó (36).
These twelve loci were used to compare mother plants with their offspring.All the fifteen offspring plants, at all the twelve families and twelve studied loci, always presented only the mother plants alleles.None heterozygous plants were found.These suggest the absence of gene flow from upland to mocó cotton, for these loci, in this generation and in this population.The observed growth of native vegetation among old fields is relatively common at this region, and in some occasions regenerated plants are used as pasture by caprines or bovine cattle.They are called capoeiras.In these two fields caprines were not found, probably due to the considerable declivity at the place of collection.Therefore plants were relatively well preserved.Another cotton species, G. mustelinum, which is endemic of Northeast Brazil, is also frequently used by cattle as a pasture, and grows much better when protected from them (BARROSO et al., 2010).
Cotton has a mixed matting system.The low numbers of heterozygote and the positive inbreeding coefficients suggest that reproduction occurs preferentially through selfing or between related individuals.Crosspollination may have been restricted by the distance among the plants or absence of efficient pollinators.The same has been found in other mocó populations in Northeast Brazil (MENEZES et al., 2010).
Gene flow by pollen could be achieved by insect pollination.The most frequent floral visitors of cotton belongs to the genus Apis sp., responsible for about 50% of cotton floral visits in Brazil (SILVEIRA, 2003).Pollen transportation by honey bees can achieve distances of 10km (BARROSO;FREIRE, 2003).
Even though upland and mocó cotton are sympatric, thus gene flow is possible, upland cotton alleles were not found among mocó cotton plants, showing that they are reproductively isolated.Paternity analysis can lead to accurate detection of gene flow (ELLSTRAND , 2003) and the data presented lead to the conclusion that pollen gene flow could be noticed in the studied generation.
If an unidentified amount of gene flow occurs, it is low, so it is supposed to be not sufficient to harm mocó cotton genetic identity.
Twelve identified loci can be used to paternity inference, differentiating between mocó and upland cotton.These loci that were shown to distinguish between upland and mocó cotton plants can be used for monitoring gene flow for these populations, considering the interest of in situ preservation, as a genetic resource for cotton.The other four primer pairs shown to be polymorphic among mocó cotton plants can be further used to diversity studies.Other loci shown to be polymorphic among upland cotton and mocó cotton could also be evaluated (ALVES et al., 2009).
Conventional non transgenic plants were used to estimate gene flow, and it is expected that if existed any cotton transgenic plant growing at these places, the estimated gene flow would be equivalent to the one measured for the conventional analysis, since the transgenes that have been used for cotton, conferring resistance to insects or herbicide tolerance, bear pollen grains which have the same characteristics of the conventional genotypes, and these transgenes do not interfere with pollen transportation.The estimative of potential transgene flow using conventional genotypes as a model can be useful when transgenes are not available or in ex-ante biosafety experiments.
Genetic preservation of the traditional mocó cultivars could be harmed not only by transgenes, but also by gene flow from conventional non transgenic cotton cultivars.These genes, possibly acquired through gene flow, could also confer favorable or non favorable traits, which would be a potential harm to the genetic maintenance.In this sense, paternity analysis is a broader genetic analysis than transgenic tracking as a tool to monitor gene flow to populations.
In situ maintenance of these genotypes can be achieved, including the use of exclusion zones for transgenic cotton planting (BARROSO et al, 2005).Moreover, collections have been made for ex situ preservation.They survive and reproduce under harsh climate conditions, and since they are of the same species as upland cotton, they might constitute a relevant genetic resource.

Conclusion
Paternity analysis, an efficient methodology do monitor gene, was applied to two populations of mocó cotton, which historical evidences indicates to have unique genetic traits, showing that gene flow from upland cotton has been absent or low, therefore representing low danger or none danger for populations preservation.In situ observations showed that drought spoils the populations.

Table 1 -
SSR markers to detect gene flow from upland to mocó cotton Municipalities and locations where shoot buds were collected * Plants used in paternity analysis

Table 2 -
Microsatellite loci used to analyse 15 mocó plants from Rio Grande do Norte feral populations

Table 3 -
Polymorphic alleles, frequency, expected (He) and observed (Ho) heterozgozity and intrapopulation fixation indexes SSR markers to detect gene flow from upland to mocó cotton

Table 4 -
Private alleles and size, (bp), according to cotton type to the relatively height number of plants and age.Although impediments for regeneration could be demonstrated, young plants were observed, showing that population had the ability of self propagation; hence it can be considered feral.Nevertheless, it is known that regeneration is rare, and populations are declining, as noticed by periodical visits to these places.