Genetic diversity and structure of native maize races from Northwestern Mexico

The objective of this work was to evaluate the genetic diversity of nine maize races (Zea mays ssp. mays) from Northwestern Mexico and one population of teosinte of the Balsas race (Zea mays ssp. parviglumis). A total of 649 alleles were identified, with an average of 20.9 alleles per locus using 31 microsatellite loci; 84.3% of them were polymorphic loci with a 0.49 expected heterozygosity. Graphic representation of principal coordinate analysis (PCoA) showed broad variation and population distribution. The highest probabilistic value obtained with the ∆K criterion confirmed the existence of five population groups clustered by the Bayesian model. This grouping coincided with the population distribution observed in the PCoA graph. Maize races examined retain broad genetic diversity among and within the evaluated populations.


Introduction
Mexico is a territory in which a broad diversity of maize (Zea mays ssp.mays L.) still exists, thus forming an enormous mosaic of wild relatives, native landraces and, after the Green Revolution, improved varieties (Kato Yamakake et al., 2009).Maize was domesticated in Southwestern Mexico in the Balsas river basin more than 9,000 years ago.The domesticated variety derived from perennial teosinte (Zea mays ssp.parviglumis Iltis & Doebley) (Van Heerwaarden et al., 2011).At least 59 maize races have been described in Mexico and related by origin and distribution to specific geographic regions (Vielle-Calzada & Padilla, 2009).
In Mexico, maize is grown on 7.6 million hectares, which is equivalent to more than a third of the cultivated area of the country (SIAP, 2016).Of this area, it is estimated that 76.5% is planted to native landraces (Herrera et al., 2002).Distribution of most of the races has remained stable over the last 60 years; however, a sharp reduction in the populations of several races has been reported, including some races from the Northwest, such as Chapalote, Dulcillo del Noroeste, Jala and Onaveño, which have been classified under the status of high risk of extinction (Perales & Golicher, 2014).Conservation of genetic diversity by traditional farmers is of great relevance since the processes of Pesq.agropec.bras., Brasília, v.52, n.11, p.1023-1032, nov.2017 DOI: 10.1590/S0100-204X2017001100008 evolution and adaptation go along with changes in their surroundings.Moreover, this diversity is an important source for assuring food security for many farmers.It is also a source of alleles that might be useful in plant breeding programs (Salhuana & Pollak, 2006).
The maize races from Northwestern Mexico have been described by their morphological traits (Wellhausen et al., 1952;Sanchez G.& Goodman 1992), biochemical markers (Doebley et al., 1985;Sanchez G. et al., 2000) and molecular markers (Reif et al., 2006;Pineda-Hidalgo et al., 2013); however, these studies used small samples (in terms of the number of individuals per population and populations per race), fewer markers and a small geographic coverage, which are significant aspects that affect the estimation reliability of the population diversity parameters (Bashalkhanov et al., 2009;Van Inghelandt et al., 2010).Thus, the evaluation of a larger number of populations per race and the collection of populations within a broader geographic coverage are necessary.
Microsatellites have been efficient tools in determining the genetic diversity and structure of populations of native maize landraces and improved lines (Vigouroux et al., 2008;Warburton et al., 2008); in a situation of similar single nucleotide polymorphism (SNP) and simple sequence repeats (SSR) numbers, microsatellites are 7 to 11 times more precise than SNPs (Van Inghelandt et al., 2010).They are also used in breeding to identify desirable alleles, to select germplasm sources and to identify heterotic groups (Lago et al., 2014;Surender et al., 2014).For the Jala maize landrace, these markers have been used to estimate the effects of in situ and ex situ conservation (Rice, 2004).
The objectives of this work were to estimate the genetic diversity and structure of 107 populations belonging to nine native maize races as well as one population of teocinte to determine the distribution of diversity within and among populations, and to deduce relationships of similarity among the landraces and populations analyzed.

Materials and Methods
A total of 107 native maize populations, represented by 25 plants each, were analyzed.These populations belonged to nine races from the Northwestern Mexico (Figure 1).Maize seeds were germinated using wet paper towels into a growth chamber and grown at 25±2°C with 80% of relative humidity during 7 days.Genomic DNA was extracted from 100 mg of mesocotyl, coleoptile and leaf tissue from 25 5-day-old seedlings per populations using a commercial kit (ChargeSwitch gDNA Plant, Invitrogen, Carlsbad, California, USA) and following the procedure indicated by the manufacturer.Extracts were purified with a KingFisher Flex robot (Thermo Fisher Scientific, Waltham, Massachusetts, USA).DNA quality was measured with an ultra-low volume spectrophotometer NanoDrop 2000 (Thermo Scientific, Waltham, Massachusetts, USA), with absorbance readings at 260 and 280 nm.Polymerase chain reaction (PCR) was individually performed in each plant.
Allele frequencies were obtained for each population, and the following diversity parameters were determined: number of alleles per locus, proportion of polymorphic loci and expected heterozygosity.The genetic structure of the populations was estimated with Wright's (1965) F-statistics (F IS , F ST , and F IT ) using Popgene V.1.32software (Yeh et al., 1999).
Population structure was inferred based on a Bayesian clustering analysis with Structure 2.3.4 software (Pritchard et al., 2000); all individuals of the ten races (including teosinte) were analyzed using a mixed model with correlated allele frequencies.In this model, estimation of the λ parameter was carried out assuming a subgroup (K=1) where K was run five times with a fixed duration of re-samplings and 100,000 Markov chain Monte Carlo (MCMC) repetitions.The number of likely groupings (K) within the population studied (2,700 plants) was determined with the methods proposed by Evanno et al. (2005) and Rosenberg et al. (2001) and applied using Structure Harvester (Earl & von Holdt, 2012).For each of the five identified clusters, a belonging likelihood of ≥ 0.9 was fixed to assign plants to the group previously identified by the program; those individual plants with a value < 0.9 were assigned to a mixed group (Vigouroux et al., 2008).Finally, the clusters were visualized and edited with the Distruct software (Rosenberg, 2004).

Results and Discussion
The analyzed populations contained 649 SSR alleles, resulting in a total of 20.9 alleles per locus (Table 1) ranging from 7 (in the phi115 marker) to 39 alleles (in the phi064 and phi402893 markers).Rocandio-Rodríguez et al. ( 2014) estimated the genetic diversity of seven corn races represented in 107 populations in the high valleys of Mexico.They used the same markers used in the present work and obtained similar results: a total of 636 alleles, with an average of 20.5 alleles per locus.In contrast, Reif et al. (2006) detected 196 SSR alleles and an average of 7.8 alleles per locus using 25 markers in the 24 native landraces of Mexico described by Wellhausen et al. (1952).
Data were compared with those of studies with similar number of plants and markers, and, even so, there are notable differences.Among the most probable factors that contributed to detecting greater polymorphism in this study are the specific group of markers used, the rich evolutionary history of Mexican maize, and the automated method of allele detection, which is capable of detecting differences in alleles at the level of individual nucleotide (Reif et al., 2006).The mean proportion of polymorphic loci in the ten analyzed races was 84.3%.The higher values were found in the Vandeño race (97.2%), and the lowest ones in Jala and Dulcillo del Noroeste races (Table 1).
The percentage of polymorphic loci in the teosinte race was 100%, thus demonstrating the existence of bottlenecks during the process of maize domestication that reduced genetic diversity to 19.1% on average.Expected heterozygosity for the different races showed broad genetic diversity, with an average of 0.49.This differs slightly from findings in other studies, such as that of González Castro et al. (2013), who reported a value of 0.57 in 20 tropical native races of Mexico, or that of Lia et al. (2009), who also found 0.57 in six native races of Argentina.Calculated parameter values based on the ten races of this study showed considerable genetic diversity and justify establishing programs for conservation of these races.
Population differentiation was estimated with the F ST statistics, with an average value of 0.328 (Table 2) for the ten races, thus evidencing that there is high differentiation among the populations due to geographic distance and divergence time.In contrast, Pressoir & Berthaud (2004) reported an F ST of just 0.011 for 31 corn populations from six locations of the Central Valleys of Oaxaca, Mexico.The lowest value in our study was for the Vandeño race, while the Dulcillo del Noroeste race showed the greatest differentiation among populations.The Jala race had the second lowest differentiation value among populations, which coincides with its highly specific adaptation, limited to the Jala Valley, state of Nayarit, where it is cultivated in an area of not more than 30 ha (Montes-Hernández et al., 2014).This race had the highest individual endogamy index within the populations; probably this is the reason why the long ears that were characteristic of the race no longer frequently appear (Aguilar-Castillo et al., 2006).
Graphic representation of PCoA showed broad variation among populations distributed in the four quadrants (Figure 2).The first axis explained 21.5% of the variation and separated three groups: the Jala race; the Vandeño, Reventador, Chapalote and Tabloncillo races, which formed a cluster in the central part of the diagram; and some populations of the Dulcillo del Noroeste races in the positive end of the axis.The second axis explained 9.2% of the variation, but there was no clear separation of races.There was a group of the Vandeño, Reventador, Chapalote and Tabloncillo races, which indicated genetic similarity among them, coinciding with their similar geographic distribution.PCoA explained only less than 30.7% of the variation.This might partly result from incomplete linkage between major genes and molecular markers used in the presented study.
Using the likelihood method, the highest number of likely groupings (K) was obtained when K=5, and thereafter, likelihood values remained constant.Moreover, applying the Evanno et al. (2005) criterion, the highest value of ΔK was obtained at K=2, while the second highest peak was observed at K=5.However, the highest value of ΔK (at K=2) may be considered spurious because of the high number of loci evaluated in the population, which caused the null hypothesis to be refuted (Vigouroux et al., 2008).Therefore, the value at K=5 was considered the one which best fitted the population structure, which is similar to that shown by PCoA.
Based on the ≥ 0.9 threshold to assign each plant to a group and taking into account the highest population values of each race (Q), the arrangement into five groups (Figure 3) was defined as follows: Group 1 represented mostly by Chapalote (Q = 0.63) and Elotero del Sinaloa (Q = 0.71) and comprising 316 plants; Group 2 included Balsas and Onaveño (Q = 0.98 and 0.55, respectively) and comprised 235 plants; Group 3 represented by Vandeño (Q = 0.76) and comprised 214 plants; Group 4, the largest group, included Blandito (Q = 0.71) and Reventador (Q = 0.75) and comprised 399 plants; and  et al. (1952) suggested that some of the landraces had emerged from hybridization of other previously existing races.They proposed Chapalote as one of the progenitors of Reventador, and the latter as a progenitor of Tabloncillo.The results obtained indicated that the populations of Reventador share a low to medium percentage of kinship with populations of the Chapalote race, as well as those of Tabloncillo with populations of Reventador, thus confirming that, in these races, there had been a genetic drift.Wellhausen et al. (1952) also proposed that the Jala race was a hybrid of the cross Tabloncillo × Comiteco.In the present study, all Jala populations and more than 50% of Tabloncillo ones exhibited a high degree of genetic kinship, thus suggesting that the Jala race possesses certain fragments of the Tabloncillo genome.
The clustering with the neighbor-joining method was generated from a matrix of genetic distances based on 649 SSR alleles.In the phylogram, the origin is monophyletic, with the population of teosinte as the common ancestor (Figure 4).In the diagram, populations of the same race did not cluster consistently, thus denoting that they are undergoing divergence because of their isolation in different locations.Furthermore, gene flow among locations could be happening and leading to this inconsistency.Four main groups tended to exhibit certain association in terms of their evolutionary history and the place where they had been collected (Wellhausen et al., 1952;Sanchez G. & Goodman, 1992).
Group 1 comprised 48 populations of the Blandito, Tabloncillo and Reventador races mainly from Sonora and Sinaloa.Blandito and Tabloncillo have Reventador as a common ancestor combined with Chapalote and Harinoso de Ocho.This grouping was also reported by Sanchez G. & Goodman (1992), as well as a clade formed by 13 populations, among which exists a cluster of 9 populations of Vandeño, another with 2 populations of Tabloncillo, and disperse populations of Blandito, Reventador, Chapalote, and Onaveño.An outstanding feature of this subgroup is that it has, on average, the highest percentage of polymorphic loci of the studied landraces.Another small subgroup was a set of units from a single node and it was formed by 2 populations related to Dulcillo del Noroeste, a cluster of 7 populations of Tabloncillo and 1 population of Chapalote.These populations are predominantly distributed in low altitude regions (Wellhausen et al., 1952).Group 2 was formed by 32 populations of seven different landraces, which had been reported by other authors in different clades, with some tendencies to grouping by origin, since 12 are from the state of Nayarit, among which there were six populations of the Jala race.The group included sets of 2 to 3 populations of the same race, related by internal nodes to Blandito, Jala, Tabloncillo, and Reventador, as well as populations in unrelated nodes of the same races, including some of Vandeño, Elotero de Sinaloa, and Dulcillo del Noroeste.
Groups 3 and 4 were the closest ones to teosinte; that is, they were more genetically similar to the common ancestor.Group 3 comprised mostly populations of the Chapalote race, which was reported by Wellhausen et al. (1952) as one of the ancient indigenous races.The group also included Vandeño and Onaveño populations, characterized by their cylindrical ears, predominantly white grains and adaptation to low moisture conditions.Group 4 included Dulcillo del Noroeste and Tabloncillo populations, which share a node with Tabloncillo Perla and Onaveño, as well as separate Reventador, Elotero de Sinaloa, Chapalote and Blandito populations, which, in other studies, had been reported as belonging to different clusters.
Divergence time and progenitors of the races analyzed in this study have not been entirely established, but results reported here suggest that differentiation of the genomes of these races has taken place over a short period of time.PCoA and the phylogram showed that several of the populations have a similar genetic background, which resulted in groupings that were inconsistent with phenotypic differentiation.Results presented here also suggested that, at the molecular level, there is a diffuse genetic background that has not yet clearly led to phenotypic differentiation, which is more clearly revealed when selectively neutral molecular markers (such as SSR) are involved.
The examined maize races harbored broad genetic diversity among and within their populations.Selection pressure and genetic drift have led to a decrease in genetic variation (in this case, to 19.1%) relative to the common ancestor.Diversity parameter values were higher than those reported in other studies on native maize because of the analyzed number of samples, the group of markers and the automated technique of allele detection.Regarding the distribution of total genetic variation, an average 66.9% of the variation was found within the populations and an average of 33.3% among populations.Geographic origin affected population differentiation and dispersion in PCoA.

Conclusions
1. Maize (Zea mays ssp.mays) races from Northwestern Mexico harbor broad genetic diversity both among and within their populations, with the genetic variation within populations being larger than that among populations.
2. Average diversity parameter values are higher than those reported in other studies on native maize, which confirms that the sample size used per race allows for a more detailed detection of the allelic variability present in the races.
3. Genetic diversity found in corn populations can be used in designing conservation strategies and maximizing its use in breeding programs.

Figure 1 .
Figure 1.Origin sites of the evaluated populations.

Figure 4 .
Figure 4. Phylogram of 107 native maize (Zea mays ssp.mays) populations from Northwestern Mexico and one teosinte (Zea mays ssp.parviglumis) population, constructed with the neighbor-joining clustering method using Rogers' modified genetic distances of 31 microsatellite loci.
The number of populations per race were as follows: 12 of the Blandito race, 13 of the Chapalote race, 13 of the Dulcillo del Noroeste race, 4 of the Elotero de Sinaloa race, 6 of the Jala race, 11 of the Onaveño race, 10 of the Reventador race, 23 of the Tabloncillo race, and 15 of the Vandeño race.

Table 1 .
Parameters of genetic diversity of nine native maize races (Zea mays ssp.mays) and one teosinte race (Zea mays ssp.parviglumis), based on 31 simple sequence repeat (SSR) loci.