Spatial pattern and genetic diversity estimates are linked in stochastic models of population differentiation

Nessa comunicacao, nos utilizamos analises de dados simulados e reais para demonstrar que, sob processos estocasticos de diferenciacao entre populacoes, os conceitos de heterogeneidade espacial e padrao espacial sao equivalentes. Nesses processos, a proporcao de variacao entre populacoes locais, estimada com base nas estatisticas FST, GST ou fP, esta correlacionada com o coeficiente angular do teste de Mantel relacionando distâncias geneticas de Nei e distâncias geograficas. O intercepto dessa regressao matricial indica o valor da divergencia genetica quando a distância geografica e zero, estando assim correlacionado com o valor de 1 - GST. Alem do interesse conceitual, a avaliacao da relacao entre medidas de heterogeneidade e padrao espacial pode ser utilizada para testar desvios de processos estocasticos de divergencia genetica, comparando diferentes loci ou grupos de especies.


INTRODUCTION
Spatial analysis of genetic divergence among local populations has always played a central role in population genetics and evolutionary biology.Genetic divergence may increase with geographic distance both because environmental variation (and associated selective effects) become more heterogeneous in large distances or because the low migration rates do not constrain divergence by random drift.
The analysis of spatial population structure has been traditionally performed in an implicit way by Wright's (1965) F-statistics and more recently developed but related techniques, such as φ and G ST estimates (Nei, 1973;Weir, 1996).These statistics have been criticized because they do not furnish a detailed description of the spatial patterns of genetic divergence (Barbujani, 1987), furnishing only a general description of spatial heterogeneity among local populations.Explicit spatial methods, such as autocorrelation analyses (Sokal and Oden, 1978a,b;Sokal and Jacquez, 1991;Epperson, 1995a,b) and matrix comparison techniques (Mantel's tests - Smouse et al., 1986;Manly, 1991;Thorpe et al., 1996), have been used to overcome this difficulty and describe in more detail the spatial patterns in genetic data.
The differences between spatial heterogeneity and spatial pattern analyses in this sense (Sokal and Oden, 1978a,b;Sokal, 1986;Diniz-Filho, 1998) reflect, in fact, implicit and explicit spatial approaches to the analysis of population differentiation.More recently, Rousset (1997) showed that it would be possible to calculate pairwise F ST statistics between populations and that a plot of F ST /(1 -F ST ) against geographic distances produced slopes and intercepts that could furnish interpretations of demographic parameters.
In this communication, we show that the concepts of spatial heterogeneity and spatial pattern indeed overlap in stochastic models of population differentiation, such as Wright's (1943) isolation-by-distance or Kimura's stepping stone (Kimura and Weiss, 1964).Because they generate functional (exponential) relationships between genetic divergence and geographic distances, there is a close correspondence between measurements of spatial heterogeneity (e.g., as F-statistics and related estimates) and parameters of explicit spatial models.

SIMULATION STUDY
We simulated 10 local populations randomly distributed in geographic space and, based on the geographic distance among them, we used the Ornstein-Uhlenbeck (O-U) stochastic process in the PDAP (Phenotypic Diversity Analysis Program -Díaz- Uriarte and Garland, 1996) program to simulate the evolution of five allele frequencies, keeping an exponential relationship between genetic divergence and geographic distances (for details, see Felsenstein, 1988;Hansen and Martins, 1996, and Telles, M.P.C. and Diniz-Filho, J.A.S., unpublished results).
We generated 50 data matrices containing 10 local populations and 5 allele frequencies, representing distinct loci, and for each one we estimated the G ST statistics, according to Alfenas et al. (1991).Estimates of H T , H S , D ST and G ST for this multiple data set were performed using METHODOLOGY Spatial pattern and genetic diversity estimates are linked in stochastic models of population differentiation José Alexandre Felizola Diniz-Filho1 and Mariana Pires de Campos Telles 2

Abstract
In the present study, we used both simulations and real data set analyses to show that, under stochastic processes of population differentiation, the concepts of spatial heterogeneity and spatial pattern overlap.In these processes, the proportion of variation among and within a population (measured by G ST and 1 -G ST , respectively) is correlated with the slope and intercept of a Mantel's test relating genetic and geographic distances.Beyond the conceptual interest, the inspection of the relationship between population heterogeneity and spatial pattern can be used to test departures from stochasticity in the study of population differentiation.the GSTRUN program.For each data set, we also performed a Mantel's test (Smouse et al., 1986;Manly, 1991) comparing Nei's (1972) pairwise genetic distances between populations with their geographic distances.Since we assume here that genetic divergence is a function of geographic distances, we also estimated the regression parameters of the linear model Nij = a + b Dij + ε where Nij is the Nei's (1972) genetic distance between populations i and j, Dij is the geographic distance between the same pair of populations and ε is the residual term.The intercept of this matrix regression (a) can be interpreted as the estimated genetic distance when the geographic distance is zero, which should be then related to the proportion of genetic variation within local populations (1 -G ST ).Its slope (b), in turn, must indicate the rate at which genetic divergence increases with geographic dis-tance.We used both model I and II regression estimates of a and b (Sokal and Rohlf, 1995) in the analyses, calculated by the MATREG program.The empirical results for model II regression parameters (assuming that X is also defined with error) are much clearer, and only these will be shown here.This occurred probably because both genetic and geographic distances are estimated with an error related to the definition of patches of genetic similarity caused by the stochastic variation in the simulations.Both programs (GSTRUN and MATREG) were written in Basic language by one of us (J.A.F.D.-F.) especially for the simulation analyses and are available upon request.
As predicted, slopes and intercepts of Mantel's regressions are significantly correlated with the proportion of variation among (G ST ) and within (1 -G ST ) local populations, respectively (Figure 1).These linear patterns are indeed coherent with simple stochastic processes of ge- A B netic divergence.When genetic divergence among populations is high, this indicates that populations distributed in geographic space are more differentiated, and so slopes of Mantel's tests are higher because space possesses a more clear influence in divergence (Figure 1A).Also, because the stochastic divergence contains necessarily a spatial component, the absence of a spatial pattern in the data is an indication that most of the genetic divergence is within the local populations (Figure 1B).

REAL DATA SET
We tested the patterns discussed above using the data matrix provided by Sokal et al. (1986), with 15 allele frequencies for 15 bi-allelic loci estimated in 50 Yanomama villages.Both G ST and Nei's (1972) genetic distances were estimated for each locus, and the a and b parameters of Mantel's regression were correlated with 1 -G ST and G ST , respectively.Results are similar to those obtained with the simulations (Figure 2), and this supports the previous conclusions that these significant relationships appear in stochastic genetic divergence.Sokal et al. (1986) discuss the hierarchical genetic structure among Yanomama, which seems to be more related to stochastic processes of divergence in space, caused by historical fission among villages through time.
No outliers are found in the two relationships of Figure 2, indicating, for example, that the spatial pattern of each locus (as measured by the Mantel's regression slope) is within the expected value for its magnitude of genetic heterogeneity.Indeed, the pairwise difference between residuals for the regressions in Figure 2 is not correlated (r = 0.03 and 0.02, as tested by a Mantel's test with 5000 permutations) with the pairwise difference between spa- A B tial patterns in each locus, that were calculated using Manhattan distances between correlograms using Moran's I in 10 distance classes.Despite this, it is interesting to note that the two loci combining high genetic divergence and the low magnitude of the spatial pattern in Figure 2A (negative residuals -Gm ag and Acp b ) are exactly those that depart more clearly from the fission history.These loci seem to be more influenced by admixture of the Ninam cluster of the Yanomama villages with an adjoining tribe, the Makiritare (Sokal et al., 1986, pp. 273-274).

CONCLUDING REMARKS
Both simulations and real data set analyses show that the concepts of spatial heterogeneity and spatial pattern are not independent in common stochastic processes of genetic divergence generating hierarchical structure at population level.In principle, since this correspondence occurs in stochastic processes, the lack of relationship or outliers in the relationship would indicate departures of this stochasticity in the genetic divergence by the action of other microevolutionary processes.
For instance, the inspection of a plot of the measurements of spatial heterogeneity for each locus, such as G ST , against parameters extracted from spatial pattern analyses (such as the slope of the Mantel's test) would be useful to detect which systems departure more from linear relationships.Selective pressures would explain this departure in such a specific system producing, at the same time, both complex relationships between divergence and space (which turns down the slope of Mantel's test) and strong heterogeneity among local populations.

Figure 1 -
Figure 1 -Relationship between slope (A) and intercept (B) of the matrix regression of Nei's (1972) genetic distances against geographic distances with the estimated proportion of variation among (G ST ) and within (1 -G ST ) local populations, for 50 simulated data matrices.

Figure 2 -
Figure 2 -Relationship between slope (A) and intercept (B) of the matrix regression of Nei's (1972) genetic distances against geographic distances with the estimated proportion of variation among (G ST ) and within (1 -G ST ) local populations, for different loci estimated for 50 Yanomama villages (data from Sokal et al., 1986).The arrows indicate the two loci that are more influenced by admixture with a neighbor tribe.