Print version ISSN 0103-9016
Sci. agric. (Piracicaba, Braz.) vol.60 no.2 Piracicaba 2003
GENETICS AND PLANT BREEDING
Intrapopulation fixation index dynamics in finite populations with variable outcrossing rates
Dinâmica do índice de fixação intrapopulacional em populações finitas com taxas variáveis de fecundação cruzada
Alexandre Siqueira Guedes CoelhoI; Roland VencovskyII
IDepto. de Genética - USP/ESALQ, C.P. 83 - 13400-970 - Piracicaba, SP - Brasil
ABSTRACTThe intrapopulation fixation index ( f ) is inversely related to the outcrossing rate (t). Results obtained from data on molecular markers of natural populations have shown that these values are highly variable, even when measured in the same group of individuals. It is thus suggested that factors besides those described in Wright's genetic equilibrium must be operating. Using simulated data sets this study shows that the finite size condition of a population is sufficient to spread the estimated f values along a range at equilibrium, as opposed to keeping them at the theoretical equilibrium point. The variation in outcrossing rates can amplify this range considerably. Correlation between estimated f values obtained from different loci in this condition showed to be negatively related to the outcrossing rates, and positively related to the variance of these rates along generations. The finite size of populations associated to small fluctuations in t mean values over time may explain the usually reported high variation among estimated f values of different loci.
Key words: population genetics, inbreeding, computer simulation
O fato de que os valores do índice de fixação intrapopulacional ( f ) e os das taxas de fecundação cruzada (t) mantém uma estreita relação entre si é amplamente conhecido. Resultados das análises de marcadores moleculares em populações naturais têm demonstrado uma elevada variação destes valores para locos diferentes avaliados em uma mesma população, sugerindo que outros fatores, além daqueles descritos no modelo de equilíbrio genético de Wright devem estar operando. Pelo uso de simulações, demonstra-se neste trabalho que a condição finita de uma população é suficiente para que os valores estimados de f passem a se estabilizar ao longo de um intervalo de variação e não mais em um único ponto. Demonstra-se ainda que variações nas taxas de fecundação cruzada ao longo das gerações amplificam substancialmente a magnitude deste intervalo. A correlação entre os valores estimados de f obtidos de locos diferentes nestas condições mostrou-se dependente dos valores das taxas de fecundação cruzada e da magnitude da variância destas taxas entre gerações, podendo ser nula sob condições de panmixia e taxas constantes. A condição finita das populações associada a pequenas flutuações na taxa média de fecundação cruzada de diferentes gerações, condições tipicamente encontradas na natureza, podem explicar as discrepâncias entre os valores estimados de f de locos diferentes, comumente reportadas na literatura. Há possibilidade de que a magnitude da variância entre os valores estimados de f de diferentes locos forneça uma estimativa do número efetivo de indivíduos reprodutivamente ativos em uma dada população.
Palavras-chave: genética de populações, endogamia, simulação computacional
In very large populations, reproducing by panmixis, in the absence of selection, migration and mutation, the allele and genotype frequencies are kept in equilibrium, and do not change along generations, keeping a relation described by (Crow & Kimura, 1970):
in which: pu is the frequency of allele u; Puu is the frequency of the homozygous genotype for allele u; Puv is the frequency of heterozygous genotype for the u and v alleles. These relations are known as the Hardy-Weinberg law and were postulated independently by G. H. Hardy and W. Weinberg in 1908.
Wright (1921; 1922) demonstrated that an equilibrium is also attained if, when keeping the other Hardy-Weinberg conditions, the population reproduces under a fixed outcrossing rate (t) instead of reproducing by panmixis. Relationships among allele and genotype frequencies in this case are altered and are modified to:
in which f is the intrapopulation fixation index (Wright, 1951; 1965). These relations are known as the Wright equilibrium relations and assume great importance in the genetic characterization of natural populations, since they provide not only an indicator parameter of the degree of structure of the genetic variability at the individual level, but also a method to obtain outcrossing rate estimates in natural conditions, given a single generation data set (Fyfe & Bailey, 1951). This is possible because of the fact that under Wright's equilibrium conditions the f and t values keep a relation that may be expressed by (Bennet & Binet, 1956):
The extensive use of this knowledge, however, was delayed until the development of genotype evaluation techniques at the molecular level, carried out along the last years. The development of molecular marker techniques along the last 30 years, and notably during the last 20 years, resulted in an actual revolution in the capacity of genotype evaluation under natural conditions. These techniques allowed the use of a much broader variety of species in genetic studies in contrast to the use of a few model species in the past.
Papers reporting the genetic characterization of natural populations are now abundant, many of which providing highly variable estimates for the intrapopulation fixation index obtained from information on different loci. In this respect, one must recognize that variation among different populations/species reflect different population structures and reproductive strategies that evolved under different ecological pressures (Loveless & Hamrick, 1984). On the other hand, variation in the same population along generations or over different loci, must reflect different aspects of its dynamic, being then a consequence of the demographic stochastic process and genetic drift (Gillespie, 1998).
The present paper evaluates, through computer simulation, the effects resulting from the finite condition of populations and those resulting from variation in the outcrossing rates, along generations and among individuals of a given generation, over the estimated values of the intrapopulation fixation index. Information herein produced might be valuable for the development of methods that would allow a better characterization of the dynamics of the genetic properties of populations, and of their implications in the recognition of better conservation and management strategies.
In a way to allow the simulation and evaluation of a diverse set of conditions, a computer program (available from the authors) based on the Delphi/Pascal language was developed. Computer simulations were carried out using the following algorithm: 1. attribute to each of the L loci of the N individuals a given genotype; 2. identify an individual at random (i); 3. attribute to the outcrossing rate a given value (t); 4. generate a random number from a [0, 1] uniform distribution (x); 5. if x<t, obtain a second random individual (j), if not, take j=i; 6. for each of the L loci, take one allele at random from i and combine it with a randomly taken allele from j; 7. repeat 2 to 6 steps N times, providing the N individuals that will constitute the next generation; 8. repeat step 7 for the number of desired generations.
Initial genotypes were obtained from expected genotype frequencies under Hardy-Weinberg equilibrium, considering two equally frequent alleles per locus. Data were obtained after 20 generations, when Wright's equilibrium conditions prevail. The effects of the following factors were evaluated: a) population size; b) variation in the generation mean outcrossing rate; c) variation among the outcrossing rates of different individuals from a given generation.
The effects of the variation of the outcrossing rates, both among individuals from the same generation and among different generations, were evaluated admitting that the outcrossing rate is a Beta distributed random variable (t), restricted to the [0, 1] interval, displaying a Beta density function given by:
in wich: is the gamma function and a and b are distribution parameters, being a>0 and b>0.
Mean and variance of t are then given, respectively, by:
Parameters a and b used in simulation were defined in a way to characterize species showing different mating systems. Simulations were carried out under conditions typical for autogamous species ( = 0.05), mixed mating species ( = 0.50) and alogamous species ( = 0.95). Beta distributions with variances equal to 0.000025 and 0.0025 were established for each case (Figure 1). Simulations with variance equal to 0.025 were carried out specifically for = 0.50, because of the fact that for mean outcrossing rates equal to 0.05 and 0.95, the corresponding density functions would represent a behavior hardly found in the nature, in which the most likely values of t would be the extreme ones, simultaneously. For each simulated case, the a and b values were obtained from the mean and variance of t values using the relations:
Simulated values for the Beta distribution were generated following a procedure adapted from Dagpunar (1988).
Considering the magnitude of the evaluated populations, the intrapopulation fixation index values were estimated using the traditional expression:
in which: is the expected heterozygosity under Hardy-Weinberg equilibrium obtained directly from sample observed allele frequencies and is the sample observed heterozygosity.
RESULTS AND DISCUSSION
Considering the dynamics of the fixation index under the different simulated conditions, the estimated values of f () reach the equilibrium condition rather quickey, this condition being represented in certain cases by an interval of variation and not anymore by a single value (Figures 2, 3 and 4). The values obtained showed an increasing magnitude of variation for decreasing population sizes, exception being made to the simulation carried out with t=0.00. Variation in mean outcrossing rates along different generations also noticeably influenced the dynamic of values obtained in simulations. The magnitude of variation of showed to be positively associated to variation in t values. On the other hand, situations in which the outcrossing rates were variable only among individuals of the same generation, the mean values kept constant along generations and with a population size of 1,000,000 individuals, did not result in a detectable variation of values along generations.
Variation of outcrossing rates (t) along generations resulted in a stronger effect on the behavior of values than that derived from variation of t among different individuals (Tables 1, 2 and 3). The weaker influence of the variation of outcrossing rates among individuals from the same generation over the dynamic of values along generations should, however, be considered carefully. Once the effect of the variation of mean outcrossing rates along generations over the behavior of values is recognized, one should consider that for small population sizes, variation of t values at individual level might considerably contribute to the increment in t value oscillation, and in this case, it would result in an indirect effect on the variance of values.
The detected variation in values does not include that derived from statistical sampling: resulted from the fact that only a portion of the actual population was evaluated. In this sense, even in situations in which the evaluation of every individual of a given population is possible, an oscillation in the values from different loci along generations must be expected, resulting from the instability of the outcrossing rates and from the events of genetic sampling of gametes.
Figure 5 shows the observed probability distributions for the intrapopulation fixation index values obtained along generations in populations of 10,000 individuals, with different outcrossing rates. These distributions, obtained with constant outcrossing rates among individuals and generations, exhibit strong similarity with Beta probability density functions defined for the interval (0, N), with means equal to the simulated t values and variances given by the same expression of sampling variance of , reported by Vencovsky (1994):
The probability density function of interest, expressed in terms of f is given by:
The usefulness of the (0, N) interval derives from the fact that the values are allowed to be negative, implying in values that may be larger than 1.0 (apparent outcrossing rate).
The similarity between these functions suggests that, in cases in which the outcrossing rates are kept relatively constant along generations, the behavior of values may be described using a Beta probability density function, defined by parameters that are function of the population size, of the f value correspondent to the actual outcrossing rate, and of the allele frequencies.
Figure 6 shows the observed probability distributions for the values obtained from 10,000 loci evaluated in a single generation, in populations of 1,000 individuals, with different outcrossing rates. The figure also shows the probability density functions defined in a similar manner to that used to describe the behavior of values in different generations. The fact that the dynamic behavior of values in both cases, a given locus along generations and different loci in a given generation, presented a similar pattern suggests that data from loci in a single generation might provide information on the single locus dynamics along generations. Although data from different generations are rare in the literature, information from different loci is not and could thus provide an estimate of the effective number of individuals reproductively active in a given population (N).
Figure 7 shows the predicted and observed values for the variance of among loci in different situations. In practical terms, the variance of values being considered does not include the effects resulting from statistical sampling, and would be better interpreted as a variance component of values associated to different loci, than as a mean square of the sample estimated values for .
Estimates of the correlation coefficient between values obtained from two different loci along generations are represented in Figure 8. The magnitude of the correlation between values in different loci along generations decreases with the increase in the outcrossing rate, reaching values close to zero under panmixis. Variation in the mean outcrossing rate along generations results in a detectable positive effect over this correlation (Figure 9). In a simulation of a population with 10,000 individuals, with a constant outcrossing rate of 0.95, the correlation of values of different loci along generations were close to zero. Inclusion of variance in mean outcrossing rates along generations increased the variance of values 9 fold, at the same time that it increased the correlation of from different loci to approximately 0.88. Although the correlation between values estimated for different loci increases with the variance of outcrossing rates along generations the heterogeneity among values from different loci in a given generation does not change, since it results from the genetic sampling resulting from the finite condition of populations.
From a practical point of view, the finite condition of populations might result in a pronounced variation in the intrapopulation fixation index values estimated from different loci in a given generation. Small populations, with dozens of individuals, and variation in outcrossing rates are conditions ubiquitous in nature and the resulting effect of these factors might be contributing to the high level of variation of values among loci evaluated in the same population, commonly reported in the literature.
As a final comment, the difficulty in performing a good characterization of intrapopulation fixation index values by the use of point estimates should be stressed. The dynamic behavior of this parameter at equilibrium would certainly be better characterized by a probabilistic approach in which probabilities of occurrence of different values could be evaluated. Without any loss of information, an approach of such a nature would allow not only to evaluate the mean value, but also its instability, that seems to be ubiquitous in natural conditions.
The estimated values of the intrapopulation fixation index exhibit a dynamic behavior under equilibrium, that is strongly influenced by population size and variation in mean outcrossing rates along generations. The instability of values from different loci evaluated in the same population, commonly reported in the literature, might be explained by the finite condition of natural populations.
BENNET, J.H.; BINET, F.E. Association between Mendelian factors with mixed selfing and random mating. Heredity, v.10, p.51-55, 1956. [ Links ]
CROW, J.F.; KIMURA, M. An introduction to population genetics theory. New York: Harper & Row, 1970. 591p. [ Links ]
DAGPUNAR, J. Principles of random variate generation. Oxford: Clarendon Press, 1988. 248p. [ Links ]
FYFE, J.L.; BAILEY, N.T.J. Plant breeding studies in leguminous forage crops: I. Natural cross-breeding in winter beans. Journal of Agricultural Science, v.41, p.371-378, 1951. [ Links ]
GILLESPIE, J.H. Population genetics: a concise guide. Baltimore: Johns Hopkins University Press, 1998. 184p. [ Links ]
LOVELESS, M.D.; HAMRICK, J.L. Ecological determinants of genetic structure in plant populations. Annual Review of Ecology and Systematics, v.15, p.65-95, 1984. [ Links ]
VENCOVSKY, R. Variance of an estimate of outcrossing rate. Brazilian Journal of Genetics, v.17, p.349-351, 1994. [ Links ]
WRIGHT, S. Systems of mating. Genetics, v.6, p.111-178, 1921. [ Links ]
WRIGHT, S. Coefficients of inbreeding and relationship. American Naturalist, v.56, p.330-338, 1922. [ Links ]
WRIGHT, S. The genetical structure of populations. Annals of Eugenics, v.15, p.323-354, 1951. [ Links ]
WRIGHT, S. The interpretation of population structure by F-statistics with special regard to systems of mating. Evolution, v.19, p.395-420, 1965. [ Links ]
Received July 22, 2002