Accessibility / Report Error

Estimation of population profiles of two strains of the fly Megaselia scalaris (Diptera: Phoridae) by bootstrap simulation

Estimação dos perfis populacionais de duas linhagens do díptero forídeo Megaselia scalaris por simulação bootstrap

Abstracts

Based on experimental population profiles of strains of the fly Megaselia scalaris (Phoridae), the minimal number of sample profiles was determined that should be repeated by bootstrap simulation process in order to obtain a confident estimation of the mean population profile and present estimations of the standard error as a precise measure of the simulations made. The original data are from experimental populations founded with SR and R4 strains, with three replicates, which were kept for 33 weeks by serial transfer technique in a constant temperature room (25 ± 1.0°C). The variable used was population size and the model adopted for each profile was a stationary stochastic process. By these simulations, the three experimental population profiles were enlarged so as to determine minimum sample size. After sample size was determined, bootstrap simulations were made in order to calculate confidence intervals and to compare the mean population profiles of these two strains. The results show that with a minimum sample size of 50, stabilization of means begins.

bootstrap; Megaselia; populations; simulation; population size


A partir de perfis populacionais experimentais de linhagens do díptero forídeo Megaselia scalaris, foi determinado o número mínimo de perfis amostrais que devem ser repetidos, via processo de simulação "bootstrap", para se ter uma estimativa confiável do perfil médio populacional e apresentar estimativas do erro-padrão como medida da precisão das simulações realizadas. Os dados originais são provenientes de populações experimentais fundadas com as linhagens SR e R4, com três réplicas cada, e que foram mantidas por 33 semanas pela técnica da transferência seriada em câmara de temperatura constante (25 ± 1,0ºC). A variável usada foi tamanho populacional e o modelo adotado para cada perfíl foi o de um processo estocástico estacionário. Por meio das simulações, os perfis de três populações experimentais foram amplificados, determinando-se, dessa forma, o tamanho mínimo de amostra. Fixado o tamanho de amostra, simulações "bootstrap" foram realizadas para construção de intervalos de confiança e comparação dos perfis médios populacionais das duas linhagens. Os resultados mostram que com o tamanho de amostra igual a 50 inicia-se o processo de estabilização dos valores médios.

bootstrap; Megaselia; populações; simulação; tamanho populacional


ESTIMATION OF POPULATION PROFILES OF TWO STRAINS OF THE FLY Megaselia scalaris (DIPTERA: PHORIDAE) BY BOOTSTRAP SIMULATION

MANZATO, A. J.,1 TADEI, W. J.2 and CORDEIRO, J. A.1

1Departamento de Ciências de Computação e Estatística, IBILCE, UNESP, Rua Cristóvão Colombo, 2265, CEP 15054-000, C.P. 136, São José do Rio Preto, SP, Brazil

2Departamento de Biologia, IBILCE-UNESP, Rua Cristóvão Colombo, 2265, CEP 15054-000, C.P. 136, São José do Rio Preto, SP, Brazil.

Correspondence to: Wlademir João Tadei, Departamento de Biologia, IBILCE, UNESP, Rua Cristóvão Colombo, 2265, CEP 15054-000, C.P. 136, São José do Rio Preto, SP, Brazil

Received April 26, 1999 ¾ Accepted June 15, 1999 ¾ Distributed August 31, 2000

(With 2 figures)

ABSTRACT

Based on experimental population profiles of strains of the fly Megaselia scalaris (Phoridae), the minimal number of sample profiles was determined that should be repeated by bootstrap simulation process in order to obtain a confident estimation of the mean population profile and present estimations of the standard error as a precise measure of the simulations made. The original data are from experimental populations founded with SR and R4 strains, with three replicates, which were kept for 33 weeks by serial transfer technique in a constant temperature room (25 ± 1.0°C). The variable used was population size and the model adopted for each profile was a stationary stochastic process. By these simulations, the three experimental population profiles were enlarged so as to determine minimum sample size. After sample size was determined, bootstrap simulations were made in order to calculate confidence intervals and to compare the mean population profiles of these two strains. The results show that with a minimum sample size of 50, stabilization of means begins.

Key words: bootstrap, Megaselia, populations, simulation, population size.

RESUMO

Estimação dos perfis populacionais de duas linhagens do díptero forídeo Megaselia scalaris por simulação bootstrap

A partir de perfis populacionais experimentais de linhagens do díptero forídeo Megaselia scalaris, foi determinado o número mínimo de perfis amostrais que devem ser repetidos, via processo de simulação "bootstrap", para se ter uma estimativa confiável do perfil médio populacional e apresentar estimativas do erro-padrão como medida da precisão das simulações realizadas. Os dados originais são provenientes de populações experimentais fundadas com as linhagens SR e R4, com três réplicas cada, e que foram mantidas por 33 semanas pela técnica da transferência seriada em câmara de temperatura constante (25 ± 1,0ºC). A variável usada foi tamanho populacional e o modelo adotado para cada perfíl foi o de um processo estocástico estacionário. Por meio das simulações, os perfis de três populações experimentais foram amplificados, determinando-se, dessa forma, o tamanho mínimo de amostra. Fixado o tamanho de amostra, simulações "bootstrap" foram realizadas para construção de intervalos de confiança e comparação dos perfis médios populacionais das duas linhagens. Os resultados mostram que com o tamanho de amostra igual a 50 inicia-se o processo de estabilização dos valores médios.

Palavras-chave: bootstrap, Megaselia, populações, simulação, tamanho populacional.

INTRODUCTION

In order to be effective, an experimental population model should allow for the counting of the population through precise census, the re-establishment of the population after each census with no major damage to the individuals, and even enable for the control of environmental factors in various ways. Parameters such as productivity, mortality, and age structure are components which are directly related with the determination of population size, and they are highly influenced by the interactions between genetic and environmental variables to which the populations are submitted. The size of a population at a determined moment is the group of living individuals at that particular time, and can be measured either with experimental populations by a precise census, or with natural populations using statistical methods of capture, marking, and recapture of individuals.

The necessity of knowing the regulation mechanisms of population size demanded suitable laboratory working techniques and resulted in the developement of two basic techniques, L'Héritier & Teissier (1933) population cages and Buzzati-Traverso (1955) serial transfer technique in bottles, which are still used in insect population studies, with some modifications. A basic problem found in studying populations is the determination of minimum sample size for the use of adequate statistical procedures. The purpose of this paper is to determine the minimum number of simulated sample profiles from experimental population profiles that should be repeated in order to obtain a confident estimate of the mean population profile, and to present estimations of the standard error as a measure of precision for the simulations made. Using the process of bootstrap intensive computational simulation, the profiles of three experimental populations of the fly Megaselia scalaris were enlarged, and in this way the minimum sample size was determined for the estimation of the mean population profile and also the confidence intervals. Bootstrap is a simulation technique developed for some kinds of statistical inference.

It was suggested by Efron (1979, 1981, 1985) to simplify traditionally difficult statistical theory calculations. A bootstrap sample x* = (x1*, x2*, ..., xn*), is obtained by n times random sampling, with replacement of the original data (x1*, x2*, ..., xn*). For example, we may obtain x* = (x5, x7, x5, x4, x7, x3, x1) if n = 7. With minimum sample size determined bootstrap confidence intervals were calculated to compare the behaviour of population size of two Megaselia scalaris geographical strains.

MATERIAL AND METHODS

Six experimental populations of two geographical strains of Megaselia scalaris Loew (Diptera: Phoridae), SR, from Seropédica, State of Rio de Janeiro, and R4, from São José do Rio Preto, State of São Paulo, were founded and kept for 33 weeks by the serial transfer technique in a constant temperature room (25 ± 1.0°C). This technique was proposed by Buzzati-Traverso (1955), modified by Dobzhansky & Pavlovsky (1961) and, according to the description of Tadei & Mourão (1981), is the following: The founding flies are introduced into a ¼ litre bottle, in which they deposit eggs for one week. On the seventh day, and at intervals of seven days, surviving flies are collected, kept in ether, counted and transferred with the surviving adults to a new bottle. On the 35th day, which corresponds to the fifth census, the population has five bottles, and at each of the following censuses, a new bottle is added and the oldest thrown out. Thus, the adult part of the population is always in the same bottle and with fresh medium. The other four bottles contain eggs, larvae, pupae and newly emerged flies which constitute the immature part of the population. For the application of bootstrap simulation process, MINITAB statistical software (1996) was used, and the model adopted for each population profile was a stationary stochastic process. In Fig. 1 an initial period of adaptation to experimental conditions can be observed therefore, to analyse the effect of the simulation, only the final 21 weeks were considered. The criterion for discarding the initial weeks of population maintenance was the setting of a linear regression, related to time, wich was adopted as the beginning of the process when the regression coefficient reached zero. To determine minimum sample size, profiles were simulated from R4 strain original data, and this confidence value, following Efron & Tibshirani (1986, 1993), measured by the sample standard error obtained through bootstrap simulation compared with the standard deviation of , . With minimum sample size determined, simulations were made for each week, building 95% confidence intervals, and the two strains were compared using confidence regions calculated with these intervals.


RESULTS AND DISCUSSION

The population size profiles of the three experimental population replicates of SR and R4 strains of Megaselia scalaris, for the 33 weeks of maintenance, are graphically shown in Fig. 1. Population size was obtained by counting the number of flies in the bottle which contained the adult part of the population on the days of the weekly census. Male, female, and total (male + female) means and standard errors for population size and productivity variables of the population replicates of the two strains, correspond to the final 21 maintenance weeks. In this period we can observe an equilibrium in population size, as shown in Table 1. Although male birth rate is 1.4 times higher than female birth rate among SR populations and 1.2 times higher among R4 populations, males are absent in the adult part, and total population size is defined by the total number of females. This is due to the fact that Megaselia scalaris male longevity is 5 or 6 days at 25°C. Other experiments, with a higher number of weekly transferences, are being done with the purpose of studying age structure in these populations.

Considering the R4 strain data in Table 2, we can see the results of linear regression adjusting for the beginning of the stationary process. For effect of simulations, although stability had already occurred from the 25th week, only data from the final 21 weeks were considered. For bootstrap simulation to be applied, the hypothesis of independence between observations should be satisfied. To do this, intervals of three consecutive weeks were taken as a unit, wich was then considered as 21 independent intervals with the three original profiles from the 13th observation week. In Table 3 the means of the simulated profiles are given using varying numbers of profiles, in an attempt to determine the minimum profile number at which the means begin to be fixed.

We observe the beginning of average stabilization with a sample size of 50. Table 4 shows the estimated values for standard error of the mean via bootstrap simulation, when sample size is 50, and these values are compared to the estimate by the estimator, known from results of the classic inference .

Table 2 shows that the stationary process starts from the 9th maintenance week, and we can consider as data to be analysed those from the following 25 weeks. To facilitate computational programming, we considered data from the final 21 weeks, as indicated by the arrow in Fig. 1. Table 3 shows that with a sample size of 50, means start to stabilize, thus determining the minimum number of profiles which should be simulated via bootstrap from data obtained with experimental laboratory populations. The standard error is a precise measure of the estimates provided by the estimator of interest when bootstrap simulations are made. In this case, the estimator is the mean, the standard error of which is statistically known by , wich corresponds to the mean profile during the 21 weeks. It can be observed in Table 4 that the estimations of standard error obtained via bootstrap are rather good, even with few repetitions of the simulation process when sample size is fixed.

For both the SR and R4 strains, working with the three experimental replicates, confidence intervals for each week were calculated by bootstrap simulation process. Initially, a sample size of 50 was taken for the three original experimental values, and after, the mean was calculated. A sample size of 50 is known to be an adequate size to obtain good estimates of standard error and convergence to normality for mean distribution. This process was repeated n times, n varying from 50 to 1300, enabling the study of mean distribution and the calculation of confidence intervals.

In Tables 5 and 6 we have the means and standard deviations obtained by simulation, as well as the means for the three experimental replicates of SR and R4 strains, respectively.

In Tables 7 and 8 we have the inferior and superior extremes of the 95% confidence intervals calculated for SR and R4 strains, for a number of simulations equal to 50, 500, 800 and 1300, based on Tables 5 and 6, respectively. Tables 7 and 8 shows that the confidence intervals obtained with 500 simulations are very close to those obtained with 800 and 1300, which indicates that with 500 simulations good results can be obtained. In a comparison between the SR and R4 profiles, the idea of interval intersection is prevalent in the respective observation weeks. In order to illustrate these comparisons, in Fig. 2 are shown mean estimated profiles for both SR and R4 strains, and the respective confidence regions calculated via bootstrap. With the obtained results, we conclude that for the population size variable and under the conditions in which the experiments were carried out, SR strain stood in a higher level compared to R4 strain.


The results of this experiment draw attention to practical laboratory work. Planning experiments with many replicates sometimes becomes operationally impossible.

The idea of increasing the number of replicates via bootstrap simulation may be an alternative, and it is natural that restrictions imposed by insect biology itself tend to be important in adopting models, such as the one of a stationary stochastic process. The main restriction, in this case, is the variability of population size, since the lower the variation of this variable, as the maintenance weeks pass, the lower the number of profiles to be simulated in the determination of minimum sample size to estimate the mean population profile.

Another important aspect is related to the statistical tests for the comparison of populations by their mean profiles. Hypothesis tests should be developed in order to test the hypothesis of profile equality, especially if there is a great intersection among the confidence regions.

  • BUZZATI-TRAVERSO, A. A., 1955, Evolutionary changes in components of fitness and other polygenic traits in Drosophila melanogaster populations. Heredity, 9: 153-186.
  • DOBZHANSKY, T. H. & PAVLOVSKY, O., 1961, A futher study of fitness of chromosomally polymorphic and monomorphic population of Drosophila pseudoobscura Heredity, 16: 169-179.
  • EFRON, B., 1979, Bootstrap methods: another look at the jackknife. The Annals of Statistics, 7: 1-26.
  • EFRON, B., 1981, Nonparametric estimates of standard error: The jackknife, the bootstrap and other methods. Biometrika, 68: 589-599.
  • EFRON, B., 1985, Bootstrap confidence intervals for a class of parametric problems. Biometrika, 72: 45-58.
  • EFRON, B. & TIBSHIRANI, R. J., 1986, Bootstrap methods for standard errors, confidence intervals, and other measures of statistical accuracy. Statistical Science, 1: 54-77.
  • EFRON, B. & TIBSHIRANI, R. J., 1993, An introduction to the bootstrap Chapman & Hall, New York, 436p.
  • L'HÉRITIER, P. & TEISSIER, G., 1933, Étude d'une population de Drosophiles en équilibre. C. R. Acad. Sci Paris, 197: 1765-1767.
  • MINITAB for windows, release 11, 1996, Minitab Inc., State College, PA, USA.
  • TADEI, W. J. & MOURÃO, C. A., 1981, Cyclic oscillation in population size of Drosophila sturtevanti Rev. Bras. Genet, 4: 149-164.

Publication Dates

  • Publication in this collection
    22 Feb 2001
  • Date of issue
    Aug 2000

History

  • Accepted
    15 June 1999
  • Received
    26 Apr 1999
Instituto Internacional de Ecologia R. Bento Carlos, 750, 13560-660 São Carlos SP - Brazil, Tel. / Fax: +55 16 271-5726 - São Carlos - SP - Brazil
E-mail: bjb.iie@terra.com.br