Acessibilidade / Reportar erro

Sample size in the estimation of correlation coefficients for corn hybrids in crops and accuracy levels

Abstracts

This study determined the sample size necessary for the estimation of the Pearson linear correlation coefficients for single, triple and double corn hybrids in crops and accuracy levels. In 361, 373 and 416 plants, respectively, of the single, triple and double hybrids of the 2008/2009 crop and, in 1,777, 1,693 and 1,720 plants, respectively, of the single, triple and double hybrids of the 2009/2010 crop, twelve traits were measured: plant height, ear insertion height, ear weight, number of grain rows per ear, ear length and diameter, cob weight and diameter, weight of hundred grains, number of grains per ear, grain length and grain yield. Then, in each hybrid and crop, were estimated the correlation coefficients for the 66 pairs of traits and determined the sample size necessary to estimate the correlation coefficients in four accuracy levels [amplitudes of the confidence interval of 95% (ACI95%) of 0.15, 0.25, 0.35 and 0.45], by resampling with replacement. The sample size varies among hybrids, crops and pairs of traits. Larger sample size is required to estimate the correlation coefficient between weakly correlated traits and smaller sample size is needed to estimate the correlation coefficient between highly correlated traits. Independently of hybrid, crop and pairs of traits, 375, 195 and 120 plants are sufficient, respectively, to estimate the correlation coefficients with maximum ACI95% of 0.25, 0.35 and 0.45.

Zea mays L; resampling; experimental design; linear relationships


O objetivo deste trabalho foi determinar o tamanho de amostra necessário para a estimação de coeficientes de correlação linear de Pearson para híbridos simples, triplo e duplo de milho em safras e níveis de precisão. Em 361, 373 e 416 plantas, respectivamente, dos híbridos simples, triplo e duplo da safra 2008/2009 e em 1.777, 1.693 e 1.720 plantas, respectivamente, dos híbridos simples, triplo e duplo da safra 2009/2010 foram mensurados 12 caracteres: altura de planta; altura de inserção e peso de espiga; número de fileiras de grãos por espiga; comprimento e diâmetro de espiga; peso e diâmetro de sabugo; massa de 100 grãos; número de grãos por espiga; comprimento de grãos; e produtividade de grãos. Em cada híbrido e safra foram estimados os coeficientes de correlação para os 66 pares de caracteres e determinou-se o tamanho de amostra para estimação de coeficientes de correlação em quatro níveis de precisão [amplitudes do intervalo de confiança de 95% (AIC95%) de 0,15, 0,25, 0,35 e 0,45] por meio de reamostragem com reposição. O tamanho de amostra varia entre híbridos, safras e pares de caracteres. Maior tamanho de amostra é necessário para a estimação do coeficiente de correlação entre caracteres fracamente correlacionados e menor tamanho de amostra é necessário para a estimação do coeficiente de correlação entre caracteres altamente correlacionados. Independentemente do híbrido, da safra e do par de caracteres, 375, 195 e 120 plantas são suficientes, respectivamente, para a estimação de coeficientes de correlação com AIC95% máximas de 0,25, 0,35 e 0,45.

Zea mays L; reamostragem; planejamento experimental; relações lineares


1 INTRODUCTION

Maize is the cereal with the largest volume of production worldwide estimated at 906.82 million tons for the crop of 2014/2015, in an area of 160.2 million hectares, Brazil is the world’s third largest producer (FAO, 2014Food and Agriculture Organization of the United Nations – FAO2014Recuperado de http://statistics.amis-outlook.org/data/index.html#DOWNLOAD). Maize is used in food and feed and as industrial raw material, mainly due to the amount of reserves accumulated in grains (Fancelli & Dourado Neto, 2004Fancelli, A. L., & Dourado Neto, D. (2004). Produção de milho (2. ed.). Guaíba: Agropecuária. 360 p.). In this sense, the increase in corn productivity in recent decades has been assigned equally to management improvement and breeding (Duvick, 2005Duvick, D. N. (2005). The contribution of breeding to yield advances in maize (. Zea mays L.)Advances in Agronomy, 86, 83-145. http://dx.doi.org/10.1016/S0065-2113(05)86002-X.
http://dx.doi.org/10.1016/S0065-2113(05)...
).

In breeding, plant selection can be performed directly or indirectly by studying the linear relationships between traits. For such analysis, it may be used the Pearson linear correlation coefficient (r), which measures the direction and intensity of the linear relationship between two random variables (Ferreira, 2009Ferreira, D.F. (2009). Estatística básica. 2. ed. (664p). Lavras: UFLA.). The direction of the correlation may be either positive or negative, in the range of –1 ≤ r ≤ 1, wherein the intensity of the linear correlation is larger, the closer to |1| is r. Additional studies as path analysis and canonical correlations have also been recommended for indirect selection of plants (Cruz & Regazzi, 1997Cruz, C. D., & Regazzi, A. J. (1997). Modelos biométricos aplicados ao melhoramento genético (2. ed.). Viçosa: UFV. 390 p.).

For reliable results from linear relationships studies, correct sample size (number of plants) is required to be used for the estimation of correlation coefficients. These coefficients may be interpreted separately or used in further analysis, for example, path analysis and canonical correlations and therefore must be estimated accurately. Accordingly, Cargnelutti Filho et al. (2010)Cargnelutti Filho, A., Toebe, M., Burin, C., Silveira, T. R., & Casarotto, G. (2010). Tamanho de amostra para estimação do coeficiente de correlação linear de Pearson entre caracteres de milho. Pesquisa Agropecuaria Brasileira, 45, 1363-1371. Recuperado de http://www.scielo.br/pdf/pab/v45n12/v45n12a05.pdf performed the sampling design for the estimation of linear correlation coefficients between traits of single, triple and double corn hybrids, based on data from a crop and for one accuracy level. The authors found that 300 plants were needed for the estimation of 91 pairs of traits, with maximum amplitude of 95% confidence interval (ACI95%) of 0.30, depending on the hybrid and pair of traits. Sample design studies for the estimation of Pearson correlation coefficients were also conducted in crambe (Cargnelutti Filho et al., 2011Cargnelutti Filho, A., Lopes, S. J., Toebe, M., Silveira, T. R., & Schwantes, I. A. (2011). Tamanho de amostra para estimação do coeficiente de correlação de Pearson entre caracteres de Crambe abyssinica.Revista Ciência Agronômica, 42, 149-158. http://dx.doi.org/10.1590/S1806-66902011000100019.
http://dx.doi.org/10.1590/S1806-66902011...
) and castor bean (Cargnelutti Filho et al., 2012Cargnelutti Filho, A., Lopes, S. J., Brum, B., Toebe, M., Silveira, T. R., & Casarotto, G. (2012). Tamanho de amostra para a estimação do coeficiente de correlação linear de Pearson entre caracteres de mamoneira. Semina. Ciências Agrárias, 33, 953-962. http://dx.doi.org/10.5433/1679-0359.2012v33n3p953.
http://dx.doi.org/10.5433/1679-0359.2012...
). Still, Shieh (2010)Shieh, G. (2010). Estimation of the simple correlation coefficient. Behavior Research Methods, 42, 906-917. http://dx.doi.org/10.3758/BRM.42.4.906. PMid:21139158
http://dx.doi.org/10.3758/BRM.42.4.906...
evaluated the properties and the effect of sampling design on the linear correlation coefficient of Pearson and Bonett & Wright (2000)Bonett, D. G., & Wright, T. A. (2000). Sample size requirements for estimating pearson, kendall and spearman correlations. Psychometrika, 65, 23-28. http://dx.doi.org/10.1007/BF02294183.
http://dx.doi.org/10.1007/BF02294183...
conducted sampling design studies for the estimation of Pearson, Kendall and Spearman correlation coefficients.

As noted above, sampling design studies for the estimation of correlation coefficients have been developed, including corn (Cargnelutti Filho et al., 2010Cargnelutti Filho, A., Toebe, M., Burin, C., Silveira, T. R., & Casarotto, G. (2010). Tamanho de amostra para estimação do coeficiente de correlação linear de Pearson entre caracteres de milho. Pesquisa Agropecuaria Brasileira, 45, 1363-1371. Recuperado de http://www.scielo.br/pdf/pab/v45n12/v45n12a05.pdf). However, it is emphasized that the sample sizing, considering hybrids, crops and accuracy levels, is important to enable the reliable estimation of the correlation coefficients, widely used in studies of linear relationships. This study aimed to determine the sample size needed for the estimation of Pearson linear correlation coefficients for simple triple and double corn hybrids in crops and accuracy levels.

2 MATERIAL AND METHOD

Two experiments were conducted with corn (Zea mays L.), in the growing seasons of 2008/2009 (first experiment) and 2009/2010 (second experiment), in an area located in Santa Maria, Rio Grande do Sul State, Brazil (29°42’S, 53°49’W, 95 m altitude). In the first experiment, sown on 12/26/2008, four plots were sown with the single hybrid P32R21 four with the triple hybrid DKB566 and four with the double hybrid DKB747. In the second experiment, sown on 10/26/2009, sixteen plots were sown with the singe hybrid 30F53, sixteen with the triple hybrid DKB566 and sixteen with the double hybrid DKB747.

Each plot consisted of four rows of 6 m long, spaced at 0.80 m, with density adjusted to five plants per linear meter, representing a plant population of 62,500 plants ha–1. Thus, each plot consisted of 120 plants, totaling 1,440 plants in the first experiment (3 hybrids × 4 plots/hybrid × 120 plants/plot) and 5,760 plants in the second experiment (3 hybrids × 16 plots/hybrid × 120 plants/plot ). In each crop, plots of the single, triple and double hybrids were randomized in the experimental area. In both experiments, the basic fertilization was 750 kg ha–1 of the formula 3-24-18 (NPK) and the topdressing was 300 kg ha–1 of urea with 45% N. The other cultural practices were performed according to the recommendations for corn (Fancelli & Dourado Neto, 2004Fancelli, A. L., & Dourado Neto, D. (2004). Produção de milho (2. ed.). Guaíba: Agropecuária. 360 p.).

In the first experiment, we assessed 361, 373 and 416 plants, respectively, for single, triple and double hybrids and in the second experiment we evaluated 1,777, 1,693 and 1,720 plants, respectively, for single, triple and double hybrids. Plants evaluated contained the twelve traits described below. As a result, the final number of plants evaluated in each harvest differed between the single, triple and double hybrids. In each of the 6,340 plants, the following traits were measured: plant height at harvest (PH), ear insertion height (EIH), unhusked ear weight (EW), the number of grain rows per ear (NR), ear length (EL), ear diameter (ED), cob weight (CW), cob diameter (CD), weight of hundred grains (WHG), number of grains per ear (NGE), grain length (GL), calculated as the difference between the diameters of ear and cob divided by two, and grain yield (YIELD), in grams per plant. Next, for each hybrid in each experiment the Pearson linear correlation coefficient (r) was calculated for each of the 66 pairs of traits, and the significance of r checked by Student’s t-test at 5% significance.

For each hybrid, in each experiment, 199 sample sizes were planned, with an initial sample size of ten plants and the others obtained with increment of five plants. Thus, the planned sample sizes were n = 10, 15, 20, ..., 1,000 plants. For each planned sample size, 1,000 resamples with replacement were obtained (Ferreira, 2009Ferreira, D.F. (2009). Estatística básica. 2. ed. (664p). Lavras: UFLA.), the same number of resamples used in previous studies on sampling design to estimate the Pearson correlation coefficient (Cargnelutti Filho et al., 2010Cargnelutti Filho, A., Toebe, M., Burin, C., Silveira, T. R., & Casarotto, G. (2010). Tamanho de amostra para estimação do coeficiente de correlação linear de Pearson entre caracteres de milho. Pesquisa Agropecuaria Brasileira, 45, 1363-1371. Recuperado de http://www.scielo.br/pdf/pab/v45n12/v45n12a05.pdf, 2012Cargnelutti Filho, A., Lopes, S. J., Brum, B., Toebe, M., Silveira, T. R., & Casarotto, G. (2012). Tamanho de amostra para a estimação do coeficiente de correlação linear de Pearson entre caracteres de mamoneira. Semina. Ciências Agrárias, 33, 953-962. http://dx.doi.org/10.5433/1679-0359.2012v33n3p953.
http://dx.doi.org/10.5433/1679-0359.2012...
), and in each resample, the Pearson linear correlation coefficients (r) of the 66 pairs of traits were estimated. Thus, for each planned sample size 1,000 r estimates were obtained for each of the 66 pairs of traits. Based on these 1,000 estimates, we determined the 2.5% percentile, the mean, the 97.5% percentile and the range of the 95% confidence interval (ACI95%), by the difference between the 97.5% percentile and the 2.5% percentile.

To determine the sample size (number of plants) required for the estimation of r, from each of the 66 pairs of traits in each hybrid and experiment, it was initially set ACI95% of r equal to 0.15 (higher accuracy), 0.25, 0.35 and 0.45 (lower accuracy). Then, we started with the initial sample size (n = 10 plants) and considered as adequate the sample size (n) the number of plants from which the ACI95% of r was less than or equal to the maximum limit for each accuracy level (0.15, 0.25, 0.35 or 0.45). The correlation coefficients obtained with data from the first experiment and the sample size for ACI95% of 0.30 (intermediate level of accuracy between 0.15 and 0.45) were presented by Cargnelutti Filho et al. (2010)Cargnelutti Filho, A., Toebe, M., Burin, C., Silveira, T. R., & Casarotto, G. (2010). Tamanho de amostra para estimação do coeficiente de correlação linear de Pearson entre caracteres de milho. Pesquisa Agropecuaria Brasileira, 45, 1363-1371. Recuperado de http://www.scielo.br/pdf/pab/v45n12/v45n12a05.pdf. Statistical analyses were run with the aid of the software R (R Development Core Team, 2014R Development Core Team. (2014). R: a language and environment for statistical computing. Vienna: R Foundation for Statistical Computing. Recuperado de http://www.R-project.org) and Microsoft Office Excel® software.

3 RESULTS AND DISCUSSION

The linear correlation coefficients (r) showed high variability among the 66 pairs of traits (Table 1). For hybrids evaluated in 2008/2009, r ranged from 0.02 (CD x GL) to 1.00 (EW x YIELD) for the single hybrid P32R21, from –0.20 (NR x WHG) to 1.00 (EW x YIELD) for the triple hybrid DKB566 and from –0.16 (NR x WHG) to 0.99 (EW x YIELD) for the double hybrid DKB747. In 2009/2010, the correlation coefficient ranged from –0.20 (EIH x EL) to 1.00 (EW x YIELD) for the single hybrid P32R21, from –0.12 (EIH x EL) to 1.00 (EW x YIELD) for the triple hybrid DKB566, and from –0.06 (NR x WHG) to 0.99 (EW x YIELD) for the double hybrid DKB747. Variability of level and direction of the correlation coefficients were also observed in 91 pairs of traits in single, triple and double corn hybrids evaluated in a growing season (Cargnelutti Filho et al., 2010Cargnelutti Filho, A., Toebe, M., Burin, C., Silveira, T. R., & Casarotto, G. (2010). Tamanho de amostra para estimação do coeficiente de correlação linear de Pearson entre caracteres de milho. Pesquisa Agropecuaria Brasileira, 45, 1363-1371. Recuperado de http://www.scielo.br/pdf/pab/v45n12/v45n12a05.pdf), in 210 pairs of traits in crambe (Cargnelutti Filho et al., 2011Cargnelutti Filho, A., Lopes, S. J., Toebe, M., Silveira, T. R., & Schwantes, I. A. (2011). Tamanho de amostra para estimação do coeficiente de correlação de Pearson entre caracteres de Crambe abyssinica.Revista Ciência Agronômica, 42, 149-158. http://dx.doi.org/10.1590/S1806-66902011000100019.
http://dx.doi.org/10.1590/S1806-66902011...
) and in 210 pairs of traits in castor bean hybrids (Cargnelutti Filho et al., 2012Cargnelutti Filho, A., Lopes, S. J., Brum, B., Toebe, M., Silveira, T. R., & Casarotto, G. (2012). Tamanho de amostra para a estimação do coeficiente de correlação linear de Pearson entre caracteres de mamoneira. Semina. Ciências Agrárias, 33, 953-962. http://dx.doi.org/10.5433/1679-0359.2012v33n3p953.
http://dx.doi.org/10.5433/1679-0359.2012...
).

Table 1
Estimates of Pearson linear correlation coefficients(1) between the 66 pairs of traits of the single hybrid P32R21, the triple hybrid DKB566 and the double hybrid DKB747 in the 2008/2009 growing season and the single hybrid 30F53, the triple hybrid DKB566 and the double hybrid DKB747 in the 2009/2010 growing season

In general, weaker correlations between pairs of traits showed larger amplitudes between hybrids and crops, as can be observed for correlations between EIH and EW (–0.12 ≤ r ≤ 0.37), EIH and EL (–0.20 ≤ r ≤ 0.30), EIH and WHG (–0.14 ≤ r ≤ 0.29), EIH and NGE (–0.09 ≤ r ≤ 0.34), EIH and YIELD (–0.13 ≤ r ≤ 0.38), NR and WHG (–0.20 ≤ r ≤ 0.11) and between CD and GL (–0.06 ≤ r ≤ 0.31) (Table 1). Conversely, higher correlations between pairs of traits showed smaller fluctuations between hybrids and crops, as can be seen, for example, for the correlations between ED and NGE (0.75 ≤ r ≤ 0.85), EW and ED (0.81 ≤ r ≤ 0.86), ED and YIELD (0.81 ≤ r ≤ 0.86), EW and NGE (0.90 ≤ r ≤ 0.95), NGE and YIELD (0.91 ≤ r ≤ 0.95) and between EW and YIELD (0.99 ≤ r ≤ 1.00).

The sample size required for estimation of r, with ACI95% lower than or equal to 0.15, showed high variability among the 66 pairs of traits measured in single hybrid (10 plants ≤ n ≤ 890 plants), triple hybrid (10 plants ≤ n ≤ 990 plants) and double hybrid (10 plants ≤ n ≤ 800 plants) of the 2008/2009 crop and also in single hybrid (10 plants ≤ n > 1,000 plants), triple hybrid (10 plants ≤ n ≤ 825 plants) and double hybrid (10 plants ≤ n ≤ 880 plants) of 2009/2010 (Table 2). The larger sample sizes were needed for the estimation of the correlation coefficient between NR and EL (445 plants ≤ n ≤ 935 plants), NR and WHG (575 plants ≤ n ≤ 775 plants), PH and CD (565 plants ≤ n ≤ 800 plants), WHG and NGE (625 plants ≤ n ≤ 845 plants), EIH and CD (565 plants ≤ n ≤ 990 plants), CD and GL (590 plants ≤ n ≤ 880 plants) and between WHG and GL (585 plants ≤ n > 1,000 plants). These pairs of traits presented low correlation coefficients in all hybrids (single, triple and double) evaluated in 2008/2009 and 2009/2010 (–0.20 ≤ r ≤ 0.43) (Table 1).

Table 2
Sample size (number of plants) for estimation of the Pearson correlation coefficient of 66 pairs of traits measured in the single hybrid P32R21, the triple hybrid DKB566 and the double hybrid DKB747 in the 2008/2009 growing season and in the single hybrid 30F53, the triple hybrid DKB566 and the double hybrid DKB747 in the 2009/2010 growing season for the range of the 95% confidence interval of 0.15

Smaller sample sizes were required for estimation of r between traits with higher correlation, for example, between EW and ED (40 plants ≤ n ≤ 100 plants), ED and YIELD (45 plants ≤ n ≤ 100 plants), EW and EL (25 plants ≤ n ≤ 120 plants) and between EL and YIELD (30 plants ≤ n ≤ 145 plants) (Table 2), which showed correlation coefficients in the range of 0.77 ≤ r ≤ 0.92 (Table 1). Even smaller sample sizes were needed for the estimation of r between the traits EW and NGE (15 plants ≤ n ≤ 40 plants), NGE and YIELD (10 plants ≤ n ≤ 30 plants), and between EW and YIELD, which required only 10 plants to estimate the r, with ACI95% less than or equal to 0.15, regardless of the hybrid and the crop (Table 2). These pairs of traits showed the highest correlations; the correlation between EW and NGE (0.90 ≤ r ≤ 0.95), NGE and YIELD (0.91 ≤ r ≤ 0.95) and between EW and YIELD (0.99 ≤ r ≤ 1.00) exhibited high values (Table 1). Thus, the higher the correlation between two traits, the smaller the sample size required for estimation of correlation, and vice versa, at a certain level of accuracy, as was already observed in previous studies (Bonett & Wright, 2000Bonett, D. G., & Wright, T. A. (2000). Sample size requirements for estimating pearson, kendall and spearman correlations. Psychometrika, 65, 23-28. http://dx.doi.org/10.1007/BF02294183.
http://dx.doi.org/10.1007/BF02294183...
; Cargnelutti Filho et al., 2010Cargnelutti Filho, A., Toebe, M., Burin, C., Silveira, T. R., & Casarotto, G. (2010). Tamanho de amostra para estimação do coeficiente de correlação linear de Pearson entre caracteres de milho. Pesquisa Agropecuaria Brasileira, 45, 1363-1371. Recuperado de http://www.scielo.br/pdf/pab/v45n12/v45n12a05.pdf, 2011Cargnelutti Filho, A., Lopes, S. J., Toebe, M., Silveira, T. R., & Schwantes, I. A. (2011). Tamanho de amostra para estimação do coeficiente de correlação de Pearson entre caracteres de Crambe abyssinica.Revista Ciência Agronômica, 42, 149-158. http://dx.doi.org/10.1590/S1806-66902011000100019.
http://dx.doi.org/10.1590/S1806-66902011...
, 2012Cargnelutti Filho, A., Lopes, S. J., Brum, B., Toebe, M., Silveira, T. R., & Casarotto, G. (2012). Tamanho de amostra para a estimação do coeficiente de correlação linear de Pearson entre caracteres de mamoneira. Semina. Ciências Agrárias, 33, 953-962. http://dx.doi.org/10.5433/1679-0359.2012v33n3p953.
http://dx.doi.org/10.5433/1679-0359.2012...
; Shieh, 2010Shieh, G. (2010). Estimation of the simple correlation coefficient. Behavior Research Methods, 42, 906-917. http://dx.doi.org/10.3758/BRM.42.4.906. PMid:21139158
http://dx.doi.org/10.3758/BRM.42.4.906...
).

For conditions in which it is desired to estimate r for each of the 66 pairs of traits with maximum ACI95% of 0.25, it would be necessary the measurement of 320, 365 and 295 plants, respectively, in the single hybrid P32R21, the triple hybrid DKB566 and the double hybrid DKB747 in 2008/09 and, 375, 315 and 310 plants, respectively, in the single hybrid 30F53, the triple hybrid DKB566 and the double hybrid DKB747 in 2009/2010 (Table 3). Thus, regardless of the hybrid, the harvest and the pair of traits, it would be recommended to measure 375 plants for estimating r with maximum ACI95% of 0.25.

Table 3
Sample size (number of plants) for estimation of the Pearson correlation coefficient of 66 pairs of traits measured in the single hybrid P32R21, the triple hybrid DKB566 and the double hybrid DKB747 in the 2008/2009 growing season and in the single hybrid 30F53, the triple hybrid DKB566 and the double hybrid DKB747 in the 2009/2010 growing season for the range of the 95% confidence interval of 0.25

For estimation of r of each of the 66 pairs of traits with maximum ACI95% of 0.35, it would be required the measurement of 165, 195 and 155 plants, respectively, in single, triple and double hybrids in 2008/2009 and, 190, 165 and 160 plants, respectively, in single, triple and double hybrids in 2009/2010 (Table 4). In turn, the measurement of 100, 110 and 100 plants, respectively, in single, triple and double hybrids in 2008/2009 and 120, 100 and 100 plants, respectively, in single, triple and double hybrids in 2009/2010 would be sufficient for the estimation of the correlation coefficient for each of the 66 pairs of traits with maximum ACI95% of 0.45% (Table 5).

Table 4
Sample size (number of plants) for estimation of the Pearson correlation coefficient of 66 pairs of traits measured in the single hybrid P32R21, the triple hybrid DKB566 and the double hybrid DKB747 in the 2008/2009 growing season and in the single hybrid 30F53, the triple hybrid DKB566 and the double hybrid DKB747 in the 2009/2010 growing season for the range of the 95% confidence interval of 0.35
Table 5
Sample size (number of plants) for estimation of the Pearson correlation coefficient of 66 pairs of traits measured in the single hybrid P32R21, the triple hybrid DKB566 and the double hybrid DKB747 in the 2008/2009 growing season and in the single hybrid 30F53, the triple hybrid DKB566 and the double hybrid DKB747 in the 2009/2010 growing season for the range of the 95% confidence interval of 0.45

For the estimation of r in 91 pairs of traits in corn with maximum ACI95% of 0.30, Cargnelutti Filho et al. (2010)Cargnelutti Filho, A., Toebe, M., Burin, C., Silveira, T. R., & Casarotto, G. (2010). Tamanho de amostra para estimação do coeficiente de correlação linear de Pearson entre caracteres de milho. Pesquisa Agropecuaria Brasileira, 45, 1363-1371. Recuperado de http://www.scielo.br/pdf/pab/v45n12/v45n12a05.pdf recommended the measurement of up to 300 plants, depending on the hybrid and the pair of traits. In crambe, Cargnelutti Filho et al. (2011)Cargnelutti Filho, A., Lopes, S. J., Toebe, M., Silveira, T. R., & Schwantes, I. A. (2011). Tamanho de amostra para estimação do coeficiente de correlação de Pearson entre caracteres de Crambe abyssinica.Revista Ciência Agronômica, 42, 149-158. http://dx.doi.org/10.1590/S1806-66902011000100019.
http://dx.doi.org/10.1590/S1806-66902011...
found that for the estimation of r with maximum ACI95% of 0.15%, the sample size ranged between 8 and 665 plants according to the pair of traits considered. Besides that, in 210 pairs of traits of two castor bean hybrids, Cargnelutti Filho et al. (2012)Cargnelutti Filho, A., Lopes, S. J., Brum, B., Toebe, M., Silveira, T. R., & Casarotto, G. (2012). Tamanho de amostra para a estimação do coeficiente de correlação linear de Pearson entre caracteres de mamoneira. Semina. Ciências Agrárias, 33, 953-962. http://dx.doi.org/10.5433/1679-0359.2012v33n3p953.
http://dx.doi.org/10.5433/1679-0359.2012...
reported that 96 plants were sufficient for the estimation of r with maximum ACI95% of 0.52. The authors also verified that for the estimation of r with ACI95% of 0.20, the sample size varied between 10 and 661 plants, depending on the pair of traits.

In agreement with Bonett & Wright (2000)Bonett, D. G., & Wright, T. A. (2000). Sample size requirements for estimating pearson, kendall and spearman correlations. Psychometrika, 65, 23-28. http://dx.doi.org/10.1007/BF02294183.
http://dx.doi.org/10.1007/BF02294183...
for the estimation of r through the Fisher confidence interval with ACI95% of 0.10, it was necessary a sample size of n = 1,507 and n = 63 observations, respectively, for low (r = 0.10) and high (r = 0.90) correlation coefficients. The sample size of n = 168 and n = 13 observations would be sufficient, according to the authors, for estimation of these coefficients with ACI95% of 0.30. According to Shieh (2010)Shieh, G. (2010). Estimation of the simple correlation coefficient. Behavior Research Methods, 42, 906-917. http://dx.doi.org/10.3758/BRM.42.4.906. PMid:21139158
http://dx.doi.org/10.3758/BRM.42.4.906...
, the use of larger sample sizes reduced the bias and the root mean square error, associated with the estimates of r. The author found that higher root mean square errors are associated with small magnitude correlation coefficients and lower root mean square errors are associated with high magnitude correlation coefficients, either positive or negative.

In this study, regardless of the hybrid, the crop and the pair of traits, 375, 195 and 120 plants were sufficient, respectively, for the estimation of r with maximum ACI95% of 0.25, 0.35 and 0.45 (Tables 3, 4 and 5). Thus, with an experiment with five treatments and four repetitions (20 plots in total) evaluating ten plants per plot (200 plants in total), r can be estimated with maximum ACI95% of 0.35, provided that the effects of treatments and local control are suppressed. If, however, six plants per plot are evaluated (total 120 plants), it can be estimated the r of each pair of traits with maximum ACI95% of 0.45, provided that the effects of treatments and local control are also suppressed. If a researcher seeks to estimate the r of 66 pairs of traits with maximum ACI95% of 0.45 within each treatment, using four replications, the researcher must evaluate 30 plants per replication (120 plants per treatment), provided that the effect of local control is removed before the estimation of r.

4 CONCLUSION

The sample size varies among different hybrids, crops and pairs of traits. Larger sample size is needed to estimate the correlation coefficient between weakly correlated traits and smaller sample size is needed to estimate the correlation coefficient between highly correlated traits.

Independently of hybrid, crop and pairs of traits, 375, 195 and 120 plants are sufficient, respectively, to estimate the correlation coefficients with maximum ACI95% of 0.25, 0.35 and 0.45.

ACKNOWLEDGEMENTS

To the National Council for Scientific and Technological Development (CNPq) and the Coordination for the Improvement of Higher Education Personnel (CAPES) for scholarships. To fellows and volunteers for their help in conducting the experiment and collecting data.

REFERÊNCIAS

  • Bonett, D. G., & Wright, T. A. (2000). Sample size requirements for estimating pearson, kendall and spearman correlations. Psychometrika, 65, 23-28. http://dx.doi.org/10.1007/BF02294183.
    » http://dx.doi.org/10.1007/BF02294183
  • Cargnelutti Filho, A., Toebe, M., Burin, C., Silveira, T. R., & Casarotto, G. (2010). Tamanho de amostra para estimação do coeficiente de correlação linear de Pearson entre caracteres de milho. Pesquisa Agropecuaria Brasileira, 45, 1363-1371. Recuperado de http://www.scielo.br/pdf/pab/v45n12/v45n12a05.pdf
  • Cargnelutti Filho, A., Lopes, S. J., Toebe, M., Silveira, T. R., & Schwantes, I. A. (2011). Tamanho de amostra para estimação do coeficiente de correlação de Pearson entre caracteres de Crambe abyssinica.Revista Ciência Agronômica, 42, 149-158. http://dx.doi.org/10.1590/S1806-66902011000100019.
    » http://dx.doi.org/10.1590/S1806-66902011000100019
  • Cargnelutti Filho, A., Lopes, S. J., Brum, B., Toebe, M., Silveira, T. R., & Casarotto, G. (2012). Tamanho de amostra para a estimação do coeficiente de correlação linear de Pearson entre caracteres de mamoneira. Semina. Ciências Agrárias, 33, 953-962. http://dx.doi.org/10.5433/1679-0359.2012v33n3p953.
    » http://dx.doi.org/10.5433/1679-0359.2012v33n3p953
  • Cruz, C. D., & Regazzi, A. J. (1997). Modelos biométricos aplicados ao melhoramento genético (2. ed.). Viçosa: UFV. 390 p.
  • Duvick, D. N. (2005). The contribution of breeding to yield advances in maize (. Zea mays L.)Advances in Agronomy, 86, 83-145. http://dx.doi.org/10.1016/S0065-2113(05)86002-X.
    » http://dx.doi.org/10.1016/S0065-2113(05)86002-X
  • Fancelli, A. L., & Dourado Neto, D. (2004). Produção de milho (2. ed.). Guaíba: Agropecuária. 360 p.
  • Food and Agriculture Organization of the United Nations – FAO2014Recuperado de http://statistics.amis-outlook.org/data/index.html#DOWNLOAD
  • Ferreira, D.F. (2009). Estatística básica. 2. ed. (664p). Lavras: UFLA.
  • R Development Core Team. (2014). R: a language and environment for statistical computing. Vienna: R Foundation for Statistical Computing. Recuperado de http://www.R-project.org
  • Shieh, G. (2010). Estimation of the simple correlation coefficient. Behavior Research Methods, 42, 906-917. http://dx.doi.org/10.3758/BRM.42.4.906. PMid:21139158
    » http://dx.doi.org/10.3758/BRM.42.4.906

Publication Dates

  • Publication in this collection
    Mar 2015

History

  • Received
    19 Sept 2014
  • Accepted
    08 Oct 2014
Instituto Agronômico de Campinas Avenida Barão de Itapura, 1481, 13020-902, Tel.: +55 19 2137-0653, Fax: +55 19 2137-0666 - Campinas - SP - Brazil
E-mail: bragantia@iac.sp.gov.br