Minimum number and best combinations of harvests to evaluate accessions of tomato plants from germplasm banks

This study presents the minimum number and the best combination of tomato harvests needed to compare tomato accessions from germplasm banks. Number and weight of fruit in tomato plants are important as auxiliary traits in the evaluation of germplasm banks and should be studied simultaneously with other desirable characteristics such as pest and disease resistance, improved flavor and early production. Brazilian tomato breeding programs should consider not only the number of fruit but also fruit size because Brazilian consumers value fruit that are homogeneous, large and heavy. Our experiment was a randomized block design with three replicates of 32 tomato accessions from the Vegetable Germplasm Bank (Banco de Germoplasma de Hortaliças) at the Federal University of Viçosa, Minas Gerais, Brazil plus two control cultivars (Debora Plus and Santa Clara). Nine harvests were evaluated for four production-related traits. The results indicate that six successive harvests are sufficient to compare tomato genotypes and germplasm bank accessions. Evaluation of genotypes according to the number of fruit requires analysis from the second to the seventh harvest. Evaluation of fruit weight by genotype requires analysis from the fourth to the ninth harvest. Evaluation of both number and weight of fruit require analysis from the second to the ninth harvest.


Introduction
The choice of genitors is one of the most important stages in the development of a breeding program.To be successful, a breeding program must have clear goals and decisions must be based on the traits to be improved, the kinds of genetic control to which these traits are subject and the sources of germplasm available (Fehr, 1987).
In tomato plants, fruit number and weight are important auxiliary traits which need to be considered for the adequate evaluation of tomato germplasm banks.Most high-productivity tomato genes are already widely used and the search for production characteristics associated with disease and insect resistance, improved flavor and early maturation, among others, is desirable.Breeding pro-grams should always include a fruit production component because the development of desirable traits must accompany the development of higher-productivity genotypes.
Crops such as tomato may require eight to fourteen harvests during the crop cycle and fruit production may vary, with higher fruit production normally occurring when harvests are alternated.Because the effect of harvesting may vary among genotypes within the same population, it is not possible to estimate the heritability coefficient from different harvests.This incapacity is due to the incorporation of environmental effect into the genotype effect because randomized genotypes cannot occur between one harvest and the next.In these situations, the repeatability coefficient may be used as measure of the accuracy of the relative superiority of one or more accessions.An estimate of the repeatability coefficient also allows for determination of the minimum number of harvests necessary to access the genotypical value of individuals based on phenotypic observations.In breeding programs with scarce economic and human resources, the estimate of the repeatability coefficient is of much interest both during the selective process and at the time for choosing the potential parents.
Analysis of variance (ANOVA) and the principal components method are some of the methods used for estimating the repeatability coefficient.Both methods have been used by several researchers and for different crops (Shimoya et al., 2002;Degenhardt et al., 2002;Ferreira et al., 1999;Dias and Kageyama, 1998).However, reports on the use of these methods for tomato are not available in the literature.
ANOVA with two variation factors is the most adequate for this study because it excludes temporary environmental effects.In the one-variation factor model, environmental effects may be misinterpreted as variation within genotypes and therefore, lead to under-estimates of the repeatability coefficient (Cruz et al., 2004).
The principal-components method can estimate the repeatability coefficient more efficiently when the genotype displays cyclic behavior in regard to the trait being studied.Because this effect may vary in different ways and intensities among genotypes, the ANOVA that estimates the usual repeatability coefficient may not eliminate the additional component of experimental error.Consequently, the repeatability estimator may be underestimated (Cruz et al., 2004).
Because tomato involves multiple harvests, this study estimates the minimum number of harvests necessary to evaluate performance of tomato accessions kept at the Vegetable Germplasm Bank (Banco de Germoplasma de Hortaliças) at the Federal University of Viçosa, Minas Gerais, Brazil.

Material and Methods
The site for the experiment was the Federal University of Viçosa (Universidade Federal de Viçosa, UFV), Minas Gerais, Brazil.The experiment had a randomized block design with three replications and included 32 tomatoes accessions from the Vegetable Germplasm Bank (Banco de Germoplasma de Hortaliças) at UFV and the cultivars Debora Plus and Santa Clara as controls.Data from nine harvests during the crop cycle included the following traits: total number of healthy fruit (TNHF i.e. without any biotic or abiotic defect or damage) per plant; total weight (g) of healthy fruit per plant (TWHF); total number of fruit (TNF) and; total fruit weight (g) (TWF).
Statistical analysis was based on the model Y ijk = m + g i + a j + b k + e ij , where Y ijk = mean of the three repetitions of each one of the traits evaluated for the genotype i in harvest j; m = general average; g i = effect of genotype i; a j = effect of harvest j; b k = effect of blocks k; and e ij = error associated to observation Y ijk .The model was considered aleatory.
Estimates of the repeatability coefficient were obtained by ANOVA intra-class correlation (Cruz et al., 2004) and by the principal-components method based on the correlations and covariance matrices (Abeywardena, 1972 andRutledge, 1974).
The computer program used for all analyses was GENES (Cruz, 2001) and the individual methodologies used are described below.

ANOVA
The repeatability coefficient was estimated by means of intra-class correlation, considering the statistical reduced model based on harvest and genotype averages: where Y ik = average value observed relative to the i-th genotype at the k-th harvest; m = overall average; g i = randomized effect of the i-th genotype under the influence of permanent environment (i = 1, ... p; p = 34); c k = fixed effect of the temporary environment on the k-th harvest (k = 1, ..., h; h = 9); and e ik = experimental error established by the environmental temporary effects on the i-th genotype, at the k-th harvest.

Principal-components method
The estimate of the coefficient of repeatability can be calculated either by means of a correlation matrix or by a matrix of phenotypical variances and covariances.This method is appropriate when genotypes display cyclic behavior in relation to the traits being studied.

Principal-components method -correlation matrix
A correlation matrix between genotypes must be obtained for each harvest pair.In this matrix, the eigenvalues (l) and eigenvectors (a), standardized from correlation matrix (R) are determined.
Eigenvectors, whose elements present the same signal and approximate magnitudes, show a tendency for genotypes to maintain their relative positions throughout the various harvest intervals.The estimator of the repeatability coefficient is proportional to the eigenvalue associated to the eigenvector, and expressed by: where j = 1, 2, ..., h; h = number of cuts evaluated; and l k is the eigenvalue associated to the eigenvector, whose elements have the same signal and similar magnitude.
Rutledge (1974) reported that l k is affected by the number of measurements for each individual.Thus, the estimator r becomes more adequate to estimate the repeatability coefficient, which is obtained by: r = -- where l 1 = 1 + (l -1)r; $ l is the eigenvalue of R associated to the eigenvector, whose elements display the same signal and similar magnitude; h = number of harvests; and p = number of genotypes.

Principal-components method -covariance matrix
The coefficient of repeatability can also be estimated by the principal components method and the matrix of phenotypic variances and covariance (G).
The repeatability coefficient estimator ã is obtained by: ( ) where l 1 is the eigenvalue $ G associated to the eigenvector, whose elements display the same sign and similar magnitude of $ $ s + s g e 2 2 and h = number of harvests.

Coefficient of determination
Based on the average of harvests (h = 9) and on the estimate of repeatability coefficients (r) obtained by one of the methods used, the coefficient of determination (R 2 ) was calculated for each characteristic.This coefficient represents certainty in predicting the real value of the individuals selected, through the following expression: R r 1+ r( -1) 2 = h h

Number of measurements
The number of measurements needed to predict the real value of genotypes based on pre-established coefficients of determination (R 2 = 0.80,0.85,0.90,0.95 and 0.99) was obtained using the following expression: where h m = number of measurements necessary to predict the real value, R 2 = coefficient of determination and G = repeatability coefficient obtained by applying one of the methods.

Determination of the most appropriate harvests
Following the determination of the ideal number of harvests to predict the value of genotypes with the desired reliability, some harvests within selected harvest groups were eliminated and others were considered most appropriate for obtaining the real value of germplasm bank accessions.All combinations of six harvests were submitted to repeatability analysis by the principal-components method and the matrix of phenotypic variance and covariance to compare harvest groups based on the traits being evaluated.

Results
The effects of genotypes and harvests were significant (p < 0.01) for all variables.Traits variability was observed between genotypes evaluated in different harvests (Table 1).  4 Coefficient of determination estimated by ANOVA. 5Repeatability estimated by principal-components analysis using the correlation matrix. 6Coefficient of determination estimated by principal-components analysis using the correlation matrix. 7Repeatability estimated by principal-components analysis using phenotypic variance and covariance. 8Coefficient of determination estimated by principal-components analysis using phenotypic variance and covariance.
ANOVA showed greater magnitudes for the repeatability coefficients (r) for total weight of healthy fruit (TWHF, r = 0.19) and total weight of fruit (TWF, r = 0.21) and maximum determination coefficient of 70% for the TWF variable (Table 1).
Table 1 also shows that the principal component and the correlation matrix repeatability estimates reached nearly twice the value of the estimates obtained by ANOVA for TWHF.Increases in the repeatability estimate by principal-components analysis for total number of fruit (TNF) and total weight of fruit (TWF) were about four times higher than by ANOVA and six times higher for total number of healthy fruit (TNHF).
Repeatability analysis by means of principal components and the matrix of phenotypic variances and co-variances resulted in coefficient estimates above 92%, except for the TNHF trait which performed best with the use of the correlation matrix and repeatability and determination coefficients of 0.63 and 93% respectively (Table 1).Evaluation of diversity between the tomato accessions studied during nine harvests presented 89.26% precision in the estimate genotypic value of accessions for TNHF and in excess of 92% for all other variables (Table 1).Thus, six harvests ensure that genotypes are selected with a precision of at least a 85% in relation to all the variables studied.
The minimum number of measurements needed for predicting genotypic value at 85% reliability by the principal-components method and phenotypic variance and covariance matrices were: two harvests for TWF, three harvests for TWHF, four harvests for TNF, and six harvests for TNHF (Table 2).It follows from the findings above that six harvests can be considered as being ideal for analyzing all variables in this experiment.This number of harvests will enhance precision of the other traits to about 90%.
To optimize tomato evaluation, it is necessary to know which of the six harvests are most important for the evaluation.Our results suggest that the evaluation of TNHF and TNF genotypes should occur from the second through the seventh harvest but be omitted from the first and last two harvests.The best times for TWHF and TWF were the last six harvests, when the coefficients of determination values were most important (Table 3).

Discussion
The low magnitudes of the repeatability coefficient and low estimates of the coefficient of determination observed in the ANOVA suggest that this method cannot be adopted to identify superior genotypes successfully.The low coefficients obtained indicate low reliability of genotypic discrimination.Estimates of the repeatability and determination coefficients are considered reasonable and likely to be adopted when above 0.5 and 80% respectively (Shimoya et al., 2002).
Repeatability analysis of principal components based on phenotypic variance and covariance matrices allows high estimates of the coefficient of determination.
Repeatability estimates by principal components that are superior to those obtained by ANOVA indicate cyclical variation between harvests (Cruz et al., 2004), which may, indeed, have occurred considering that the harvest effect was significant.
Tomato breeding programs in Brazil should take into account the amount and size of the fruit produced because Brazilian consumers demand standardized tomatoes and value those that are heavy and large.Thus, total weight of healthy fruit (TWHF) and total weight of fruit (TWF) need to be evaluated in a tomato breeding program because they identify the most productive genotypes for commercial standards.Principal components analysis using phenotypic variance and covariance matrices for estimating the repeatability coefficient is more appropriate because it ensures greater reliability in selecting the most productive genotypes for commercial fruit production.For all variables studied six harvests allow for the selection of genotypes with at least 85% precision and it seems reasonable to infer that six harvests are sufficient to evaluate tomato genotypes.It should be noted that if the objective of the breeding program is to evaluate genotypes for the number of fruit, analysis should start from the second and finish in the seventh harvest.However, if the objective of the program is to evaluate fruit weight, analysis should start from the fourth and end in the ninth harvest.For assessment of fruit number and weight, analysis should be from the second to ninth harvests, with a minimum precision of 88% for the coefficient of determination.
Accessions from a specific germplasm bank selected on the basis of traits that are of major interest for a breeding program (e.g.pest and disease resistance and improved flavor) can be classified by priority of use according to the auxiliary production characteristic desired (TNHF, TWHF, TNF, TWF).

Table 3 -
Repeatability coefficients (r) and coefficients of determination (R 2 ) for four groups of six harvests, for total number of healthy fruit per plant (TNHF), total weight of healthy fruit per plant (TWHF), total number of fruit (TNF) and total weight of fruit (TWF) obtained from genotypic data for 34 tomato genotypes.