Specific residue : application of orthogonal contrasts when heteroscedasticity is present

When experimental data are submitted to analysis of variance, the assumption of data homoscedasticity (variance homogeneity among treatments), associated to the adopted mathematical model must be satisfied. This verification is necessary to ensure the correct test for the analysis. In some cases, when data homoscedascity is not observed, errors may invalidate the analysis. An alternative to overcome this difficulty is the application of the specific residue analysis, which consists of the decomposition of the residual sum of squares in its components, in order to adequately test the correspondent orthogonal contrasts of interest between treatment means. Although the decomposition of the residual sum of squares is a seldom used procedure, it is useful for a better understanding of the residual mean square nature and to validate the tests to be applied. The objective of this review is to illustrate the specific residue application as a valid and adequate alternative to analyze data from experiments following completely randomized and randomized complete block designs in the presence of heteroscedasticity.


Introduction
The analysis of variance of experimental data requires that the assumption of homoscedasticity (similar variances among treatments), associated to the adopted mathematical model is satisfied.This verification is necessary for a correct significance of the test application.When this condition is not met the heteroscedasticity is prevailing (variance heterogeneity).
The heteroscedasticity can be classified as regular and irregular according to Steel and Torrie (1981) based on Cochran (1947).The regular type is generally originated from data non-normality and some type of relationship between means and variance treatments.In this case, the data may be transformed to have variance stability among treatments and, as a consequence, the errors will fit into an approximately normal distribution.The irregular type is characterized by certain treatments showing significantly higher variability compared to others, not necessarily presenting a relation between means and variances.In this case, Cochran andCox (1957, 1971) recommended that such high variability treatments are omitted or that treatments are subdivided into homocedasticity groups in such way that they may present similar variances; or yet, to subdivide the residual sum of squares (SSResidual) in applicable components for the several comparisons of interest, thus obtaining specific residues.
When an analysis of variance is performed, the sum of squares of the treatments (SSTreatment) can be decomposed into components corresponding to orthogonal contrasts; in the same way, the residual sum of squares (SSResidual) can also be decomposed into their orthogonal contrast components, giving origin to the specific residues that are appropriate to test each contrast between treatment means.The residual sum of squares (SSResidual) decomposition is not a usual procedure as the treatment sum of squares (SSTreatment) decomposition, but according to Cochran andCox (1957, 1971), it can be applied when there are reasons suggesting the presence of irregular types of heteroscedasticity.In this case, the SSResidual decomposition is useful to better understand the residual mean square (MSResidual) nature and validate the tests to be applied.
A residual sum of squares (SSResidual) decomposition for experimental data of a randomized complete block design was presented by Steel and& Torrie (1981); initially, they established an orthogonal contrast grouping for treatments and thereafter they obtained the value of each contrast for each block.The authors concluded that if the randomized complete block design is valid, any comparison within each block is not influenced by the general level of the block.As a consequence, the variance for any comparison within blocks is appropriate to test contrasts between treatment means.The procedure was numerically shown.
In presence of the heteroscedasticity among experiments, when a group of experiments is considered, the interaction effects involving experiments (assumed as randomized effects) are influenced.An appropriate alternative to analyze the experimental data is the application of the specific residue method.With the objective to illustrate this case, Oliveira and Nogueira (2007) applied the specific residue method on sugarcane yield (t ha -1 ) experimental data obtained from a group of eleven experiments characterized by the presence of heteroscedasticity among experiments.Each experiment had a randomized incomplete block design, arranged in a 3 3 NPK factorial (27 treatments = three blocks × nine experimental units).The confounding of two degrees of freedom corresponding to the block effects plus NPK interaction effects was considered.No replication was applied to blocks.
The objective of this review is to illustrate the application of specific residues as an alternative procedure to analyze data showing heteroscedasticity among treatments.

Material and Methods
The methods, definitions and concepts on orthogonal contrasts applied to obtain specific residues can be found in Nogueira (2004).To bypassthe irregular heteroscedasticity present in the experimental data of a randomized complete block design, Ferreira (1978) presenteda mathematical procedure to obtain the specific residue sum of squares, correspondent to the appropriate components for comparisons (orthogonal contrasts) of interest, using the orthogonal transformation method.Thus, the specific residue sum of squares of the , with (J-1) degrees of freedom and Y ˆhj is the Y hj contrast estimate, correspondent to the Y h contrast application within block j, for j = 1, ..., J, where I is the total number of treatments, for i = 1, …, I; hi c is the associated coefficient of the i-esimal treatment mean in the h-esimal contrast; h Y ˆ is the h- esimal contrast estimate, for h = 1, ..., (I-1) ; ij

I-1 h=1
∑ SSR(Y h )=SSResidual has (I-1) (J-1) degrees of free- dom and the residual mean square for Y h , MSR(Y h ) = Thus, the hypothesis H 0 :Y h = 0 vs. H a : Y h ≠ 0, for h=1, ..., (I -1), is tested by the application of the F test, and where MS(Y h ) is the mean square referred to the Y h component, with one degree of freedom, obtained as follows: In the case of a completely randomized design experiment in presence of irregular heteroscedascity SSResidual is decomposed in specific residues as shown by Nogueira (1984) and Nogueira and Campos (1985).These authors developed the decomposition of SSResidual and presented appropriate specific residues to test each contrast, and also identified how the specific residue sum of squares refers to the Y h component (SSR(Y h )).The development of the specific residue sum of squares in relationto the Y h component was obtained by applying the mathematical expectance (E) on SSR(Y h ) of the randomized complete block design experiment, as follows: , for h = 1,..., (I-1) with (J-1) degrees of freedom, and , where SST i is the i-esimal treatment sum of squares.Thus, the residual mean square for Y h (MSR(Y h )) is given by: with n h degrees of freedom, obtained by the application of the Satterthwaite (1941Satterthwaite ( ,1946) ) Therefore, the hypotheses H 0 :Y h = 0 vs. H a : Y h ≠ 0, for h=1, ..., (I -1) were tested by the application of the F test, and the calculated F value was obtained through the expression: where MS(Y h ) is the mean square of the Y h component, with one degree of freedom, obtained as follows: , and the followed the approximated F distributions with one degree of freedom was referred to MS(Y h ) with n h degrees of freedom obtained by the Satterthwaite (1941Satterthwaite ( , 1946) ) formula and to MSR(Y h ) as verified by Nogueira (1984).The verification was accomplished through the application of the simulation method developed by Godoi (1978), based on Box and Miller (1958), to variables with normal and one-dimensional distributions.
The Chi-square test was applied to verify the adherence of F h with the F (1,n h ) distributions.

Completely randomized design
The experimental data shown in Table 1, cited by Nogueira (1984), refer to sorghum total dry matter yield, first cropping (g per pot) obtained from a completely randomized design experiment, with eight treatments and four replications, so that: Total for each treatment 1 -Sorghum plant total dry matter yields (g per pot), mean deviation sum of squares and variance estimate for each treatment (eight treatments, average of four replications).with (4 -1) degrees of freedom, where y ij is the observed value (g per pot) of the i-esimal treatment in the j-esimal replication.
Preliminary analyses of variance results are presented in Table 2. Seven degrees of freedom for treatments and the sum of squares for treatments were decomposed according to the following group of orthogonal contrasts of interest: Y 1 : control treatments versus located and incorporated P-rates; Y 2 : among controls;Y 3 : Located versus incorporated P-rates; Y 4 : Linear effect of located P-rates; Y 5 : Quadratic effect of located P-rates; Y 6 : Linear effect of incorporated P-rates; Y 7 : Quadratic effect of incorporated P-rates.
Contrasts Y 4 and Y 5 provided the located-P treatment effect and contrasts Y 6 and Y 7 , the incorporated-P treatment effect.The coefficients of applied contrasts and some results are shown in Table 3.As P-rates are not equidistant, the coefficients attributed to Y 4 , Y 5 , Y 6 and Y 7 contrasts were obtained using the orthogonal polynomial coefficient procedure for non-equidistant levels developed by Nogueira (1978) and cited by Nogueira (2007).The new analysis of variance with F test results without specific residue application is presented in Table 4.
If the model homoscedasticity assumption is satisfied, that is, if it is possible to consider that statistically S S S S = = = = L = MSResidual, the analysis presented in Table 4 is perfectly valid.
In order to verify the experimental data homoscedasticity, the Bartlett test was applied (among other tests), which is appropriate to test the following hypotheses: was rejected at pvalue < 0.005 significance level, evidencing significant differences among variances due to the replications within treatments, characterizing the presence of heteroscedasticity.Once heteroscedasticity was evidenced, a procedure should be applied to overcome this situation.One alternative was the use of the specific residue as the F test denominator, to test each contrast defined in Table 3.This procedure consisted of the decomposition of all residual degrees of freedom ( 24), and consequently, the residual sum of squares obtaining the specific residue for each contrast: , for h = 1,..., (8 -1) with (4 -1) degrees of freedom and , with n h degrees of freedom obtained through the application of the Satterthwaite (1941Satterthwaite ( , 1946) ) formula  Note: is the i-esimal treatment mean; , total of the i-esimal treatment.4-1 S 8 ∑ , with (4 -1) degrees of freedom and MSR (among replications) = Thus, the hypothesis H 0 :Y h = 0 vs. H a : Y h ≠ 0, for h=1, ..., (8 -1), will be tested by the application of the F test and that , as observed by Nogueira (1984).Results are shown in Table 5, where the values in [ ], found in DF ( degrees of freedom) col-   The F test values presented in Table 4 were obtained having MSResidual as denominator, with 24 degrees of freedom.The results presented in Tables 4 and 5 are different as well as some of the conclusions.This fact is important due to the presence of heteroscedasticity, because in Table 4, the MSResidual corresponds to the MSR(Y h ) arithmetic mean; and in Table 5, the values obtained for MSR(Y h ) were different.In the presence of homoscedasticity the values obtained for MSR(Y h ) are very close to the ones obtained for MSResidual.The use of the procedure showed to be an interesting alternative to be applied when irregular heteroscedasticity is present, providing trustworthy results.

Randomized complete block design
In order to illustrate the specific residue procedure application on data analyses of a randomized complete block design experiment, the following experimental data were considered: yields of eight potato varieties (t ha -1 ) distributed in five blocks (Table 6).
The Bartlett test was applied to verify the variance homogeneity hypothesis, which was rejected, thus evidencing the presence of variance heterogeneity among treatments.Due to this fact and considering that experimental errors followed a normal distribution, the specific residue procedure was applied as an alternative for this data analysis.The initial analysis of variance is shown in Table 7.
The orthogonal contrasts Y 2 , Y 3 , Y 4 and Y 5 provided the high productivity variety effect with four degrees of freedom, and the contrasts Y 6 and Y 7 provided the low productivity variety effect with two degrees of freedom.The coefficients of the applied contrasts, the contrast estimates and the sum of squares obtained are shown in Table 8.
Twenty eight degrees of freedom and the residual sum of squares were decomposed according to the Y(h) components, resulting the Y(h) specific residues given by: with (5-1) = 4 degrees of freedom and hj Y ˆ is the Y hj con- trast estimate, corresponding to the Y h contrast application in the block j, for j = 1, ..., J = 5 , , where y ij is the observed value related to variety i in block j; h Y ˆis the h-esimal contrast estimate, for h = 1, ..., (8-1)=7 and  ∑ SSR(Y h ) = SQResidual = 348.324,with (8-1)(5-1)=28 degrees of freedom.
Thus, the hypotheses H 0 :Y h = 0 vs. H a : Y h ≠ 0, for Sci.Agric.(Piracicaba, Braz.), v.67, n.1, p.117-125, January/February 2010 h=1, ..., (8 -1), were then tested by the application of the F test, The analysis of variance obtained with the specific residue procedure application is presented in Table 11.Significant F test values for Y 1 and Y 4 contrasts were observed, evidencing they differ from zero.The analysis of variance without the specific residue procedure was also obtained (Table 12) in order to be compared to the previous analysis (Table 11).Significant F value was obtained for the Y 1 contrast when calculated with MSResidual as denominator, with 28 degrees of freedom, evidencing that it significantly differed from zero.When the specific residue procedure was applied (Table 11), significant F values were obtained for the Y 1 and Y 4 contrasts.

Conclusion
The use of the specific residue procedure is a valid and efficient alternative when heteroscedasticity is

∑
formula, and thus, SSR(Y h ) + SSR(among replications), with I(J-1) degrees of freedom, and the SSR(among replications) is the residual sum of squares among replicationsdegrees of freedom and that residual mean square among replications (MSR(among replications) is MSR (among replications) =

.
The values referred to y ij and the Y h coefficients for the hj Y ˆ calculus are presented in

Table 2 -
Preliminary analysis of variance for the sorghum experiment Note: DF is degrees of freedom; SS is Sum of Squares; MS is Mean Square.

Table 3 -
Application of orthogonal contrasts to the sorghum experiment.

Table 4 -
Analysis of variance with treatment decomposition of seven degrees of freedom decomposition in orthogonal contrasts without specific residue application.

Table 5 -
Analysis of variance with specific residue application.

117-125, January/February 2010 umn
refer to the effective degrees of freedom -n h , obtained by the Satterthwaite formula and applied in the F test.

Table 9 .
The results referred to hj Y ˆ and h Y ˆestimates and SSR(Y h ) values are presented in Table 10, as follows:

Table 10 -
Estimation of hj Y ˆand h Y ˆ and SSR(Y h ) values.

Table 8 -
Coefficients of contrasts, estimates and contrast sum of squares for the potato yield experiment.

Table 7 -
Analysis of variance of potato yield.

Table 9 -
Observed values (y ij ) and Y h -coefficients for hj Y ˆ estimation.

Table 12 -
Analyses of variance without specific residue procedure application.Note: *significance by (0.01 < p-Value ≤ 0.05); **significance by (p-Value ≤ 0.01).Table 11 variance with specific residue procedure application.Note: *significance by (0.01 < p-Value ≤ 0.05); **significance by (p-Value ≤ 0.01).because it validates the applied tests and also allows a better understanding of the residual mean square nature.The MSResidual corresponds to the MSR(Y h ) arithmetic mean, although the values obtained for MSR(Y h ) can be different.In the presence of homoscedasticity the values obtained for MSR(Y h ) are very close to those obtained for MSResidual.