BONFERRONI ’ S AND SIDAK ’ S MODIFIED TESTS

Results of practical importance had been discarded testing formulated hypothesis with the aid of statistical analysis of experimental data because of the power of the utilized test. This study compares the power of two Bonferroni’s Modified and one Sidak’s Modified tests with known tests analyzing 1200 simulated experiments. All differences of means were obtained in relation to the mean of the adopted control to guarantee parametrical magnitude of mean differences. Student’s test (type I comparisonwise error) and Waller-Duncan’s (Bayesian error) showed the highest percentage of significative differences, followed by Duncan’s, BM2, SiM, BM1, DunnettU’s, SiN, BN, Dunnettu’s, SNK’s, REGWF’s, REGWQ’s, Tukey’s, Sidak’s and Bonferroni’s tests. For differences equal to zero, Student’s and Waller-Duncan’s test exhibit 5% frequency of rejection of the null hypothesis, in accordance the nominal error I adopted (α = 0.05). All other tests had values below 0.05, generally ranging on 0.01 to 0.02 or less. Depending of the number of zero differences and considering the type I experimentwise error I, Student’s, Waller-Duncan’s and Duncan’s tests showed crescent values of errors (> 0.05), proportional to the number of null differences included in the experiment; all other tests exhibit showed of type I experimentwise error < 0.05, most nearing 0.01-0.02 or less. Efficiency of the three “Modified Tests” was close to DunnettU’s test, but higher than the other testes of type I experimentwise error nature (MEER).


INTRODUCTION
To expand the knowledge in different areas of the experimental sciences, experiments are set up to test hypothesis that best explain the phenomena under investigation.Statistical methods are used to design experiments, to choose group of treatments, to perform analyses of variance, to do statis-tical tests and to estimate parameters.All those methods provide ways to prove or not the formulated hypothesis.
The tests more frequently used for comparison of treatments (means) are Student's, Duncan's, Student-Newman-Keul's (SNK), Tukey's and Dunnett's.Bonferroni's, Sidak's and Waller-Duncan's testes are used less than often.Studies have been conducted to evaluate both the power of these tests and the different types of errors involved.The evaluation of the power of each test is obtained through the calculus of the percentage of significative differences in which the probability of error of type I -the rejection of the null hypothesisadopted is α = 0.05 or 0.01.
More recently, errors type I were classified in type I comparisonwise error and type I experimentwise error, the type I experimentwise error under condition of the general null hypothesis, and the type I experimentwise error under condition of partial hypothesis and maximum experimentwise error rate (MEER) type (SAS, 2004), considering both situations.Both unilateral and bilateral Student's tests and Waller-Duncan's test guarantee protection to comparisonwise errors; SNK's enables protection to general experimentwise error, and Tukey's, Bonferroni's, Sidak's, REGWF's, REGWQ's and Dunnett's answers to errors of MEER type.

Bonferroni's Test
The Bonferroni's test may be used to calculate confidence interval and comparison of means.In this paper, only differences were studied.Means i x and j x of two treatments with r i and r j replications, differ significantly if: where df is the number of degrees of freedom of QM Res, s QM Re s = , and t(α β , df) is the value of the t-distribution with probability α β and df degrees of freedom.If r r r , then r 2 r 1 r If a given experiment has t treatments with r replications, and α = 0.05 is the global probability for k comparisons, in which H 0 : T 1 = T 2 = ... = T t = 0 is the general null hypothesis, then, for all comparisons between pairs of means, k = 2 t C = t(t-1)/2, and in this case α B = α/k.
If H 0 is true, the probability of not rejecting any difference is (1-α B ) k .If the global probability adopted is α = 0.05 or 0.01 (for t treatments), the probability of rejecting the general H 0 will be: The calculus of the significative difference between two means is done applying Student's test with a B and df degrees of freedom.The test assures α = 0.05 for all tests, and the error is of the MEER experimentwise type.

Modified Bonferroni's Test (BM)
Modified Bonferroni's test was proposed by Conagin (1999).The Student's t for the test shall be obtained in Student's Tables, with degree of freedom of the residual and α' probability level, in which α' = α(1 + P), and ordinarily α = 0.05 or 0.01.The calculus of P is presented bellow.In the analysis of variance, the test of H 0 is obtained by the test F 0 = QMTreat/QMresidual, in which there are t treatments and r replications (e.g. in randomized blocks design).The critical value of F is F c , with (t -1) and (r -1) (t -1) degrees of freedom, for α = 0.05 or 0.01.
The parametric model is X ij = M + T i + B j + E ij , in which i = 1, 2, ... t, and j = 1, 2,..., r.The expected values are: If the general H 0 is true, T 1 = T 2 = ... = T t = 0.For the sampling model (experimental), Due to the size of experimental error some treatments of real nature are not significative.
If F 0 > F c , H 0 is rejected.Then, there should be one or more treatments t e ≠ 0. The parameter of noncentrality of the F distribution, if H a (alternative hypothesis) is true, is λ (Winer et al., 1991); this value is: The evaluation of P(F 0 ) and P(F c ), if H a is true, may be obtained by PROB F Function of SAS (2004).For fixed t and r: ) is then defined.This is the probability represented by the area between F 0 and F C in the F non-central.If F C is fixed, the area increases if F 0 increases.The area is originated and due to treatments that are far from X (general mean) and includes the treatment significant different of X .
With α'/k, smaller values than the t value when H 0 is true are obtained.If t'< t, the efficiency of the test will be increased.
The corresponding α' for this situation is defined as α'= α(1+P), and then α BM = α'/k.In this case, the t value of the , being l < t).
The modified Bonferroni's test uses the t Student test with probability α', and then α BM = α'/k.
Values shown in Tables 1, 2, 5, and 6, use the PROBF Function to calculate P.But since Group G 2 includes 25 treatments, ordinarily produces λ ˆ > 100, and then, PROBF Function do not calculate the P(F 0 ) and P(F C ).In this case, calculated Tables (Conagin, 2001) for α = 0.05, t, r and F 0 /F C included in the cells the corresponding P value are used.If λ ˆ > 100 and F 0 /F C > 7, an proximate value for P using F 0 /F C = 7 and λ ˆ = 100 is used.The justificative is that if F 0 /F C >, 7 the area (and then P) should be greater than F 0 /F C = 7, and then adopted P is a conservative value.

Second Modified Bonferroni's Test (BM 2 )
If in the Analysis of variance F 0 > F C (general H 0 rejected), there should be a treatment parametrically different of zero.It is possible to evaluate a by â and use it to calculate BM 2 , as follows.
If F 0 > F C the ANOVA is performed and the significant differences between two means by the Student's test (comparisonwise type I error) are calculated.The â number of significant differences is used as an estimate of the true number of parametrical differences between treatments.This happens because, in general, experiments are performed to evaluate responses of new "treatments", supposedly superior to treatments ordinarily used (especially in agronomy, animal science and veterinary medicine, medical, biological, and industrial research, and some other areas in which a new or new treatments are supposed to be better, in some way, than currently used treatments).The â number of significant differences should be â < a. Due to the size of experimental error, some treatments actually different from zero but of small values, result in non-significant differences, and then â is, probably, a conservative estimate of the true a value (â < a).
The probability modified Bonferroni's (BM 2 ) is then: The modified Bonferroni's test behaves similarly to tests of the type MEER, as can be possibly seen in the columns 0% (exp) of Tables 1, 2, 3, 4, 5, and 6.Their frequencies are very much alike the corresponding Dunnett's frequency and it is well known that Dunnett's test is of MEER type.

Sidak's Test (S i )
The Sidak's test uses a probability α s and, similarly to Bonferroni's, use the Student t test with level α s and df degrees of freedom of the residual.The α s value is calculated from: in which α s is the global probability and k is the number of comparisons between means.Ordinarily, if the interest lies in all comparisons between two means . To compare each treatment with a control, k = t -1; this type appear as S i N in the tables.The value α s is greater than α B and then the corresponding t S is smaller than t B .So Sidak's test is a little more efficient than Bonferroni's.For instance, for α = 0.05, k = 10, α s = 0.0512, and α B = 0.05.Therefore, t S is smaller than t B .

Modified Sidak's Test (S i M)
Similarly to Second Bonferroni's Modified (BM 2 ), after performing an ANOVA of a given experi-ment, if F 0 > F C (H 0 general hypothesis rejected), significant differences are calculated by the t test; the number of significant differences â is an estimate of the number of all the actual differences a. Then: This test tends to be a little more effective than the correspondent Bonferroni's (BM 2 ), because M α .The calculus of significative differences in relation to a control uses t test with k=(t-1)-â.If interest lies in all differences between two means, k=(t(t-1)/2)-â.The behavior of the application of Modified Sidak's test can be evaluated in Tables 1, 2, 3, 4, 5, and 6.
In group G 3 (Tables 5 and 6), parametrical differences were 0.3, 0.2, 0.15, 0.1, 0.05, and 0.0 (both cp and exp), for r = 4 (400 experiments) and r = 8 (200 experiments).Once again "default criterion" was used    1 and 2), and G 2 alone (Tables 3 and 4), tests showed high power, with little differences among them.Differences in the power of the various tests increased for differences 0.2, 0.15, and 0.1 to values often and more ordinarily obtained in the research data analysis.In groups G 1 and G 2 , power of tests were, in order, Waller-Duncan (W), Student bilateral (T 2 ), Duncan (D), SNK, S i M, BM, BM 2 , S i N, BN, Dunnett, Tukey and Bonferrooni.
Regarding error of type I comparisonwise [column 0% (cp)], Student's and Waller-Duncan's tests showed values α ≈ 0.05, the nominal error adopted.The other tests showed α < 0.05, most of them around 0.01 or smaller.The error I per experiment (column 0% exp; Tables 3 and 4) of Student's (T 2 ), Waller-Duncan's and Duncan's tests were much higher than 0.05; other tests showed errors smaller than 0.05, most nearing 0.01, or smaller.
Regarding errors of type I comparisonwise [column 0% (cp)], W, T 1 , T 2 , and D, showed α = 0.05, the nominal error adopted.The other tests showed α < 0.05, most nearing 0.01.For errors of type I experimentwise [column o% (exp)], T 1 , T 2 , W, and D, showed α values well above 0.05; the other tests showed α values smaller than 0.05, most nearing 0.01 or less.The power of all the tests increased when the number of replications increased.It is therefore easier to compare values in the Tables 1, 3, and 5, with the corresponding values in Tables 2, 4, and 6, respectively.

Table 5 -
Power of different statistical tests for Group 3,