SISVaR : a GuIdE foR ItS BootStRap pRocEduRES In MultIplE coMpaRISonS

Sisvar is a statistical analysis system with a large usage by the scientific community to produce statistical analyses and to produce scientific results and conclusions. The large use of the statistical procedures of Sisvar by the scientific community is due to it being accurate, precise, simple and robust. With many options of analysis, Sisvar has a not so largely used analysis that is the multiple comparison procedures using bootstrap approaches. This paper aims to review this subject and to show some advantages of using Sisvar to perform such analysis to compare treatments means. Tests like Dunnett, Tukey, Student-Newman–Keuls and Scott-Knott are performed alternatively by bootstrap methods and show greater power and better controls of experimentwise type I error rates under non-normal, asymmetric, platykurtic or leptokurtic distributions.


IntRoductIon
Among the statistically intensive computational methods, the Monte Carlo, bootstrap and permutation (randomization) methods can be highlighted (Manly, 1997;Chernick, 1999).The work of Efron (1979) was a milestone in the systematization of computationally intensive methods in statistics.The frequentist inference is based on the assumption of the existence of a probabilistic model from which a random sample was drawn.If this model is not known or if the model does not fit the sample data, the inference is compromised.Therefore, the importance of computationally intensive methods in statistics is extremely evident.Moreover, the computational time and effort with modern computers nowadays can be considered negligible.
A problem that has been the focus of many studies is multiple comparison procedures for the treatment means under non-normality or under heterogeneity variances in normal or non-normal probabilistic models.Several methods can be used to overcome the difficulties of performing multiple comparison procedures in cases of non-normality or heteroscedasticity.Hochberg and Rom (1995) reviewed the field of multiple comparisons with special focus on the modified Bonferroni method.Two such methods are competitive with Hommel (1988) and Rom (1990).The use of adjusted p-values in multiple comparison procedures (PCM) was introduced by Wright (1992) and by Westfall andYoung (1989, 1993).The later authors attempted to connect the use of adjusted p-values with bootstrap resampling methods.Efron (1979) introduced the bootstrap as a new statistic method.Several comprehensive books on bootstrap are currently available: Hall (1992), Efron and Tibshirani (1993), Davison and Hinkley (1996), Manly (1997) and Chernick (1999).Thorpe and Holland (2000) proposed several methods for performing variances multiple comparisons under non-normal populations.Bootstrap procedures are Ciênc.Agrotec., Lavras, v.38, n. 2, p.109-112, mar./abr., 2014 used associated with the modification of the Bonferroni corrections for p-values adjustments with the purpose of refining the technique.Comparisons with a control treatment and the overall test of homogeneity of variances were discussed by the authors.The nonparametric procedures despite of being independent of several assumptions about the nature of the distribution and parameters free are considered by Thorpe and Holland (2000) as deficient because the loss of power when compared with their competitors.
The aim of this paper is to review the computationally intensive procedures to perform multiple comparisons available in the computer statistical program Sisvar, illustrating its advantages, limitations and analysis capabilities.In addition, a second objective is to show some evaluations of the performance of these methods through Monte Carlo simulations using the experimentwise and comparisonwise type I error rates and power.The new features under implementation will be emphasized.

BootStRap MultIplE coMpaRISonS WItH SISVaR
The multiple comparison procedures in Sisvar were developed to compare k population means performing the hypothesis tests, 0 i h H : µ =µ , i ≠ h = 1, 2, ..., k.
The procedures were applied at two particular testing situations, namely: a) Family of m = k -1 comparisons pairs, such as comparison of treatment versus control (  th treatment), as follow: Finally, the family of multiple comparison tests of Sisvar, described in the item (b) above, was subjected to a performance evaluation through Monte Carlo simulations.Initially, samples were simulated from k populations, considering the probabilistic model called g-h (Hoaglin, 1985).The parameter g controls the amount and direction of asymmetry and the parameter h controls the kurtosis.With g = h = 0 the model corresponds to the standard normal distribution.The tail of the distribution becomes heavier with the increase of h and the distribution becomes asymmetric with increasing g.Thus, adverse situations showing deviations from symmetry and kurtosis were considered for evaluation of the multiple comparison procedures.Type I error rates (size of the test) and the power were assessed to evaluate the performance of the tests.
The bootstrap multiple comparison procedures of both family of pairwise comparison can be performed with Sisvar.The last item in the menu of analysis is the option to be chosen.The file, the factor variable and the test can be selected from this option.The Dunnett version of the test is one the choices.It can be applied for comparisons with a control treatment.Researchers can use it for free by downloading and installing directly from the address: http://www.dex.ufla.br/~danielff/softwares.htm.

coMpaRISonS WItH tHE oRIGInal tEStS
Some simulations results are shown to emphasize the superiority of the bootstrap multiple comparisons procedures presented by Sisvar in cases of non-normality (g=0 and h=0.5).Table 1 shows the comparisonwise (CW) and experimentwise (EW) error rates for several tests in their original and bootstrap versions.It was observed that the original test of Tukey and SNK showed greater experimentwise type I error rates than the nominal significance level of 5%, exceeding the value of 50 percentage points.When the number of treatment means was large, the performance of these tests is worse.The comparisonwise type I error rates of all procedures were under control below the nominal significance level of 5%.
The BT test was the best in the control of the experimentwise type I error rates followed by the BSK test.Under a complete null hypothesis, the BSK test showed a high performance, but under partial null hypothesis this test had greater experimentwise type I error rates than the nominal significance levels under normal or non-normal probability models (data not shown).The BT test in this case of partial null hypothesis (results not shown) controls the experimentwise type I error rates properly.
The family of all possible pairwise comparisons of the form: These approaches involve the determination p-values for each of the m hypotheses (1) and ( 2) by several methods analogous to the method disclosed by Holland and Thorpe (2000) for variances.Several ways to adjust the p-values were considered.The Adjusted p-values should be considered, since they showed the best performance.Concurrently to obtain p-values, multiple comparisons were implemented following the original steps of the test of Dunnett, Tukey (T), Student-Newman-Keuls (SNK) and Scott-Knott (SK) (Tukey, 1953;Scott and Knott, 1974;Hochberg and Tamhane, 1987;Steel, Torrie and Dickey, 1996). (1) (2)

Table 1 -
Comparisonwise and experimentwise type I error rates in percentage under complete null hypothesis of equality of treatment means with a non-normal distribution for the multiple comparisons procedures of original (OT) and bootstrap (BT) Tukey, original (OSNK) and bootstrap (BSNK) SNK and bootstrap Scott and Knott (NSK) considering the nominal significance level of 5% in 2,000 Monte Carlo simulations and 2,000 bootstrap resampling.