Acessibilidade / Reportar erro

Assessing transformation methods for group comparisons under violated assumptions: type I error rate and test power

[Avaliação de métodos de transformação para comparações de grupos sobre pressupostos violados: taxa de erro tipo I e poder do teste]

ABSTRACT

In this study, some transformation methods that are applied when the assumptions of analysis of variance are not met are evaluated in terms of type I error rate and the test power, under circumstances with different distributions, number of groups, number of observations, variance ratios, and different standard deviation differences. The data set used in the study consisted of random numbers generated from N (0,1), and χ2(3) distributions using the random function of the Numpy library in the Python programming language. The logarithmic, square root and root transformations were evaluated on ANOVA based on simulation combinations. It was observed that the transformation techniques of taking the square root after adding 0.5 and 0.375 to the data were relatively more reliable compared to other transformations in terms of type I error rate. However, in every case, type I error rate determined at the beginning of the experiment increased both before and after the transformation was applied. In particular, interestingly, the third and fourth degree root transformations gave better results of test power in the right skewed distribution. In addition, we compared the transformation techniques in question to determine the normality of the data and the homogeneity of variances by a real data.

Keywords:
data transformation; square root transformation; logarithmic transformation; analysis of variance; type I error rate; test power

RESUMO

Neste estudo, alguns métodos de transformação, aplicados quando as premissas da análise de variância não são cumpridas, são avaliados em termos de taxa de erro tipo I e poder de teste, em circunstâncias com diferentes distribuições, número de grupos, número de observações, razões de variância e diferenças de desvio-padrão. O conjunto de dados utilizados no estudo consistiu em números aleatórios gerados a partir das distribuições N(0,1) e χ2(3), utilizando a função aleatória da biblioteca Numpy, na linguagem de programação Python. As técnicas de transformação logarítmica, raiz quadrada e raiz foram avaliadas na ANOVA, com base em combinações de simulação. Observou-se que as técnicas de transformação de tomar a raiz quadrada após adicionar 0,5 e 0,375 aos dados foram relativamente mais confiáveis em comparação com outras transformações em termos de taxa de erro tipo I. No entanto, em todos os casos, a taxa de erro tipo I determinada no início do experimento aumentou tanto antes quanto depois da aplicação da transformação. Em particular, curiosamente, as transformações de raiz de terceiro e de quarto grau deram melhores resultados de poder de teste na distribuição assimétrica à direita. Além disso, foram comparadas as técnicas de transformação em questão para determinar a normalidade dos dados e a homogeneidade das variâncias por meio de dados reais.

Palavras-chave:
transformação de dados; transformação de raiz quadrada; transformação logarítmica; análise de variância; taxa de erro tipo I; poder do teste

INTRODUCTION

Most studies examining the effects of any treatment on the means of the groups consider three or more groups. The analysis of variance (ANOVA-F) test is still widely used today as a parametric test method for comparing the mean of more than two groups.

Some assumptions must be met before conducting the parametric tests. The assumptions for ANOVA are independence of observations, additivity of factor effects, homogeneity of variances between or among groups, and normality of the data. The normality of the observations and the homogeneity of the group variances are related to the assumed populations; hence the researcher cannot always interfere with these assumptions. Therefore, if these assumptions are not met, the results of the ANOVA are invalid (Larson, 2008LARSON, M.G. Analysis of variance. Circulation, v.117, p.115-121, 2008.; Mendes, 2012).

Applying the ANOVA without meeting the assumptions causes a deviation from the pre-determined type I error rate (5.0%), thus affecting the test power. Consequently, the true differences between the means of the groups may not be revealed. After checking the assumptions with conventional approaches, there are some alternative options if the assumptions of ANOVA are not met. In this sense, Tukey (1957TUKEY, J.W. On the comparative anatomy of transformations. Ann. Math. Stat., p.602-632, 1957.) suggested that if the assumptions of ANOVA are not met, transformation techniques can be used on the questionable data.

Some studies have examined the type I error rate and the test power in comparing the mean of more than two groups using parametric and non-parametric tests (Mendeş, 2002; Patric, 2007PATRIC, J.D. Simulations to analyze type I error and power in the ANOVA F test and nonparametric alternatives. 2007. 80f. Thesis (Master of Science) - University of West Florida, USA.; Koşkan and Gürbüz, 2009KOŞKAN, O.; GÜRBÜZ, F. Comparison of f test and resampling approach for type I error rate and test power by simulation method. J. Agr. Sci., v.15, p.105-111, 2009.; Ferreira et al., 2012FERREIRA, E.B.; ROCHA, M.C.; MEQUELINO, D.B. Monte Carlo evaluation of the ANOVA's F and Kruskal-Wallis tests under binomial distribution. Sigmae, v.1, p.126-139, 2012.; Lantz, 2013LANTZ, B. The impact of sample non-normality on ANOVA and alternative methods. Br. J. Math. Stat. Psychol., v.66, p.224-244, 2013.).

In general, quantitative data would have the normal distribution, but in practice, the data may not always have a normal distribution thus may not satisfy the assumptions that observations would be normally distributed and variances would be homogenous.

Data transformation, which is one of the options that can be applied in this case, provides a new form to the questionable data by using a variety of mathematical operations. Some researchers claim that “transforming data is an inappropriate way or data cheating”. The missing part in this critique is that these transformations are applied to all data, not just a part of it, so there is no cheating or voluntary manipulation. Furthermore, the data transformation technique ensures the validity of the statistical test. In the literature, there are many simulation studies investigating the effects of transformation techniques on the ANOVA in terms of type I error rate and test power. Some of these studies reported that various transformation techniques had negative effects on type I error rate and test power (Arıcı et al., 2011), while others reported positive effects (Mahapoonyanont et al., 2010MAHAPOONYANONT, N.; MAHAPOONYANONT, T.; PENGKAEW, N.; KAMHANGKIT, R. Power of the test of one-way anova after transforming with large sample size data. Procedia Soc. Behav. Sci., v.9, p.993-937, 2010.; Özkan et al., 2010ÖZKAN, M.M.; KOCABAŞ, Z.; KAŞKO, Y.; ALBAYRAK, R. The effect of square root and logarithmic transformations on type I error of ANOVA in balanced experiments. In: INTERNATIONAL TURKISH-GERMAN AGRICULTURE SYMPOSIUM, 9., 2010, Antakya. Proceedings… Antakya, Hatay, Türkiye: Mustafa Kemal University, 2010.; Arıcı, 2012; Yiğit, 2012YIĞIT, S. Type I error rate and test power for different approaches to factorial designs when normality and homogeneity of variances assumptions are not satisfied. 2012. Thesis (Master) - University of Çanakkale Onsekiz Mart, Çanakkale, TÜR.). Maidapwad and Sananse (2014MAIDAPWAD, S.L.; SANANSE, S.L. On analysis of two-way ANOVA using data transformation techniques. Int. J. Sci. Res., v.3, p.480-483, 2014.) emphasized that many researchers start conducting variance analysis without checking the normality assumption, which leads to information loss in the obtained results. To support this claim, they demonstrated the effects of various transformation techniques on group comparisons. Hammouri et al. (2020HAMMOURI, H.M.; SABO, R.T.; ALSAADAWI, R.T. et al. Handling skewed data: A comparison of two popular methods. Appl. Sci., v.10, p.6247, 2020.) mentioned the positive effects of conducting group comparisons after logarithmic transformation of data with skewed distributions. This study is one of the recent significant works in this field.

As highlighted by Blanca et al. (2017BLANCA, M.J.; ALARCÓN, R.; ARNAU, J.; BONO, R.; BENDAYAN, R. Non-normal data: is ANOVA still a valid option?. Psicothema, v.29, p.552-557, 2017.), if the distribution shapes of the assumed populations exhibit moderate deviations from normality, the assumption of same population distribution shapes holds, each group has equal sample size, and the sample size is large, then the technique of analysis of variance (ANOVA) is a powerful method. However, researchers may sometimes have doubts about which sample size is sufficient or how much deviation from normality can be tolerated.

There are various methods that can be an alternative to the ANOVA technique when assumptions were not met. Generally, researchers use non-parametric methods such as the Kruskal-Wallis test when the data could not meet the normality assumption, in addition to transforming the data. However, the Kruskal-Wallis test is also heavily influenced by heterogeneity of variances (Liu, 2015LIU, H. Comparing Welch ANOVA, a Kruskal - Wallis test, and traditional ANOVA in case of heterogeneity of variance. 2015. 48f. Thesis (Master of Science Degrees in Biostatistics) - Virginia Commonwealth University, Virginia, USA.).

The purpose of this study was to analyze the effects of different transformation techniques, including logarithmic (log10), square root (x, x+0.5 and x+(38)) and root transformations (x3 and x4), on one-way variance analysis. The focus will be on assessing both type I error rates and test powers in situations where the assumptions for normal distribution and homogeneity of variances are not met.

MATERIALS AND METHODS

The data set of this study consisted of randomly generated numbers from N (0, 1) and χ2 (3) distributions, determined according to the simulation design given in Table 1. Numpy library in Python Programming Language for generating random numbers was used (Harris et al., 2020HARRIS, C.R.; MILLMAN, K.J.; VAN DER WALT, S.J. et al. Array programming with NumPy. Nature, v.585, p.357-362, 2020.). Density plots of the theoretical distributions used are shown in Figure 1. We also compared transformed and non-transformed datasets for normality and homogeneity of variances on real data. The Shapiro-Wilks and Bartlett tests were conducted on real data to assess the normality and homogeneity of variances, respectively. Detailed information about the real dataset will be explained in later sections.

Table 1
Simulation design for random numbers generated from N (0,1) and χ2(3) distributions

Figure 1
Probability density plots for theoretical distributions

METHODS

Simulation designs were set up for each distribution N (0,1) and χ2(3) as follows: the numbers of the group were determined as 3, and the number of observations in each group as 3, 5, 10, 15, and 30. In addition, variance ratios among the groups were adjusted as 1, 3, 5, and 10 folds the variance ratio of the other groups. The standard deviation differences among the means were generated as 0, 0.5, 1, 1.5, and 2. Each combination of simulation was iterated 100000 times. Due to the populations having different means and variances, each observation was standardized. Thus, the means and variances of all populations were equalized. Samples were generated from the standardized populations according to determined sample sizes. If the type I error rate was the focus, and the variances were homogeneous, the observations were used as they are. However, in the case when the variances became heterogeneous, the observations in the final group were multiplied by the square roots of the constant numbers corresponding to the specified variance ratios. In addition, if the power of the test was the focus, standard deviation differences were constituted by adding constant numbers to the final group. The determination of whether the differences among the group means were due to coincidence or not was provided by a one-way ANOVA technique. In the ANOVA technique, the type I error rate was calculated by dividing the number of the rejected H0 hypotheses in 100000 simulations, before and after the transformations were applied to the observations, by the total number of simulations. For the power of the test, standard deviation differences were constituted, and after 100000 simulations and the number of rejected H0 hypotheses before and after transforming was divided by the total number of simulations. The nominal significance level (α) was determined as 5.0% in this simulation study. A flowchart representing the simulation program utilized to compute type I error rate and test power is shown in Figure 2.

Figure 2
Flowchart of simulation program

It is well known that the analysis of variance is the frequently used statistical method to determine whether the difference between the means of two or more independent groups is due to coincidence or not. ANOVA or in other words F test is used to test H0 (null) and HA (alternative) hypotheses as described below in detail. The data generated by simulation can be identified with Equation (1).

Y i j = µ + α i + e i j (1)

where;

µ: is the overall mean of the population,

αi: ith the effect of the treatment,

eij: is the error term.

The null and alternative hypotheses can be tested as:

H 0 : µ 1 = µ 2 = .. = µ k

HA: “at least one of the groups' mean (µk) is different”

where k is the number of experimental groups or treatments.

The F ratio is calculated by dividing the mean square of treatments (MST) by the mean square of error (MSE). The critical F - table value is determined with k - 1 and N - k degrees of freedom.

If the calculated F = MSTMSE ratio is greater than the critical F - table value, then H0 is rejected. The H0 hypothesis is accepted when the calculated F = MSTMSE ratio is lower than the critical F - table value.

RESULTS

Simulation results of the type I error rates of ANOVA after transformations when the distributions are normal and χ2(3) shown in Table 2. For standard normal distribution, the type I error rates calculated without transformations were kept at 5% when variances were homogeneous, regardless of the sample size. It was observed that the calculated type I error rates tended to increase when the variances were slightly nonhomogenous, and the sample size increased. As the sample size increased, this trend became more apparent. For instance, when variance ratios were 1:1:5, the type I error rate after square root transformation was 6.9 and 7.7% for n=3 and n=30, respectively. It was found that, as the variance heterogeneity increased, the type I error rates calculated without transformation outperformed those calculated with transformation but did not maintain the pre-determined type I error rate (5.0%). In addition, x+0.5 andx+(38) transformation techniques were more reliable than other transformation techniques in case variances were heterogeneous. It can be concluded that the type I error rates increase after transformation techniques when the homogeneity of variance is severely disrupted at the rate of 1:1:1:1:10 with increasing sample size.

Table 2
Type I error rates of ANOVA after transformations when distribution is standard normal N (0,1) and χ2(3)

For χ2(3) distribution, an increase in the sample sizes resulted in a 5.0% type I error rate in a scenario where the variances were homogeneous. All transformation techniques produced results that were in proximity to the pre-determined type I error rate (5.0%). Although the application of any transformation technique increased the type I error rate, it was found that logarithmic transformation produced a lower type I error rate, especially when n=50 and variances were heterogeneous. This trend was consistently observed across all heterogeneous variance ratios. Furthermore, when the variances were homogenous, it was observed that the type I error rates approached 5.0% in non-transformed data. It is seen that while the variances were homogeneous, all transformation techniques increased the type I error rates to 5%. As the heterogeneity of the variances increased, the type I error rates in ANOVA could not be kept at the level of %5.0 after all transformation techniques were applied, regardless of the sample size.

The power values of ANOVA for both transformed and non-transformed data and observations were obtained from both distributions are presented in Tables 3 and 4. The power values that reached the desired level of 80% are presented in bold font according to standard deviation differences ranging from 0.5 to 2. In the case of standard normal distribution when variances were homogeneous, power values above 80% were achieved with a small sample size, however, when variances were heterogeneous this could only be attained with a larger sample size. Also, there was no significant difference between transformed and non-transformed values. Under the χ2(3) distribution, the power values obtained with the x3 and x4 transformations were higher compared to other transformation methods, especially when the variances were homogeneous and the standard deviation differences ranged between 0.5 to 2. In cases where the variance ratios were 1:1:3, the x3 and x4 transformations were also more successful, particularly in low standard deviation differences and small sample sizes (such as 30). In addition, similar results were obtained when the variances became increasingly heterogeneous, for example, in cases where the variance ratios were 1:1:5 and 1:1:10.

Moreover, all applied transformation techniques reached or exceeded the desired power level of 80%. Among the transformation techniques, the x3 and x4 techniques were more powerful than the rest under a χ2(3) distribution. The power values that reached or exceeded the desired power level of 80% are indicated in bold font for χ2(3) distribution.

DISCUSSION

When the variances were homogenous, the type I error rates with non-transformed and transformed data preserved the pre-determined value of 5% in ANOVA. This result agreed with Başpınar and Gürbüz (2000), Arıcı et al. (2011), Arıcı (2012), Yiğit (2012YIĞIT, S. Type I error rate and test power for different approaches to factorial designs when normality and homogeneity of variances assumptions are not satisfied. 2012. Thesis (Master) - University of Çanakkale Onsekiz Mart, Çanakkale, TÜR.), and Blanca et al. (2017BLANCA, M.J.; ALARCÓN, R.; ARNAU, J.; BONO, R.; BENDAYAN, R. Non-normal data: is ANOVA still a valid option?. Psicothema, v.29, p.552-557, 2017.) who found that when the variances were homogeneous the type I error rates preserved at 5%.

When the variances were heterogeneous, it was observed that the type I error rates for ANOVA with transformed and non-transformed data could not preserve the pre-determined value of 5%. Furthermore, the type I error rates tended to increase with transformed data when the sample size was 30 or larger. In a simulation study with data having a normal distribution, Arıcı (2012) reported that square root and logarithmic transformations increased the pre-determined (5%) type I error rate. Hence, the increase of the type I error rate when the variances deviate from homogeneity is consistent with the findings of Trumbo et al. (2004TRUMBO, B.E.; SUESS, E.A.; BRAFMAN, R.E. Classroom simulation: are variance-stabilizing transformations really useful?. In Proc. Am. Stat. Assoc. Sect. Stat. Educ., p.2809-2813, 2004.), Özkan et al. (2010ÖZKAN, M.M.; KOCABAŞ, Z.; KAŞKO, Y.; ALBAYRAK, R. The effect of square root and logarithmic transformations on type I error of ANOVA in balanced experiments. In: INTERNATIONAL TURKISH-GERMAN AGRICULTURE SYMPOSIUM, 9., 2010, Antakya. Proceedings… Antakya, Hatay, Türkiye: Mustafa Kemal University, 2010.), Arıcı et al. (2011), and Arıcı (2012).

In addition, under χ2(3) distribution, the type I error rates of transformations applied data did not preserve the pre-determined value of 5%. Tekindal (1999TEKINDAL, B. A simulation study on the transformations applied when the normality assumption is violated in ANOVA. J. Ind. Arts Educ. Fac. Gazi Univ., v.7, p.25-37, 1999.) reported that under χ2(3) distribution the type I error rates in variance analysis were maintained at 5% after logarithmic and square root transformations were applied on the data. Therefore, our study is not consistent with the findings of Tekindal (1999). It was observed that when the variances were heterogeneous, the type I error rates could not be maintained at the 5.0% level and received higher values due to the application of transformation techniques. Yiğit (2012YIĞIT, S. Type I error rate and test power for different approaches to factorial designs when normality and homogeneity of variances assumptions are not satisfied. 2012. Thesis (Master) - University of Çanakkale Onsekiz Mart, Çanakkale, TÜR.) reported that logarithmic transformation did not provide reliable results when variances were heterogeneous. These findings are consistent with the results obtained from this study.

After applying transformations to skewed data, variance analysis yielded more powerful results compared to the non-transformed data. In the case of the rightly skewed χ2(3) distribution, the test power values increased after transformations, particularly square root transformations, as the heterogeneity of the variances increased. In this context, the findings of Rasmussen and Dunlap (1991RASMUSSEN, J.L.; DUNLAP, W.P. Dealing with nonnormal data: Parametric analysis of transformed data vs nonparametric analysis. Educ. Psychol. Meas., v.51, p.809-820, 1991.) and Çavuş and Yazıcı (2020ÇAVUŞ, M.; YAZICI, B. Comparison of Hsieh test and ANOVA for logtransformed on income data. In: INTERNATIONAL SYMPOSIUM ON ECONOMETRICS, OPERATIONAL RESEARCH AND STATISTICS, 20., 2020, Ankara. Proceedings… Ankara: [Minduce], 2020.) studies share similarities with the present study.

It was stated that applying logarithmic, square root, and root transformation techniques in the study resulted in similar increases in the power values after performing ANOVA on the non-transformed data. When applying a logarithmic transformation, the test power increased with an increase in sample size. In this respect, these results are similar to Trumbo et al. (2004TRUMBO, B.E.; SUESS, E.A.; BRAFMAN, R.E. Classroom simulation: are variance-stabilizing transformations really useful?. In Proc. Am. Stat. Assoc. Sect. Stat. Educ., p.2809-2813, 2004.) and Mahapoonyanont et al. (2010MAHAPOONYANONT, N.; MAHAPOONYANONT, T.; PENGKAEW, N.; KAMHANGKIT, R. Power of the test of one-way anova after transforming with large sample size data. Procedia Soc. Behav. Sci., v.9, p.993-937, 2010.) studies.

When the standard deviation difference was 0.5 and variances were heterogeneous, the test power decreased in both cases, with and without transformations. Arıcı (2012) claimed different results for this situation who reported that the test power values were adversely affected when the standard deviation differences were 1 and 1.5. It was evident that after applying transformation methods, there was an increase in the test power values with an increase in heterogeneity levels and the number of observations between populations (Arıcı, 2012). This finding is also consistent with the current study.

Table 3
Test power values when the distributions are standard normal distribution and χ2(3), and the variance ratios are 1:1:1 and 1:1:3
Table 4
Test power values when the distributions are standard normal distribution and χ2(3), and the variance ratios are 1:1:5 and 1:1:10

The Shapiro-Wilk and Bartlett tests were employed on the real data to determine the normality and homogeneity of variances, respectively. The open-access dataset used in this study was published in the Science Data Bank by Bousbia et al. (2021BOUSBIA, A.; BOUDALIA, S.; GUEROUI, Y.; HADDED, K.; BOUZAOUI, A.; KIBOUB, D.; SYMEON, G. Use of multivariate analysis as a tool in the morphological characterization of the main indigenous bovine ecotypes in northeastern Algeria. Plos one, v.16(7): e0255153, 2021.). The data included body measurements taken from cattle in Algeria, with a total of 130 adult cattle (30 males and 100 females) from 30 farms belonging to 4 region-specific ecotypes with distinct characteristics being measured. We used only one variable, which was Muzzle Circumference (MC) to assess the normality and homogeneity of variances, both on transformed and non-transformed data. The Shapiro-Wilk and Bartlett tests results are demonstrated in Table 5. Hypothesis for Shapiro Wilk and Bartlett test can be described basically as follows:

H0: The data is normally distributed.

HA: The data is not normally distributed.

H0: The assumed population variances of the groups from which they are taken are equal.

HA: The assumed population variances of at least two (maybe all) groups from which they are taken are not equal.

If the p-value is greater than the nominal significance level of 0.05, it means that the null hypothesis will not be rejected.

Based on the Shapiro-Wilk test as tabulated in Table 5; the MC variable fitted the normal distribution after transformations since the p-value was greater than the nominal significance level (α = 0.05). The x4 and Log10 transformations gave better results than others in terms of p-value.

When considering the results of the Bartlett test, the probability of accepting the null hypothesis significantly increased after all transformations. Thus, the homogeneity of variances, which is one of the most important assumptions of ANOVA, was met. Similar to the assumption of normal distribution, the x4 and Log10 transformations yielded improved results.

Table 5
Tests for normality and homogeneity of the variances for MC variable

CONCLUSION

In conclusion, under N (0, 1) and χ2(3) distribution when the variances were heterogeneous, for this simulation study, x+0.5 andx+(38) transformation techniques gave more reliable type I error rates. Especially in the case of right-skewed distributions such as χ2(3), it was observed that x3 and x4 transformations provided significantly higher test power values. While current transformation techniques are relatively effective under specific conditions, they can be ineffective in many cases, thus highlighting the need for new transformation techniques. The necessity of modifying and improving current transformation techniques is one of the conclusions of this study. Based on the information provided above the effect of the transformation techniques evaluated in this study can be examined with different sample sizes or samples obtained from different continuous distributions.

ACKNOWLEDGEMENTS

We would like to express our sincere thanks to Professor Doctor Hayati Koknaroglu for their valuable contributions to this study.

REFERENCES

  • ARICI, KY. The effect of transformations on type I error and test power in balanced factorial experiments. 2012. Thesis (PhD) - University of Ankara, Ankara, TÜR.
  • ARICI, K.Y.; ÖZKAN, M.M.; KOCABAS, Z. Heterojen varyanslı gruplarda kruskal-wallis testi ile transformasyon sonrası varyans analizinin karşılaştırılması (Comparison of Kruskal-Wallis test and transformed variance analysis in heterogeneous variance groups). In: NATIONAL ZOOTECHNICAL STUDENT CONGRESS, 7., 2011, Adana. Proceedings… Adana, Türkiye: [s.n.], 2011.
  • BAŞPINAR, E.; GÜRBÜZ, F. The power of the test in the samples of various sample sizes were taken from the binary combinations of the normal, beta, gamma and weibull distributions. J. Agr. Sci., v.6, p.116-127, 2000.
  • BLANCA, M.J.; ALARCÓN, R.; ARNAU, J.; BONO, R.; BENDAYAN, R. Non-normal data: is ANOVA still a valid option?. Psicothema, v.29, p.552-557, 2017.
  • BOUSBIA, A.; BOUDALIA, S.; GUEROUI, Y.; HADDED, K.; BOUZAOUI, A.; KIBOUB, D.; SYMEON, G. Use of multivariate analysis as a tool in the morphological characterization of the main indigenous bovine ecotypes in northeastern Algeria. Plos one, v.16(7): e0255153, 2021.
  • ÇAVUŞ, M.; YAZICI, B. Comparison of Hsieh test and ANOVA for logtransformed on income data. In: INTERNATIONAL SYMPOSIUM ON ECONOMETRICS, OPERATIONAL RESEARCH AND STATISTICS, 20., 2020, Ankara. Proceedings… Ankara: [Minduce], 2020.
  • FERREIRA, E.B.; ROCHA, M.C.; MEQUELINO, D.B. Monte Carlo evaluation of the ANOVA's F and Kruskal-Wallis tests under binomial distribution. Sigmae, v.1, p.126-139, 2012.
  • HAMMOURI, H.M.; SABO, R.T.; ALSAADAWI, R.T. et al. Handling skewed data: A comparison of two popular methods. Appl. Sci., v.10, p.6247, 2020.
  • HARRIS, C.R.; MILLMAN, K.J.; VAN DER WALT, S.J. et al. Array programming with NumPy. Nature, v.585, p.357-362, 2020.
  • KOŞKAN, O.; GÜRBÜZ, F. Comparison of f test and resampling approach for type I error rate and test power by simulation method. J. Agr. Sci., v.15, p.105-111, 2009.
  • LANTZ, B. The impact of sample non-normality on ANOVA and alternative methods. Br. J. Math. Stat. Psychol., v.66, p.224-244, 2013.
  • LARSON, M.G. Analysis of variance. Circulation, v.117, p.115-121, 2008.
  • LIU, H. Comparing Welch ANOVA, a Kruskal - Wallis test, and traditional ANOVA in case of heterogeneity of variance. 2015. 48f. Thesis (Master of Science Degrees in Biostatistics) - Virginia Commonwealth University, Virginia, USA.
  • MAHAPOONYANONT, N.; MAHAPOONYANONT, T.; PENGKAEW, N.; KAMHANGKIT, R. Power of the test of one-way anova after transforming with large sample size data. Procedia Soc. Behav. Sci., v.9, p.993-937, 2010.
  • MAIDAPWAD, S.L.; SANANSE, S.L. On analysis of two-way ANOVA using data transformation techniques. Int. J. Sci. Res., v.3, p.480-483, 2014.
  • MENDES, M. The comparison of some alternative parametric tests to one - way analysis of variance about type i error rates and power of test under non - normality and heterogeneity of variance. 2002. Thesis (PhD) - University of Ankara, Ankara, TÜR.
  • ÖZKAN, M.M.; KOCABAŞ, Z.; KAŞKO, Y.; ALBAYRAK, R. The effect of square root and logarithmic transformations on type I error of ANOVA in balanced experiments. In: INTERNATIONAL TURKISH-GERMAN AGRICULTURE SYMPOSIUM, 9., 2010, Antakya. Proceedings… Antakya, Hatay, Türkiye: Mustafa Kemal University, 2010.
  • PATRIC, J.D. Simulations to analyze type I error and power in the ANOVA F test and nonparametric alternatives. 2007. 80f. Thesis (Master of Science) - University of West Florida, USA.
  • RASMUSSEN, J.L.; DUNLAP, W.P. Dealing with nonnormal data: Parametric analysis of transformed data vs nonparametric analysis. Educ. Psychol. Meas., v.51, p.809-820, 1991.
  • TEKINDAL, B. A simulation study on the transformations applied when the normality assumption is violated in ANOVA. J. Ind. Arts Educ. Fac. Gazi Univ., v.7, p.25-37, 1999.
  • TRUMBO, B.E.; SUESS, E.A.; BRAFMAN, R.E. Classroom simulation: are variance-stabilizing transformations really useful?. In Proc. Am. Stat. Assoc. Sect. Stat. Educ., p.2809-2813, 2004.
  • TUKEY, J.W. On the comparative anatomy of transformations. Ann. Math. Stat., p.602-632, 1957.
  • YIĞIT, S. Type I error rate and test power for different approaches to factorial designs when normality and homogeneity of variances assumptions are not satisfied. 2012. Thesis (Master) - University of Çanakkale Onsekiz Mart, Çanakkale, TÜR.

Publication Dates

  • Publication in this collection
    18 Sept 2023
  • Date of issue
    Sep-Oct 2023

History

  • Received
    27 Mar 2023
  • Accepted
    10 May 2023
Universidade Federal de Minas Gerais, Escola de Veterinária Caixa Postal 567, 30123-970 Belo Horizonte MG - Brazil, Tel.: (55 31) 3409-2041, Tel.: (55 31) 3409-2042 - Belo Horizonte - MG - Brazil
E-mail: abmvz.artigo@gmail.com