Sample size for estimating mean and coefficient of variation in species of crotalarias

The objective of this study was to determine the sample size necessary to estimate the mean and coefficient of variation in four species of crotalarias (C. juncea, C. spectabilis, C. breviflora and C. ochroleuca). An experiment was carried out for each species during the season 2014/15. At harvest, 1,000 pods of each species were randomly collected. In each pod were measured: mass of pod with and without seeds, length, width and height of pods, number and mass of seeds per pod, and mass of hundred seeds. Measures of central tendency, variability and distribution were calculated, and the normality was verified. The sample size necessary to estimate the mean and coefficient of variation with amplitudes of the confidence interval of 95% (ACI95%) of 2%, 4%, ..., 20% was determined by resampling with replacement. The sample size varies among species and characters, being necessary a larger sample size to estimate the mean in relation of the necessary for the coefficient of variation.


INTRODUCTION
The species of the genus Crotalaria are used as cover plants in crops rotation systems, with high production of green matter mass and biological nitrogen fixation, contributing to the production of organic matter (Chaudhary 2016).According to Vargas et al. (2017), the residual nitrogen of C. juncea is sufficient for growing broccoli followed by zucchini.Bhandari et al. (2016) highlight that the C. juncea is an important source of fiber, but that few studies have been developed, being the genetic breeding incipient in this species.C. spectabilis can reduce the incidence of Ralstonia solanacearum in tomato plants (Deberdt et al. 2015) and also features low suitability for reproduction of Helicoverpa armigera (Reigada et al. 2016).According to Machado et al. (2007) and Braz et al. (2016), C. spectabilis , C. breviflora and MARCOS TOEBE et al. C. ochroleuca can be used to reduce the population density and management of areas infested with Pratylenchus brachyurus.
When the experiments were carried out, one of the main steps is the definition of the sample size suitable to be used for the evaluation of the effect of treatments, being the same influenced by the desired precision, by characters that will be evaluated, by the variability of the data and the error rate allowed.Sample size is important for the representativeness of the research and the validity of the conclusions (Koumanov 2017).According to Ferreira (2009), the resampling with replacement can be used for the sample sizing, being independent of the data probability distribution.This technique has been used in the sample sizing for the estimation of genetic and phenotypic parameters of sugar cane (Leite et al. 2009), mean and coefficient of variation (Toebe et al. 2014), linear correlation (Cargnelutti Filho et al. 2010, Toebe et al. 2015) and path analysis (Toebe et al. 2017b) in maize and the mean and standard deviation of pepper fruits (Silva et al. 2015).The resampling process was also used in the sampling size of cover plants, as for the estimation of the mean of characters of jack bean and velvet bean seeds (Cargnelutti Filho et al. 2012) and the plastochron in pigeon pea (Cargnelutti Filho et al. 2013).
Specifically in crotalarias, Facco et al. ( 2017) evaluated the influence of basic experimental units in the estimate of the optimal plot size for the evaluation of fresh mass of C. juncea.Teodoro et al. (2015) and Toebe et al. (2017a) determined the sample size to estimate the average of characters in C. juncea and C. spectabilis using the Student's t-distribution.In the study of Teodoro et al. (2015), the sample size oscillated among the levels of precision, variables and between the two species of crotalarias.It is likely that when a larger number of variables and species of this genus are evaluated, changes occur in the recommendation of the sample size.Still, although some theoretical studies (Kelley 2007, Panichkitkosolkul 2009, Banik et al. 2012) and practical studies (Toebe et al. 2014) of sample sizing for the coefficient of variation have been developed, reports in the literature were not found on such information on species of crotalaria.Thus, the objective of this study was to determine the sample size necessary to estimate the mean and coefficient of variation in four species of crotalarias (C.juncea, C. spectabilis, C. breviflora and C. ochroleuca) in precision levels.

MATERIALS AND METHODS
Four tests of uniformity were carried out (blank experiments, i.e., without treatments) in the season of 2014/2015, in the experimental area of the Federal University of Pampa, Campus Itaqui, located in the municipality of Itaqui, State of Rio Grande do Sul, Brazil, in geographic coordinates lat 29°09'S and long 56°33'W and altitude of 74 m.According to the classification of Köppen, the climate of the region is Cfa type, humid subtropical with hot summers and without a defined dry season (Wrege et al. 2011), and the soil classified as Haplic Plinthosol (Santos et al. 2013).
Each test of uniformity was performed in an area of 65.61 m 2 (8.1 m in length × 8.1 m width) and was destined to one of the four species of crotalarias (C. juncea, C. spectabilis, C. breviflora and C. ochroleuca).In the four assays of uniformity the fertilizer basis was 25 kg•ha −1 of N, 100 kg•ha −1 of P 2 O and 100 kg•ha −1 of K 2 O and all the cultural treatments were carried out in a uniform way within the sample area.This standardization for performing of uniformity assays is indicated by Storck et al. (2016).
The seeds of C. juncea, cultivar IAC-KR1, were seeded on 10/18/2014, with spacing between rows of 0.45 m and 27 seeds per meter of row, totaling 60 seeds per m 2 .The seeds of C. spectabilis, common cultivar, were seeded on 10/16/2014, with spacing between rows of 0.45 m and 33 seeds per Afterwards, the amplitude of the confidence interval of 95% was calculated (ACI 95% ) by the difference between the percentile 97.5% and the percentile 2.5%.
For the determination of sample size (number of pods) required to estimate the mean of each of the eight characters in each species of crotalaria, maximum limits of ACI 95% were fixed, of 2% (greater precision), 4%, 6%, ..., 20%, (less precision) of mean.Whereas for the determination of sample size required to estimate the coefficient of variation (CV, in %), of each of the eight characters in each species of crotalaria, maximum limits of ACI 95% were fixed of 2% (greater precision), 4%, 6%, ..., 20% (less precision).Then, it was started from the initial sample size (n = 10 pods) and it was considered as appropriate sample size (n) the number of pods from which the amplitude of the confidence interval of 95% (ACI 95% ) was less than or equal to the maximum limit established for each level of precision, as already described in previous studies (Cargnelutti Filho et al. 2010, Toebe et al. 2014, 2015, 2017b).Statistical analyzes were performed with the aid of the program R (R Development Core Team 2017) and the application Microsoft Office Excel ® .

RESULTS AND DISCUSSION
In general, it was found wide variability (difference between maximum and minimum values) of data in eight characters evaluated in four species of crotalarias (C.juncea, C. spectabilis, C. breviflora and C. ochroleuca), which may be justified by the number of pods evaluated (Table I).This wide variability is indicative that the extreme values are included in the sample of 1,000 pods, and this is important for the correct sample dimensioning.For a given character and species, the values of mean and median were similar among themselves, indicating good fit of the data to normal distribution.In this sense, it was found mesokurtic kurtosis meter of row, totaling 73 seeds per m 2 .The seeds of C. breviflora, common cultivar, were seeded on 10/17/2014, with spacing between rows of 0.45 m and 33 seeds per meter of row, totaling 73 seeds per m 2 .The seeds of C. ochroleuca, common cultivar, were seeded on 10/20/2014, with spacing between rows of 0.45 m and 44 seeds per meter of row, totaling 98 seeds per m 2 .
In the period from March to June 2015, successive harvests of pods were held, randomly, in each assay of uniformity, in accordance with the productive cycle of each species.For each species (C.Juncea, C. spectabilis, C. breviflora and C. ochroleuca), 1,000 pods were collected and, in each pod, the following characters were evaluated: mass of pod with seeds (MPWS, in g); mass of pod without seeds (MPWOS, in g); length of the pod (LP, in mm); width of the pod (WP, in mm); height of the pod (HP, in mm); number of seeds per pod (NSP, in units); mass of seeds per pod (MSP, in g); and mass of one hundred seeds (MHS, in g).From these data it was calculated the minimum, mean, median, maximum, variance, standard deviation, coefficient of variation, asymmetry and kurtosis values of each character in each species, being checked the normality of the data through the test of Kolmogorv-Smirnov.
For each character of each species of crotalaria, 190 sample sizes were planned, being the smallest sample size of 10 pods and the other sample sizes obtained with addition of five pods.Thus, the sample sizes planned were n = 10, 15, 20, ..., 1,000 pods.For each planned sample size, in each species, 10,000 resamples with replacement were obtained, being that in each resample, the mean (x ̅ ) and the coefficient of variation (CV, in %) of each of the eight characters were estimated.Thus, for each planned sample size, 10,000 estimates of the mean and coefficient of variation for each character were obtained.Based on these 10,000 estimates of each statistic ( x ̅ and CV), the percentile 2.5%, the mean and the percentile 97.5% were determined.MARCOS TOEBE et al. in seven, four, seven and seven characters of the species C. juncea, C. spectabilis, C. breviflora and C. ochroleuca, respectively.
It was verified positive and significant asymmetry (p ≤ 0.05) in five, zero, zero and one character and, negative and significant asymmetry (p ≤ 0.05) in one, three, four and two characters, in the species C. juncea, C. spectabilis, C. breviflora and C. ochroleuca, respectively (Table I).The significant deviations of asymmetry observed in 50% of cases can be associated with high sample size used (n = 1,000 pods), since the amplitude of the confidence interval of the asymmetry can be reduced by increasing the size of the sample.Thus, small deviations of asymmetry become statistically significant (Doane and Seward 2011, Wright and Herrington 2011), as already observed in data from corn hybrids (Toebe et al. 2014).
The character NSP was the only character that does not fit the normal distribution in the four species evaluated (Table I).It was also not found fit to normal distribution for these characters MSP and MPWOS in C. breviflora and C. ochroleuca, respectively.For the remaining cases, normality of data (p > 0.05) was observed.Considering the large number of evaluations, the wide variability of data, the similarity among the values of mean and median, the predominance of mesokurtic and symmetrical data, the fit to the normal distribution and, also taking into account, the process of sampling sizing via resampling (Ferreira 2009), it may be inferred that the database offers reliability for the proposed study, and the sample sizes determined serve as reference for the culture.
In the four species of crotalarias, the coefficient of variation showed lower values in the characters originated from the measurement (LP, WP and HP) in relation to those obtained by counting and weighing (MPWS, MPWOS, NSP, MSP and MHS -Table I).The coefficient of variation ranged between 5.40% and 10.53% for the characters LP, WP and HP assessed in the species C. juncea, C.  2015) observed higher values of coefficients of variation for fresh matter mass, followed by dry matter mass and yield, being observed greater variability for characters evaluated in C. juncea in relation to C. spectabilis, similarly to this study.
For the highest level of precision [amplitude of the confidence interval of 95% (ACI 95% ) of 2% of the mean], the sample size ranged between 235 to more than 1,000 pods for C. juncea, between 120 to more than 1,000 pods for C. spectabilis, between 115 to more than 1,000 pods for C. breviflora and between 305 to more than 1,000 pods for C. ochroleuca, depending on the character (Table II).These results indicate that to estimate the mean of various characters more than one thousand pods should be evaluated, if the researcher chooses to use this level of precision.In these cases, it would be recommended study of sample sizing with simulation of sample sizes greater those stipulated in the present study.Using semi-amplitude of confidence interval of 95% of 1% of the mean by the Student's t test distribution, Teodoro et al.    plants of C. spectabilis, depending on the character to be measured.For intermediate precision [amplitude of the confidence interval of 95% (ACI 95% ) of 10% of the mean], the sample size ranged from 10 to 195 pods for C. juncea, from 10 to 65 pods for C. spectabilis, from 10 to 95 pods for C. breviflora and from 15 to 80 pods for C. ochroleuca, depending of the character (Table II).At this level of precision, the assessment of 20 pods is sufficient to estimate the mean of LP, WP and HP, regardless of the species considered.Whereas for MPWOS and MHS, 85 pods are necessary and, 195 pods for the determination of the characters MPWS, NSP and MHS, regardless of the species considered.Among the species, greater sample size is necessary to estimate the mean of C. juncea.As noted earlier, the characters MPWS, MPWOS, NSP, MSP and MHS showed greater variation, being higher the variation also in C. juncea (Table I), resulting in increased the sample size (Table II).For ACI 95% of 10% of the mean, Cargnelutti Filho et al. ( 2012) recommended the evaluation of 117 and 66 seeds, respectively, in jack bean and velvet bean.Leite et al. (2009) verified that the sample size ranged in accordance with the genetic or phenotypic parameter and with the variable.For C. spectabilis, Toebe et al. (2017a) recommended from 10 to 484 plants and Teodoro et al. (2015) recommended from 44 to 340 plants, depending of the character and the species of crotalaria, with semi-amplitude of 95% confidence interval of 5% of the mean.
Being adopted the lowest precision of this study [amplitude of the confidence interval of 95% (ACI 95% ) of 20% of the mean], the sample size ranged from 10 to 55 pods for C. juncea, from 10 to 20 pods for C. spectabilis, from 10 to 25 pods for C. breviflora and from 10 to 20 pods for C. ochroleuca, depending on the character (Table II).C. juncea showed in general higher variability and, consequently, higher sample size compared to the other species (Tables I and II).In this and in other species of crotalaria, variability in the sample size in function of character was observed, being that ACI 95% in a simulated sample size was lower for the character length of pod (Figure 1a shorter variability) in relation to ACI 95% of the mass of seeds per pod (Figure 1b -greater variability).Variability in sample size between characters or pairs of characters has been frequently reported in the literature, as it can be verified in sugar cane (Leite et al. 2009), maize (Cargnelutti Filho et al. 2010, Toebe et al. 2014, 2015, 2017b), jack bean and velvet bean (Cargnelutti Filho et al. 2012), C. juncea (Teodoro et al. 2015) and in C. spectabilis (Teodoro et al. 2015, Toebe et al. 2017a).
For the estimation of the coefficient of variation (CV in %) in a higher level of precision [amplitude of the confidence interval of 95% (ACI 95% ) of 2%], the sample size ranged between 55 to more than 1,000 pods, depending on the character and species (Table III).For intermediate precision [amplitude of the confidence interval of 95% (ACI 95% ) of 10%], the sample size for the estimate of coefficient of variation ranged from 10 to 105 pods for C. juncea, from 10 to 55 pods for C. spectabilis, from 10 to 65 pods for C. breviflora and from 10 to 50 pods for C. ochroleuca, depending on the character.For the estimate of coefficient of variation, with ACI 95% of 5% in maize, Toebe et al. (2014) verified the need of sample size from 20 to 725 ears, depending on the character, the year and hybrid to be evaluated.
Species with lower variability, as for example C. spectabilis, need smaller sample size for estimating the coefficient of variation as compared to C. juncea (Table III).Still, characters that showed less variation, as for example, LP, WP and HP (Table I), needed smaller sample size to estimate the mean (Table II) and for the estimate of the coefficient of variation (Table III), holding a certain level of precision.In this sense, Kelley (2007) and Toebe et al. (2014) also reported a need for greater sample size for estimating the coefficient of variation in variables with higher   scores of coefficients of variation and vice-versa.
The difference in variability of the sample size for estimating the coefficient of variation among characters of the same species can be found in Figure 1c, d, where the ACI 95% is lower for the length of pods in comparison to that seen for mass of seeds per pod in C. juncea.In these figures, it is also possible to check that the mean values of the coefficient of variation remains constant with increasing of the sample size, and there is a reduction of ACI 95% , i.e., the increase in sample size does not reduce the coefficient of variation, but reduces the variability of the estimate of the same, which means greater precision in the estimation of the coefficient of variation, as already observed by Toebe et al. (2014) in maize hybrids.

CONCLUSIONS
The sample size varies among species and characters, being necessary a larger sample size to estimate the mean in relation of the necessary for estimation of coefficient of variation.Based on this study, researchers will be able to size their experimental samples considering the species of crotalaria, variables and desired precision.
(2015)  verified the need for evaluation from 1,108 to 8,510 plants, depending on the character and the species of crotalaria (C.juncea or C. spectabilis).In this range of precision,Toebe et al. (2017a) recommended the evaluation from 241 to 12,107

( 1 )
MPWS: Mass of pod with seeds, MPWOS: Mass of pods without seed, LP: Length of the pod, WP: Width of the pod, HP: Height of the pods, NSP: Number of seeds per pod, MSP: Mass of seeds per pod, MHS: Mass of 100 seeds.

( 1 )
MPWS: Mass of pod with seeds, MPWOS: Mass of pods without seed, LP: Length of the pod, WP: Width of the pod, HP: Height of the pods, NSP: Number of seeds per pod, MSP: Mass of seeds per pod, MHS: Mass of 100 seeds.

Figure 1 -
Figure 1 -Percentile 2.5%, mean and percentile 97.5% of 10,000 estimates of: (a) the mean of the length of pods, in mm; (b) mean of the mass of seeds per pod, in g; (c) the coefficient of variation (CV) of the length of pods, in %, and; (d) the coefficient of variation of the mass of seeds per pod, in %, for the sample sizes n = 10, 20, ..., 1,000 pods of Crotalaria juncea.
spectabilis, C. breviflora and C. ochroleuca.For the characters MPWS, MPWOS, NSP, MSP and MHS, the coefficient of variation ranged between 8.83% and 36.31%.Yet the values of CV increased at the order C. spectabilis, C. breviflora, C. ochroleuca and C. juncea.These results suggest a lower sample size for LP, WP and HP compared to other characters and smaller sample size for C. spectabilis, C. breviflora and C. ochroleuca in relation to C. juncea.In ten characters evaluated in plants of C. spectabilis, Toebe et al. (2017a) verified a coefficient of variation in the range of 7.819% for mass of 100 grains to 55.452% for number of seeds per plant.In C. juncea and C. spectabilis, Teodoro et al. (

TABLE I Minimum, mean, median, maximum, variance, standard deviation (SD), coefficient of variation (CV in %), asymmetry, kurtosis and p-value of the test of normality of Kolmogorov-Smirnov to eight characters measured in 1,000 pods of four species of crotalarias -C. juncea, C. spectabilis, C. breviflora and C. ochroleuca -in the season of 2014/15 in Itaqui -RS - Brazil.
Mass of pod with seeds, in g: MPWOS: Mass of pods without seed, in g; LP: Length of the pod, in mm; WP: Width of the pod, in mm; HP: Height of the pods, in mm; NSP: Number of seeds per pod, in units; MSP: Mass of seeds per pod, in g; MHS: Mass of 100 seeds, in g; (2) *Asymmetry differs from zero, by means of the t test at 5% probability.ns Not significant. (3)*Kurtosis differs from zero, by means of the t test at 5% probability.ns Not significant.