Stability of the hypocotyl length of soybean cultivars using neural networks and traditional methods

The length of the hypocotyl has been highlighted as a potential descriptor of the soybean crop. However, there is no information available in the published literature about its behavior over several planting times. The present study aimed to identify soybean cultivars with stability and predictability of hypocotyl length behavior through neural networks and traditional adaptability and stability methodologies. We analyzed 16 soybean cultivars in 6 planting seasons under greenhouse conditions. In each season, a randomized block design with 4 replications was adopted. The experimental unit was composed of 3 plants. The plot mean was used in the analysis. Hypocotyl length data were analyzed by analysis of variance and Tukey’s test. Then analyses were carried out using the Traditional Method, Plaisted and Peterson, Wricke, Eberhart and Russell, and Artificial Neural Networks. A significant effect (p<0.01 by the F test) was identified for Cultivars versus Planting Season and Planting Seasons and Cultivars. Cultivars BRS810C, BRSMG760SRR, TMG1175RR, and BMX Tornado RR showed lower averages, high stability, and general adaptability regarding soybean hypocotyl length whereas the cultivar BG4272 presented higher mean, high stability, and general adaptability. Identification of soybean cultivars of predictable and stable behavior as to hypocotyl length contributes to Soybean Improvement as it further our knowledge on the potential descriptor and the possibility of increasing the number of descriptors.


INTRODUCTION
From 1997 onwards, soybean cultivars (Glycine max (L.) Merr.) were protected under the Law of Protection of Cultivars (LPC), Law n. 9,456 of April 25, 199725, (BRASIL, 1997a) regulated by Decree no. 2,366, published on November 5, 1997 (BRASIL, 1997b). Due to the large number of cultivars registered each year, the descriptors used in the Distinguishability, Homogeneity and Stability (DHS) tests may not be sufficient for the differentiation of a new cultivar (NOGUEIRA et al., 2008, SILVA et al., 2017. The length of the hypocotyl was identified by NOGUEIRA et al. (2008), among several other Alves et al. characteristics including a potential descriptor for the soybean crop. MATSUO et al. (2012) analysed the magnitudes of genetic parameters such as genotypic determination coefficient and CV g(%) /CV e(%) . These researchers characterized the influence of the genetic component in the phenotypic expression of hypocotyl length in soybean genotypes, i.e., great genetic influence and little environmental effect. However, we were unable to find any manuscript in the current published literature with detailed information about the behavior of the cultivars as to the length of soybean hypocotyl analyzed in several different environments.
Knowledge on the subject is important since when a genotype is analyzed in different environments, its phenotypic value may be influenced not only by the environment to which it is subjected and its genotypic effect but also by an additional component called genotype-environment interaction (CRUZ et al., 2012). Although, studies on genotype-environment interaction are of great importance for the genetic improvement of cultivars and development of crops, these studies do not provide detailed information about the behavior of each genotype against environmental variations (CRUZ et al., 2012). To that end, these authors recommend adaptability and stability analyses through which individuals are able to identify cultivars of predictable behavior that are responsive to environmental variations under specific or broad conditions. Several methodologies are available for the analysis of the stability and adaptability of a group of genotypes tested in a series of environments such as the traditional method (CRUZ et al., 2012), PLAISTED & PETERSON (1959), WRICKE (1965, EBERHART & RUSSELL (1966), and based on artificial neural networks (NASCIMENTO et al., 2013) among other methods. Except for the methodology proposed by NASCIMENTO et al. (2013), the other methodologies have been used over the years in soybean crops and cultivars.
Differently from the traditional methodologies, the one based on Neural Networks uses in its training genotypes belonging to the classes defined by EBERHART & RUSSELL (1966). Subsequently, the simulated genotypes are used in training and validation of the neural network. Therefore, no assumptions about the model are made. The attribution of the genotype in terms of adaptability and stability is based not only on the genotypes in studies but also on a large collection of simulated genotypes according to the characteristics of the experiment.
The present study aimed to analyze and identify the cultivars that present hypocotyl length of low or high magnitude and that are stable throughout the evaluation period. As a result, it would be possible to suggest potential sampled cultivars for DHS trials, specifically for the list of examples of cultivars that allow greater standardization of the cultivars used in the comparison for DHS characteristics in different breeding programs thus increasing the reliability and quality of the studies submitted to the National Service of Protection of Cultivars -NSPC - (SNPC, 2017).
In view of the above-mentioned, our goal in this study was to identify soybean cultivars with stability and predictability of behavior regarding hypocotyl length through neural networks and traditional methodologies of adaptability and stability.

MATERIALS AND METHODS
The experiments were conducted in a greenhouse at the Federal University of Viçosa (UFV), Rio Paranaíba Campus, in the city of Rio Paranaíba, MG, southeast Brazil. Plants of 16 soybean cultivars (P98Y30; BRSMG820RR; BRSMG850GRR; BRS810C, BRSValiosaRR, BRSMG811CRR, Conquista, BG4277, BRSMG760SRR, TMG1175RR, BRSMG752S, BG4272, PRE6336, BMX Tornado RR, NA5909RG, and PRE5808) were analyzed in 6 planting seasons (November 2014, January 2015, March 2015, October 2015, February 2016, and April 2016. In this assay, pots containing 3dm 3 of substrate were used with soil previously fertilized according to the technical recommendations of the crop containing 1/3 of organic matter which were arranged in benches. Seeds of random size were planted at 2cm depth. Once the plants reached the V2 development stage (FEHR & CAVINESS, 1977), a digital caliper was used to measure the length of the hypocotyl; i.e., distance between the soil and the cotyledon node.
In each planting season, a randomized block design with 4 replicates was used. The experimental unit was composed of 3 plants. The plot mean was used in the analysis of the data on hypocotyl length. For the analysis of variance, the normality of the errors was assessed through the Lilliefors Test. The homogeneity of variance was evaluated according to the Bartllet Test in the Genes Software (CRUZ, 2013). Then a joint analysis of the stability experiments and analyses was carried out using the traditional method (CRUZ et al., 2012), PLAISTED & PETERSON (1959), WRICKE (1965, EBERHART & RUSSELL (1966), and based on artificial neural networks (NASCIMENTO et al., 2013).
Estimator of the stability parameter of the Traditional Method (CRUZ et al., 2012): In which: Y ij is the mean of the genotype i (i = 1, 2, ..., g) in the environment j (j = 1, 2, ..., a) and r is the number of replicates associated with each genotype.
The estimator of the stability parameter of PLAISTED & PETERSON (1959): In which: s 2 gaii' is the component of variance of the interaction between pairs of genotypes and environments.
The estimator of the stability parameter of WICKE (1965): In which: Y ij is the mean of genotype i in the j environment; is the mean of genotype i; is the average of the environment j; e is the overall average. According to EBERHART & RUSSELL (1966), the regression model used, the estimator of the stability and adaptability parameters, and the coefficient of determination were: In which: Y ij is the mean of genotype i in the j environment; b 0i is the mean of genotype i; b 1i is the linear regression coefficient which measures the response of the i th genotype to the variation of the environment; I j is the coded environmental index in which the mean is zero; d ij is the regression deviation; e is the average experimental error. The linear regression coefficient b 1i was evaluated at 1% and 5% significance by the t test.
In which: QMD i is the mean square of the regression deviation of each genotype; and QMR is the mean square of the residue. The variance component assigned to the regression deviations for cultivar i was evaluated at 1 and 5% significance by the F test.
In which: R 2 i(%) is the coefficient of determination for genotype i; e SQ(A/G i ) is the sum of squares of environments within genotype i.
In addition, the Tukey test (α=0.05) was used to compare the cultivar averages (means) for hypocotyl length.
Information on the artificial neural network, simulation of the data, and classification of the cultivars regarding adaptability and stability through RNA are available at NASCIMENTO et al. (2013).
In the present study, the backpropagation network single hidden layer was used and simulated 3000 genotypes; 2400 genotypes were used for training whereas 600 genotypes were used for validation purposes. Artificial neural network consists of 1 entrance layer, 1 intermediate layer, and 1 exit layer. The first layer has 6 entries which refer to the hypocotyl length values of the cultivars evaluated in 6 planting seasons. Number of neurons in the middle layer ranged from 1 to 10. The output layer was composed of 1 neuron. The output is determined by the genotype classification in 1 of the 6 classes defined by EBERHART & RUSSELL (1966).
The required compotents for the network function included the number of neurons in the hidden layer, initial values for weight, decay rate, and maximum iterations. These components were selected considering the network that provided an error value of 2% at the most for the set test as performed by NASCIMENTO et al. (2013) and BARROSO et al. (2013). The best architecture of the network in which the middle layer presented 6 neurons was established by the one that presented a classification error lower than 2%.
The joint analysis of the experiments and stability by the Traditional method (CRUZ et al., 2012), PLAISTED & PETERSON (1959), WRICKE (1965, EBERHART & RUSSELL (1966) were performed using Software Genes (CRUZ, 2013). In order to evaluate the adaptability and stability of the 16 soybean cultivars under study, the nnet function of the nnet package (VENABLES & RIPLEY, 2002) which was also implemented in R (R Development Core Team, 2018) of adaptability through the artificial neural network.

RESULTS AND DISCUSSION
In the present study, we were able by means of the analysis of variance to identify interaction effects of Cultivars versus Planting Season and Planting Seasons and Cultivars (α<0.01) using the F test with a coefficient of variation equal to 14.16%. These results corroborated those of NOGUEIRA et al. (2008) and MATSUO et al. (2012) who also identified that there is a significant difference between cultivars and reported a similar value for coefficient of variation. As the effect of the interaction was significant, there is a possibility that the best genotype in one environment may not be as good in another genotype (CRUZ et al., 2012). This demonstrated that a cultivar can present different hypocotyl lengths according to the planting environment evaluated. However, in the present study, our attempt was to identify genotypes of medium high or medium low hypocotyl length which are stable throughout the environments; i.e,. invariant or predictable genotypes as to its behavior.
We managed to identify cultivar BG4272 as the cultivar with the highest average hypocotyl length whereas cultivars TMG1175RR, BMX Tornado RR, BRSMG760RR, and BRS810C did not differed statistically and presented the lowest averages ( Table 1). The total amplitude was 13.14mm and the differences between the BG4272 for the cultivars of smaller lengths were 13.14mm, 12.56mm, 12.36mm, and 10.05mm, respectively.
By the analysis of the stability parameter of the Traditional Method (Table 1), we noted that cultivars BMX Tornado RR, BRSMG752S, NA5909RG, BRS810C, PRE5808, and BRSMG760RR presented the 6 smaller unfolding of the sum of squares; i.e., those that presented smaller variances within the evaluated environments. This means that they were more stable during the 6 planting seasons. According to CRUZ et al. (2012), the concept of stability is frequently associated with low yielding cultivars -in this particular case, the smaller hypocotyl lengths.
Thus, BMX Tornado RR, BRS810C, and BRSMG760RR presented lower and more stable means due to the smaller unfolding of the mean square sum, i.e. minimum variance between the environments to detriment of the other cultivars analyzed.
The magnitude of the stabilization parameters obtained by PLAISTED & PETERSON (1959) and WICKE (1965) were convergent with each other (Table 1). Cultivars PRE5808, BRSMG820RR, BRS810C, TMG1175RR, NA5909RG, and BRSMG811CRR were identified as the most stable cultivars. The methodology proposed by PLAISTED & PETERSON (1959) quantifies the relative contribution of each genotype to the genotypeenvironment interactions and identifies those genotypes of greater stability. According to CRUZ et al. (2014), the methodology proposed by WICKE (1965) presents basically the same advantages and disadvantages of the one proposed by PLAISTED Table 1 (1959). These methodologies are interrelated to some extent as WRICKE's (1965) stability estimates are obtained by the sum of squares of the interaction Genotypes versus Environments (SQGxA) whereas those of PLAISTED & PETERSON (1959) are obtained by decomposing the variance Genotypes versus Environment . Using these methodologies, cultivars BRS810C and TMG1175RR were identified as the low-medium, more stable cultivars in comparison to the other cultuvars in the different environments analyzed. This comes from the genetic characteristics of the genotype expressed in the different environments evaluated and detected using this statistical method Cultivars BRSMG820RR, BRSMG811CRR, BG4277, BRSMG760SRR, TMG1175RR, BRSMG752S, BG4272, BMXTornadoRR, NA5909RG, and PRE5808 presented general adaptability β 1 =1 and high stability or predictability in which is the variance attributed to the deviations of the regression obtained by the methodology proposed by EBERHART & RUSSEL (1966) (Table 2). When analyzing in conjunction with the cultivar means, it was noted that cultivar BG4272 had the highest mean, the broad (wide) adaptability, and high predictability and stability. In contrast, cultivars BRS810C, BRSMG760SRR, TMG1175RR and BMX Tornado RR showed lower averages, wide adaptability, and high predictability and stability. Using this methodology, we were able to identify medium high or low and stable cultivars as to the hypocotyl length of soybean plants throughout 6 planting seasons. Cultivars P98Y30, BRSM88RR, BRS Valiosa RR, BRSMG811CRR, MG/ BR46 (Conquista), BG4277, BRSMG760SRR, TMG1175RR, BRSMG752S, BG4272, PRE6336, BMXTornadoRR, NA5909RG, and PRE5808 showed general adaptability and high predictability through the network methodology of use of artificial neural networks proposed by NASCIMENTO et al. (2013) (Table 2). When hypocotyl lengths are analyzed altogether, the cultivar BG4274 was identified as having the highest average, general adaptability and high predictability whereas cultivars BRS810C, BRSMG760SRR, TMG1175RR, and BMX TornadoRR were the cultivars with the lowest average of general adaptability and high predictability.
Identification of cultivars with lower or higher mean, high adaptability, and high stability by the use of RNA was 100%. These findings are in agreement with those of EBERHART & RUSSELL (1966). This result corroborated that of TEODORO et al. (2015) who observed a 100% agreement between EBERHART & RUSSELL (1966) and RNA methods for discriminating the phenotypic adaptability of the analyzed genotypes. When considering all the cultivars analyzed, 87.5% and 81.25% agreement between the methods were identified for Adaptability and Stability, respectively. Results among the methods, were not concordant for the cultivars BRS Valiosa RR and Conquista in the analysis of the adaptability and for P98Y30, BRSMG850GRR, and PRE6336 for stability. According to BARROSO et al. (2013), the percentages of agreement between the EBERHART & RUSSELL (1966) methodology and the training of an artificial neural network for the analysis of the adaptability and phenotypic stability of alfalfa genotypes (Medicago sativa) were, respectively, for adaptability and stability, 81.52%, and 83.69%, comparing EBERHART & RUSSELL (1966) and Artificial Neural Networks. In addition, the high percentage values of the network with EBERHART & RUSSELL (1966) showed that the network was trained according to the concepts of adaptability and stability proposed by EBERHART & RUSSELL (1966) (BARROSO et al., 2013). According to TEODORO et al. (2015) and CARVALHO et al. (2018), RNAs can be regarded as an effective alternative to measure the adaptability and phenotypic stability of genotypes in genetic improvement programs.
Artificial neural networks have also been used for other purposes such as: assess if there is an adequate neural network available for the prediction of electrical energy of a photovoltaic system (PINHEIRO et al., 2017); prediction of mass gain in animals using the multiple linear regression method and a technique based on artificial intelligence -more specifically, artificial neural networks (LOPES et al., 2017); proposed the use of artificial intelligence through artificial neural networks and genetic algorithms, respectively, in the simulation of oat grain yield (Avena sativa) and optimization of sowing density in the main succession systems of south Brazil (DORNELLES et al., 2018); and evaluate the efficiency of RNA and nondestructive sampling to estimate nutrient use efficiency in the trunk (LAFETÁ et al., 2018) Possibility of identifying cultivars with high to medium low to high hypocotyl length, invariant, with high adaptability and high stability (predictability) strengthens the hypothesis that, in the set of cultivars analyzed as for hypocotyl length, 31.25% of the cultivars were stable throughout the 6 planting seasons. This reinforces the idea that hypocotyl length is a potential descriptor of soybeans and that studies show the possibility of identifying potential sampled cultivars to be used in performing the DHS assays. Sample cultivars which are differentiated by macroregion should be used, as appropriate, as controls to clarify the expression levels of each trait used in DHS tests (SNPC, 2017).

CONCLUSION
Cultivars BRS810C, BRSMG760SRR, TMG1175RR, and BMX Tornado RR were identified as being cultivars of lower mean, high stability, and general adaptability.
In addition, cultivar BG4272 was identified as a cultivar of higher average, high stability, and general adaptability regarding the hypocotyl length of the soybean.

DECLARATION OF CONFLICT OF INTERESTS
The authors declare no conflict of interest. The founding sponsors had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, and in the decision to publish the results.