ARTIFICIAL NEURAL NETWORKS TO PREDICT EFFICIENCIES IN SEMI-MECHANIZED BEAN ( Phaseolus vulgaris L.) HARVEST

Bean is among the most consumed and produced crops in Brazil. Given the high demand for food, the search for technologies and controllers to increase the efficiency of agricultural systems has grown. This study aimed to model artificial neural network (ANN) architectures to predict mechanical efficiencies in the semi-mechanized bean harvest. We used a multilayer perceptron network with three inputs (harvest moisture, threshing rotor rotation, and feed rate), two hidden layers of neurons, and one output (efficiency). We evaluated the efficiency in the header, separation on the threshing rotor, cleaning of sieves, and the total efficiency of the machine. ANN was processed by a scripted algorithm to model the network, alternate the number of neurons in hidden layers, as well as to select, test, and validate ANN with less error. ANN was validated by comparing its results with the experimental data. The architectures selected to predict efficiencies were 3-8-15-1 for the header, 3-9-7-1 for the thresher and separation, 3-5-11-1 for cleaning, and 3-15-10-1 for the total operation. ANN predicted satisfactory results with errors below 1% and a high hit rate, thus being valid to predict the efficiencies in the semi-mechanized bean harvest.


INTRODUCTION
Beans (Phaseolus vulgaris L.) have great relevance in the current scenario of the Brazilian economy, besides being one of the major food crops for direct human consumption rich in carbohydrates, protein, and iron (Conab, 2015). Brazil, India, and Burma are among the world's largest bean producers. Therefore, the relevant demand for this legume makes Brazil one of the largest producers in the world (Almeida et al., 2017). Besides the expressive economic importance of beans, both for domestic consumption and exports, this legume is essential for the population's diet and one of the main protein sources for social groups less economically favored (Nkundabombi et al., 2015). Legumes are uniquely rich in both protein and dietary fiber (Çakir et al., 2019).
In this sense, the increasing demand for this food source raises the importance to consider the efficiency of mechanized harvesting. Accordingly, mechanical systems have been designed to enhance harvest efficiency, reducing crop losses. In many cases, losses arising from the harvesting process are high and can result from several factors, directly affecting the product quantitatively and qualitatively (Souza et al., 2010).
The growing worldwide demand for food has driven the use of technologies to increase production process efficiency, using several tools to make it possible. In this context, more sophisticated techniques that assist in controlling processes and reducing losses have been increasingly sought after. Among these, artificial neural networks (ANNs) are intelligent systems that have been Engenharia Agrícola, Jaboticabal, v.42, special issue, e20210097, 2022 increasingly demanded in recent years (Rivera-Mejía et al., 2012).
ANNs can be used in many fields of expertise (Silva et al., 2016). In agriculture, for instance, they have been used to estimate sugarcane Pol values (Coelho et al., 2019) and recommend fertilization for guava orchards (Silva et al., 2004), as well as to predict corn harvest (Pishgar-Komleh et al., 2012) and harvester header (Peyman et al., 2013;Nadai et al., 2020) losses, among others (Borges et al., 2017;Sadeck et al., 2017;Pereira et al., 2018;Marey et al., 2020). Therefore, given the ANN potential and the need for effective mechanized harvesting systems, this study aimed to model ANN architectures to predict the efficiency of pick-up header, thresher, and separator for grains, as well as for product cleaning and the entire harvesting operation, during a semi-mechanized harvesting of beans, validating these networks through comparison with experimentally obtained data.

Modeling of ANNs
Data were processed using a multilayer perceptron (MLP) network with a multilayer feedforward architecture. This network has variable range of applications and versatility in terms of use (Silva et al., 2016).
The network used 70% of the data set for learning, with training supervised using a backpropagation algorithm. Another 15% of the data was used for testing and 15% for numerical validation to evaluate network performance.
An MLP network, displaying a 3-n1-n2-1 configuration, was used to search for an architecture that could represent efficiency adequately during harvesting. It had three variables in the input vector (harvest moisture, axial-flow thresher cylinder rotation, and harvester feed rate), n neurons in the first layer and in the second hidden layer (n1 and n2) with a varied number of neurons to find the best performing architecture, and a vector in the output layer (efficiency).
Activation functions consisted of the hyperbolic and logistic tangent for the first and second layers of neurons, respectively, and a linear function for the output layer. These functions were chosen for being the most recommended for the MLP network and present better results to solve the studied problem. Figure 1 shows the network representation with the configuration adopted in this study. FIGURE 1. Representation of a multilayer perceptron (MLP) network with configuration 3-n1-n2-1.

Field experiment
The network was trained, tested, and validated with harvest efficiency data from a field experiment. These data encompassed changes in grain water content (9.9 to 18.4%), header feed rates (0.4 at 4.97 kg s -1 ), and threshing cylinder speeds (420 to 540 rpm). The cylinder-concave clearance used during the test was set at 20 mm. Beans of the cultivar Carioca were used in the tests. The evaluated harvester was a Double Master pick-up threshing machine with an axialflow (rotor) thresher system. The maximum harvesting capacity of the harvester was 6.5 t h -1 . The machine was pulled and driven by a Massey-Ferguson tractor, Model MF 620, with a power of 82 kW.
We used the material other than grain (MOG) mass feed rate estimated as in [eq. (1)] because it represents better the effects of four elements involved in the harvesting process, namely a width equivalent to the number of windrowed rows, bean yield, MOG and harvester travel speed. Where: FR is the MOG mass feed rate (kg s -1 ); W is the equivalent width of the header (m); S is the speed of the pick-up thresher (km h -1 ); Yd is the bean yield (kg ha -1 ), Mog is the ratio of MOG mass to grain mass (kg kg -1 ).
The total loss of the machine (Equation 2) was determined by the sum of the losses on the header (Equation 3), thresher cylinder (Equation 4), and cleaning sieve (Equation 5). The efficiency is the quotient obtained from the division of the loss by bean yield, and its values are expressed as a percentage.
Where: ηr is the loss on the header (kg ha -1 ); mr is the mass of grains lost under the header (g); ηs is the loss in the separation system (kg ha -1 ); ms is the mass of grains lost in the threshing and separation system (g); ηl is the loss in the cleaning system (kg ha -1 ); ml is the mass of grains lost after the passage of the harvester machine (g); ηt is the total loss of the harvester (kg ha -1 ), A is the area of a frame placed in the transverse direction to the machine's displacement (m²).

Statistical validation of the prediction model
The data were subjected to a statistical test proposed by Leite & Oliveira (2002), with a procedure derived from the method described by Graybill (1976) to perform the Ftest. In the test, Yj is considered as an alternative method, while Y1 is taken as a standard method. Their relationship is expressed in matrix form (Yj = Y1β + ε).
The F-test at a 5% probability level was applied for a hypothesis H0: β0=0 and β1=1. Parameters such as F(H0), mean error (tē) added to the criterion rYjY1  (1 − | ē |) allowed analyzing model validation. Thus, it allowed confirmation as an alternative to the modeling with ANN for predicting the efficiency in the pick-up header, threshing and separation of grains, product cleaning, and the total machine efficiency during the bean harvest.

Selected artificial neural networks
The efficiency prediction for the pick-up header of a grain threshing machine had the best neural network architecture with a 3-8-15-1 configuration. This structure was composed of three inputs (water content, thresher cylinder rotation, and feed rate), eight neurons in the first hidden layer, 15 neurons in the second hidden layer, and one output (efficiency).
The best mean squared error predictor (Best) between observed and predicted data by the selected neural network was 1.04. The number of iterations (Epochs) required to determine the mathematical model was 23, which was where the training, testing, and validation are the closest to the best mean squared error. Figure 2 shows the data of the mean squared error of the network. Figure 3 shows the comparison between observed and ANN-selected data during training, testing, and validation. Training and validation had linear correlations above 0.86, while testing had a value of 0.62. Thus, the selected ANN was effective in processing, with a validation corresponding to 86% of the observed data.
The efficiency prediction for grain threshing and separation by pick-up threshing machine had a network with a 3-9-7-1 structure as the best architecture. It consisted of three inputs (water content, thresher cylinder rotation, and feed rate), nine neurons in the first hidden layer, seven neurons in the second hidden layer, and one output (efficiency). The best mean squared error (Best) between the observed data and ANN during the prediction of separation efficiency was 0.42. During network processing, 74 iterations (Epochs) were required to elaborate the mathematical model (Figure 4).
A considerable performance can be observed when comparing the observed and ANN-calculated data since validation and testing yielded linear correlation values above 86%, while for training it was above 93%. Figure 5 shows that ANN presented a better performance than the other networks in its processing because validation corresponds to 89% of the observed data. The prediction for the efficiency of grain cleaning during bean harvest, performed by a pick-up threshing machine, had as best architecture a neural network with a 3-5-11-1 configuration, consisting of three inputs (water content, thresher cylinder rotation, and feed rate), five neurons in the first hidden layer, 11 neurons in the second hidden layer, and one output (efficiency).
The best mean squared error (Best) between the observed data and the results from the neural network was 0.02. Thirty-five iterations (Epochs) were required to determine the mathematical model during the network learning process. Figure 6 displays the ANN processing performance and respective calculated mean squared errors.
As observed for grain pick-up header and separation, the network selected to predict cleaning efficiency was highly significant in quality. Both observed and ANNestimated data showed correlations between training and testing above 83%. However, validation had a linear correlation coefficient of 0.88, i.e., 88% of the validation corresponded to the observed data (Figure 7). The prediction of the total efficiency for harvesting beans with a pick-up threshing machine had as the best architecture a network with a 3-15-10-1 configuration, consisting of three inputs (water content, thresher cylinder rotation, and feed rate), 15 neurons in the first hidden layer, ten neurons in the second hidden layer, and one output (efficiency).
The best mean squared error (Best) between observed and predicted values was 1.84. Thirty-seven iterations (Epochs) were needed to determine the mathematical model to predict the total efficiency. Figure 8 demonstrates the network performance during processing and respective calculated values.
The parameters for the total efficiency ( Figure 9) show a slight decrease in the linear correlation, in which training correlates higher than 88%, testing with a value higher than 76%, but with a lower correlation, corresponding to a value slightly higher than 60% for validation. Even with a decrease, the ANN quality is considered significant since its values had a low mean relative error and statistical validation.  Table 3 shows the identity analysis of the model between the data predicted by neural networks and those obtained experimentally. The mean relative errors were lower than 1%, the coefficients of determinations were higher than 70%, the mean errors were considered null by the t-test, and all neural networks had a higher correlation between the validation data than the ANN accuracy.

Statistical validation of selected neural networks
The neural network selected to predict the total efficiency of semi-mechanized harvesting of beans was considered statistically equal to the observed data, as they showed no significance by the Graybill test, and the other analyzed parameters. However, the other efficiency predictions must be checked during harvesting and verify ANN-prediction continuous quality. Leite & Oliveira (2002) reported that it might be acceptable in some cases, especially when the mean squared residual is very low. In this case, the value of F(H0) tends to be very high, resulting in a high significance level. It occurs when the results of both methods are similar. Thus, the networks obtained to predict platform, threshing, separation, and cleaning efficiencies were also considered valid.
When mean relative error was evaluated, the results obtained with the neural network were lower than those observed by Souza et al. (2003), who mathematically modeled threshing and separation efficiencies during bean harvesting, with a mean relative error of 1.78%. The implemented simulation model can be considered valid with this error value because it presents a low mean relative error in the simulation of the efficiency of threshing and mechanical separation of the bean. TABLE 3. Identity analysis of ANN model evaluated by the mean relative error (Er), coefficient of determination (R²), ), mean error t-test (te), and the relationship between correlation and network accuracy (R>(1−e)).

Efficiency
Er ( Artificial neural networks to predict efficiencies in semi-mechanized bean (Phaseolus vulgaris L.) harvest Engenharia Agrícola, Jaboticabal, v.42, special issue, e20210097, 2022 The mean coefficient of determination of the selected neural networks was 0.76, a value equivalent to the mean coefficient of Souza et al. (2001), who modeled losses in the semi-mechanized bean harvest using linear models.
The coefficient of determination obtained for harvesting efficiency in the header is lower than that found by Peyman et al. (2013), who modeled losses in the cutting header and found a coefficient of determination of 0.837. Pishgar-Komleh et al. (2012) used ANNs to predict corn harvest losses related to threshing cylinder speed and harvester travel speed. These authors validated the network using the highest coefficient of determination, which was 0.93 for a network with a 2-7-10-1 architecture, but a relative error of 15.48%. On the other hand, the relative errors found in the present study for predicting the efficiencies of the header, cleaning, separation, and total were lower than 1%. Figure 10 shows the efficiencies of the pick-up header, separation of the threshing rotor, cleaning of sieves, and total obtained experimentally and those predicted by ANN. The selected network tended to slightly overestimate (0.64%) the prediction of the picking-up efficiency ( Figure  10a) up to 96.7% of the experimental efficiency, followed by a tendency in underestimating (0.23%) the data. The prediction of threshing efficiency (Figure 10b) showed an overestimation trend (0.52%) up to 97.4% of that observed, followed by a data underestimation trend (0.20%). Moreover, an overestimation trend (1.20%) for total efficiency (Figure 10d) up to 93.9% was observed; from this value onwards, the neural network was slightly underestimated (0.32%). However, little variation between ANN predictions and experimental data was observed when analyzing grain cleaning (Figure 10c), with a null trend up to 99.62%, followed by a slight trend of data overestimation (0.07%).
As we could train and validate the ANNS using little representative sample data, the method developed to obtain the ANNs can be used to model machine performance in other situations (Lunardi & Lima-Junior, 2021) and crop conditions (Borges et al., 2017;Marey et al., 2020).
For being a tractor-pulled machine, tractor speed is expected to control feed rate and hence crop yield. According to Queiroz et al. (2020), yield can be determined using predefined thematic maps or grain flow sensors. Figure 11 shows a schematic model of internal machine mechanisms, in which the header can be driven as a function of the tractor travel speed.