DEEP LEARNING-BASED MODEL FOR CLASSIFICATION OF BEAN NITROGEN STATUS USING DIGITAL CANOPY IMAGING

ABSTRACT Laboratory chemical analysis of leaf samples can be costly and time-consuming, making it impractical for assessing crop variability. To address this challenge, researchers have focused on developing non-invasive tools that aid nitrogen (N) management, maximizing profits, minimizing environmental impact, and meeting market demands. This study aimed to develop a computer vision-based classifier system for assessing the N status in bean crops. An experiment was conducted in a greenhouse, involving five treatments (0%, 50%, 100%, 150%, and 200% N of the recommended dose) with six replications, totaling 30 pots containing six seedlings of Phaseolus vulgaris L. beans in four different phenological phases (V4, R5, R6, and R7). Digital RGB images of the bean canopies were captured using a camera at four-week intervals (30, 37, 44, and 51 days after emergence - DAE). The images were manually labeled to create an image database based on N status. Four different computational N status classifiers were developed by training a Convolutional Neural Network (CNN), one for each DAE. The classifiers were evaluated using confusion matrix metrics (accuracy, precision, and recall), resulting in an overall accuracy of about 80% when evaluating nitrogen status at five levels. Improved results were achieved by grouping the saturation classes of the 150% and 200% treatments with the 100% class (>=100% class), yielding an accuracy of 97% for 30 and 44 DAE. Promising results aside, this method opens new possibilities for improvement and application to other treatments, electromagnetic spectrum bands, and crops.


INTRODUCTION
Laboratory chemical analyses of leaf samples are expensive and time-consuming, therefore infrequently conducted.They may also be inefficient when crop variability is significant.Therefore, research endeavors have focused on developing non-invasive tools to support nitrogen (N) management, aiming to maximize profit, minimize environmental impact, and meet market demands.In this sense, images of leaves, and various biometric measures related to shape, texture, and color, among others, have been employed to provide quantitative information associated with plant nutrition (Confalonieri et al., 2015).Golzarian & Frick (2011) extracted distinct color and shape characteristics from images of wheat, ryegrass, and bromo in the early stages of growth, employing principal component analysis to distinguish between these species.They achieved classification accuracy ranging from 82.4% to 88.2%.Romualdo et al. (2014) proposed an artificial vision system to analyze and interpret RGB (red, green, and blue color model) images for identifying N deficiency at various stages of corn crop development.Their method, based on Naive Bayes classification, successfully identified levels of N deficiency in the initial stages of corn growth, with an overall percentage of correct answers of 82.5% in V4 and 87.5% in V7.Confalonieri et al. (2015) introduced a variation of the DGCI (dark green color index) index to estimate N levels in leaves using RGB images.They evaluated the new method using data Engenharia Agrícola, Jaboticabal, v.43, n.2, e20230068, 2023 collected from rice cultivation, yielding a mean square error of 14.9% for N estimation.
Despite the potential of imaging techniques for determining N deficiency, researchers have acknowledged influences on method reliability due to camera quality and lighting conditions.Additionally, the inherent complexity of biological systems and their component subsystems, as well as the intricate interactions between them, involve numerous variables, resulting in highly complex mathematical models.Recent investigations into computational methods based on artificial intelligence (AI) techniques have demonstrated positive outcomes in generating predictive models and classifiers for plant production systems.For example, Abdalla et al. (2020) combined two techniques to assess the nutritional status of rapeseed at different growth stages, while Gokulnath & UshaDevi (2017) employed machine learning techniques for the automatic detection of plant diseases, enabling early diagnosis.Escalante et al. (2019) used RGB images with CNN to estimate nitrogen fertilization and barley yield.
Convolutional Neural Network (CNN), a technique within the domain of Deep Learning, has been successfully applied across various fields, including agriculture (Kamilaris et al., 2017).CNN represents an extension of classical machine learning methods, incorporating increased model complexity, and allowing hierarchical data representation through multiple levels of abstraction.CNNbased modeling exhibits the potential to address more complex problems accurately and rapidly, provided that sufficiently large datasets describing the problem are available (Kamilaris & Prenafeta-Boldú, 2018).Dyrmann et al. (2016) highlighted that in comparison with classical classification methods, CNN-based modeling applied to plants is less affected by natural variations such as lighting changes, shadows, bias, and occluded plants.
CNN modeling possesses several features that justify its comparison, primarily its exceptional performance in image recognition.These models leverage convolutional filters to extract relevant information from images (Bouguettaya et al., 2022).This capability enables various applications, including those within the agricultural domain, such as predicting extreme climate damage to agriculture to minimize economic losses (Benos et al., 2021;Zhang et al., 2021), detecting plant diseases through leaf images captured at different resolutions, facilitating the timely application of preventive techniques (Tiwari et al., 2016;Sambasivam & Opiyo, 2020), and utilizing deep learning with ultra-spectral soil data to achieve precision agriculture (Zhong et al., 2021).
Building upon this foundation, this study proposes and demonstrates a method for designing a computer vision system based on CNN to classify bean plots according to their N content using leaf image cutouts.Furthermore, the approach was evaluated on different days after emergence (DAE) to assess the reliability and feasibility of the computational classifier for distinct phases of bean cultivation.

MATERIAL AND METHODS
An experiment was conducted to create an image database using bean cultivation as the subject.The experiment took place during the months of February and March, which correspond to the summer season, at the facilities of the Faculty of Animal Science and Food Engineering (FZEA) of the University of São Paulo (USP) in Pirassununga, SP, Brazil.The geographic coordinates of the location are approximately 21°57′02″S, 47°27′50″W, with an average elevation of 630 meters above sea level.The region experiences a Köppen climate classification of Cwa type, which is characterized by two distinct seasons: a rainy summer and a dry winter with infrequent occurrences of frost.The average annual air temperature in the city is 21.5°C.During summer, the mean air temperature, relative humidity, and dew point were recorded as 29.21 ± 6.20°C, 62.59% ± 20.51%, and 20.33 ± 1.79°C, respectively.
The image database was collected and organized to serve as the predictive attribute (input) for training computational classifiers based on deep learning methods, with the N level as the target attribute (output) of the model.The data collection process and modeling approach are described below.

Data acquisition and preprocessing
Treatments consisted of recommended nitrogen (N) doses based on soil analysis, namely 0%, 50%, 100%, 150%, and 200%.A total of 30 pots, each filled with 15 dm³ soil obtained from C horizon classified as dystrophic Red Latosol, were utilized.Cultivation was conducted in pots to enable better control over the applied N doses and minimize losses.The pots were arranged in a completely randomized design, employing a plot scheme subdivided in time, specifically at 30, 37, 44, and 51 days after emergence (DAE), with plots representing the doses and subplots of the season.Irrigation was conducted daily to ensure that the water deficit did not influence plants.In all pots, six seeds of Phaseolus vulgaris L. beans, specifically the cultivar BRSMG Madrepérola, were planted.As N is a highly mobile soil nutrient, doses were applied in two stages: onethird of the total dose during the planting stage and the remaining two-thirds at 20 DAE.The other nutrients were uniformly mixed with soil material in all pots based on the soil analysis results, with N intentionally left as a limiting factor for bean production.
Canopy digital images were captured using a Fujifilm digital color camera, specifically the Finepix S4500 model, which was equipped with a 3.0" LCD, 14MP resolution, and 30x optical zoom.The camera was securely mounted at a fixed height using a Vivitar brand VIV-TR75 universal tripod, which can reach a maximum height of 1.20 m (Figure 1).Photographs of bean plants were captured from approximately 80 cm away from the pots, ensuring consistency by taking them at the same times of the day (between 10 a.m. to 2 p.m.) to minimize lighting variation impact on the images.Environmental control was not applied during the photography sessions to replicate conditions similar to those found in the field (Hennessy et al., 2022).A photograph was taken of each pot to create a comprehensive database.To determine the optimal time for distinguishing N levels in leaves, images were acquired starting from 30 DAE, which is when nitrogen deficiency begins to manifest in the leaves.This process was repeated once a week for a total of four weeks, specifically at 30 DAE, 37 DAE, 44 DAE, and 51 DAE.
The digital images were processed at the Laboratory of Machines and Precision Agriculture (LAMAP) with computational support from the Laboratory of Robotics and Automation of Biosystems Engineering (RAEB) at FZEA-USP.The image dataset underwent processing using a custom script developed in MATLAB® R2015a software (The MathWorks Inc.).The script facilitated automatic and random cropping of the images, generating samples or clippings for training the N prediction model.
A preliminary investigation was conducted to explore varied sizes of clippings, considering horizontal and vertical dimensions ranging from 20 to 240 pixels.These dimensions are crucial in deep learning methods.The study revealed that clippings with dimensions of 40x40, 60x20, and 80x80 pixels, all with a resolution of 96 dpi, could be utilized without compromising modeling quality while reducing computational demands.Clippings of this size enable the development of a computer vision system that samples 'n' cutouts from a larger image.The automatic classifier can then generate multiple predictions, which, when combined, result in a more robust and accurate estimate.

Convolutional Neural Network-based modeling
Cutouts from images were randomly assigned for the modeling steps, including training, validation, and testing.
For training, approximately 60% of the images were used, resulting in around 1800 clippings for each nitrogen level.Validation was conducted using 20% of the images, which amounted to approximately 600 clippings for each nitrogen level.The validation set was employed during the training process to assess the error and determine the stopping condition.The remaining 20% of the images were reserved for assessing the performance of the N prediction model, yielding approximately 600 clippings for each nitrogen level.This process was repeated for images collected at 30, 37, 44, and 51 DAE.
To augment the training and validation images and expand the database, translation and mirroring techniques were applied using a MATLAB script.The translation technique involved applying a crop filter, smaller than the original image, at various positions to generate new images from the cropped sections of the original image.Nazki et al. (2020) presented a pipeline utilizing GANs in an unsupervised image translation environment, which improved learning by addressing data distribution in a plant disease dataset.This approach reduced the bias introduced by acute class imbalance and resulted in better classification accuracy (+5.2%) by adjusting the classification decision threshold.The mirroring technique involved creating a second image by performing a 180° rotation on the original image, producing an inverse version.From the database of 12,000 images, approximately 3000 clippings were obtained for each nitrogen level.
Table 1 provides examples of 60x20 dimension cutouts from leaves and leaves with associated soil, representing the five nitrogen levels for images collected at 37 DAE (phenological phase R5).TABLE 1. Examples of 60x20 dimension cutouts associated with nitrogen levels for images collected at 37 days after emergence -DAE (phenological phase R5).

Nitrogen level (%)
Cutout -Only Leaf Cutout -Leaf and Soil 0 50 100 150 Once the image database was organized, four N classifier models were constructed utilizing CNN architecture, with one model designed for each week of the experiment (30 DAE, 37 DAE, 44 DAE, and 51 DAE).The Sequential Keras Library (version 2.0.6) for the Python Programming Language (version 3.5.4rc1) was utilized to build these models.The training data was used to construct the models, while the validation data was employed to monitor their progression and determine the optimal configuration of hyperparameters and the most effective combinations of cutout sizes.The fine-tuning process, based on the model with the highest accuracy, enabled the determination of values for the following hyperparameters: type (convolution and pooling), size and number of filters, step, activation function ('relu'), and the number of layers.This iterative process continued until the final models were obtained (as shown in Table 2).

Classifier evaluation and validation
Trained models were assessed for performance by confusion matrices, which involved comparing the N labels predicted during the testing phase with the true labels.Confusion matrices are known to provide valuable insights into the effectiveness of classifiers in terms of accuracy, precision, and recall.Therefore, these metrics were calculated based on the values derived from the confusion matrix, including true positives, true negatives, false positives, and false negatives, following the method outlined by Sokolova & Lapalme (2009).

𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 = (𝑇𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠 + 𝑇𝑟𝑢𝑒 𝑛𝑒𝑔𝑎𝑡𝑖𝑣𝑒𝑠) / 𝑇𝑜𝑡𝑎𝑙
(1)  =   / (  +  ) (2) While accuracy provides an overall measure of correct answers, it can be influenced by the inclusion of falsely classified examples.In the case of unbalanced classes, accuracy may give the impression of correct answers being concentrated in a single class.Hence, it is crucial to consider additional metrics such as precision and recall.Precision reflects the classifier's ability to accurately classify positive instances and avoid misclassifying negative instances, ensuring that correct answers are not classified as incorrect.On the other hand, recall indicates the classifier's ability to identify all the correct samples for each class, minimizing false negatives (Sokolova & Lapalme, 2009).

RESULTS AND DISCUSSION
CNN-based individual training models were created for each week (n DAE -n days after emergence).These models achieved an overall accuracy of 81.4% at 30 DAE (Table 3), 82.8% at 37 DAE (Table 4), 78.9% at 44 DAE, and 80.4% at 51 DAE (Table 5).The most favorable outcomes were obtained using 60x20 cutouts.
To assess N status prediction at 30 DAE, the generated model was assessed using 592 cutouts from a separate testing dataset.The model could identify 0%, 50%, and 200% N status accurately.However, it did not perform as well in distinguishing between the 100% and 150% N status, as indicated by lower recall and precision values for these classes (Table 3).This suggests that the image database needs a better balance with more samples to enhance the performance of future CNN-based modeling endeavors.Notably, the precision values for the 100% and 150% classes were 50.5% and 75.6%, respectively, which are lower than those of the other classes.The model developed to predict N status at 37 DAE underwent testing using 853 images from a separate test dataset, yielding the highest accuracy of 82.8% (Table 4).However, the optimal performances were observed in the 0% and 150% classes, with precision values of 99.0% and 90.5%, respectively.Consequently, while it achieved the highest accuracy among the models, there is a significant disparity in the correct predictions across different classes.The third model, trained with 44 DAE data, exhibited the lowest accuracy among the models, measuring 78.5%.It was evaluated using 648 cutouts from a separate test dataset.However, it demonstrated favorable performance in correctly identifying the 0%, 50%, and 100% classes, as evidenced by the precision values (100.0%,89.0%, and 75.8%, respectively) and recall values (93.6%, 98.6%, and 70.5%, respectively) shown in Table 5.The model designed to predict N status at 51 DAE underwent testing using 654 cutouts from a separate test dataset.It achieved an accuracy of 80.4%, which is comparable to the models generated for 30 DAE and 37 DAE.Notably, it generated numerous correct predictions for the 0%, 50%, and 100% classes, as demonstrated by the precision values (90.8%, 87.0%, and 85.4%, respectively) and recall values (93.8%, 82.0%, and 70.1%, respectively) outlined in Table 5.
Engenharia Agrícola, Jaboticabal, v.43, n.2, e20230068, 2023 When considering N status classification, the model generated with 37 DAE data performed the best in terms of accuracy (Table 4).However, when examining the precision and recall of each class, the model was not consistent in accurately predicting each class, resulting in non-uniform performance across classes.Considering these metrics, the best model is observed to be the one generated with 30 DAE data.This is particularly evident when combining data from the 100%, 150%, and 200% classes into a single class called "> =100%", as summarized in Table 7. Consequently, all records with N status values of 100%, 150%, and 200% were grouped as "> =100%" and utilized alongside records from the 0% and 50% classes to train a CNN-based model for predicting three N status classes (0%, 50%, and > =100%).Notably, these three N doses applied to beans (100%, 150%, and 200%) fall within the recommended range for the crop.Hence, leaf coloration is expected to be similar since plants received the proper amount of N for their development at different phenological stages.7) into one (>=100%), the overall accuracy of the models increases.Both the 30 DAE and 44 DAE models demonstrate the highest accuracy, reaching about 97%, with consistent precision and recall values.This suggests that leaves from the 100%, 150%, and 200% treatments exhibited less variation in color intensity and distribution, making it challenging for the models to learn.Sabzi et al. (2021) used hyperspectral imaging (HSI) to predict N levels in cucumber leaves, employing three regression methods (artificial neural networks-particle swarm, partial least squares regression, and convolutional neural networks).These authors grew the plants in 20 plots that received three N levels (30%, 60%, and 90%), with leaves being scanned under controlled light conditions.They noted that the best models achieved an approximate correlation coefficient of 0.9 for numerical value prediction within established ranges.In contrast, we used RGB images collected in an outdoor environment and employed image clippings of leaves for local sensing using computer vision equipment.This equipment enables multiple collections from the same plant, providing classifications for decisionmaking modules.Qiu et al. (2021) employed RGB images captured by an unmanned aerial vehicle to estimate the Nitrogen Nutrition Index (NNI) during different growth periods of rice.They utilized machine learning algorithms (adaptive impulse, neural network artificial K-nearest neighbor, partial least squares, random forest, and support vector machine), with Random Forest algorithms demonstrating the best performance and coefficients of determination ranging from 0.88 to 0.96.In contrast, the proposed method in this study focuses on local sensing using computer vision equipment and image clippings of leaves.However, the general methodology could be applied to images of agricultural fields to predict nutrition parameters.Safa et al. (2019) conducted a study using thermal images and artificial neural networks (ANNs) to estimate N content in perennial ryegrass (Lolium perenne) pastures.They aimed to estimate N content based on plant temperatures and various environmental parameters (air temperature, wind speed, soil temperature and moisture, humidity, and solar radiation).The ANN model achieved a high variance between training and validation data (94% and 93%, respectively) for estimating pasture N content.
The results demonstrate the promising potential of automatic N status prediction using RGB images and deep learning techniques.This method could facilitate the development of computer vision systems that sample cutouts from larger images and use automatic CNN-based classifiers to generate multiple predictions, resulting in more robust and accurate estimates.However, future improvements should include a larger and better-balanced image database with an equal number of images for each class.Additionally, the method should be evaluated to recognize other classes representing treatments below 100% (e.g., 0%, 20%, 40%, 60%, 80%, and 100%).Lastly, exploring other electromagnetic spectrum bands, such as near-infrared images, is another potential avenue for further investigation.

CONCLUSIONS
A CNN-based computer vision system was developed to accurately identify and classify nitrogen status in bean leaves.The system demonstrated the capability to detect nitrogen deficiency in the early stages of common bean development using RGB images.Although the CNNbased classifier performed well, there is still room for improvement through additional data acquisition experiments.Expanding the training database to include a wider range of nitrogen status classes and achieving better balance among the classes would enhance the classifier's performance.Furthermore, the concept presented in this study can be extended to explore other bands of the electromagnetic spectrum and can be adapted for use with different crops.

FIGURE 1 .
FIGURE 1. Digital imaging experiment for bean cultivation in pots.

TABLE 2 .
Convolutional Neural Network (CNN) based architecture for nitrogen level modeling.

TABLE 3 .
Confusion matrix between the actual and CNN-based models for nitrogen status classification 30 days after emergence.

TABLE 4 .
Confusion matrix between the actual and CNN-based models for nitrogen status classification 37 days after emergence.

TABLE 5 .
Confusion matrix between the actual and CNN-based models for nitrogen status classification 44 days after emergence.

TABLE 6 .
Confusion matrix between the actual and CNN-based models for nitrogen status classification 51 days after emergence.