New classification of weed development stages using machine learning methods: radiomics parameters

Gülbandilar, Eyyüp; Uludağ, Ahmet; Çiçek, Yasin

doi:10.51694/AdvWeedSci/2025;43:00018

Abstract:

Background: In recent years, artificial intelligence methods based on image processing have been introduced for weed identification therefore weed control. Image processing methods typically cause a heavy computer workload.

Objective: The goal of this research was to create a classification model for weed development stages with better accuracy reduced workload by incorporating Region of Interest (ROI) into existing classification models.

Methods: Weeds were grown and photographed in several development stages for the dataset. Using the ROI technique, commonly utilized in medicine, the leaf image features were digitized, and a total of 448 sample records were obtained. Of these image features, 9 were identified as the most important variables using the linear regression model. SMOTE analysis was applied to balance the distribution in our data. The data were randomly divided into 70% for training, 15% for testing, and 15% for validation groups. The models were developed by using artificial intelligence methods such as ANFIS, MLPNN, SVM, kNN, Naive Bayes, Decision Tree, Random Forest, Deep Learning, and Logistic Regression.

Results: Accuracy, Precision, Recall, and F1-score parameters were used to evaluate the performance of the models. NB, ANFIS, and LR models failed to produce results within acceptable limits. However, RF, MLPNN, DT, Keras, and SVM models were successful. kNN models results were not far off but a failure nonetheless.

Conclusions: Based on these results, we demonstrated that our RF, MLPNN, DT, Keras, SVM, and kNN models with ROI implementation can successfully determine the development stages of weeds.

Keywords:
Weed; Radiomics; ROI; Weed Development Stage; Classification

1. Introduction

Since the beginning of human civilization, agriculture has been crucial for providing basic needs such as food, fuel, clothing, and shelter. To maximize crop yields and ensure sustainability, plants are cultivated using various agricultural practices. These practices usually focus on providing optimal conditions—such as proper water, temperature, humidity, and soil nutrients—while carefully managing sowing and planting periods.

Weeds are a significant problem in agriculture, causing qualitative and quantitative crop losses, which vary depending on factors like weed species, crop type, and environmental conditions. Crop losses can reach up to 80% in sugar beet, with complete crop failure in extreme cases (Jabran et al., 2018). Weed control methods include mechanical, biological, and chemical approaches (Korres et al., 2018). Accurate weed identification and population estimation are key to effective control, traditionally achieved through morphological identification (Torun et al., 2023).

With technological advancements, artificial intelligence (AI) and machine learning have become integral to agriculture, enabling weed identification, pest control, and efficient management practices (Cicek et al., 2022). There are numerous studies on weed classification through AI in the literature, employing various models, weed types, crop types, and conditions. Some of the most relevant studies are as follows:

Pathak et al. (2023) demonstrated that k-Nearest Neighbor (kNN), Support Vector Machine (SVM), and Random Forest (RF) models, combined with handcrafted image processing methods, can successfully classify weeds in cornfields, with the highest precision and recall values reaching 0.90 for kNN and ten-fold cross validation accuracy of 75.37 overall. UAV-based image classification of crops like wheat and paddy using these techniques has also achieved high precision, with Naïve Bayes (NB) and modified SVM reaching 99.09% (Sugumar, Suganya, 2023).

In other studies, deep learning techniques have been employed for weed detection. Meena et al. (2023) reported 99.62% accuracy using DenseNet in identifying various plant species, while Espejo-Garcia et al. (2020) compared multiple models, finding Xception to be the most accurate with an F1 score of 99.54. Lottes et al. (2016) achieved 90% accuracy using RF and Markov Random Fields, with images obtained from a robot equipped with a camera, to distinguish between sugar beets and weeds, and Potena et al. (2017) reached a 96.1% success rate using CNN and RGB-NIR image segmentation with real-time imagery from a robot.

Other studies have utilized UAV data for weed detection, achieving varying levels of accuracy across different machine learning models. For example, Islam et al. (2021) obtained 96% accuracy with RF and 94% with SVM for pepper field weed detection. Yu et al. (2019) achieved an F1 score above 0.95 for detecting five weed types using CNN. Luo et al. (2021) classified 140 different weed seeds using CNN, reaching a precision of 93.11% with GoogLeNet.

Further research highlights the efficiency of CNN models in weed detection and classification. Razfar et al. (2022) demonstrated a 97.7% accuracy rate in soybean weed detection with a five-layer custom CNN, and Ong et al. (2022) achieved 92.41% accuracy with RF and CNN in Chinese cabbage fields using UAV images. Dai et al. (2023) used multispectral imaging to classify wheat and weeds with 91% accuracy with SVM, while Li et al. (2021) achieved 89.1% accuracy with MLPNN and SVM in classifying weeds using hyperspectral data in New Zealand pastures. Alam et al. (2020) used RF for weed classification, achieving 95% accuracy with a dataset of 396 images.

As mentioned above, the methods used for agriculture and weed classification/identification generally focus on image-based classification. Our literature review did not find studies that classify weeds using image features. Additionally, studies in the literature that focus on determining the growth stages of weeds using AI methods were not found. It is also known that the workload times in these image analysis studies are high. We believe that numerical image features, commonly used in the medical field, can be applied to agriculture to reduce workload times and increase accuracy. In the preliminary prototype study conducted using the photos from the same dataset, the growth stages of plants were classified using 5 different deep learning methods based on images with and without ROI. The average workload for the photos with ROI applied was found to be 8:50:480 (minutes:seconds:milliseconds), while the workload for the original images was 10:01:240 (minutes:seconds:milliseconds). Furthermore, determining the growth stages of weeds using AI can significantly aid farmers in managing the weed control process and determining the amount of herbicide needed. Efficacy of herbicides differs depending on the growth stage(s) of weeds, which is especially important in precision agriculture and use of lower rates of herbicides (Gerhards et al., 2022). In our study, we aimed to offer a different perspective to the literature by using numerical image features alongside the growth characteristics of the plant as input parameters.

2. Material and Methods

Since this study is based on the plants’ growth and developmental stages, it is crucial to capture images of the plant from the time of planting until it reaches certain stages. The study primarily aimed to create a dataset. After preparing the dataset, images were digitized using ROI, and appropriate features were identified. The study then classified the stages using classification algorithms.

The dataset comprises a total of 73 features. In addition to the data recorded during the plants’ growth time was also recorded. The images obtained during the study were evaluated along with the growth time data. The study aims to predict the growth stages of the plant or classify them based on the stages using all parameters. The study consists of three main stages as seen in Figure 1. First stage is the cultivating and photographing the weeds, second stage is the extracting the ROI images and radiomics from the photographs, and the last stage is preparing the data set, developing the models and evaluating the performances.

Figure 1
Main Workflow of our study

2.1 Dataset

The initial step in growing weeds is collecting their seeds. Ten different species of weeds’ seeds frequently found in the Sinanpaşa -Afyon-Turkey plains were collected and dried for planting. Seeds were sown under suitable conditions. Data collection were started as soon as weeds emerged to create a comprehensive dataset. Then, the development of the weeds was tracked twice a week at different time intervals, using a Nikon D7000 camera with ‘Nikon 50mm 1.8 mm’, ‘Tokina 11-16 mm’, and ‘Sigma 105 mm’ lenses. Once the weeds have reached a certain height, they were transferred to 16 different units (2⁴ = 16) with variations in soil, irrigation, light, and fertilizer conditions. Measurements of leaf count, root diameter, moisture, temperature, and height were taken, and the plants were photographed. Additionally, the leaves of the photographed plants were analyzed using ROI, extracting 62 numerical features from the weed images. Eleven more features describing the environment and weed characteristics were manually added. Through measurements and photography taken at different time intervals, 448 sample records were obtained. The dataset comprised a total of 448 rows (number of samples) and 73 columns (total features).

ROI is typically used to define the boundaries of a significant area within an image. In our study, the Lifex 6.30 (Nioche et al., 2024) software package was used to create ROIs from the images (Figure 2a, 2b). In medical imaging, ROI can be defined to measure the boundaries and size of a tumor. This approach allows for focusing on a specific area of interest rather than processing the entire image, thereby extracting various numerical features from that area. Calculations such as mean and maximum values can also be performed based on the defined ROI boundaries (Chityala et al.,2004). In our study, the numerical characteristics of leaf images were digitized using ROI.

Figure 2
(a) Original Image and (b) ROI focused image

The specifications of the device used in the studies are as follows: Intel i7 12650h (2.53 GHz), Windows 11 Pro, Nvidia Geforce RTX 4060 (Mobile). Software versions used in tests were Python 3.9, TensorFlow r2.1.

2.2 Feature Extraction

The data for the features defined above were collected, resulting in a dataset with 448 rows (sample size) and 73 columns (features). When the model was tested with all variables, the success rate was not sufficiently accurate. Therefore, to determine the less significant features with respect to our output variable, which is the date, a linear regression model was created using the SPSS software package (version 27.0.1) to identify suitable features (Figure 3). As shown in Table 1 below, the normalized importance values of the 9 significant features out of the total 73 features affecting our output variable are indicated in Figure 3. based on the new configuration, our dataset, which now includes 9 input parameters and 1 output parameter, consists of 10 columns and 448 rows. The output variable in our dataset, which is the "date" column, categorizes the growth periods of the weeds into four different class periods based on expert opinions. Due to the imbalanced distribution of data across these growth periods, the Synthetic Minority Over-Sampling Technique (SMOTE) algorithm was applied to balance the data distribution. SMOTE creates new minority class samples by operating between the nearest neighbors of the minority class samples, thereby equalizing the number of samples in all classes (Fernandez et al, 2018). After applying the SMOTE algorithm, the number of data points increased to 512 rows and 10 columns. This set of 512 data points was randomly selected and split into training, testing, and validation datasets: 70% (358 data points) for the training set, 15% (77 data points) for the test set, and 15% (77 data points) for the validation set.

Figure 3
Ranking of Input Parameters by Normalized Importance Levels

Thumbnail

Table 1
Statistical distribution of variables used in the study

Temperature: Temperature, measured as the soil temperature of the plant using a soil thermometer. glzlmZp: Zone percentage measures the homogeneity of the homogeneous zones. glcmElg2: Indicates the randomness of grey-level voxel pairs. ngldmCoa: Represents the level of spatial rate of change in intensity. conStd: reflects the standard deviations (in chosen unit) in the volume of interest. Height: The height of the plant above the soil at the time of measurement. glcmEner: Also known as Uniformity or Second Angular Moment, it represents the uniformity of grey-level voxel pairs. paramsBS: Number of parameters of the ratio of maximum and minimum difference to gray levels. glzlmZln: Gray-Level Non-Uniformity for zone or Zone Length Non-Uniformity, indicating the non-uniformity of the grey-levels or the length of the homogeneous zones (Nioche et al., 2024). The output variable, Date, defines the period from when the weed emerges from the soil to the time the image is taken. The "date" output variable has been categorized into four classes.

The statistical properties of the dataset used are provided in the Table 1. Furthermore, during the model creation phase, it was observed that there were differences between the means of the data in Table 1. Therefore, to standardize the data in our dataset, normalization was performed using Equation 1, scaling the values between 0 and 1.

(1)

x_{n o r} = \frac{x - x_{m i n}}{x_{m a x} - x_{m i n}}

where x is a random feature value that is to be normalized, x_nor is the normalized feature value, x_min is the minimum feature value in the dataset, and x_max is the maximum feature value (Ali et al., 2014).

2.3 Model Development

2.3.1 ANFIS

The ANFIS model was developed using the MATLAB software package, with the membership function types for the normalized input variables chosen as triangular membership functions. A similar triangular membership function was used for each of the nine input variables. The number of membership function sets for each variable was set to 2. The number of epochs during model training was set to 30. For the four-class output variable, 245 rules were used during model development, with the AND operator applied between input parameters. The hybrid method was selected as the training method, and the "wtaver" function was employed during the defuzzification phase, achieving higher performance compared to other functions.

The ANFIS process, essentially a hybrid of Artificial Neural Networks (ANN) and Fuzzy Logic, begins with the fuzzification stage. In this context, fuzzy logic is based on a structure of fuzzy sets and subsets (Zadeh, 1965). While traditional logic dictates that an element either belongs to a set or it does not, fuzzy logic allows an element to have a degree of membership in a set, ranging from 0 to 1. This permits intermediate values, enabling fuzzy logic to process operations in a manner similar to human thinking and to express these operations through mathematical functions.

2.3.2 Multilayer Perceptron Neuronal Network (MLPNN)

MLPNN is a type of artificial neural network algorithm. The algorithm is based on the multilayer perceptron model, consisting of neurons called perceptrons that produce results from multiple inputs by generating a nonlinear mapping between the input and output vectors. The MLPNN architecture fundamentally consists of three layers: input, hidden layers, and output layers, with the number of hidden layers being adjustable (Li et al., 2021).

In our study, besides the input and output layers, three hidden layers were utilized. The first hidden layer contains 4 neurons, the second hidden layer has 7 neurons, and the final hidden layer consists of 9 neurons. The tanh activation function was used for all hidden layers, while lbfgs was selected as the solver for weight optimization, with the maximum number of iterations set to 20,000.

2.3.3 Support Vector Machine (SVM)

SVM is a method frequently used in classification problems where points on a plane are separated by a line. The goal of the line drawn is to be at the maximum distance from the points of the class elements.

(2)

\hat{y} = {\begin{array}{l} 0, & i f W^{T} . x + b < 0, \\ 1, & f W^{T} . x + b \geq 0, \end{array}

Here, w represents the weight, x is the input vector, and b is the bias. If the resulting value for a point is less than 0, it is classified into one class; if it is greater than or equal to 0, it is classified into the other class (Sugumar, Suganya, 2023).

In our study, for our 9 normalized input variables and the single output variable with 4 classes, a multi-class classification model was used by employing the SVM pseudocode with Python Kernel. Support Vector Classification was chosen in the model, and because the classification is multi-class, the decision_function_shape parameter was set to ‘ovo’ (one-vs-one). Among the parameters used in the algorithm of the model, the kernel was selected as rbf, with degree 3, gamma scale, coef 0.0, probability false, tol 1×10−31×10−3, cache_size 200, and max_iter −1 (Table 2).

Thumbnail

Table 2
SVM Parameters

2.3.4 k-Nearest Neighbors (kNN)

kNN algorithm predicts the class of a given value based on the class information of its nearest neighbors. The algorithm primarily relies on two values: distance and the number of neighbors. It calculates the distance of the point to be predicted from other points and determines the class based on the number of nearest neighbors specified by the k value (Pathak et al., 2023). In our study, the kNN algorithm's n_neighbors parameter was set to 3, and the metric parameter was set to Minkowski.

2.3.5 Naive Bayes (NB)

NB classifier is a probabilistic classification algorithm that uses Bayes’ theorem. Its goal is to find the best match between new data and a set of classifications. To calculate this probabilistically, it converts joint probabilities into the product of prior and conditional probabilities (Yang, 2018). In our NB algorithm, because we are dealing with multi-class classification, MultinomialNB was used.

2.3.6 Decision Tree (DT)

DT are tree-like structures where each internal node represents a feature, each branch represents a decision rule, and each leaf node represents a class label. The topmost node in a tree is known as the root node, and the nodes that are not connected to any other node are known as leaf nodes. The values on the leaf nodes are the predicted output values. The algorithm progresses from the root towards the leaf node by making decisions at each node. The paths from the root to the leaf nodes represent classification rules (Kamiński et al., 2018). In our study, for the creation of the DT algorithm, the criterion parameter was set to entropy. In the RF algorithm, the n_estimators parameter was set to 10, and the criterion was set to log_loss.

2.3.7 Random Forest (RF)

RF algorithm is a powerful machine learning technique commonly used in classification. It consists of multiple DT combined to compute a classification. Each DT is constructed by randomly selecting a subset of features and using a different bootstrap sample from the training data, thus preventing overfitting (Alam et al., 2020).

2.3.8 Deep Learning (DL)

DL, a subset of machine learning, models artificial neural networks to work similarly to the human brain. DL is a general term for multi-layered neural networks. While a basic neural network contains two hidden layers, a DL model can have many more layers. It extracts the features and representations of the data with hidden layers between the input and output layers. Increasing the number of hidden layers and nodes increases both the accuracy and the cost (time) of the network (Janiesch et al., 2021).

DL model used consists of 3 layers: 2 hidden layers and 1 output layer. The model was created using the Python Keras library with a sequential model (Figure 4) The input layer contains 9 neurons, which include our normalized input variables (Temperature, glzlmZp, glcmElg2, ngldmCoa, conStd, Height, glcmEner, paramsBS, glzlmZln, Date,). The first hidden layer has 17 neurons, the second hidden layer has 5 neurons, and the hyperbolic tangent activation functions were preferred in these layers. Our output layer consists of a single neuron, which is our single output variable, "time," and this variable has 4 classes. The softmax function was chosen as the activation function in our output layer. During the creation of our model, the number of epochs was set to 400, and the batch size was set to 7. Different combinations of parameters used in the creation of the model, such as the number of layers, the number of neurons in each layer, and the number of epochs, were tested, and the highest accuracy rate was obtained with the values mentioned above. The pseudocode of the model is given in Table 3.

Figure 4
Deep Learning architecture general structure

Thumbnail

Table 3
Deep Learning Parameters

2.3.9 Logistic Regression (LR)

LR is a classification algorithm used to predict the probability of a categorical dependent variable. It is a model that can best define the relationship between the dependent and independent variables using the least number of variables. It uses the sigmoid (Logistic) function for classification. This function is an S-shaped curve, and classification is performed through the function (Bircan, 2004).

In our study, for the output classification in logistic regression, since we have multi-class classification, our multi_class parameter is set to multinomial, and our solver parameter is set to "lbfgs".

3. Results and Discussion

Using the radiomic features extracted from our recordings and image images of weeds at different time intervals, we created our models using nine different AI methods. We selected the best model parameters obtained with different parameters. In evaluating the performance of all our models, we used accuracy, Precision, Recall, and F1-score parameters. The performance measurement results of the nine models of our training group, which were randomly selected 70% from our data, are shown in Table 4. When the table is examined, it primarily shows the group measurement parameters of the four classes of our output variable, time interval. In the other columns, the average performance values of all models are seen. Examining Table 4 shows that the NB machine learning model's performance is below acceptable values. Although the accuracy rate for time of class 3 is close to our acceptable limit of 0.75, the overall performance of the NB model remains poor, and other measurement parameters are also below acceptable values. Therefore, unfortunately, the NB method has turned out to be a model that exhibits poor performance in training results. In contrast, when we examine the performance of our training models from MLPNN, DT, and RF models, we see that both the intra-class classification and the average classification performances are quite high. These results raise doubts about whether there is memorization in our models. To answer this question about our models, we need to also look at the test and validation results of these models. We also observed that the performance results of our other models are within acceptable values.

Thumbnail

Table 4
Performance Results of Training Data Used in the Study

Using the training data, we created nine models in our study. We applied these recorded models to the test and validation data we had previously set aside and obtained the model performance results in Table 5 and Table 6. For a comprehensive evaluation of the model performance results, it is befitting to consider these three tables together. Firstly, our NB model, which had unacceptable training performance, also showed unacceptable values in both test and validation results, thus confirming that the NB model is unsuitable for this type of classification. When we examined the MLPNN, DT, and RF models, which had high performance values, we found that the RF model had acceptable performance values in both test and validation results. Similarly, we found that the performance values of our MLPNN and DT models were either above or close to acceptable values. The Keras and SVM models also showed performance results that were above acceptable values in all measurements. We observed that while the kNN model had high training performance, its test and validation results were somewhat acceptable. However, despite the high training performance of our ANFIS model, its test and validation results unfortunately fell outside acceptable limits. Similarly, our LR model did not have acceptable performance results in both training and test phases.

Thumbnail

Table 5
Performance Results of Test Data Used in the Study

Thumbnail

Table 6
Performance Results of Validation Data Used in the Study

The average running times of all the tested machine learning models are MLPNN: 8,800 milliseconds, SVM: 4.52 milliseconds, KNN: 3.64 milliseconds, NB: 3.64 milliseconds, DT: 7.05 milliseconds, RF: 28.48 milliseconds, DL: 31862.97 milliseconds, and LR: 5.22 milliseconds.

Guzel et al. (2024) utilized YOLOv5 to predict the growth stages of three different weed species in wheat fields, classifying them using approximately 145,792 images. Their study demonstrated that DL architectures achieved high accuracy rates of 89% to 97% for detecting the first four phenological stages, which are crucial for weed management. While the fifth phenological stage of field forking larkspur was detected with a 45% success rate, the final stages of charlock mustard and creeping thistle were accurately predicted with rates ranging from 81% to 88% (Guzel et al., 2024).

In our study, the best classification performance in the training dataset was achieved using MLPNN, RF, and DT models, with F1-Scores ranging between 0.98 and 1.00. Similarly, KNN, SVM, and ANFIS models also yielded F1-Scores above 0.80. Therefore, the model performances obtained in this study can be considered partially superior to the results of Guzel et al. (2024). In contrast to their study, which reported significantly lower training performance for the 1st and 5th classes, our results showed no notable differences in classification performance across classes (Table 4).

Zhang and colleagues conducted a study on classifying weeds in soybean seedlings using ResNet50, VGG16 (Visual Geometry Group 16), and VGG19 (Visual Geometry Group 19). Out of a total of 9,816 augmented samples, 70% were used for training and validation, while 30% were allocated for testing. They reported an accuracy of 0.89 for grass weeds and 0.95 for broadleaf weeds, with an average processing time of 330 milliseconds. The performance metrics of their hybrid model ranged between 0.86 and 0.95, while its processing time varied between 316 and 350 milliseconds.

The average model performance results obtained by Zhang et al. (2023) are quite similar to ours. However, while their study classified weeds into two categories (grass weed and broadleaf weed) across two stages, we achieved comparable accuracy across four different growth stages. Additionally, while their average processing time was 330 milliseconds across all models, our study achieved an average processing time of just 16 milliseconds. Nevertheless, we acknowledge that these metrics are influenced by the dataset size and the physical specifications of the hardware used (Zhang et al., 2023).

4. Conclusions

Determining the development time of weeds in the fields is crucial to control them properly. In this study, we aimed to determine the development process of weeds using image processing methods. In our study; developing AI models in agriculture field by utilizing image features in MR and CT images, an approach that has been commonly used in medical field, has been aimed. This approach utilized image feature parameters to reduce the workload instead of processing the images directly on the computer. By identifying the appropriate parameters from our data, we classified the development process of weeds using nine different AI models. These models were tested with both test and validation data.

Among the nine models we developed, the performance results of the NB, ANFIS, and LR models were not within acceptable range. In contrast, the performance of the RF, MLPNN, DT, Keras, and SVM models were within acceptable range. Although the kNN model's training performance was within acceptable limits, its test and validation performances were close to acceptable range. Based on these results, we demonstrated that our RF, MLPNN, DT, Keras, SVM, and kNN models could successfully determine the development process of weeds. The developed models are expected to assist in creating recommendation systems for determining the appropriate amount and timing of herbicide use in weed management.

Funding
This research received no external funding

Acknowledgements

None

References

Alam M, Alam MS, Roman M, Tufail M, Khan MU, Khan MT. Real-time machine-learning based crop/weed detection and classification for variable-rate spraying in precision agriculture. In: Institute of Electrical and Electronics Engineers – IEEE, editor. Proceeding of 2020 7th International Conference on Electrical and Electronics Engineering. New York: Institute of Electrical and Electronics Engineers; 2020. p. 273-280.
Ali PJM, Faraj RH, Koya E, Ali PJM, Faraj RH. Data normalization and standardization: a technical report. Mach Learn Tech Rep.2014;1(1):1-6. Available from: https://www.doi.org/10.13140/RG.2.2.28948.04489
» https://www.doi.org/10.13140/RG.2.2.28948.04489
Bircan H. [Logistic regression analysis: an application on medical data]. Koc Univ J Soc Sci. 2004;(8):185-208. Turkish.
Chityala RN, Hoffmann KR, Bednarek DR, Rudin S. Region of interest (ROI) computed tomography. Proc SPIE Int Soc Opt Eng. 2004;5368(2):534-41. Available from: https://doi.org/10.1117/12.534568
» https://doi.org/10.1117/12.534568
Cicek Y, Uludag A. ve Gulbandılar E. [Artificial intelligence techniques used in sugar beet production]. Esk Turk World Appl Res Center Inf TechnolJ. 2022;3(2):54-9. Turkish. Available from: https://www.doi.org/10.53608/estudambilisim.1102769
» https://www.doi.org/10.53608/estudambilisim.1102769
Dai X, Lai W, Yin N, Tao Q, Huang Y. Research on intelligent clearing of weeds in wheat fields using spectral imaging and machine learning. J Cleaner Prod.2023;428. Available from: https://doi.org/10.1016/j.jclepro.2023.139409
» https://doi.org/10.1016/j.jclepro.2023.139409
Espejo-Garcia B, Mylonas N, Athanasakos L, Fountas S. Improving weeds identification with a repository of agricultural pre-trained deep neural networks. Comp Electr Agric.2020;175. Available from: https://doi.org/10.1016/j.compag.2020.105593
» https://doi.org/10.1016/j.compag.2020.105593
Fernandez A, Garcia S, Herrera F, Chawla NV. SMOTE for learning from imbalanced data: progress and challenges, marking the 15-year anniversary. J Art Int Res. 2018;61:863-905. Available from: https://doi.org/10.1613/jair.1.11192
» https://doi.org/10.1613/jair.1.11192
Gerhards R, Andujar Sanchez D, Hamouz P, Peteinatos GG, Christensen S, Fernandez-Quintanilla C. Advances in site-specific weed management in agriculture: a review. Weed Res. 2022;62(2):123-33. Available from: https://doi.org/10.1111/wre.12526
» https://doi.org/10.1111/wre.12526
Guzel M, Turan B, Kadioglu I, Basturk A, Sin B, Sadeghpour A. Deep learning for image-based detection of weeds from emergence to maturity in wheat fields. Smart Agric Technol. 2024;9:1-7. Available from: https://doi.org/10.1016/j.atech.2024.100552
» https://doi.org/10.1016/j.atech.2024.100552
Islam N, Rashid MM, Wibowo S, Xu C-Y, Morshed A, Wasimi SAetal. Early weed detection using image processing and machine learning techniques in an Australian Chilli Farm. Agriculture. 2021;11(5):1-13. Available from: https://doi.org/10.3390/agriculture11050387
» https://doi.org/10.3390/agriculture11050387
Jabran K, Uludag A, Chauhan BS. Sustainable weed control in rice. In: Korres NE, Burgos NR, Duke SO, editors. Weed control: sustainability, hazards, and risks in cropping systems worldwide. Boca Raton: CRC; 2018. p. 276-87.
Janiesch C, Zschech P, Heinrich K. Machine learning and deep learning. Electr Markets.2021;31(3):685-95. Available from: https://doi.org/10.1007/s12525-021-00475-2
» https://doi.org/10.1007/s12525-021-00475-2
Kamiński B, Jakubczyk M, Szufel P. A framework for sensitivity analysis of decision trees. Cent Eur J Oper Res.2017;26(1):135-59. Available from: https://doi.org/10.1007/s10100-017-0479-6
» https://doi.org/10.1007/s10100-017-0479-6
Korres NE, Burgos NR, Duke SO, editors. Weed control: sustainability, hazards, and risks in cropping systems worldwide. Boca Raton: CRC; 2018.
Li Y, Al-Sarayreh M, Irie K, Hackell D, Bourdot G, Reis MM, Ghamkhar K. Identification of weeds based on hyperspectral imaging and machine learning. Front Plant Sci. 2021;11:1-13. Available from: https://doi.org/10.3389/fpls.2020.611622
» https://doi.org/10.3389/fpls.2020.611622
Lottes P, Hoeferlin M, Sander S, Müter M, Schulze P, Stachniss LC. An effective classification system for separating sugar beets and weeds for precision farming applications. In: Institute of Electrical and Electronics Engineers – IEEE, editor. Proceeding of 2016 IEEE International Conference on Robotics and Automation (ICRA). New York: Institute of Electrical and Electronics Engineers; 2016. p. 5157-63.
Luo T, Zhao J, Gu Y, Zhang S, Qiao X, Tian Wetal. Classification of weed seeds based on visual images and deep learning. Inf Proc Agric. 2023;10(1):40-51. Available from: https://doi.org/10.1016/j.inpa.2021.10.002
» https://doi.org/10.1016/j.inpa.2021.10.002
Meena SD, Susank M, Guttula T, Chandana SH, Sheela J. Crop yield improvement with weeds, pest and disease detection. Proc Comp Sci. 2023;218:2369-82. Available from: https://doi.org/10.1016/j.procs.2023.01.212
» https://doi.org/10.1016/j.procs.2023.01.212
Nioche C, Orlhac F, Buvat I. Texture: user guide: local image features extraction Lifex. Lifexsoft.org. Oct 23, 2024.
Ong P, Teo KS, Sia CK. UAV-based weed detection in Chinese cabbage using deep learning. Smart Agric Technol.2023;4:1-8. Available from: https://doi.org/10.1016/j.atech.2023.100181
» https://doi.org/10.1016/j.atech.2023.100181
Pathak H, Igathinathane C, Howatt K, Zhang Z. Machine learning and handcrafted image processing methods for classifying common weeds in corn field. Smart Agric Technol. 2023;5:1-12. Available from: https://doi.org/10.1016/j.atech.2023.100249
» https://doi.org/10.1016/j.atech.2023.100249
Potena C, Nardi D, Pretto A. Fast and accurate crop and weed identification with summarized train sets for precision agriculture. In: Chen W, Hosoda K, Menegatti E, Shimizu M, Wang H. Intelligent autonomous systems 14: proceedings of the 14th International Conference IAS-14 14. New York: Springer International; 2017. p. 105-121
Razfar N, True J, Bassiouny R, Venkatesh V, Kashef R. Weed detection in soybean crops using custom lightweight deep learning models. J Agric Food Res. 2022;8:1-10. Available from: https://doi.org/10.1016/j.jafr.2022.100308
» https://doi.org/10.1016/j.jafr.2022.100308
Sugumar R, Suganya D. A multi-spectral image-based high-level classification based on a modified SVM with enhanced PCA and hybrid metaheuristic algorithm. Rem Sens Appl Soc Envir. 2023;31. Available from: https://doi.org/10.1016/j.rsase.2023.100984
» https://doi.org/10.1016/j.rsase.2023.100984
Torun H, Ozkil M, Aksoy N, Uremiş İ, Uludağ A. Cardamine occulta: a new weed and alien plant species in banana production greenhouses in Turkey. Acta Bot Hungar. 2023;65(3-4):399-411. Available from: https://doi.org/10.1556/034.65.2023.3-4.10
» https://doi.org/10.1556/034.65.2023.3-4.10
Yang FJ. An implementation of naive bayes classifier. In: Institute of Electrical and Electronics Engineers – IEEE, editor. Proceeding of 2018 International conference on computational science and computational intelligence (CSCI). New York: Institute of Electrical and Electronics Engineers; 2018. p. 301-306.
Yu J, Sharpe SM, Schumann AW, Boyd NS. Deep learning for image-based weed detection in turfgrass. Eur J Agron. 2019;104:78-84. Available from: https://doi.org/10.1016/j.eja.2019.01.004
» https://doi.org/10.1016/j.eja.2019.01.004
Zadeh LA. Fuzzy sets. Inf Control. 1965;8(3):338-53. Available from: https://doi.org/10.1016/S0019-9958(65)90241-X
» https://doi.org/10.1016/S0019-9958(65)90241-X
Zhang X, Cui J, Liu H, Han Y, Ai H, Dong Cetal. Weed ıdentification in soybean seedling stage based on optimized faster R-CNN algorithm. Agriculture. 2023;13(1):1-16. Available from: https://doi.org/10.3390/agriculture13010175
» https://doi.org/10.3390/agriculture13010175

Edited by

Editor in Chief:
Carlos Eduardo Schaedler

Associate Editor:
José Barbosa dos Santos

Publication Dates

Publication in this collection
07 July 2025
Date of issue
2025

History

Received
25 Feb 2025
Accepted
12 May 2025

This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided that the original author and source are credited.

[1] Alam M, Alam MS, Roman M, Tufail M, Khan MU, Khan MT. Real-time machine-learning based crop/weed detection and classification for variable-rate spraying in precision agriculture. In: Institute of Electrical and Electronics Engineers – IEEE, editor. Proceeding of 2020 7th International Conference on Electrical and Electronics Engineering. New York: Institute of Electrical and Electronics Engineers; 2020. p. 273-280.

[2] Ali PJM, Faraj RH, Koya E, Ali PJM, Faraj RH. Data normalization and standardization: a technical report. Mach Learn Tech Rep.2014;1(1):1-6. Available from: https://www.doi.org/10.13140/RG.2.2.28948.04489
» https://www.doi.org/10.13140/RG.2.2.28948.04489

[3] Bircan H. [Logistic regression analysis: an application on medical data]. Koc Univ J Soc Sci. 2004;(8):185-208. Turkish.

[4] Chityala RN, Hoffmann KR, Bednarek DR, Rudin S. Region of interest (ROI) computed tomography. Proc SPIE Int Soc Opt Eng. 2004;5368(2):534-41. Available from: https://doi.org/10.1117/12.534568
» https://doi.org/10.1117/12.534568

[5] Cicek Y, Uludag A. ve Gulbandılar E. [Artificial intelligence techniques used in sugar beet production]. Esk Turk World Appl Res Center Inf TechnolJ. 2022;3(2):54-9. Turkish. Available from: https://www.doi.org/10.53608/estudambilisim.1102769
» https://www.doi.org/10.53608/estudambilisim.1102769

[6] Dai X, Lai W, Yin N, Tao Q, Huang Y. Research on intelligent clearing of weeds in wheat fields using spectral imaging and machine learning. J Cleaner Prod.2023;428. Available from: https://doi.org/10.1016/j.jclepro.2023.139409
» https://doi.org/10.1016/j.jclepro.2023.139409

[7] Espejo-Garcia B, Mylonas N, Athanasakos L, Fountas S. Improving weeds identification with a repository of agricultural pre-trained deep neural networks. Comp Electr Agric.2020;175. Available from: https://doi.org/10.1016/j.compag.2020.105593
» https://doi.org/10.1016/j.compag.2020.105593

[8] Fernandez A, Garcia S, Herrera F, Chawla NV. SMOTE for learning from imbalanced data: progress and challenges, marking the 15-year anniversary. J Art Int Res. 2018;61:863-905. Available from: https://doi.org/10.1613/jair.1.11192
» https://doi.org/10.1613/jair.1.11192

[9] Gerhards R, Andujar Sanchez D, Hamouz P, Peteinatos GG, Christensen S, Fernandez-Quintanilla C. Advances in site-specific weed management in agriculture: a review. Weed Res. 2022;62(2):123-33. Available from: https://doi.org/10.1111/wre.12526
» https://doi.org/10.1111/wre.12526

[10] Guzel M, Turan B, Kadioglu I, Basturk A, Sin B, Sadeghpour A. Deep learning for image-based detection of weeds from emergence to maturity in wheat fields. Smart Agric Technol. 2024;9:1-7. Available from: https://doi.org/10.1016/j.atech.2024.100552
» https://doi.org/10.1016/j.atech.2024.100552

[11] Islam N, Rashid MM, Wibowo S, Xu C-Y, Morshed A, Wasimi SAetal. Early weed detection using image processing and machine learning techniques in an Australian Chilli Farm. Agriculture. 2021;11(5):1-13. Available from: https://doi.org/10.3390/agriculture11050387
» https://doi.org/10.3390/agriculture11050387

[12] Jabran K, Uludag A, Chauhan BS. Sustainable weed control in rice. In: Korres NE, Burgos NR, Duke SO, editors. Weed control: sustainability, hazards, and risks in cropping systems worldwide. Boca Raton: CRC; 2018. p. 276-87.

[13] Janiesch C, Zschech P, Heinrich K. Machine learning and deep learning. Electr Markets.2021;31(3):685-95. Available from: https://doi.org/10.1007/s12525-021-00475-2
» https://doi.org/10.1007/s12525-021-00475-2

[14] Kamiński B, Jakubczyk M, Szufel P. A framework for sensitivity analysis of decision trees. Cent Eur J Oper Res.2017;26(1):135-59. Available from: https://doi.org/10.1007/s10100-017-0479-6
» https://doi.org/10.1007/s10100-017-0479-6

[15] Korres NE, Burgos NR, Duke SO, editors. Weed control: sustainability, hazards, and risks in cropping systems worldwide. Boca Raton: CRC; 2018.

[16] Li Y, Al-Sarayreh M, Irie K, Hackell D, Bourdot G, Reis MM, Ghamkhar K. Identification of weeds based on hyperspectral imaging and machine learning. Front Plant Sci. 2021;11:1-13. Available from: https://doi.org/10.3389/fpls.2020.611622
» https://doi.org/10.3389/fpls.2020.611622

[17] Lottes P, Hoeferlin M, Sander S, Müter M, Schulze P, Stachniss LC. An effective classification system for separating sugar beets and weeds for precision farming applications. In: Institute of Electrical and Electronics Engineers – IEEE, editor. Proceeding of 2016 IEEE International Conference on Robotics and Automation (ICRA). New York: Institute of Electrical and Electronics Engineers; 2016. p. 5157-63.

[18] Luo T, Zhao J, Gu Y, Zhang S, Qiao X, Tian Wetal. Classification of weed seeds based on visual images and deep learning. Inf Proc Agric. 2023;10(1):40-51. Available from: https://doi.org/10.1016/j.inpa.2021.10.002
» https://doi.org/10.1016/j.inpa.2021.10.002

[19] Meena SD, Susank M, Guttula T, Chandana SH, Sheela J. Crop yield improvement with weeds, pest and disease detection. Proc Comp Sci. 2023;218:2369-82. Available from: https://doi.org/10.1016/j.procs.2023.01.212
» https://doi.org/10.1016/j.procs.2023.01.212

[20] Nioche C, Orlhac F, Buvat I. Texture: user guide: local image features extraction Lifex. Lifexsoft.org. Oct 23, 2024.

[21] Ong P, Teo KS, Sia CK. UAV-based weed detection in Chinese cabbage using deep learning. Smart Agric Technol.2023;4:1-8. Available from: https://doi.org/10.1016/j.atech.2023.100181
» https://doi.org/10.1016/j.atech.2023.100181

[22] Pathak H, Igathinathane C, Howatt K, Zhang Z. Machine learning and handcrafted image processing methods for classifying common weeds in corn field. Smart Agric Technol. 2023;5:1-12. Available from: https://doi.org/10.1016/j.atech.2023.100249
» https://doi.org/10.1016/j.atech.2023.100249

[23] Potena C, Nardi D, Pretto A. Fast and accurate crop and weed identification with summarized train sets for precision agriculture. In: Chen W, Hosoda K, Menegatti E, Shimizu M, Wang H. Intelligent autonomous systems 14: proceedings of the 14th International Conference IAS-14 14. New York: Springer International; 2017. p. 105-121

[24] Razfar N, True J, Bassiouny R, Venkatesh V, Kashef R. Weed detection in soybean crops using custom lightweight deep learning models. J Agric Food Res. 2022;8:1-10. Available from: https://doi.org/10.1016/j.jafr.2022.100308
» https://doi.org/10.1016/j.jafr.2022.100308

[25] Sugumar R, Suganya D. A multi-spectral image-based high-level classification based on a modified SVM with enhanced PCA and hybrid metaheuristic algorithm. Rem Sens Appl Soc Envir. 2023;31. Available from: https://doi.org/10.1016/j.rsase.2023.100984
» https://doi.org/10.1016/j.rsase.2023.100984

[26] Torun H, Ozkil M, Aksoy N, Uremiş İ, Uludağ A. Cardamine occulta: a new weed and alien plant species in banana production greenhouses in Turkey. Acta Bot Hungar. 2023;65(3-4):399-411. Available from: https://doi.org/10.1556/034.65.2023.3-4.10
» https://doi.org/10.1556/034.65.2023.3-4.10

[27] Yang FJ. An implementation of naive bayes classifier. In: Institute of Electrical and Electronics Engineers – IEEE, editor. Proceeding of 2018 International conference on computational science and computational intelligence (CSCI). New York: Institute of Electrical and Electronics Engineers; 2018. p. 301-306.

[28] Yu J, Sharpe SM, Schumann AW, Boyd NS. Deep learning for image-based weed detection in turfgrass. Eur J Agron. 2019;104:78-84. Available from: https://doi.org/10.1016/j.eja.2019.01.004
» https://doi.org/10.1016/j.eja.2019.01.004

[29] Zadeh LA. Fuzzy sets. Inf Control. 1965;8(3):338-53. Available from: https://doi.org/10.1016/S0019-9958(65)90241-X
» https://doi.org/10.1016/S0019-9958(65)90241-X

[30] Zhang X, Cui J, Liu H, Han Y, Ai H, Dong Cetal. Weed ıdentification in soybean seedling stage based on optimized faster R-CNN algorithm. Agriculture. 2023;13(1):1-16. Available from: https://doi.org/10.3390/agriculture13010175
» https://doi.org/10.3390/agriculture13010175

	N	Mean	Sth. Error of Mean	Sth. Deviation
	Valid	Mean	Sth. Error of Mean	Sth. Deviation
Temperature	448	22.344	0.0829	1.7549
glzlmZp	448	0.4894961	0.0076188	0.1612598
glcmElg2	448	7.940569	0.043423	0.9190907
ngldmCoa	448	0.0075575	0.0003834	0.0081147
conStd	448	13.874633	0.2939006	6.2207026
Height	448	13.048	0.2839	6.0087
glcmEner	448	0.0091582	0.0004104	0.0086876
paramsBS	448	1.6933461	0.0293174	0.6205322
glzlmZln	448	934.83802	41.01724	868.17133
Date	448	2.29	0.049	1.031

Models	Class	Comparison results in the class			Average comparison results
		Precision	Recall	F1-score	Accuracy	Precision	Recall	F1-score
ANFIS	1	0.93	0.85	0.89	0.866	0.867	0.867	0.865
	2	0.8	0.88	0.84
	3	0.8	0.91	0.85
	4	0.93	0.83	0.88
Keras	1	0.86	0.88	0.87	0.801	0.796	0.794	0.794
	2	0.73	0.65	0.69
	3	0.83	0.84	0.83
	4	0.77	0.81	0.79
SVM	1	0.81	0.87	0.84	0.776	0.773	0.768	0.768
	2	0.73	0.61	0.67
	3	0.81	0.74	0.78
	4	0.75	0.84	0.79
LR	1	0.72	0.86	0.78	0.697	0.688	0.685	0.68
	2	0.62	0.42	0.5
	3	0.7	0.71	0.71
	4	0.71	0.74	0.73
kNN	1	0.84	0.9	0.87	0.832	0.831	0.826	0.826
	2	0.74	0.71	0.73
	3	0.9	0.76	0.82
	4	0.84	0.93	0.88
NB	1	0.49	0.95	0.65	0.541	0.612	0.515	0.479
	2	0.65	0.19	0.29
	3	0.75	0.24	0.37
	4	0.55	0.68	0.61
DT	1	1.00	1.00	1.00	1.00	1.00	1.00	1.00
	2	1.00	1.00	1.00
	3	1.00	1.00	1.00
	4	1.00	1.00	1.00
RF	1	0.99	0.99	0.99	0.991	0.991	0.991	0.991
	2	0.98	0.99	0.98
	3	1.00	1.00	1.00
	4	1.00	0.99	0.99
MLPNN	1	1.00	1.00	1.00	1.00	1.00	1.00	1.00
	2	1.00	1.00	1.00
	3	1.00	1.00	1.00
	4	1.00	1.00	1.00

Models	Class	Comparison results in the class			Average comparison results
		Precision	Recall	F1-score	Accuracy	Precision	Recall	F1-score
ANFIS	1	0.56	0.67	0.61	0.605	0.602	0.607	0.602
	2	0.7	0.64	0.67
	3	0.56	0.47	0.51
	4	0.59	0.65	0.62
Keras	1	0.78	0.93	0.85	0.75	0.763	0.763	0.755
	2	0.94	0.68	0.79
	3	0.64	0.74	0.68
	4	0.7	0.7	0.7
SVM	1	0.68	1.00	0.81	0.697	0.706	0.718	0.699
	2	0.87	0.59	0.7
	3	0.63	0.63	0.63
	4	0.65	0.65	0.65
LR	1	0.48	0.93	0.64	0.579	0.598	0.61	0.562
	2	0.62	0.23	0.33
	3	0.65	0.58	0.61
	4	0.64	0.7	0.67
kNN	1	0.6	1.00	0.75	0.697	0.731	0.721	0.699
	2	0.91	0.45	0.61
	3	0.73	0.58	0.65
	4	0.68	0.85	0.76
NB	1	0.38	1.00	0.55	0.448	0.455	0.499	0.402
	2	0.17	0.05	0.07
	3	0.71	0.26	0.38
	4	0.57	0.65	0.6
DT	1	0.62	0.87	0.72	0.698	0.702	0.708	0.697
	2	0.79	0.68	0.73
	3	0.65	0.68	0.67
	4	0.75	0.6	0.67
RF	1	0.78	0.93	0.85	0.855	0.857	0.862	0.855
	2	0.86	0.82	0.84
	3	0.86	0.95	0.9
	4	0.94	0.75	0.83
MLPNN	1	0.68	1.00	0.81	0.697	0.71	0.72	0.7
	2	0.86	0.55	0.67
	3	0.64	0.74	0.68
	4	0.67	0.6	0.63

Models	Class	Comparison results in the class			Average comparison results
		Precision	Recall	F1-score	Accuracy	Precision	Recall	F1-score
ANFIS	1	0.55	0.5	0.52	0.645	0.625	0.651	0.618
	2	0.58	0.29	0.39
	3	0.65	0.87	0.74
	4	0.73	0.94	0.82
Keras	1	0.44	0.67	0.53	0.711	0.701	0.719	0.699
	2	0.71	0.5	0.59
	3	0.86	0.83	0.84
	4	0.79	0.88	0.83
SVM	1	0.6	0.75	0.67	0.802	0.789	0.807	0.791
	2	0.84	0.67	0.74
	3	0.91	0.87	0.89
	4	0.8	0.94	0.86
LR	1	0.5	0.83	0.62	0.75	0.756	0.775	0.741
	2	0.86	0.5	0.63
	3	0.9	0.83	0.86
	4	0.76	0.94	0.84
kNN	1	0.5	0.67	0.57	0.632	0.645	0.655	0.628
	2	0.59	0.42	0.49
	3	0.94	0.65	0.77
	4	0.56	0.88	0.68
NB	1	0.29	1	0.45	0.487	0.65	0.57	0.478
	2	0.8	0.17	0.28
	3	0.89	0.35	0.5
	4	0.62	0.76	0.68
DT	1	0.57	0.67	0.62	0.671	0.677	0.681	0.668
	2	0.65	0.62	0.64
	3	0.88	0.61	0.72
	4	0.61	0.82	0.7
RF	1	0.56	0.83	0.67	0.685	0.689	0.69	0.668
	2	0.65	0.54	0.59
	3	0.75	0.91	0.82
	4	0.8	0.47	0.59
MLPNN	1	0.47	0.67	0.55	0.684	0.673	0.685	0.673
	2	0.65	0.54	0.59
	3	0.86	0.83	0.84
	4	0.71	0.71	0.71