Data mining for ranking sorghum seed lots

Rocha, Luciana D.; Gadotti, Gizele I.; Bernardy, Ruan; Pinheiro, Romário de M.; Monteiro, Rita de C. M.

doi:10.1590/1983-21252023v36n224rc

Acessibilidade / Reportar erro

Brasil

Español English

sumário « anterior atual seguinte »

Sumário

Agronomy • Rev. Caatinga 36 (2) • Apr-Jun 2023 • https://doi.org/10.1590/1983-21252023v36n224rc copy

Data mining for ranking sorghum seed lots

Mineração de dados no ranqueamento de lotes de sementes de sorgo

Authorship SCIMAGO INSTITUTIONS RANKINGS

ABSTRACT

The ranking of seed lots is a fundamental process for all companies in the seed industry. This work aims to demonstrate data mining methods for ranking sorghum seed lots during the seed processing through analysis of quality control data. Germination and cold tests were performed to verify the physiological quality of the lots. Seed samples from each lot were evaluated in two moments: post-cleaning and finished product (ready for marketing). The results after pre-processing totaled 188 rows of data with six attributes, encompassing 150 lots accepted for marketing, 6 rejected, and 32 intermediate lots. The classifiers used were J48, Random Forest, Classification Via Regression, Naive Bayes, Multilayer Perceptron, and IBk. The Resample filter was used for adjustment of the data. The k-fold technique was used for training, with ten folds. The metrics of Accuracy, Precision, Recall, F-measure, and ROC Area were used to verify the accuracy of the algorithms. The results obtained were used to determine the best machine-learning algorithm. IBk and J48 presented the highest accuracy of data; the IBk technique presented the best results. The Resample filter was essential for solving the data imbalance problem. Sorghum seed lots can be classified with great accuracy and precision through artificial intelligence and machine learning technique.

Keywords:
Quality; Post-harvest technology; Artificial intelligence

Attribute	Description	Value
Lots	Lots	{1-187}
Germination	Post-cleaning	{0-100}
Germination	Finished product
Cold test	Post-cleaning	{0-100}
Cold test	Finished product
Lot classification	Decision	{Accepted, rejected, intermediate}

Algorithm	Correct attribute classification (%)
Algorithm	No resample	With resample
CVR	79.8 cb	86.2 c
J48	93.1 a	96.3 a
Naïve Bayes	92.5 a	94.7 b
Random Forest	80.8 b	93.1 b
Multilayer Perceptron	89.4 b	93.08 b
IBk	93.1 a	96.8 a

Classifiers	Accuracy
Classifiers	Recall	Precision	ROC Area	F-Measure	Class
IBk	0.987	0.974	0.924	0.980	Accept
	1.000	1.000	1.000	1.000	Reject
	0.871	0.931	0.908	0.900	Intermediary
J48	0.987	0.987	0.940	0.987	Accept
	0.857	0.667	0.916	0.750	Reject
	0.871	0.931	0.901	0.900	Intermediary
CVR	0.877	0.971	0.974	0.922	Accept
	0.769	1.000	1.000	0.870	Reject
	0.810	0.436	0.952	0.567	Intermediary
MLP	0.987	0.974	0.996	0.980	Accept
	0.000	0.000	1.000	0.000	Reject
	0.871	0.750	0.990	0.806	Intermediary
Naïve Bayes	0.948	0.986	0.992	0.967	Accept
	1.000	1.000	1.000	1.000	Reject
	0.905	0.704	0.985	0.792	Intermediary
Random Forest	1.000	0.922	1.000	0.960	Accept
	0.769	1.000	1.000	0.870	Reject
	0.524	1.000	0.997	0.688	Intermediary

		Prediction
		Accepted	Rejected	Intermediary
Actual class	Accepted	27	4	0
	Rejected	2	148	0
	Intermediary	0	0	7

		Prediction
		Accepted	Rejected	Intermediary
Actual class	Accepted	27	2	2
	Rejected	1	148	1
	Intermediary	1	0	6

Universidade Federal Rural do Semi-Árido Avenida Francisco Mota, número 572, Bairro Presidente Costa e Silva, Cep: 5962-5900, Telefone: 55 (84) 3317-8297 - Mossoró - RN - Brazil
E-mail: caatinga@ufersa.edu.br

Acompanhe os números deste periódico no seu leitor de RSS

[1] *Corresponding author: <ruanbernardy@yahoo.com.br>