Acessibilidade / Reportar erro

Identifying relevant variables for production batch categorization into quality levels

A large number of correlated process variables are usually found in industrial environments, making it difficult for engineers to identify the key variables. Partial Least Squares (PLS) has been successfully applied to select the most relevant process variables for predicting response variables. However, many practical applications are more interested in correctly categorizing the final product into classes. This paper addresses this classification issue by integrating Partial Least Square (PLS) regression to the z-nearest neighbor rule and support vector machine for the categorization of production batches into two quality levels. Indices based on PLS parameters are developed for evaluating variable importance. The classification methods are then applied to reduce noisy and irrelevant variables based on the importance indices. The best subset of variables is identified by monitoring accuracy profile variations while variables are removed. In three datasets, the suggested approach reduced the number of variables necessary for classification of production batches by 90.6 per cent, while yielding 29.2 per cent more accurate classifications.

Variable selection; PLS; z-Nearest neighbors classification rule; Support vector machine


Universidade Federal de São Carlos Departamento de Engenharia de Produção , Caixa Postal 676 , 13.565-905 São Carlos SP Brazil, Tel.: +55 16 3351 8471 - São Carlos - SP - Brazil
E-mail: gp@dep.ufscar.br