Acessibilidade / Reportar erro

Predicting partition coefficients of migrants in food simulant/polymer systems using adaptive neuro-fuzzy inference system

Abstracts

Food contaminations by migration of low molecular weight additives into foodstuffs can result from direct contact between packaging materials and food. The amount of migration is related to the structural properties of the additive as well as to the nature of packaging material. The goal of this study is to develop a quantitative structure-property relationship (QSPR) model by the adaptive neuro-fuzzy inference system (ANFIS) for prediction of the partition coefficient K in food/packaging system. The partition coefficients of a set of 44 systems consisted of 4 food simulants, 6 migrants and 2 packaging materials were investigated. A set of 6 molecular descriptors representing various structural characteristics of food simulants (2 descriptors), migrants (3 descriptors) and polymers (1 descriptor) was used as data set. This data set was divided into three subsets: training, test and prediction. ANFIS as a new modeling technique was applied for the first time in this field. The final model has a root mean square error (RMSE) of 0.0006 and correlation coefficient (R²) for the prediction set of 0.9920.

quantitative structure property relationship (QSPR); adaptive neuro-fuzzy inference system (ANFIS); partition coefficients; additive migration; food safety


A contaminação de alimentos pela migração de aditivos de baixo peso molecular em alimentos processados industrialmente pode ser resultado do contato direto entre a embalagem e o alimento. A concentração do aditivo que migra do material da embalagem para o alimento está relacionada com as propriedades estruturais do aditivo, bem como com a natureza do material empregado na embalagem. O objetivo deste estudo é desenvolver um modelo QSPR pela adaptação do sistema de interferência neuro-fuzzy (ANFIS) a fim de predizer o valor do coeficiente de partição K, no sistema de estudo, embalagem/alimento. Para tal, foram investigados 44 coeficientes de partição em vários sistemas, assim constituídos: 4 de simuladores alimentares, 6 de migrantes alimentares e 2 de embalagens. Um conjunto de 6 descritores moleculares, representando várias características dos simuladores de alimentos (2 descritores), dos migrantes (3 descritores) e de polímeros (1 descritor) foi empregado como a série de dados para avaliar esse estudo. Esta série de dados foi dividida em três subconjuntos: treinamento, teste e predição. A técnica de modelagem ANFIS foi aplicada pela primeira vez neste campo de estudos relacionado com alimento/embalagem. O resultado desta modelagem forneceu um RMSE de 0,0006 e o coeficiente de correlação (R²) para o ensaio da predição foi de 0,9920.


ARTICLE

Predicting partition coefficients of migrants in food simulant/polymer systems using adaptive neuro-fuzzy inference system

Parviz ShahbazikhahI,* * e-mail: shahbazikhah@mehr.sharif.ir ; Mohammad Asadollahi-BaboliI; Ramin KhaksarII; Reza Fareghi AlamdariI; Vali Zare-ShahabadiIII

IDeepartment of Chemistry and Young Researchers Club, Islamic Azad University, Mahshahr Branch, Mahshahr, Iran

IIDepartment of Food Science and Technology, National Nutrition and Food Technology Research Institute, Faculty of Nutrition Science and Food Technology, Shaheed Beheshti University of Medical Sciences, Tehran, Iran

IIIYoung Researchers Club, Islamic Azad University, Mahshahr Branch, Mahshahr, Iran

ABSTRACT

Food contaminations by migration of low molecular weight additives into foodstuffs can result from direct contact between packaging materials and food. The amount of migration is related to the structural properties of the additive as well as to the nature of packaging material. The goal of this study is to develop a quantitative structure-property relationship (QSPR) model by the adaptive neuro-fuzzy inference system (ANFIS) for prediction of the partition coefficient K in food/packaging system. The partition coefficients of a set of 44 systems consisted of 4 food simulants, 6 migrants and 2 packaging materials were investigated. A set of 6 molecular descriptors representing various structural characteristics of food simulants (2 descriptors), migrants (3 descriptors) and polymers (1 descriptor) was used as data set. This data set was divided into three subsets: training, test and prediction. ANFIS as a new modeling technique was applied for the first time in this field. The final model has a root mean square error (RMSE) of 0.0006 and correlation coefficient (R2) for the prediction set of 0.9920.

Keywords: quantitative structure property relationship (QSPR), adaptive neuro-fuzzy inference system (ANFIS), partition coefficients, additive migration, food safety

RESUMO

A contaminação de alimentos pela migração de aditivos de baixo peso molecular em alimentos processados industrialmente pode ser resultado do contato direto entre a embalagem e o alimento. A concentração do aditivo que migra do material da embalagem para o alimento está relacionada com as propriedades estruturais do aditivo, bem como com a natureza do material empregado na embalagem. O objetivo deste estudo é desenvolver um modelo QSPR pela adaptação do sistema de interferência neuro-fuzzy (ANFIS) a fim de predizer o valor do coeficiente de partição K, no sistema de estudo, embalagem/alimento. Para tal, foram investigados 44 coeficientes de partição em vários sistemas, assim constituídos: 4 de simuladores alimentares, 6 de migrantes alimentares e 2 de embalagens. Um conjunto de 6 descritores moleculares, representando várias características dos simuladores de alimentos (2 descritores), dos migrantes (3 descritores) e de polímeros (1 descritor) foi empregado como a série de dados para avaliar esse estudo. Esta série de dados foi dividida em três subconjuntos: treinamento, teste e predição. A técnica de modelagem ANFIS foi aplicada pela primeira vez neste campo de estudos relacionado com alimento/embalagem. O resultado desta modelagem forneceu um RMSE de 0,0006 e o coeficiente de correlação (R2) para o ensaio da predição foi de 0,9920.

Introduction

Continuous efforts in food matrix preservation, distribution and marketing are being made worldwide to supply consumers with high quality products and foods. To avoid food packaging contamination, one should first find the source of contamination. Various interactions between food and packaging materials can contaminate food. Migration of low molecular weight additives from packaging materials into foodstuffs can also contaminate them.1 Types and levels of solvents and migrating monomers from polymers into foods are important factors of food contamination. Many research groups have been widely studying the food contamination.2-4 Also, the migration of low molecular weight compounds from a food into polymer has been subject of considerable attention.5

The thermodynamic equilibrium (partition) of the migration process can be defined as an exchange of mass/energy between the packaging material and food.6 For quality control of food packaging, the partition coefficients between polymer packaging and the food matrix should be known. Fortunately, quantitative structure-property relationships (QSPR) based on computational methods have made possible calculating these partition coefficients. Indeed, QSPRs represent predictive models derived from application of statistical tools correlating chemical property, such as partition coefficient, with descriptors representative of molecular structure and/or property. The success of any QSPR model depends on the accuracy of input data, selection of appropriate descriptors and statistical tools.7 Finally the developed model is subjected to validation step. The validation strategies check the reliability of the developed model for its possible application on a new data set, and confidence of prediction can thus be judged. In the current work, we have validated model using three techniques: leave-one-out and leave-multiple-out cross validation techniques and Y-randomization test.

The objectives of the present paper are twofold: i) to explore the structure property relationships of partition coefficient of diverse systems and ii) to compare the developed ANFIS model with the quadratic model reported previously.8

Theory

Adaptive neuro-fuzzy inference system

The proposed neuro-fuzzy model in ANFIS is a multilayer neural network-based fuzzy system.9-10 Its topology is presented in Figure 1. As shown, the system has a total of five layers. In this connectionist structure, the input (layer 0) and output (layer 5) nodes represent the descriptors and the response, respectively. In the hidden layers, there are nodes functioning as membership functions (MFs) and rules. This architecture eliminates the disadvantage of a normal feed forward multilayer network, which is difficult for an observer to understand or to modify. ANFIS simulates Takagi-Sugeno-Kang fuzzy rule11 of type-3 where the consequent part of the rule is a linear combination of input variables and a constant. For a Sugeno fuzzy model a common rule set with the fuzzy if then rule is as follow:

If x is Ai and y is Ai, then

For simplicity, we assume here that the examined fuzzy inference system has two inputs x and y and one output, although the ANFIS contains five layers as shown in Figure 1:


Layer 1. The fuzzy part of ANFIS is mathematically incorporated in the form of membership functions (MFs). A membership function µAi(x) can be any continuous and piecewise differentiable function that transforms the input value x into a membership degree, that is to say a value between 0 and 1. The most widely applied membership functions are the generalized bell (gbell MF) and the Gaussian function (equations (2) and (3), respectively) which are described by the three parameters, a, b, and c. Therefore, layer 1 is the fuzzification layer in which each node represents a membership:

As the values of the parameters {ai, bi and ci} change, the bell-shaped functions vary accordingly, exhibiting various forms of membership functions on linguistic label Ai. Parameters in this layer are referred to as premise parameters.

Layer 2. Every node in this layer is a fixed node labeled, whose output is the product of all the incoming signals:

Every node in this layer computes the multiplication of the input values and gives the product as the output. The membership values represented by µAi(x) and µBi(y) are multiplied in order to find the firing strength of a rule where the variables x and y has linguistic values Ai and Bi, respectively

Layer 3. This layer is the normalization layer which normalizes the strength of all rules according to equation (5):

where wi is the firing strength of the ith rule which is computed in layer 2. Node i computes the ratio of the ith rule's firing strength to the sum of all rules' firing strengths. For convenience, outputs of this layer are called normalized firing strengths.

Layer 4. Every node in this layer is an adaptive node with a node function:

where wi is a normalized firing strength from layer 3 and {pi, qi, ri} is the parameter set for this node. Parameters in this layer are referred to as consequent parameters.

Layer 5. The single node in this layer is a fixed node labeled Σ, which computes the overall output as the summation of all incoming signals:

Thus we have constructed an ANFIS system that is functionally equivalent to Sugeno fuzzy model, which was used in the present QSPR study due to its transparency and efficiency.

Cross-validation techniques

The consistency and reliability of a method can be explored using the cross validation technique.12 Two different strategies including leave-one-out (LOO) or leave-multiple-out (LMO) can be employed. In LOO strategy, by deleting each time one object from the training set, a number of models are produced. Obviously, the number of models produced by the LOO procedure is equal to the number of available samples (n), e.g. n = 44. Prediction error sum of squares (PRESS) is a standard index to measure the accuracy of a modeling method based on the cross-validation technique. Based on the PRESS and SSY (sum of squares of deviations of the experimental values from their mean) statistics, the Q2 can be easily calculated by equation (8):

In this sense, a high value for the statistical parameter is considered as proof of high predictive ability of the model.13 However, several authors suggest that a high value of Q2LOO appears to be necessary but not sufficient.14 For this reason, we also used LMO cross validation technique. In the case of LMO, M represents a group of randomly selected data points which is left out at the beginning and would be predicted then by the model developed using the remaining data points. So, M molecules are considered as a prediction set. The R2LMO can be calculated by equation (9):

This algorithm is shown in Figure 2. It is common choosing 10-30% of the total number of molecules to leave-out. In the present work, calculation of R2LMO was based on 1000 randomly selections of groups of 8 and 12 samples. The higher value of Q2LOO or R2LMO indicates the higher predictive power of the model.


Methodology

Data set and descriptors

The equilibrium distribution of migrants is affected by the partitioning behavior of compounds between polymer packaging and the food matrix. Therefore the nature of food simulant, polymer and migrant are important to avoid food contamination. The data collected by Tehrany et al.8 was used to develop a QSPR model using ANFIS method.The total data set consists of 44 systems of simulant/polymer/migrant together with their partition coefficient (K). The partition coefficients (K) were used as dependent variable in our QSPR study. As the equilibrium distribution of migrants is dependent on the nature of food simulant, polymer and migrant, the data set consists of systems including three components: (i) Food simulant; (ii) polymer; (iii) migrant. In order to simplify for each system a code (I) was defined by the following equation:8

I = 100 × LFood, + 10 × LPolymer, + LMigrant

where LFood, LPolymer, and LMigrant are levels for food, polymer, and migrant components, respectively. These levels are given in Table 1 for each compound. Therefore, as an example, a system with I = 224 consists of 10% ethanol/PA/IP.

A set containing six molecular descriptors was used. The values of all descriptors are listed in Table 2. As this table shows, these descriptors are polymer polarity, food simulant polarity, simulant molecular weight, migrant molecular weight, migrant LUMO (lowest unoccupied molecular orbital) and migrant HLB (hydrophilicity, lipophilicity balance).

Model development by ANFIS

To develop ANFIS model the data set was divided into three subsets: training, test and prediction. All molecules were randomly placed in these sets. The training set consisted of 22 molecules used to generate the model. The test set containing 11 molecules was employed to take care of the overtraining. The prediction set comprised of 11 molecules was used to evaluate the model.

The compounds included in each set are specified in Table 1. The six simulant/polymer/migrant descriptors were used as inputs for development of the ANFIS model. The model building involves two stages: structure identification and parameter identification. The former is related to finding a suitable number of rules and a proper partition of the feature space. The latter is concerned with the adjustment of system parameters, such as MF (membership function) parameters, linear coefficients, and so on. It is concluded that by increasing the number of MFs per input, the number of rules increases accordingly. For the first stage of ANFIS modeling, grid partitioning should be used for partitioning the features. The number and type of membership functions should be optimized by using RMSE as a criterion for the test set. All ANFIS models were produced using MATLAB 7.0 Fuzzy Logic Toolbox (MATLAB, Mathworks Inc. software, Natick, USA, 2008).

Results and Discussion

Statistical parameters of ANFIS model

Prediction results of the ANFIS model for all data sets are shown in Table S1 (available as Supplementary Information). The statistical parameters of the resulted model are given in Table 3. In this table, the model is also compared to the quadratic model previously reported on the same data set by Tehrant et al.8 which is as follows:

where x1 is the polarity of food simulant, x2 is the polarity of polymer, x3 is the molecular weight of migrant, x4 is LUMO, x5 is the molecular weight of food simulant and x6 is the HLB of migrant.

It can be seen that the RMSEprediction value has improved from 0.0248 for the quadratic model to 0.0006 for the ANFIS model. It shows that the ANFIS model is (0.248/0.0006 = 41.3) times more precise than the quadratic model. In the other words, this nonlinear model is able to predict the variances of the partition coefficients.

The correlation between the experimental and calculated values of the partition coefficients is shown in Figure 3. The residuals of the calculated values of the partition coefficients are plotted against the experimental ones in Figure 4. The propagation of the residuals in both sides of zero line indicates that no symmetric error exists in the proposed QSPR model.



Evaluation of ANFIS models

The models were also subjected to the test for criteria of the validity of the generated model. The cross validation techniques such as leave-one-out (LOO-CV) and leave-multiple-out (LMO-CV) were used to prove the consistency of the model. In particular, the leave-one-out (LOO), leave-eight-out (L8O) and leave-twelve-out (L12O) procedures were utilized in this work for both the ANFIS and quadratic models. The results are shown in Table 4. Note that calculations of R2L8O and R2L12O were based on 1000 randomly selections of groups containing eight and twelve samples from the original training set. The high values of the R2 for LOO, L8O and L12O indicate that the proposed model is reliable.

Moreover, to assess the robustness of the ANFIS method the Y-randomization test was applied. The dependent variable vector K was randomly shuffled and a new QSPR model was developed using the original descriptor matrix. The new QSPR model is expected to show a low value for R2prediction and Q2LOO. Several random shuffles of the K vector were performed for which the results are shown in Table 5. The results tabulated in Table 5 indicate that the ANFIS model is not due to a chance correlation or structural dependency in the training set.

Conclusions

Quantitative structure property relationships (QSPR) were developed for the calculation of K values based on molecular descriptors. Our model was based on the six molecular descriptors: polarity of food simulant, polarity of polymer, molecular weight of migrant, LUMO (lowest unoccupied molecular orbital), molecular weight of food simulant and HLB (hydrophilicity, lipophilicity balance) of migrant. Forty four different systems of food/migrant/packaging were predicted using these descriptors. ANFIS as a powerful nonlinear tool was used to develop a model between descriptors and K values. We validated our model using the cross validation techniques of leave-one-out, leave-multiple-out and also Y-randomization test. The theoretical values of partition coefficients showed that there is a good correlation between the physico-chemical and structure of molecule. As final conclusion, ANFIS produced substantially better model than the quadratic model reported recently.8

Supplementary Information

The experimental and calculated partition coefficients for both least square and ANFIS models in this QSAR study is available free of charge at http://jbcs.sbq.org.br as PDF file.

Submitted: July 31, 2010

Published online: April 7, 2011

Supplementary Information

Table S1 - Click to enlarge

  • 1. Figge, K.; Plastic Packages for Foodstuffs: a Topical Survey of Legal Regulations and Migration Testing, Wissenschaftliche Verlagsgellschaft: Stuttgart, 1996.
  • 2. Harte, B. R.; Gray, J. I.; Miltz, J.; Food Product-Package Compatibility, 1st ed., Technomic Publ. Co.: East Lansing, 1987.
  • 3. Risch, S.; Food Technol. 1988,7,95.
  • 4. Gilbert, G. S.; Miltz, J.; Giacin, J. R.; J. Food Process Preserv. 1980,4,27.
  • 5. Giacin, J. R.; In Foods and Packaging Materials-Chemical Interactions; Ackermann, P.; Jägerstad, M.; Ohlsson, T. eds.; RSC: Cambridge, Great Britain, 1995, p. 12.
  • 6. An, D.; Halek, G.; J. Food Sci. 1996,61,185.
  • 7. He, L.; Jurs, P.; J. Mol. Graphics Modell 2005,23,503.
  • 8. Tehrany, E.; Fournier, F.; Desobry, S.; J. Food Eng. 2006,77,135.
  • 9. Shi, W.; Shen, Q.; Kong, W.; Ye, B.; Eur. J. Med. Chem. 2007,42,81.
  • 10. Jalali-Heravi, M.; Shahbazikhah, P.; Ghadiri-Bidhendi, A.; Quant. Struct. Act. Relat. 2008,27,729.
  • 11. Sugeno, M.; Kang, G.; Fuzzy Set Syst. 1988,28,15.
  • 12. Osten, D.; J. Chemom. 1988,2,39.
  • 13. Wold, S.; Quant. Struct. Act. Relat. 1991,10,191.
  • 14. Golbraikh, A.; Tropsha, A.; J. Mol. Graphics Modell 2002,20,269.
  • *
    e-mail:
  • Publication Dates

    • Publication in this collection
      04 Aug 2011
    • Date of issue
      Aug 2011

    History

    • Received
      31 July 2010
    • Accepted
      07 Apr 2011
    Sociedade Brasileira de Química Instituto de Química - UNICAMP, Caixa Postal 6154, 13083-970 Campinas SP - Brazil, Tel./FAX.: +55 19 3521-3151 - São Paulo - SP - Brazil
    E-mail: office@jbcs.sbq.org.br