Multivariate Regression Models for the Simultaneous Quantitative Analysis of Calcium and Magnesium Carbonates and Magnesium Oxide through Drifts Data

Neste trabalho são apresentados modelos de regressão multivariada, empregando espectros por reflexão difusa no infravermelho com transformada de Fourier (DRIFTS), para análise quantitativa do sistema ternário formado pelos carbonatos de cálcio e magnésio e pelo óxido de magnésio. Através de um diagrama ternário, foram definidas as misturas para aquisição dos espectros por reflexão no infravermelho médio. Empregando-se a regressão por mínimos quadrados parciais (PLS), foram construídos modelos com os dados espectrais centrados na média e/ou escalados pela variância, utilizando o conjunto de espectros de calibração. Para selecionar os melhores modelos, foram comparados os valores de RMSEP (root mean square error of prediction). Os resultados comprovam que bons modelos de calibração multivariada para a determinação de carbonatos de cálcio e magnésio e de óxido de cálcio podem ser obtidos a partir dos espectros por reflexão no infravermelho. Estas determinações são particularmente úteis no estudo da decomposição térmica de rochas dolomíticas.


Introduction
The reserves of carbonate rocks in the state of Rio Grande do Sul -Brazil, are mainly constituted by dolomitelike rocks (MgO content ≅ 18%), while calcite-like rocks (MgO content < 4%) occur in a smaller proportion.It is assumed that, in volume, only 18% of the calcareous rocks are calcite-like (used mainly in cement production) and 82% are dolomite-like and are used mainly in the correction of soil acidity and in the production of lime.
The search for a solution to the suppressed demand of raw materials by turning dolomite into calcium-rich carbonate and magnesium oxide, which are thoroughly used in industrial processes impelled the development of a physical process of dissociation of dolomite-like rocks, in order to obtain these two different products.The basic idea of the process is a fractionated calcination of a dolomite-like rock, composed mainly of Ca and Mg carbonates (52% and 40% approximately), 1 at a given temperature and a complete calcination process that results in a mix of calcium carbonate combined with magnesium oxide in the same material.
The lack of specific methodologies [2][3][4][5][6] that allow the joint analysis of the mix of obtained products from the fractionated calcination of the dolomite-like rocks led to the development of a technique that enabled the quantitative analysis of calcium carbonate and magnesium oxide by means of Diffuse Reflectance Infrared Fourier Transform Spectroscopy (DRIFTS).
Several studies that use infrared spectrometry have been carried out in the last few years in order to qualitatively and quantitatively determine organic and inorganic compounds in soils and rocks. 7-11] This is a non-invasive method of analysis that allows for the quantification of the main components in the mix of generated products in one measurement, with a high degree of reliability. 17The determination of the contents in the mix is very important for the quality control of the calcination and separation processes.Furthermore, the evaluation method also allows the control of final quality of the singular products generated after the extraction process.
The partial least squares (PLS) regression method 18-25 is an outstanding method of multivariate regression usually applied to ternary systems, especially when preceded by a robust factorial design. 26,27][30][31] The present paper proposes the use of the DRIFTS-PLS method to determine the concentration in weight of calcium carbonate, magnesium carbonate and magnesium oxide.The major advantages are that these analyses do not require extensive sample preparation, 32 are noninvasive, and do not produce residues that are harmful to the environment.

Experimental Design
Mixture design has been used throughout this work in order to model the ternary system, which is composed of the following analytes: calcium carbonate, magnesium carbonate and magnesium oxide.Nineteen mixtures were distributed along the ternary diagram presented in Figure 1.These mixtures will be used to develop the multivariate regression models.Other six mixtures, represented by tiny light letters, were selected to compose the validation set.

Samples
Powdered samples of calcium carbonate, magnesium carbonate and magnesium oxide were used in order to prepare a set of standard mixtures with different concentrations according to the mixture design.Calcium carbonate was supplied by FERTISUL, located in the state of Paraná, in southern Brazil, magnesium carbonate was donated by UNISINOS, located in the state of Rio Grande do Sul, in southern Brazil, and commercial analytical-grade magnesium oxide was supplied by VETEC.
After the mixture was made, each component percentage was recalculated, and the final values are presented in Table 1.The calcium carbonate content varied between 29.99 and 70.00%, the magnesium carbonate content varied between 0.00 and 49.99% and the magnesium oxide content, between 0.00 and 40.03%.The same procedures were used with the external validation set samples.

Spectra
Each previously prepared standard was diluted in spectroscopic-grade KBr at a ratio of 1:49.The midinfrared reflectance spectra were recorded from 4000 to 450 cm -1 with an FT-IR NICOLET spectrometer MAGNA 550 according to Table 2 parameters and using a diffuse reflectance accessory EasiDiff™ PIKE Technologies, Inc. KBr (spectroscopic grade) was used as background and the spectra were obtained in triplicate for each sample.
The same procedures were used with the external validation set samples.
Software OMNIC E.S.P. v.3.1 was used to acquire the spectra and to calculate an average spectrum.TurboQuant v.1.1 was employed in the modeling.

Modeling
The partial least squares (PLS) regression method was employed on the modeling using the standard set.Briefly, spectra were pretreated by either mean-centered and variance-scaled or mean-centered only.

Variables selection
PLS regression on the full spectra range was performed on the nineteen standard spectra.The first three acquired latent variables (LV) are presented in Figure 2. Throughout the development of the calibration models the sub-regions where there was large LV 2 or LV 3 signal contribution were used and are presented in Table 3 and Figure 3.
The stepwise elimination algorithm 33 was used to optimize the regions used in the models, and a coefficient of determination increase and PRESS (Predictive Residual Error Sum Of Squares) decreases were used as selection criteria.

Evaluating models
The six validation sample prediction errors were calculated in order to compare the models beyond the calculated coefficient of determination (R 2 ) with the calibration standards.Another adequate parameter used to simultaneously evaluate the six samples is called RMSEP (Root Mean Square Error Of Prediction), being calculated according to expression 1. (1)   where y i is the reference value for the i-th sample and y ^i is the predicted value for the same sample and n to the number of validation samples.

Results and Discussion
After many attempts to obtain a calibration model it was verified that two different pre-processing produced better results.Both models used the regions shown in Table 3.It is worthwhile to note that there are regions selected by LV 2 and LV 3 criteria and not used in the optimization.A different region set was selected for each analyte.
The model that used mean-centered pretreatment is called model 1.In model 2 the mean-centered data were also variance scaled.
The minimum PRESS was used in the calibration process together with the search for minimum RMSEP.For model 1 those criteria give three latent variables (factors) for CaCO 3 and MgCO 3 and four latent variables for MgO.For model 2, three latent variables were obtained for all analytes.
The 19 standard samples used for calibration yielded coefficient of determination (R 2 ) values of 0.954, 0.954 and 0.989 for CaCO 3 , MgCO 3 and MgO, respectively, in model 1; and 0.974, 0.967 and 0.959 in model 2, showing a good linear correlation.
The prediction results for the validation set of the six samples are presented in Table 4 for both models and compared with the real values.Also presented in the table are RMSEP values for each analyte.These results can be considered quite reasonable for the both preprocessing techniques.
The RMSEP values can be used to suggest that the prediction value for each analyte is better obtained when model 1 is used for MgCO 3 and MgO and model 2 for CaCO 3 .

Conclusions
In this work it was demonstrated that it is possible to develop a model to be used in the study of the thermal decomposition of dolomite rocks.It was possible to simultaneously determine the three basic analytes: calcium carbonate, magnesium carbonate and magnesium oxide by employing partial least squares regression methods in DRIFTS data.
The model that used the mean-centering technique gives better results for magnesium carbonate and magnesium oxide, whereas a model that also included variance-scaled data gives better results for calcium carbonate.
This method is therefore ideal for monitoring the calcination of dolomitc rocks and of other carbonate-rich rocks.The major advantages are that these analyses do not require extensive sample preparation, are non-invasive, do not produce residues that are harmful to the environment, and are less time-consuming, especially when used on a routine basis.

Figure 1 .
Figure 1.Mixture design of ternary system of calcium carbonate, magnesium carbonate and magnesium oxide.

Figure 2 .
Figure 2. Loadings for latent variables of the reference model.

Figure 3 .
Figure 3. Sub-regions selected through the 2 nd and 3 rd latent variable loadings.

Table 3 .
Sub-regions selected for each analytes

Table 2 .
Experimental parameters for spectra acquisition

Table 1 .
Percentage weight of the samples in the standard set

Table 4 .
Prediction values for all analytes and RMSEP values for both models