Introduction
A considerable effort has been undertaken in the last decades aiming at the characterization of crude oils and their derivatives to uncover the amazing chemical complexity of such matrices.1 Mass spectrometry (MS) has played a central role in crude oil analysis due to its high speed, selectivity, resolution and mass accuracy, allowing unsurpassed analytical power. Such MS data have been used to obtain class distributions in terms of molecular formulas and unsaturation levels, and to correlate this composition information with the geochemical and physicochemical properties of the oil in a field known as petroleomics.2
Petroleomics studies have been conducted using three MS analyzers: Orbitrap-MS,3,4 multi-reflecting time-of-flight (TOF)-MS analyzer5 and Fourier transform ion cyclotron resonance (FT-ICR) MS.6,7 The FT-ICR MS is the most commonly used due to its superior resolution and mass accuracy, which result in much greater attributions of molecular formulas.8
For the ionization techniques, the most common has been, by far, electrospray ionization (ESI).9-12 ESI FT-ICR MS has, therefore, been widely used to determine sulfur-containing species in heavy crude oil13 and non-polar polyaromatic hydrocarbons and polyaromatic heterocycles in asphaltenes,10,14 as well as to classify biodegraded and non-biodegraded crude oils focused on the O2-class.12 Atmospheric pressure photoionization (APPI) has also been applied to crude oil analysis,15-17 but much less intensively, most particularly when the goal is to efficiently ionize nonpolar sulfur species and polycyclic aromatic hydrocarbons, which are interesting classes because they affect the petroleum refinery and its regulation by governmental agencies.18 The direct infusion of crude oil solutions on APPI can positively charge species to produce both radical cations (M+•) and protonated molecules ([M + H]+)18 by APPI(+) acquisition mode or negatively to produce deprotonated molecules ([M – H]–) by APPI(–) acquisition mode.
In a FT-ICR MS analysis of crude oils, the selection of experimental conditions is crucial since there is a multitude of parameters and settings to be adjusted, which can have dramatic impacts on data quality, particularly because conclusions are drawn on a comparative basis. In these analyses, the optimization of multi-parameter systems are commonly performed by what has been called the ‘one factor at a time’ (OFAT) approach. When using OFAT, one factor is varied at a time while the others are held constant to achieve the desired response.19 Design of experiments (DOE) is, however, a multivariate statistical tool for planning, conducting, analyzing and interpreting data, which uses statistical theory to choose values of each studied factor, thus maximizing the information about these factors regarding one or more responses. Researchers have therefore benefitted from DOE to minimize waste and cost, and to extract most relevant information with the least number of experiments.20 For MS, DOE has been applied mostly to optimize ionization efficiencies.19,21,22 It seems that DOE has not been applied in petroleomics studies and just OFAT approaches have been performed. Nevertheless, it is known that OFAT implicitly assumes a lack of statistical interaction of variables and relies on the intuition as well as the practical and theoretical knowledge of the experimenter. The simplicity of the analysis of data from OFAT experiments also ignores that, for many problems, the central assumption of absence of statistical interaction is invalid.23
To the best of our knowledge, DOE has never been applied to optimize the multitude of crucial parameters of petroleomics analysis. We have therefore used this optimization approach to maximize the efficiency of APPI(±) FT-ICR MS for crude oil analysis. A set of the main parameters for both APPI(+)-MS and APPI(–)-MS was evaluated from full and fractional two-level factorial designs and a test case using the optimized parameters was also performed.
Experimental
Samples and materials
Two crude oil samples were provided by Petróleo Brasileiro S. A. (Petrobras) and identified as M01 and C01. The M01 (Campos Basin, Brazil) and C01 (Sergipe-Alagoas Basin, Brazil) oils present American Petroleum Institute (API) gravity of 21 and are from offshore and onshore exploration, respectively. High performance liquid chromatography (HPLC)-grade toluene was purchased from Tedia (Fairfield, USA).
Design of experiments
The nine main parameters (variables) associated with petroleomics analysis studied were: sheath gas, auxiliary gas, sweep gas, vaporizer temperature, capillary voltage, capillary temperature, tube lens voltage, flow rate and accumulated scans (Figure 1). Briefly, the flow rate is related to infusion of the sample solution. Vaporizer temperature is applied in the APPI probe to vaporize the compounds on the solution. The sheath gas and auxiliary gas are nitrogen streams applied in the coaxial flow to aid in the nebulization of the compounds, whereas the sweep gas, also nitrogen, is applied in a contrary way to eliminate unwanted interferers and aggregates. The capillary voltage and capillary temperature parameters are associated with the energy regime of the ions into the MS instrument. The tube lens voltage is a voltage applied in the conduction lenses to focalize and transfer the ions into the MS analyzer. The accumulated scans factor is responsible for the final mass spectrum obtained from the accumulation or summation of a given number of transients (time domain spectra).24

Figure 1 Scheme of the APPI FT-ICR MS system with the nine parameters evaluated from the DOE: sheath gas, auxiliary gas, sweep gas, vaporizer temperature, capillary voltage, capillary temperature, tube lens voltage, flow rate and accumulated scans.
Initially, a 29-4 fractional factorial design was applied for APPI(±)-MS using the M01 sample to screen the variables and select the significant effects. Then, the 25 and 23 full factorial designs with five center points were performed for APPI(+) and APPI(–) using only the significant factors previously identified, respectively, for both crude oils C01 and M01 (Table 1). Four main responses were evaluated: the maximum MS intensity, number of detected ions, number of molecular formulas and number of classes. The low and high value of parameters were chosen based on the logical conditions of analysis in the APPI(±) FT-ICR MS system. The Design-Expert version 6.0.4 software25 was used in the DOE processing and the data evaluation was performed considering the 95% of confidence level.
Table 1 Factors, parameters and levels of the 29 - 4 fractional factorial design, 25 and 23 full factorial designs applied to the APPI(±) FT-ICR MS for crude oil analysis
Factor | Parameter (variable) | APPI(+)-MS | ||
---|---|---|---|---|
Level | ||||
Low | Center | High | ||
A | sheath gas / arb | 0a | 20a | 40a |
B | auxiliary gas / arb | 0a | 20a | 40a |
C | sweep gas / arb | 0a | 20a | 40a |
D | vaporizer temperature / oC | 200 | 300 | 400 |
E | capillary voltage / V | 0 | 70 | 140 |
F | capillary temperature / oC | 200a | 300a | 400a |
G | tube lens voltage / V | 0 | 115 | 230 |
H | flow rate / (µL min-1) | 10a | 30a | 50a |
J | accumulated scans | 100 | 200 | 300 |
Factor | Parameter (variable) | APPI(-)-MS | ||
A | sheath gas / arb | 0 | 30 | 60 |
B | auxiliary gas / arb | 0a | 30a | 60a |
C | sweep gas / arb | 0a | 20a | 40a |
D | vaporizer temperature / oC | 200 | 300 | 400 |
E | capillary voltage / V | 0a | -40a | -80a |
F | capillary temperature / oC | 200 | 300 | 400 |
G | tube lens voltage / V | -150 | -100 | -50 |
H | flow rate / (µL min-1) | 10 | 30 | 50 |
J | accumulated scans | 100 | 200 | 300 |
a25 and 23 full factorial designs. APPI(+)-MS and APPI(-)-MS: positive and negative acquisition mode atmospheric pressure photoionization mass spectrometry, respectively.
FT-ICR MS analysis
These analyses were performed using a 7.2 T LTQ FT Ultra mass spectrometer (Thermo Scientific, Bremen, Germany) equipped with a direct infusion APPI source operating in both the positive and negative ion modes. The crude oils were prepared in toluene with a final concentration of 1 mg mL-1. Data acquisition was performed along the range of m/z 100-1000 by the Xcalibur 2.0 software (Thermo Scientific) using a mass resolving power of 400,000 at m/z 400. Molecular formula attributions for the ions were done by comparing their m/z values with a library of compounds present in the database of the PetroMS software26 (Petrobras, Rio de Janeiro, Brazil and University of Campinas, Campinas, Brazil), based on literature search and standards. The data processing was done through the following steps: (i) the assignment of m/z for each spectrum signal; (ii) automatic allocation of the optimal threshold for the noise intensity of each individual spectrum; (iii) internal calibration of spectrum by homologues series using the most intense class; and (iv) assignment of molecular formula for each signal by comparing experimental m/z with a theoretical m/z database for possible crude oil constituents.4 The automatic gain control (AGC) was used in a fixed value for all of the measurements. From the MS acquired for each design run, two response data (maximum intensity and number of detected ions) were collected using the Xcalibur 2.0 software, whereas the other two response data (number of molecular formulas and classes) were collected after the MS processing using the PetroMS software.26
Application of APPI(±) FT-ICR MS parameters optimized by DOE
The APPI(±) FT-ICR MS parameters optimized by the DOE approach were applied as a test case in the analysis of the C01 crude oil and their saturated, aromatic, resin and asphaltene (SARA) fractions. The SARA fractions were obtained in a previous study.27 Briefly, an aliquot the sample was submitted to the precipitation of asphaltenes in n-heptane and the soluble fraction (maltene) was transferred to an open chromatography glass column to carry out the elution with n-heptane (saturates fraction), toluene (aromatics fraction) and 90:10 toluene:methanol (resins fraction). Each fraction was concentrated under reduced pressure in a rotary evaporator.27 The crude oil and its SARA fractions were prepared in toluene with a final concentration of 1 mg mL-1, and were analyzed using the APPI(±) FT-ICR MS with the optimized parameters by DOE approach.
Results and Discussion
Initially, the two-level fractional factorial design was applied to screen the effects of the main parameters on the APPI(±) FT-ICR MS analysis of crude oil. After that, for the analysis of C01 and M01 oils, the significant factors selected from the fractional factorial design were used in the full factorial design for the optimization of the APPI(±) FT-ICR MS system.
Fractional factorial design
The 29-4 fractional factorial design was performed in the M01 crude oil using the APPI(±)-MS conditions described in Table 1. A total of 32 runs were carried out for each ionization mode (Tables S1 and S2 (Supplementary Information (SI) section) for APPI(+)-MS and APPI(-)-MS, respectively). The initial results show that the MS profile and the four responses are very different when the APPI parameters are changed for both APPI(+) and APPI(–), as can be visualized in the representative examples of Figures S1 and S2 (SI section), which represent the run numbers 8 and 10 for APPI(+), and run numbers 27 and 11 for the APPI(–) of M01 oil of Tables S1 and S2 (SI section), respectively.
The comparison of the magnitude and statistical significance of parameters and their effects for each response were evaluated from the construction of normal plots for both APPI(+) and APPI(–) fractional factorial design, wherein just the primary effects were considered. Figure 2 shows the normal plots of DOE for the four responses obtained by APPI(+) FT-ICR MS analysis of the M01 oil. The significant parameter effects were A, B, C, D, F, J and H, but we could reduce them to five due to the low effect contribution of the J and D parameters, which appeared in just one of the responses. The final significant parameter effects were found to be A, B, C, F and H, which correspond to sheath gas, auxiliary gas, sweep gas, capillary temperature and flow rate, respectively. The five significant parameters were further investigated using a 25 full factorial design.

Figure 2 Normal plot graphs of 29–4 fractional factorial design for the maximum intensity, detected ions, molecular formulas and classes for APPI(+) FT-ICR MS analysis of the M01 sample. The red circles mark the significant parameters.
The normal plots were also obtained for the APPI(-) FT-ICR MS analysis of the M01 oil (Figure 3), and the significant parameters were found to be B, C and E, which correspond to auxiliary gas, sweep gas and capillary voltage. These parameters were further investigated using the 23 full factorial design. Note that the fractional factorial design showed that the auxiliary gas and sweep gas used for the crude oil analysis in the FT-ICR MS have an important contribution in both APPI(±) ionization modes, whereas the other parameters were important for a single ionization mode.
Full factorial design
The full factorial design was applied using only the significant factors previously identified (superscript a in Table 1). A total of 37 and 13 runs were performed for APPI(+)-MS and APPI(–)-MS, respectively, and Tables S3-S6 (SI section) list the results. Again, distinct MS profiles were found for the same sample and ionization mode when changing just a few APPI parameters, as Figures S3-S4 (SI section) show with representative examples. Intentionally, we have compared in such Figures the same runs, which means the same parameter conditions were compared for both crude oils in APPI(+). Note that they present distinct MS profiles, which was expected because they are crude oils from different basin and exploitation characteristic.
Initially, an outlier in the detected ions response was identified in the run 7 for the APPI(+) of the M01 oil, so we excluded this run for the processing. Figures 4 and 5 show the normal plots for the four responses obtained from the APPI(±) FT-ICR MS analysis of the M01 oil in the full factorial design. The significant parameters to APPI(+)-MS were: A (sheath gas), D (capillary temperature) and AD, which is the interaction between both parameters. For APPI(–)-MS, the significant parameter effects were: A (sheath gas), B (auxiliary gas) and AB, which is the interaction between them. Note that for APPI(–)-MS, the C parameter was also found to be significant (Figure 5), but just for the maximum intensity response, therefore it was not considered in the interpretation.

Figure 4 Normal plot graphs of 25 full factorial design for the maximum intensity, detected ions, molecular formulas and classes responses for APPI(+) FT‑ICR MS analysis of the M01 sample. The red circles mark the significant parameters.

Figure 5 Normal plot of 23 full factorial design for the maximum intensity, detected ions, molecular formulas and classes responses for APPI(–) FT-ICR MS analysis of the M01 sample. The red circles mark the significant parameters.
Figures 6 and 7 show the normal plots for the four responses obtained from the APPI(±) FT-ICR MS analysis of the C01 oil in the full factorial design. The significant parameters for APPI(+)-MS were: A (sheath gas), B (auxiliary gas), D (capillary temperature) and some interactions between them, whereas for the APPI(-)-MS were: A (sheath gas), C (capillary voltage) and the interaction between both, and no significant parameters were found for the classes response.

Figure 6 Normal plot of 25 full factorial design for the maximum intensity, detected ions, molecular formulas and classes responses for APPI(+) FT-ICR MS analysis of the C01 sample. The red circles mark the significant parameters.

Figure 7 Normal plot of 23 full factorial design for the maximum intensity, detected ions, molecular formulas and classes responses for APPI(–) FT‑ICR MS analysis of the C01 sample. The red circles mark the significant parameters.
All models were evaluated using different diagnostic plots (Figures S5-S19, SI section) for the APPI(±)-MS of M01 and C01 oils. The diagnostic plots show that the models are significant and reliable for all data, as a result of: (i) the normal distribution of residuals seen in Figures S5A-S19A; (ii) the normal distribution of the residual vs. run in Figures S5B-S19B; (iii) residual vs. predicted in a confidence level of 95% in Figures S5C-S19C; and (iv) the model accuracy, with a good correlation between the predictions and actual results in Figures S5D-S19D.
After the evaluation of the models, a “numerical optimization” based on the desirability concept was used to obtain the combination of the factors. Such combination intended to maximize the four responses for each ionization mode and different crude oils, using the same weights for each response. Tables S7-S10 (SI section) show the desirability level for the three best conditions obtained. For example, the APPI(+)-MS data of M01 oil (Table S7) shows that a higher desirability (71%) was achieved using the following parameter levels: sheath gas (40), auxiliary gas (0), sweep gas (7), capillary temperature (400 ºC) and flow rate (ca. 50 µL min-1). The same data interpretation was performed (Tables S8-S10) for all APPI ionization modes in both crude oils. Table 2 summarizes the final APPI(±) FT-ICR MS optimized parameters by fractional and full designs.
Table 2 APPI(±) FT-ICR MS optimized parameters by DOE for crude oil analysis in the petroleomics study
Parameter | M01 | C01 | |||
---|---|---|---|---|---|
APPI(+)-MS | APPI(-)-MS | APPI(+)-MS | APPI(-)-MS | ||
Sheath gas / arb | 40a | 20 | 40a | 20 | |
Auxiliary gas / arb | 0a | 60a | 0a | 0a | |
Sweep gas / arb | 0a | 0a | 0a | 0a | |
Vaporizer temperature / oC | 300 | 300 | 300 | 300 | |
Capillary voltage / V | 70 | 0a | 70 | 0a | |
Capillary temperature / oC | 400a | 300 | 400a | 300 | |
Tube lens voltage / V | 115 | -100 | 115 | -100 | |
Flow rate / (µL min-1) | 50a | 30 | 50a | 30 | |
Accumulated scans | 100 | 100 | 100 | 100 |
aValues determined from the two-level full factorial design. The other values were determined from the two-level fractional factorial design. M01: sample extracted from Campos Basin, Brazil; C01: sample extracted from Sergipe-Alagoas Basin, Brazil; APPI(+)-MS and APPI(-)-MS: positive and negative acquisition mode atmospheric pressure photoionization mass spectrometry, respectively.
The parameter values without the superscript a (Table 2) were obtained from the two-level fractional factorial design, and are fixed on the center point value (Table 1) because they do not present a significant effect in the responses. The whole range of their values can therefore be used. Nevertheless, when the two distinct oils are compared for each ionization mode (Table 2), only the auxiliary gas parameters were found to be different for APPI(–)-MS with the values of 60 and 0 for M01 and C01 oils, respectively. Regardless of the crude oil used, the final optimized values are, therefore, the same, which suggests that the best APPI(±)-MS conditions should remain quite similar from one oil to another.
If we compare the results of our study to the conditions recently described for crude oil analysis using APPI(±) FT-ICR MS, the vaporizer temperatures of 250-35028 and 300 ºC29 were described by APPI(±)-MS and APPI(+)-MS using FT-ICR MS, respectively. Note that these are the same values reported herein. We have no studies describing the capillary voltage values. Pereira et al.30 used a capillary temperature of 250 ºC in the asphaltenes analysis, but we have not found such temperature when analyzing whole crude oils. Table 2 shows that optimized values of 400 and 300 ºC for APPI(+)-MS and APPI(–)-MS, respectively, were obtained in our study. The flow rate has been recognized as one of the most important parameters for crude oil analysis using APPI, which should be higher than for ESI. For example, Bae et al.31 used the flow rate of a crude oil solution three times higher for APPI than for ESI. The same optimal flow rate values as found in our study have been reported, for example, 50-100,28 5029 and 50 µL min-1.32 In agreement with our results, the same optimal accumulated scans have also been reported.15,32,33 Some studies have, however, applied twice as high accumulated scans, 200, but we found that accumulated scans above 100 results in MS with similar quality.34,35 Some studies have described the use of sheath, auxiliary and sweep gases in crude oil analysis by APPI(±)-MS, but the values cannot be found in the manuscripts.15,31,32
Application of the APPI(±)-MS optimized conditions
The APPI(±) FT-ICR MS optimized parameters by DOE (Table 2) were then applied for the C01 crude oil and its SARA fractions analysis aiming at consolidating the optimization. The SARA fractions were chosen because they selectively represent different classes of constituents in terms of polarity and aromaticity, which may be suppressed in the whole crude oil spectra. Figures S20-S21 (SI section) show that the APPI(±)-MS optimized parameters led to high quality spectra for both the whole crude oil and all fractions. Indeed, each fraction displayed quite specific MS profiles, which are representative of their different chemical constituents, as also shown by Liu et al.36 and Cho et al.37 The processing of the APPI(±)-MS data acquired using the optimized parameters also provided thousands of molecular formulas assigned with errors below 1 ppm and class distributions of the whole crude oils and SARA fractions (Figure 8). Figure 8a shows the class distribution from the APPI(+)-MS data, which shows the attribution of the CH-class as the most abundant in the whole crude oil and all SARA fractions (except for resins fraction). In the resins fraction the N-class predominates because it is the fraction with the most polar components of crude oils. Cho et al.37 also described these facts in another petroleomics study. The APPI(–)-MS data (Figure 8b) reveal a large distribution of Ox-classes (x = 1-4), which are chemical constituents associated with the acidic species in crude oils.38
Conclusions
A comprehensive optimization via DOE of the APPI(±) FT-ICR MS parameters for the crude oil analysis in the petroleomics field was performed, and it was found that the most important parameter analysis are: sheath gas, auxiliary gas, sweep gas, capillary temperature and flow rate for APPI(+)-MS; and auxiliary gas, sweep gas and capillary voltage for APPI(–)-MS. Continuous use of these parameters in our laboratory for many samples, for both crude oils and fractions, with quite contrasting compositions, have so far always led to high quality APPI(±)-MS spectra. It is known that ion source parameters may present a different behavior depending on the instrument, but these results are the first step and can serve as guide for optimal petroleomics analysis via APPI(±)-MS, and this becomes even more important when noting that most studies in petroleomics have not reported more detailed information on them.