Effects of Chemical Composition and Pyrolysis Process Variables on Biochar Yields: Correlation and Principal Component Analysis

Based on a systematic review, 19 case studies were selected, focusing on the production of biochar through pyrolysis of five lignocellulosic biomasses (olive husk, beech wood, corncob, spruce wood, and hazelnut shell), under constant pressure (0.1 MPa) and temperature from 650.2 to 973.0 K. Interactions between process variables (temperature, residence time of the vapor phase and heating rate), biomass chemical composition variables (lignin, holocellulose, ash, carbon, nitrogen, oxygen and hydrogen content) and biochar yield-CY were evaluated by Pearson’s correlation matrix and Principal Component Analysis-PCA. Strong correlations (|r| ≥0.75, p<0.05) were found for lignin and CY (0.78); carbon and CY (0.76); nitrogen and CY (0.77). Three variables of biomass chemical composition were the most important ones for the first principal component-PC1; process variables (heating rate and the vapour residence time) were the most important ones for the second principal component-PC2. Experiments with hazelnut shell as feedstock were associated with higher CY.


INTRODUCTION AND OBJECTIVES
Thermochemical char is a stable carbon-rich by-product (65% to 95% carbon) (Debiagi et al., 2018) resulting from thermochemical degradation of plant or animal biomass (Ahmad et al., 2014) under O 2 -free or limited quantities of O 2 (Pandey et al., 2020).
Traditionally, the production of char -also known as "charcoal" occurs through direct burning of woody biomass and reactional atmosphere in contact with oxygen used for thousands of years (Weber & Quicker, 2018) in systems such as "earth-mound kiln" (Adam, 2009). Currently, most charcoal production still occurs in traditional (rudimentary) kilns, resulting in inefficient carbonization, CO 2, and nonCO 2greenhouse gases (VOCs) release and economic losses (Pereira et al., 2017). To overcome these issues, modern technologies and lifecycle assessment are now helping to improve efficiency, to reduce VOC generation (Azzi et al., 2019).
The wide range of lignocellulosic biomass feedstock suitable for pyrolysis includes wood biomass, herbaceous and agricultural biomass  depending on the local availability, which minimizes transportation costs and the overall carbon footprint of char production (Mukome et al., 2013).
Lignocellulosic biomasses, consisting mainly of cellulose, hemicellulose, and lignin, are considered suitable for energy purposes due to its high volatile material (VM in %) associated with the low ash content (Tsai et al., 2012) and low sulfur content (Mishra & Mohanty, 2018) compared to the mineral coal (Vassilev et al., 2010). Chemically, the biomass composition can be simplified as a complex mixture of carbon, oxygen, sulphur, nitrogen, ash, and small quantities of few other elements including alkali, earth alkaline and toxic metals (Tripathi et al., 2016). Different lignocellulosic biomasses differ in many properties such as thermal stability Pandey et al., 2020) and one component reacts 2 faster than other (hemicellulose > cellulose > lignin), which is affected by parameters such as: initiation temperature, heating rate, vapor residence time and the presence of catalysts (i.e.: biomass K content). In function of operation parameters adjustments, the thermochemical conversion of dry biomass under inert atmosphere can produce carbonaceous solid residue, bio-oil and syngas by different modes of processes, such as torrefaction, slow pyrolysis and fast pyrolysis (Jung et al., 2015;Mimmo et al., 2014;Parshetti et al., 2013).
Some investigations on fast pyrolysis have focused on the conditions to maximize the production of the bio-oil fraction, which is the liquid biofuel that can be used directly without modification in stationary heat and power applications or upgraded to a drop-in biofuel (Bridgwater, 2012;Zhang et al., 2020;Waqas et al., 2018;. On the other hand, detailed information about process conditions to maximize char production through pyrolysis of biomass is not so common (Tripathi et al., 2016) and further investigation for reactor design and process optimization is needed (Debiagi et al., 2018) because different ranges of operational parameters, such as temperature (T), heating rate (HR) and vapour residence time (SRT) in combination with a wide range of biomass feedstocks with a variety of physical-chemical compositions result in large variability in terms of char yield and char physico-chemical properties (Rehrah et al., 2018). Mass and energy balances for modelling the thermochemical processes are also recommended for char yield increase (Jesus et al., 2018).
In respect to the properties of biochars, physical and chemical characteristics are of fundamental importance to select the most appropriate reactor design (Santos et al., 2020) in terms of the application intended for the char, as for instance, carbon sequestration, since oxidation resistance is a function of feedstock properties and pyrolysis condition (Han et al., 2018).
Char yield can be defined as char mass production per unit of weight of dry feedstock used (%-m in dry mass basis of analysis) . In principle, it is possible to seek for specific char properties and increase the char yield from a given biomass feedstock (Luo et al., 2015).
Usually, higher HR, moderate T and shorter SRT favours bio-oil production, while lower HR, higher T and longer SRT favours the syngas production (Uddin et al., 2013). In respect to char production, the most favourable conditions are lower temperature (T), slower heating rate (HR) and longer vapour residence time (SRT) (Tripathi et al., 2016;Uddin et al., 2013).
Towards a char characterization, a microscopic inspection shows a quaternary structure organized in a decreased scale as follows: heterogeneous phases, graphene-like aggregates, aromatic clusters, and atomic arrangement (Xiao & Chen, 2017). The organic components include water-soluble organics, aliphatic compounds with high molecular weight and a relatively high proportion of insoluble aromatic structures (Singh et al., 2012). The proportion of different organic components and the degree of condensation of aromatic carbon depend on the feedstock and the process variables (T, SRT, HR) (Sun et al., 2011) since the release of hydrogen (H %-m) and oxygen (O %-m) is favoured over carbon (C%-m), in such way that as pyrolysis progresses, reactions by pyrolysis mechanisms result in a porous carbonaceous material with a progressively higher fixed carbon content (FC %-m) (Crombie et al., 2013). When char is produced for using as a soil amendment to increase fertility or sequester atmospheric CO 2 , it is referred to as biochar (Glaser et al., 2001), in accordance with the International Biochar Initiative and European Biochar Certificate (Klasson, 2017). Most publications address the effect of each parameter or variable, keeping the others constant (Guedes et al., 2018). Some authors, for instance, have investigated the products formed through pyrolysis with emphasis on the effect of temperature (Palamanit et al., 2019;Zhang et al., 2020). However, few studies have so far analysed the combined effect of three or more variables simultaneously on biochar yield (Morales et al., 2015;Weber & Quicker, 2018;Yadav et al., 2019;Li et al., 2019) and their methodological approach did not apply Principal Component Analysis (PCA). One of the few cases wothy to be mentioned is the application of multivariate statistical methods to select best biomasses for bioenergy purposes (Couto et al., 2013;Garcia et al., 2019;Júnior et al., 2016).
The aim of the present study was to identify the relative contribution of different variables of chemical composition and pyrolysis process (individually and in groups) to the char yield (named as biochar considering agronomic and carbon sequestration purposes), using exploratory multivariate analysis. The specific objectives were: (1) identify relevant correlations among biomass chemical composition and pyrolytic processes variables to the biochar yield using as feedstock, agricultural and forest by-products that has shown promising results in previous studies with focus on bioenergy production; (2) determine the number of principal components that allow to explain more than 50% of the total variance of compiled data and; (3) to briefly discuss the influence of biomass composition and process variables on the char yield.

Systematic review eligibility criteria
Scientific publications containing primary data on char production from five selected feedstocks (beech wood, hazelnut shell, olive husk, spruce wood and corncob) were identified after a systematic literature survey, being these species similar in energy content values, as follows: beech wood (19.6 MJ/kg), hazelnut shell (19.5 MJ/kg), olive husk (21.8 MJ/kg), spruce wood (20.5 MJ/kg) and corncob (17.3 MJ/kg) (Saidur et al., 2011), an attribute required for the purpose of the present study.
The influence of the pyrolysis kinetic model, reactor model, catalysts, inert carrier gas flowrate, biomass moisture and particle size diameter were not included in the present study due to lack of data/information about one or more of these variables in the case studies. In short, the following eligibility criteria were considered to select the publications: • Criterion 1: Publications related to slow or fast pyrolysis solely (combinations with any other thermochemical technique were excluded) for each of the five selected biomasses (corncob, olive husk, spruce wood, hazelnut shell, bench wood). • Criterion 2: Available information about the biomass chemical composition, including at least the following variables in percentage of mass (%-m): content of holocellulose (HC, which is cellulose + hemicellulose), lignin (LG), Carbon (C), Hydrogen (H), Oxygen (O) and Nitrogen (N) and ash (ASH, since the ash content was ≤ 4%). • Criterion 3: Information available on pyrolytic process, including the following process variables: final temperature (T) in K, vapour residence time (SRT) in sec, heating rate (HR) in K*s -1 and pressure (P) in MPa.
The pressure had to be constant (around 0.1 MPa) and the temperature ≥ 650 K, due to the fact that biochar yield shows a steady decrease as the pyrolytic process goes over 673.1 K (400 o C) (Zhang et al., 2020). • Criterion 4: Information available on pyrolysis products distribution, with at least the dependent variable char yield-CY (%-m). • Criterion 5: Publications describing experiments and original data (review papers were not included). • Criterion 6: Publications in indexed journals registered in the Journal Citation Report (JCR). The review carried out with this set of criteria resulted in the following number of papers using Scopus survey platform (TITLE-ABS-KEY), accessed for the last time in December 18 th 2019: [ From all the documents recovered, five publications met these criteria: Demirbas et al. (1996); Antal et al. (2000); Demirbaş (2001); Pütün et al. (2001) and Demiral et al. (2012) resulting in 19 experimental study cases.

Statistical Analyses
Variables related to biomass chemical composition and the pyrolytic process were analysed. Data obtained from scientific publications selected after the filters were applied was treated statistically using (i) Correlation Analysis (CA) and (ii) Principal Component Analysis (PCA). All statistical analyses were carried out using computational routines in R software, version 4.0.2 (R Core Team, 2018). The production of graphs and figures was supported by the following R libraries: dplyr, readxl, stringr, factoextra, colorspace, FactoMineR, corrplot and dendextend.

Principal Component Analysis (PCA)
The principal component analysis (PCA) is an orthogonal transformation to convert a set of data into a principal component space, in other words, in a new coordinate system to find relationship among complex multi-variables (Choi, Choi and Park 2012). The first principal component (PC1) explains the largest portion of the observed variance. The second principal component (PC2) explains the second largest portion of the observed variance, and so on (Jolliffe IT, 2002). In the present study original data was normalized, as a pre-treatment; the correlation matrix (Pearson's correlation) was built up with the normalized data; then, a new system of coordinates (principal components) was generated. The eigenvalues and eigenvectors of correlation matrix among 11 variables were computed (Davò et al., 2016). Eigenvector loadings, correlations, and the contribution in % of each original variable to the principal components were revealed. The components which explained in total more than 50% of the variance were analyzed (Choi, Choi, Park et al. 2012).

Selected case studies
The variables of chemical composition from five selected agriculture/forest biomasses (spruce wood, beech wood, corncob, hazelnut shell and olive husk) and the variables of pyrolysis process namely temperature (T), heating rate   (HR) and vapour residence time (SRT) from the 19 cases studies selected through a systematic review are compiled in Table 1. The descriptive statistics for these variables (Table 2) show large variability, as evidenced by the high coefficient of variation (CV), some of them with CV >> 20%, such as: LG (48.1%), ASH (82.7%), N (65.4%), SRT (152.3%) and HR (102.7%). Figure 1 shows the Pearson´s coefficient correlation matrix with all 11 variables. It was built up with data from Table 1 normalized by their mean and standard deviation according to Table 3.

Correlations Analysis
According to Figure 1, strong correlations are observed between the following pairs of chemical composition variables: 5 -12 HC vs LG (-1.0); HC vs ASH (-0.72); HC vs C (-0.79); HC vs N (-0.97); LG vs C (0.81); LG vs N (0.96); ASH vs N (0.73). Average correlation was observed between ASH vs LG (0.69). Strong correlations were found between elemental variables C vs N (0.75) and H vs O (-0.93). Other strong correlations (|r| ≥ 0.75; p < 0.05) occurred between CY and structural or elemental biomass composition variables: CY vs LG (0.78), CY vs HC (-0.77), CY vs C (0.76) and CY vs N (0.77). Figure 1. Pearson´s correlation matrix with 11 variables. HC, LG, ASH: total holocellulose, lignin and ash (%) respectively; C, H, O, N: carbon, hydrogen, oxygen, and nitrogen (%) respectively; T, SRT, HR: temperature (K), vapour residence time (s) and heating rate (K/s) respectively; CY: char yield in dry mass basis (%-m). More circular shapes suggest weaker and more elliptical shapes suggest stronger correlations. An elliptical shape bending towards the left (from white to dark red) and towards the right (from white to dark blue) mean negative and positive correlations, respectively. Dark blue or red are strong correlations. The levels of significance of p-values are found in Table 4.

Principal Component Analysis (PCA)
In Table 5, details related to the three first principal components are found. Figure 2 shows that 66.4% of the variance is explained by the 1 st and 2 nd principal components (PCs). Since the PC1 (45.8%) and PC2 (20.6%) explain about 66.4% of the variance observed, the PC3 was excluded from the discussions. Figure 3 and Figure 4 show details related to the first (Dim-1) and second (Dim-2) principal components respectively, in terms of contribution by variables. A graphical visualization of data in Table 5 for the first two principal components (66.4% of the explained variance) are showed in Figure 5

Correlation between PC1 vs CY and PC2 vs CY
When the biochar yield (CY) is correlated to the first and to the second principal components (PC1 vs CY and PC2 vs CY respectively) the results revealed that PC1 has a strong positive correlation with CY (r = 0.8282), but PC2 has not (r = -0.3121).

Discussion
The results are discussed in accordance with the following sections: Correlation Analysis (CA) and Principal Component Analysis (PCA).

Correlation between Char Yield and biomass composition variables
The strong correlation between CY vs LG is understandable, since LG has a three-dimensional structure and poses more resistance to thermal degradation than holocellulose (cellulose with hemicellulose), also due to its high level of aromaticity, size and structural arrangement (Haykiri-Acma et al., 2010), which affects the proportion of the solid product generated, more than other structural constituents (Akhtar et al., 2012). Since the elemental carbon content (C %-m) forms the biochar three-dimensional structure, it is also expected strong positive correlation between CY and C, which was confirmed (0.76).
Strong negative correlation observed between CY and HC is explainable, since higher holocellulose content implies in less lignin content in biomass composition and, therefore, lower biochar yield (Duku et al., 2011;Tripathi et al., 2016;Kan et al., 2016). This trend is also in accordance with results described in the literature (Lv et al., 2010), since during pyrolysis, the tar and the syngas yields increases when there is a high cellulose content (one of the holocellulose constituents).
One aspect to be careful about correlation analysis and PCA is that one may see causality between correlated variables where it does not exist. In the present investigation, for instance, Nitrogen was found to be correlated to char yield according to Pearson's correlation (r=0.77 in Figure 1) and have relevant contribution to PC1 (Table 5), although it is known that Nitrogen-based small molecular weight compounds are expected to react fast and probably does not contribute much to the char yield.
Correlation between process variables HR and SRT showed a negative correlation (-0.62), which is the only correlation between pyrolysis process variables ( Figure 1). The heating rate (HR) in the 19 case studies varied from 0.04 to 10 K/s and the vapour residence time (SRT) varied from 180 and 18,000 seconds. In principle, when HR is high and SRT is low, fast pyrolysis occurs, meanwhile low HR and high SRT are typical of slow pyrolysis. Low HR and high SRT favors char production (CY) in relation to other product fractions (bio-oil and syngas). Weber & Quicker (2018) reviewed and summarized the results from several experiments on biochar production. The authors concluded that CY reduces after 400 o C, meaning that degradation rate of intermediate solid phase is slower than degradation rate of the initial biomass (Bach et al., 2016) due to deposition of volatiles on intermediate solid phase.
Principal Component Analysis (PCA) For the present study, 66.4% of the variance is explained by the 1 s and 2 nd principal components (PCs) (Figure 2). The Kaiser criterium was met as well, since these two PCs showed eigenvalues > 1.
The PC1 explains 45.8% of the total variance, being LG, HC, N, C and ASH the variables with the highest Pearson´s correlation in Table 5 (0.993; -0.990; 0.948; 0.837 and 0.828 respectively) as a function of the respective high loading value for each one of them. Each of these variables has contribution higher than 10% (Table 5; Figure 3). It is worth to mention that hollocelulose HC is placed in the opposite quadrant to the other biomass composition variables (LG, N and C and ASH) in Figure 5, reflecting that differently from those variables, HC affects unfavorably the CY.
In short, the correlations among biomass composition variables (LG, HC, N, C) and CY are more relevant for PC1 linear combination (Dim 1, horizontal representation) and the correlations among pyrolysis process variables HR and SRT are more relevant for PC2 linear combination (Dim 2, vertical representation).
Its important to note there is no concensus regarding the minimum percentage required for the explained variability when applying PCA. In one investigation, for instance, focused on eucalyptus biomass from different clones as feedstock, the first two PCs accounted for 72% of the total variance of the original data being this value considered appropriate (Couto et al., 2013). In another study including twelve native wood species grouped by physical, anatomical, and chemical characteristics (Lobão et al., 2011) the first two PCs accounting for only 58% of the explained variance were accepted as enough. In the present study, the first two PCs explained 66.4% of the variance.
The biplot diagram ( Figure 5) includes the first and the second principal components (PC1 and PC2 respectively) that explain up to the 66.4% of the total variance observed. The biplot graph is used by multivariate methods to show the existing relation among variables, among observations and among variables and observations (Lipkovich and Smith, 2002). Case studies (or experiments in Table 1) are shown as dots, meanwhile variables are presented as vectors. Experiments (or case studies) are grouped in the biplot basically according to their biomass type: olive husk (1,2,3); bench wood (4,5,6); spruce wood (7) together with corncob (8,9;10,11,12,13); one experiment with corncob (14) standing alone and; hazelnut shell (15,16,17,18,19). The first dimension (Dim1, Figure 5) displays one vector representing biochar yield (CY), close and positively associated to variables of biomass chemical composition (C and LG) and negatively associated to HC ( Figure 5). The experiments carried out with bench wood and one with corncob are associated to the second dimension (Dim2) formed by variables describing pyrolysis process (HR, RST and T in a decreasing order of importance, as shown by the length of the vectors).

CONCLUSIONS
This investigation was based on 19 case studies taken from the literature according to a systematic review. The investigation focused on biochar yield (CY) as the main product of interest after pyrolysis has been applied to five types of biomasses with similar energy content selected as feedstocks (olive husk, beech wood, corncob, spruce wood, and hazelnut shell). The study included data from seven selected variables describing the feedstock chemical composition (holocellulose, lignin, ash, carbon, hydrogen, oxygen, nitrogen) and three selected variables describing the pyrolysis process (temperature, vapour residence time and heating rate). Based on Person's correlation and Principal Component Analysis (PCA), it was concluded that variables representing the biomass chemical composition showed strong correlations with CY, being the main responsible for 66.4% of the variance explained by the two principal components (PC1 and PC2). Although pyrolysis process variables showed no strong neither average correlation with CY, heating rate and vapour residence time are the main variables contributing to the second principal component in the PCA. If the purpose is to increase biochar yield, the biomass selected as feedstock for pyrolysis should have, in principle, high lignin, carbon contents and low hemicellulose content.
The first principal component (PC1) was well correlated to char yield (CY) (r=0.8282), which confirms that for the studied dataset, biomass chemical composition was more relevant in terms of contribution to CY than process parameters; furthermore, experiments with hazelnut shell as feedstock, were associated with higher CY.