Models for prediction of individual leaf area of forage legumes

A área foliar é uma variável essencial para a quantificação de outras importantes características foliares em estudos fisiológicos de plantas, como taxa fotossintética e teor de fósforo, normalizados por área. Essa é uma das razões para a necessidade de métodos rápidos e precisos para estimar a área foliar . O objetivo deste trabalho foi ajustar modelos de regressão linear ou não linear para predizer a área foliar de seis espécies de leguminosas forrageiras, a partir de imagens digitais analisadas com o pacote LeafArea, software R. Em um experimento de campo, foram coletadas aleatoriamente 100 folhas das seguintes espécies: Crotalaria juncea (L.), Canavalia ensiformis (L.), Cajanus cajan (L.), Dolichos lablab (L.), Mucuna cinereum (L.), e Mucuna aterrima (Piper & Tracy) Merr., nas quais foram medidos o comprimento e a largura do folíolo central. Posteriormente, imagens digitais de cada folha foram processadas no software R para estimativa da área foliar . Essas estimativas foram usadas para ajustar modelos de predição de área foliar; de fato, setenta folhas foram usadas para ajustar os modelos; o restante delas foi usado para validação do modelo. Para as seis espécies, o modelo polinomial completo de segundo grau, ou submodelos derivados, pode ser usado para predizer a área foliar em função do comprimento e largura do folíolo central, apresentando R2 acima de 0,98 e porcentagem de erro médio absoluto abaixo de 9%. Nestes modelos, o efeito da largura da folha é geralmente maior que o comprimento da folha. O pacote R LeafArea mostrou-se uma ferramenta muito eficiente para a estimativa da área foliar através da execução do software ImageJ, com alta precisão e fácil calibração.


INTRODUCTION
Forage legumes (Fabaceae family), such as lablab (Dolichos lablab L.) and crotalaria (Crotalaria juncea L.), are widely cultivated as green fertilizers because of their biological and nitrogen-fixation capacity in the soil (Philippot et al., 2013), which increases the availability of this nutrient for conventional crops (e.g.maize).These plants provide a very efficient plant cover (Perin et al., 2004), help to control weeds (Monquero et al., 2009), provide animal feed (Fiallos et al., 2012) and soil protection against mechanical damage, and avoid losses of nutrients by leaching and/or percolation (Souza et al., 2012).Nevertheless, the production of biomass and the efficient use of these species are directly related to the foliage production.
Leaf area is a commonly analyzed variable in field studies (Wang & Zhang, 2012) with woody species, agricultural crops, and weeds.This is because several leaf characteristics are typically normalized by leaf area, such as maximum rate of net photosynthesis, dark respiration rate, nitrogen content, and phosphorus content (Osnas et al., 2013).In phytopatometry, the calculation of leaf area through image analysis is essential to evaluate the severity of diseases, substituting disease diagrammatic scales, and pest damage degree.However, accurate measurements of leaf area usually require the use of expensive equipment, making this type of procedure unviable, especially in large scale.
There are also indirect, non-destructive, methods of foliar area measurement, thus circumventing logistical difficulties in obtaining data.These methods consist in the application of dimensional or allometric analysis from mathematical equations that relate linear measures of the leaf limb to its area (Marchi et al., 2011).In general, these methods are simple, efficient, and inexpensive, based on linear models (Souza & Amaral, 2015), avoiding leaf exception, thus eliminating the need for foliar area meters or geometric reconstructions.
Nowadays, digital cameras are promising devices for the measurement of leaf area in field because they are easy to handle, cheaper than leaf measuring devices, and perhaps more accurate than methods based on leaf dimensions, especially when leaves present damage (Godoy et al., 2007).The estimation of leaf area through digital images has already been performed with several species, such as legumes (Cargnelutti Filho et al., 2015;Cargnelutti Filho et al., 2012;Toebe et al., 2012), soybean (Richter et al., 2014), common bean (Martin et al., 2013), grasses (Zanchi et al., 2009), and perennial crops (Godoy et al., 2007).
The package LeafArea (Katabuchi, 2015) of the R software (R Core Team, 2016) allows to conveniently execute the software ImageJ (Rasband, 2016;Schneider et al., 2012) for the analysis of digital images.The package provides an easy-to-use automated tool to measure the leaf area of several images simultaneously, but requires the excision of leaves, not allowing the same leaves to be measured later (Rouphael et al., 2010).
The objective of this work was to fit regression models, linear or non-linear, to predict the individual leaf area of six cultivated species of forage legumes, based on digital images analyzed with the R package LeafArea.
In these 100 leaves, we measured, with a millimeter ruler, the length and the width of the central leaflet, as illustrated in Figure 1.The central leaflets were scanned using an HP Ink Advantage 1516® multifunctional digital printer, generating A4 size images (210 × 297 mm).Then, the images were processed and analyzed with the package LeafArea version 0.1.1 for leaf area estimation through the function run.ij() that automatically runs ImageJ.Images are initially segmented to separate background from leaf blade.The proportion of pixels in each part of the segmented image is computed and used to calculate leaf area.To calibrate the function, a leaf cut of 5 × 5 cm size was scanned and processed, setting the arguments "distance.pixel= 395.02"and "known.distance= 5.0".To prevent dust from affecting the image analysis, the size lower limit for leaf area to be considered on calculations was set to 4.5 (cm², in this case), through the argument "low.size".
An exploratory analysis of the length, width, and leaf area data from the 100 leaflets of each species was x 206 Hygor Amaral Santana et al.The complete second-degree polynomial model was also subjected to the stepwise selection of regressors, aiming to obtain more parsimonious submodels.
We highlight that all models were fitted without the intercept, so that for null values of length and width, the leaf area would also be zero.The choice of model for each species was based on the goodness-of-fit criteria presented in Table 2.After choosing the model, 30% of the data (30 leaves of each species) were used for the validation of the model, computing Pearson's correlation coefficient between observed and predicted values.Shepard diagrams were built to visualize this relationship.
All statistical analyzes were performed with the software R (R Core Team, 2016).

RESULTS AND DISCUSSION
The crops have a great diversity of leaf formats and dimensions.C. ensiformis showed the highest leaf followed by M. cinereum, M. aterrima, D. lablab, C. juncea, and C. cajan (Table 3).In each of the six species, the variability in terms of leaf area varied from 20 to 35%.
The fitted equations that relate the leaf area with length and width, as well as the goodness-of-fit criteria, are presented in Table 4.In bold, we highlighted the chosen model, also considering the complexity of the model.It was observed, for example, that for C. juncea, the reduced linear model practically did not present reduction in R² and adjusted coefficient of determination (R²aj).or increase in mean absolute percentage error (MAPE) and AIC.
Different models were required for the leaf area according to the species.However, it is known (Maldaner et al., 2009;Monteiro et al., 2005;Queiroga et al., 2003) that the format, age, and size of the leaves determine the type of model used for leaf area prediction.

Coefficient of determination
Adjusted coefficient of determination Mean absolute percentage error Akaike's information criterion   It was not possible to obtain a reduced linear model for C. cajan and the two mucunas.A power model was chosen for C. cajan.Due to expressive differences in R² and R²aj, we chose the complete linear model for cinereum and M. aterrima.
Except for D. lablab, the product (L × W) was necessary in all prediction equations (Table 4).The fitted model for D. lablab was a reduced polynomial, with only the quadratic effect of leaf length, LA = 0.780226L².And this is perfectly plausible, given the more rounded shape of the leaflet (Figure 1).Note that if we take the simple ratios between the mean length and width of each species found in Table 3, we will see that the one of D. lablab is closest to 1 (11.05/12.12= 0.91), showing a greater circularity.Thus, it would be unnecessary to have both variables (length and width) in the same model.In addition, based on the area of the circle, we approach the coefficient of the fitted equation, that is, L 2 2 π 4 Lucena et al. (2011) found that linear models are optimal for estimating leaf area of acerola.Sachet et al. (2015) observed that the best method to estimate leaf area of peach is the linear model, replacing the destructive analysis.For M. cinereum (Cargnelutti Filho et al., 2012) and C. ensiformis (Toebe et al., 2012), linear models using leaf width and length were well fitted, presenting R² of 0.992 and 0.978, respectively, corroborating the results found here.
The linear effect of the width is higher than the leaf length (Table 4).In the case of M. cinereum, the length had a slightly higher effect.Toebe et al. (2012) observed that the width of the central leaflet of C. ensiformis affects leaf area more than the leaf length.Cargnelutti Filho et al. (2012) reported that the leaf area of M. cinereum is more affected by width.
Figure 2 shows Shepard diagrams of the observed values of leaf area and values predicted by the chosen model with the data of 30 leaves, randomly chosen from each species, to validate the fitted model.The correlation coefficients were all higher than 0.93, indicating high reliability of the models.

CONCLUSIONS
For the six species, the complete second-degree polynomial model, or derivative submodels, can be used to predict the leaf area as a function of length and width of the central leaflet, presenting high goodness-of-fit, with R² above 0.98 and mean absolute percentage error below 9%.In these models, the effect of leaf width is generally greater than the leaf length.
The four criteria of goodness-of-fit, coefficient of determination, mean absolute percentage error, and Akaike's information criterion, were generally concordant with each other.Nevertheless, Akaike's information criterion was more sensitive to the number of parameters than the adjusted coefficient of determination.
The R package LeafArea showed to be a very efficient application for the estimation of leaf area through the execution of software ImageJ, with high precision and easy calibration.

SSR
-residual sum of squares; SST -total sum of squares; p -number of model parameters; n -number of observations (70, in this case); LA i -i-th value of leaf area;

Table 1 :
Models to predict leaf area (LA) as a function of the length (L) and width (W) of the central leaflet , calculating minimum, mean, and maximum values.Then, 70 leaves of each species were used to fit the regression models described in Table1, to predict the leaf area as a function of leaflet length and width.

Table 3 :
Minimum, maximum, and mean (n = 100) values of leaflet length, width, and area

Table 4 :
Fitted models to predict leaf area (LA) as a function of leaflet length (L) and width (W) MAPE -mean absolute percentage error; AIC -Akaike's information criterion; R 2 -coefficient of determination; R² aj.-adjusted coefficient of determination.