Allometric models to estimate peanuts leaflets area by non-destructive method

: The determination of leaf area is fundamental for studies related to plant growth and physiology. Thus, non-destructive methods allow an accurate estimate of the leaf area through linear dimensions of the leaves. The research objective was to construct allometric equations to estimate the leaflet area of peanut cultivars. Then, 2,605 leaflets were collected from six peanut cultivars (IAC Caiapó, IAC 8112, Runner IAC 886, BRS Havana, BRS 151 L7, and IAC Tatuí), with more than 400 leaflets sampled for each cultivar. We measured the length, width, product between length and width, and leaflet area. Linear and non-linear models (linear, linear without intercept, power, and exponential) were built, and the best equation was chosen using the statistical criteria: highest coefficient of determination (R 2 ), Pearson’s linear correlation coefficient (r), Willmott’s agreement index ( d ), lowest Akaike information criterion (AIC), and root mean square of the error (RMSE). It was found that the models that used the product between length and width were the most suitable for estimating the leaflet area of peanut cultivars. Given the little intraspecific morphological variability, it was possible to group the cultivars, and model ̂= 0.875 * LW 0.929 was indicated to estimate the peanut leaflet area accurately, regardless of the cultivar.


INTRODUCTION
The nutritional importance and productive potential of peanut (Arachis hypogaea L.) classifies this crop as a multipurpose oilseed, offering several medicinal and economic benefits (Akram et al. 2018). The chemical and nutritional composition of peanuts includes vitamins, folic acid, thiamine, and tocopherols (Akram et al. 2018). In addition, its seeds are sources of fatty acids, with great importance for the composition of its oil (Toomer 2017). There is a great diversity of peanut genotypes, which differ in oil quality and morphological structure (Krishna et al. 2015).
Because of the relevance of the peanut crop, studies related to growth and development are of great need to estimate the productive potential of its different genotypes (Akram et al. 2018). Among the aspects required to indicate plants' growth status and physiological behavior, leaf area is characterized as an essential parameter (Keramatlou et al. 2015).
The leaf area directly influences the vital processes of plants, which include gas exchange and the interception of irradiance (Córcoles et al. 2015). The shape and amount of intercepted light is a determining factor for photosynthesis and directly interferes with acquiring essential resources for the formation of carbohydrates (Mattos et al. 2020). Therefore, the light energy captured by the leaves of a crop is critical to shaping its growth (Liu et al. 2021). https://doi.org/10.1590/1678-4499.20220121

BASIC AREAS Article
For physiological and ecophysiological studies, the measurement of leaf area provides an estimate of the transpiration rate and net absorption and ensures an understanding of the interaction between the growth of the species and its developmental environment (Sabouri and Sajadi 2022). Carrying out studies with leaf area modeling involves processes of plant community composition, evolution, competition, and adaptation and provides clarification of fruit quality attributes (Keramatlou et al. 2015).
Methods for estimating leaf area can be direct or indirect and differ in plants destructive or non-destructive sampling (Zhang 2020). Direct methods are economically unfeasible, limited by logistical factors, and prevent successive measurements of leaves (Hernandéz-Fernandez et al. 2021). On the other hand, a non-destructive indirect method of determining leaf area is through allometric relationships and considers the proportionality of the linear dimensions of the leaves (Santos et al. 2021).
To monitor leaf development and quantify a regression model that estimates leaf area, it is necessary to use digital processing methods, viable, accurate, and economic tools for crops (Sauceda-Acosta et al. 2017). Using this approach, Sabouri and Sajadi (2022) evaluated the use of image processing and regression methods to estimate the leaf area of chia (Salvia hispanica L.), quinoa (Chenopodium quinoa Willd.), and bitter melon (Momordica charantia L.).
In addition, the approach to leaf area estimation by these methods covers studies with pepper (Yeshitila and Taye 2016), cactus forage (Lucena et al. 2021), cocoa (Salazar et al. 2018), fava beans (Peksen 2007), and mango (Ghoreishi et al. 2012). However, for a critical evaluation of models to estimate the leaf area, it is necessary to use many leaves or leaflets to validate the model (Hernandéz-Fernandez et al. 2021).
For peanuts, there is a lack of studies that contemplate a more significant number of cultivars and demonstrate morphological differences and provide a regression model that facilitates subsequent studies with this species. We formulated that the linear and non-linear models are reliable for estimating the leaflet area (LA) of peanut cultivars. Thus, the objective of this work was to propose an equation that accurately estimates the leaf area of peanut cultivars through linear dimensions of the leaflets using several linear and non-linear models.

Plant material and experimental conditions
The experiment was conducted in an experimental area of the didactic garden belonging to the Agricultural Sciences Center of the Universidade Federal Rural do Semi-Árido, RN, Brazil (5°12'25.26"S, 37°19'6.42"W). The region's climate is classified as BSh, being dry and very hot, with a dry and rainy season (Alvares et al. 2013). It has an average temperature of 28 °C and annual rainfall of around 695 mm. The soil in the region is classified as Eutrophic Red-Yellow Argisol (Embrapa 2018).
For the construction of the models, peanut cultivars with an erect growth habit were planted, with sowing in September 2021 and harvesting in December 2021. Each experimental plot consisted of a cultivar sown in a 15-m planting row, with the density of 15 seeds per linear meter and spacing of 0.90 m between rows.

Plant sampling and image processing
At 85 days after planting, a total of 2,605 leaflets (n) mature, expanded, free from damage or pests, were collected from six peanut cultivars [1: IAC Caiapó (n = 427); 2: IAC 8112 (n = 433); 3: Runner IAC 886 (n = 422); 4: BRS Havana (n = 441); 5: BRS 151 L7 (n = 465); and 6: IAC Tatuí (n = 417)] (Fig. 1), with more than 400 leaflets randomly sampled for each cultivar. We selected leaflets in different shapes and sizes to test the model's generality and seek greater variability in the data. The leaves were placed in plastic envelopes and kept in the shade immediately after collection to maintain turgidity. The leaflets were separated from the leaves and then digitalized in a scanner (Epson model L395, Tokyo, Japan) with a resolution of 1,200 × 1,200 dpi, with the images processed and analyzed individually using the ImageJ software (National Institutes of Health, United States of America). The digitalized images included three rulers as reference indicators for measurements. We calculated the maximum length (L) (distance between the petiole insertion end and the opposite distance from the midrib), and maximum width (W) (largest measure perpendicular to the midrib), and LA were calculated (cm 2 ) for each leaflet. Then, we calculated the product between length and width (LW) (cm 2 ).

Data analysis
To compare the foliar parameters between the cultivars (L, W, LW, and LA), a unidirectional analysis of variance was performed, and then the honestly significant difference (HSD) was performed using the Tukey's test at the level of 5% probability (Zar 1996).
To verify whether there is accuracy in the estimates of the regression coefficients between L and W, we evaluated the degree of collinearity between these parameters. Therefore, we calculated the variance inflation factor (VIF) (Eq. 1) (Marquardt 1970) and the tolerance value (T) (Eq. 2) (Gill 1986). In this case, if the VIF is greater than 10 and the T is less than 0.1, it indicates that L and W have multicollinearity to affect the estimate of the leaflet area and, consequently, should be disregarded one of these two leaf parameters (L or W) for fitting regression models in the prediction of leaflet area (Gill 1986).
in which: r: Pearson's linear correlation coefficient between L and W.
To estimate leaflet area based on linear dimensions, tests were performed with 90 linear and non-linear regression models between LA (dependent variable) and length, width, and product (LW) (independent variables) of each cultivar, and later they were tested with the grouped data (all the grouped cultivars). Among the models tested, 25 presented satisfactory criteria for estimating the leaflet area. Logarithmic and polynomial models from second to fifth order were excluded. Thus, for greater data precision and speed in the analysis, 10 equations from four models were used for each cultivar and the grouped data.
in which: y "ᵢ : estimated values of LA; y i : observed values of LA; y "ᵢ : average of observed values; y " í = y "ᵢ − y (; ; y í = yᵢ − y ' ; L"x\θ & ' : maximum likelihood function; p: number of model parameters; n: observation numbers; x i and y i : ith observations of the variables y and x; y "ᵢ and : means of variables y and x. Descriptive analyses were performed to calculate maximum and minimum values, median, and coefficient of variation (CV), and data normality was verified by the Shapiro-Wilk's test. The observed and estimated LA were compared using Student's t-test for paired samples (p < 0.01). We performed statistical data analysis with the software R® v.4.1.2 (R Core Team 2022).

RESULTS
The high number of leaflets of the cultivars used in this study (n = 2,605) provided a high data variation for constructing models to estimate the peanut LA (Fig. 2). The leaflets of the cultivars showed a length range of 0.82-8.38 cm (IAC Caiapó), 1.99-8.59 cm (IAC 8112), 0.98-8.50 cm (Runner IAC 886), 1.20-8.45 cm (BRS Havana), 1.44-8.64 cm (BRS 151 L7), and 1.78-8.58 cm (IAC Tatuí) (Fig. 2). The LW ranges were 0.60-4.34, 0.68-3.86, 0.63-4.60, 0.67-3.81, 0.96-4.11 and 0.91-4.07 cm for the same cultivars (Fig. 2). Regarding the LA of the cultivars, averages of 12.2 ± 0.26, 11 ± 0.21, 10.6 ± 0.24, 11.8 ± 0.21, 11.7 ± 0.27, and 11.1 ± 0.22 cm2 were observed, with coefficients of variation above 30% (Fig. 2).  Regarding the comparison of L, W, LW, and LA between the cultivars, significant differences were observed in all leaf parameters, such as the cultivar IAC Caiapó, which had greater W, LW, and LA, while its L was the second largest, and the cultivar Runner IAC 886 had the most petite L and size of leaflets compared to the other cultivars (Fig. 2). The data for L and W did not adjust to the normal distribution in all cultivars, while for LW the data showed normal distribution in the cultivars IAC Caiapó, AIC 8112, and Runner IAC 886. In LA, there was an adjustment of the normal distribution only for the cultivar IAC 8112 (Fig. 2).
Linear and non-linear association patterns between L, W, LW, and LA were observed in the data used to build the regression model to estimate the peanut LA (Fig. 3). We observed linear patterns between LW and LA, nonlinear patterns between L and LA, and between W and LA, thus evidencing the need to use different models for data adjustment and validation (Fig. 3). According to the preliminary analysis for model calibration, the VIF values ranged from 1.66 to 8.24, and the T values ranged from 0.12 to 0.60, respectively. Thus, for all cultivars, VIF values were < 10 and T was > 0.10, showing that the collinearity between L and W is considered negligible, and we can include these parameters in the regression models.

Length (cm)
Length * Width (cm 2 ) Leaf area (cm 2 ) Width (cm) Figure 3. Matrix with histograms (diagonal) and scatter plots between length, width, product between length and width, and leaflet area of 2,605 leaflets (pooled data) used for the generation of models to estimate the peanut leaflet area.
Regarding the regression models obtained to estimate the peanut LA, it was observed that the determination coefficients (R2) and linear correlation coefficients (r) (Table 1). According to the criteria for choosing the best models, models #9 and #10 are not indicated to accurately estimate the LA of any of the cultivars, as they presented the lowest coefficients of determination, correlation coefficients linear, and Willmott's concordance index, and higher AIC and RMSE, confirming the ineffectiveness of these models (Table 1). For each cultivar individually, Eq. 7, obtained with the power model using the product between leaflet length and width (LW), was the most suitable for estimating the LA with greater precision (> 97%). However, other models can also be indicated to estimate the LA of each cultivar individually, such as models #3 and #4 (Table 1).
It was observed in the principal components analysis (PCA) that there was concentration of 98.01% data variability in the two main axes, corresponding to 69.94% in the first and 28.07% in the second (Figs. 4a and b). PCA showed clustering among cultivars, evidencing similar phylogenetic characteristics between them. This analysis indicated slight morphological variation between the peanut cultivars' leaflets, confirmed by the leaflet morphotypes of each cultivar (Fig. 4c). These results and the established criteria made it possible to create a generalized model covering all cultivars. The power model y "ᵢ = 0.875 * LW 0.929 was the most suitable to estimate the peanut LA with an accuracy of 99%, regardless of cultivar. Table 1. Statistical models, regression coefficients (β 0 and β 1 ), coefficient of determination (R 2 ), Pearson's linear correlation coefficient (r), Akaike information criterion (AIC), Willmott's agreement index (d), root mean square error (RMSE), and equations for estimating leaflet area of peanut cultivars as a function of linear leaflet dimensions (length and width). The equation proposed to estimate the area of peanut leaflets presented high adjustment (R 2 = 0.9883) of the data about the adjustment of the line, in which the residual variance was homogeneous with low dispersion of the data (normal distribution) (Fig. 5). The area of leaflets estimated from the equation recommended in the present study showed positive correlation with the observed area of leaflets (determined by digital images), with the coefficient of determination (R 2 ) of 0.9791, confirming a significant relationship between these parameters (Figs. 6a and 6b). Thus, regardless of the cultivar, the equation y "ᵢ = 0.875 * LW 0.929 can efficiently estimate the area of peanut leaflets by speed using product dimensions between LW.

DISCUSSION
This study describes a non-destructive method to estimate the LA of peanut cultivars, seeking to build a single model that allows estimating the LA of six cultivars. Therefore, we observed that the cultivars showed little intraspecific difference in leaflet morphology, confirming the possibility of building a single model to estimate the peanut LA.
The wide variability observed in the data of LW and LA (CV > 33.9%) is fundamental for studies related to allometric modeling to estimate the leaf area of plant species (Dalmago et al. 2019, Toebe et al. 2019). This high variability allows for more representative models and precise equations for leaves of different shapes and sizes, which can measure at different phenological stages during the plant cycle (Cargnelutti Filho et al. 2021). Although the implemented experiment was in only one area, it is considered that the total number of leaflets sampled (2,605) in all parts of the plants (middle, lower and upper thirds) was adequate for the construction of models that determine the area of peanut leaflets as a function of linear dimensions (L, W and or LW). Some studies have reported that a small number of samples (leaves or leaflets) to build allometric models can generate biased and unreliable equations for estimating the leaf area of plants (Pompelli et al. 2012).
The present study showed that the best equations to estimate the LA of peanut cultivars were those that used LW, compared to equations that used only one leaflet dimension (L or W), with the best criteria and adjustments of the models used (Guimarães et al. 2019, Macário et al. 2020, Goergen et al. 2021, except for the exponential model, in which the best equations were those that used L of the leaflets (Ribeiro et al. 2020).
Leaf area can also be estimated using only a single dimension of the leaf surface (L or W), simplifying the analyses. However, using equations with only one size can lead to a loss of model precision for the leaf area estimation (Pompelli et al. 2012), which was observed in the present study with the minor adjustments of the proposed models, except for the exponential model, in which the equation using leaflet L was the most adequate to estimate LA.
Generally, linear models are the most used to estimate the leaf area of agricultural and forest species (Gomes et al. 2020, Goergen et al. 2021, Hernandéz-Fernandez et al. 2021, Mela et al. 2022. However, it is assumed that these models are used with losses in precision, which often occurs due to the high values of the intercepts (β0) of the regression line (Santos et al. 2021). Therefore, in this study, we observed that we disregarded the use of linear models #1 and #2 for all peanut cultivars. According to Zuur et al. (2010), linear models show high residual dispersion, indicating that these models are not recommended to estimate leaf area during leaf development throughout the crop cycle, whereas nonlinear models are the most suitable for determining leaf area. The nonlinear Eq. 8, obtained through the power model, was the most appropriate