Leaf area estimation with nondestructive method in cassava

The objective of this study was to create a single mathematical equation able to estimate the leaf area of different cassava cultivars from one lineal dimension without destroying any plant tissue. Two hundred leaves per cultivar from ten cultivars were used to calibrate the model and more than one hundred leaves per cultivar were used to test its predictive capacity as independent data. All equations were the result of the nonlinear correlation between the leaf area and the length of its central lobe. To validate it as a “general” equation, another set of five cultivars were used. A “specific” equation for each cultivar was also calibrated to compare with the “general” equation’s performance. Cultivar Vassourinha has remarkably different leaf morphology from the other nine cultivars, making the “general” equation’s tendency line deviate and lowering its coefficient of determination. Therefore, one more equation was generated excluding that cultivar and, as a result, it was not possible to estimate the leaf area from all the cultivars using only the “general” equation. The “general without Vassourinha” equation has a high accuracy level when estimating leaf area of the other nine cultivars, plus the extra five cultivars that were included to validate the general equation; all of them present similar leaf morphology. Due to its importance, Vassourinha cultivar’s “specific” equation should be used when estimating leaf area for this cultivar or other cultivars with similar leaf morphology.


INTRODUCTION
Cassava (Manihot esculenta Crantz) is considered by the Food and Agriculture Organization (FAO) the queen of foods, one of the main foods of the 21st century due to the rapid growth of its production in recent years, its importance in feeding more than 800 million low-income people and its high capacity to adapt to climate change (Howeler et al. 2013). It is particularly important in tropical regions, where it is the third source of food, only surpassed by rice and corn (FAO 2014). It is estimated that the global population will increase to approximately 9 billion by 2050, being that increment more concentrated in African countries. Consequently, it is expected that the number of people who depend on cassava for their staple food will increase proportionally (Fermont et al. 2009;FAO 2018).
Currently, cassava roots (in natura) are the best-known and most popular way of consumption worldwide, although the flour and chip industry, as well as their consumption, are growing rapidly, especially in Asian countries (FAO 2018). In Brazil, the amount of starch consumed as flour has been consolidated in the sectors of pasta, cookies, bread making, wholesalers and, recently, in the tapioca industry. Everything indicates that there have been positive changes in the dynamics of cassava

MATERIAL AND METHODS
Field experiments with cassava were carried out in 2018/19 growing season in Santa Maria, RS, Brazil (latitude: 29º43'S, longitude: 53º43'W and altitude: 95 m). Ten cultivars were used (Vassourinha, Aceguá, Frita, Pioneira, Fepagro RS13, BRS 396, BRS 399, IAC 576, Preta e Branca and Gema de Ovo) with 30 plants per row, spaced to 0.8 m between plants and 1 m between rows. The planting was carried out with one horizontal stake per hole, with approximately 3-5 buds per stake, on October 15 2018.
Fertilization was performed according to the results of a chemical analysis of the soil, based on the technical recommendation for the crop defined by the Manual de Adubação e Calagem (CQFS RS/SC and NRS-SBCS 2016). The cultivars were selected for several reasons, such as their impact in the society, their importance in different markets (table cultivars, for silage or dual purpose), their breeding levels (biofortified cultivars with vitamins and minerals) and the great variability that exists between them.
Between 160 and 180 days after planting, when all cultivars presented leaves of different sizes and shapes (fully expanded or not), period that the vegetative growth is more active, capable of intercepting much of the light (Alves 2002), the leaves were randomly harvested, seeking sampling of leaves in initial, medium growth and fully expanded leaves. The amount was from 300 to 400 leaves per cultivar. It was determined that the use of 200 leaves per cultivar to generate the equations is representative enough, this number of samples is bigger than the ones found in the literature for cassava (Burgos et al. 2010;Guimarães et al. 2019); this way, the leftover leaves were used to validate these equations. The leaf length was measured with a rule from the bottom of the central lobe to its apex, and then the same leaves were scanned to measure their individual LA. There are studies (Cargnelutti Filho et al. 2015;Carvalho et al. 2017) that prove that the coefficient of determination (R²), i.e., the proportion of total variance of the variable explained by regression, between only the leaf length and their leaf area is sufficiently strong (over 0.85 in all potential equations) without the need of measuring the leaf width or any other dimension.
The leaves were digitally scanned by an HP LaserJet M1132 MFP scanner at 300 dpi and then processed by the QUANT v.1.0.2 software to measure the area of each leaf. Ten different "specific" potential equations were developed using the data of measured length and LA of 200 leaves from each cultivar. After that, the same 200 leaves per cultivar were used to calibrate just one single "general" potential equation, y = a.x b where y represents the leaf area (cm²), a and b are coefficients obtained by nonlinear regression of points in Excel software, and the value x is the central lobes length (cm) (Tironi et al. 2015). Figure 1 shows that the cultivar Vassourinha has leaves with long linear shaped lobes, meanwhile the other two cultivars (that represent the other nine cultivars because of the great morphological similarities among them) have leaves with short elliptical and spear shaped lobes. It was thought that these two remarkably different leaf formats will also have different tendencies when modeling the equation, which is why the authors decided to generate another equation gathering all cultivars with similar leaf shape and not including the cultivar Vassourinha. This new equation was called "general without Vassourinha".
A database of other five cultivars measured back in the years 2013 and 2014 was used to validate both "general" and "general without Vassourinha" equations. Independent data for validation were collected in commercial fields in the municipality of Vera Cruz (latitude: 29º71'S, longitude: 52º51'W, and altitude: 68 m). The cultivars used were 'Paraguay' , 'São José' , 'Fepagro RS14' , 'Estrangeira' and 'Fécula Branca' , the planting was carried out in the second half of October 2013. Fifty leaves were collected from each cultivar at 160 days after planting; this is the time when the plants present leaves in development, expanded leaves and is emitting new ones. All five cultivars have leaves with short elliptical and spear shaped lobes, and the harvest was carried out randomly with different leaf sizes, collecting expanded leaves at the base of the plant, developing leaves in the middle part and leaves recently emitted at the apex of the plant. Leaf length was used to estimate the LA of 50 leaves per cultivar and the results were compared with the measured LA. The results of this validation are extremely important to determinate if it is possible to use the general equations presented in this study in other cultivars that were not used to calibrate them. Independent data were used to statistically validate the performance of the equations, the LA from 100 to 200 leaves per cultivar were estimated by replacing their central lobes length in the generated equations, these leaves were the ones that were not used to generate the equations. The results obtained were evaluated with: mean absolute error (MAE), root mean square error (RMSE), normalized root mean square error (NRMSE), the BIAS index, the c index, the r coefficient and the dl index. (1) The MAE (Eq. 1) and the RMSE (Eq. 2) express the magnitude of the error produced by the model, so that the closer to zero these statistics are, the better the model (Janssen and Heuberger 1995). The MAE is less sensitive for extreme values than the RMSE, since it does not square the difference between estimated and observed values. The NRMSE (Eq. 3) value is usually called the normalized root mean deviation or square error expressed as a percentage. The lower values indicate less residual variation. The BIAS index (Eq. 4) expresses the average deviation of the estimated values in relation to the observed values, thus indicating the tendency of the model to overestimate or underestimate the LA values and, therefore, the closer to zero it is statistically, the smaller the magnitude of the systematic error of the model (Leite and Andrade 2002).
The confidence or performance index (c) (Eq. 5), proposed by Camargo and Sentelhas (1997) indicates the performance of the methods, the closer to 1, the better the performance of the model is. The correlation coefficient (r) (Eq. 6) indicates the degree of dispersion and association of the simulated data in relation to the observed data. The closer to 1, the more correlated the simulated and observed data will be.
The d1 or dw index (Eq. 7) is a measure of how much the model is free of error, the closer to 1, the lower the error this estimate presents (Willmott et al. 1985). The values of the d1 index vary from 0, for no match, to 1, for the perfect match. In Eqs. 1-4, 6 and 7, Si represents the estimated LA values (cm²/leaf), d is an estimation of error (Willmott 1981), ȳ the average of the LA values estimated by the equations, Oi the observed LA values (cm²/leaf), Ō the sum of the LA values observed and N the number of observations.

RESULTS AND DISCUSSION
The "general" equation generated has a high coefficient of determination (R² = 0.8632), but, looking at the figure that shows the "general" LA = 0.5269x 2.0411 equation (Fig. 2a), it can be noticed that there is a group of data points that do not follow the bigger group's tendency, making that coefficient of determination to be lower than it could be. By looking at the figure that shows the "general without Vassourinha" LA = 0.3838x 2.1896 equation (Fig. 2b), it can be seen that the group of dots deviating the tendency line is not longer there and that the coefficient of determination is even higher than that of the "general" equation (R² = 0.938), concluding that the group of dots were in fact the ones generated by the cultivar Vassourinha. This difference between both coefficients of determination could be enough to decide which equation to choose when estimating LA on cultivars with short elliptical and spear shaped leaves. Figure 3 shows the equations resulting from the nonlinear correlation between both measures taken, the best fit of the equations for estimating LA was using the power correlation between the length of the central lobe and its measured LA. Despite being similar to each other, all cultivars present slightly different leaf shapes, resulting in a group of equations with strong determination coefficients, where the "specific" equations of cultivar BRS 396 has the lowest R² = 0.9425 (Fig. 3f) and of cultivar Preta e Branca has the highest R² = 0.9798 (Fig. 3i). Researches carried out by Guimarães et al. (2019) and Alves (2002) are in accordance with the results found in this study. For example, the cultivar Gema de Ovo, described in this study and by Guimarães et al. (2019), showed the same trend, with better adjustment using power models, obtaining similar LA values in both studies.
Table 1 statistically concludes that these "specific" equations are the best way to estimate LA for each cultivar, but it would not be practical to use a big number of equations that work precisely in just one cultivar each.
When using "specific" equations, the dots generated are distributed closer to the 1:1 line than when using general equations, which happens for all cultivars (Fig. 4). The LA, when calculated with general equations, tends to be slightly underestimated; the two cases where these equations overestimate the LA are with the cultivars Vassourinha and Preta e Branca. However, the only case where the overestimation is excessive (more than two times its real size) is when estimating LA in cultivar Vassourinha. Some intervals without LA values of cultivars Preta e Branca (Fig. 4m) and Fepagro RS14 (Fig. 5e) are explained by their morphology in the evaluation period (160-180 days after planting). In the case of 'Preta e Branca' , leaves with lobe length between 14.4 and 16.6 cm were not found, resulting in a small gap in the dispersion points (Fig. 4m). In 'Fepagro RS14' , the small intervals in dispersion (Fig. 5e) are due to the low number of leaves found with lobe length between 9.6 to 15.0 cm.
It is worth noting that despite the absence of these values in the evaluation, it did not harm the analysis of the LA of these cultivars. According to the methodology described, the leaves were randomly collected seeking for sampling of leaves in initial, medium growth and fully expanded leaves, and the same methodology was applied to all other cultivars in this study.
Using Table 1 to compare the statistical values of every equation's performances on each cultivar, it is confirmed that the best equations to estimate the LA are the "specific" ones; however, the "general without Vassourinha" equation shows an overall good performance. The "specific" equation of each cultivar has its MAE ranging from 0.24 to 11.75 and its RMSE       (a, c, e, g, i) and the "general without Vassourinha" equation (b, d, f, h, j) versus measured LA of leaf. For the cultivars Paraguaia (a, b), São José (c, d), Fepagro RS14 (e, f), Estrangeira (g, h) and Fécula Branca (i, j). Bragantia, Campinas, v. 79, n. 4, p.347-359, 2020 (a) from 18.29 to 41.44, results that are similar to those obtained in researches made for snap bean (Phaseolus vulgaris L.), that went from 12.56 to 39.94 (Toebe et al. 2012). The "general" equation has its MAE ranging from 15.73 to 55.18 and its RMSE from 28.11 to 77.36. The "general without Vassourinha" equation has the MAE ranging from 3.30 to 50.94 and its RMSE from 20.00 to 67.97 (the results obtained by 'Vassourinha' in general equations are not considered).
When analyzing the results obtained by the NRMSE in the "general" equation, a lower precision was noticed in the cultivars Aceguá (Fig. 4e), Paraguaia (Fig. 5a) and Fécula Branca (Fig. 5i). The results obtained by the "general without Vassourinha" equation show less precision in the cultivars Preta e Branca (Fig. 4o) and Fécula Branca (Fig. 5j). However, the other statistical evaluations discussed in this study confirm a good performance of these equations. These results indicate that the "specific" equations of each cultivar have a high predictive capacity, this also happens with the "general without Vassourinha" equation, but its predictive capacity is less accurate.
The other statistical evaluations presented (BIAS, c, r and dl) confirm the high predictive power of "specific" and the "general Vassourinha" equations ( Table 1). The "specific" equations had the smallest errors values, followed by the values obtained by the "general without Vassourinha" equation and the largest errors values were the data estimated by the "general" equation. This happens in all cultivars excluding the case of 'Branca e Preta' , which presented higher errors in all statistical methods with the "general without Vassourinha" equation (Table 1) and it is visually confirmed by the 1:1 Fig. 4n and 4o, but the differences were not significant.
Using the database from another's cassava cultivars experiments, it was possible to compare their measured LA and the estimated LA in extra five cultivars (Fig. 5). When visually analyzing the 1:1 figures, these five cultivars had a similar behavior compared with those obtained in the other cultivars excluding 'Vassourinha' . The "general without Vassourinha" equation has a statistically higher level of accuracy than the "general" equation when estimating LA for these new five cultivars (Table 2). It is also important to notice that with the exception of 'Estrangeira' , both equations tend to slightly underestimate the LA (Fig. 5). It is thought that this happens because of the leaf shapes similarities that exist between all these fourteen cassava cultivars. These proofs combined with NRMSE values (Fig. 5) were enough to confirm that it is possible to estimate LA in other cultivars that were not the ones used to create the general equations. This capacity is important to emphasize because it is known that there are many cases where it is not sure which cassava cultivar is being produced in the field , this makes the general equations important to be considered when choosing an equation to estimate LA. The use of linear leaf dimensions for LA estimation is an advantage because the process does not destroy the plant to obtain data, even requiring manpower to perform the measurements. This makes it possible to evaluate the plant throughout its productive cycle. In other works, the best results were obtained when using the product between their dimensions (width and length) (Toebe et al. 2012;Richter et al. 2014;Cargnelutti Filho et al. 2015). Despite this, no more than just a linear dimension was used, thus considerably reducing the problems and systematic errors that using more than just one dimension could carry.
Although the estimations made with the "specific" equations have the better values statistically, the objective of this study was to generate just one general equation, making the "general without Vassourinha" equation the chosen one to estimate the LA in future studies in cassava crops of cultivars with elliptical spear-shaped lobes and length up to 29.5 cm ( Fig. 1a and 1c). It is noteworthy that it is not recommended, but not restricted to use the "general" and "general without Vassourinha" equations to estimate LA in cultivars that have leaves with straight or narrow lobes and expanded leaf length greater than 29.5 cm, such as 'Vassourinha' (Fig. 1b), due to the extremely inaccurate results they could get. Thinking about the social, economic and cultural importance the cultivar Vassourinha has in South Brazil, it was determined that its LA estimation should be as accurate as possible, justifying the use of its "specific" equation, LA = 0.1475x 2.2075 for future studies.

CONCLUSION
This study reaffirms that the nondestructive method can be used as a good precision tool to estimate leaf area in cassava cultivars, presenting practicality, low cost and nondestruction of plant tissue as its main advantages. Due to the amplitude of the length of the lobules used to generate the equations in this study, it is recommended to use these equations in different regions and cultivation conditions, development and growth stages of plant and planting period.
It was not possible to generate a single equation that could be used in all studied cassava cultivars, but the "general without Vassourinha" equation LA = 0.3838x 2.1896 should be used to estimate leaf area in the cultivars Aceguá, Frita, Pioneira, Fepagro RS 13, BRS 396, BRS 399, IAC 576, Preta e Branca, Gema de Ovo, Paraguaia, São José, Fepragro RS 14, Estrangeira, Fécula Branca and other cultivars with elliptical spear-shaped lobes with length up to 29.5 cm.
When estimating the LA of 'Vassourinha' , its "specific" equation LA = 0.1475x 2.2075 should be used. This could be also indicated for plants that have leaves with straight or narrow lobes and expanded leaf length greater than 29.5 cm. Although, to extend the use of this "specific" equation to other cassava cultivars with morphological characteristics similar to 'Vassourinha' , a new study of more cultivars with this specific leaf morphology would be necessary.