CLASSIFICATION OF MACAW PALM FRUITS FROM COLORIMETRIC PROPERTIES FOR DETERMINING THE HARVEST MOMENT

Macaw palm (Acrocomia aculeata) is a promising crop for biofuel production due to the high concentration of its fruit oil, but the harvest date is an issue to be better understood so it could be cultivated on an industrial scale. The aim of this study was to use the colorimetric properties of the macaw palm fruits to develop a neural network classifier to determine the ideal moment for harvesting, based on the oil content of the fruit mesocarp. During nine weeks of maturation were sampled 900 fruits of macaw palm fruits and the colorimetric properties of the RGB, HSI and CIELab color models were used to classify the fruits into immature and mature fruits. Kappa index and the overall accuracy values were used to access the classifier performance. The classifiers based on RGB parameters and on hue were considered equivalents having a Kappa index of 0.901 and 0.942, respectively, indicating the 59 week of maturation as the ideal time to harvest with the highest oil content.


INTRODUCTION
The Macaw palm (Acrocomia aculeata) is found in an expressive way in the Brazilian territory with abundance in the Cerrado region (Conceição et al., 2015). The fruit is considered the most relevant part of the plant economically and presents as one of the main characteristics the high concentration of oil. The oil is used in pharmaceutical industry for the manufacture of cosmetics, in the food sector, as cooking oil and by the energy sector for the production of biofuels (Barreto et al., 2016;Cazarolli et al., 2016).
Brazilian biodiesel production in 2017 was around 4,291,276 m 3 and soybean oil accounted for 64% of total biodiesel production in Brazil while oil produced by other oil crops was responsible for the participation of approximately 10% of total biodiesel production (ABIOVE, 2018). Thus, the exploration of crops with bioenergetics potential like the Macaw presents an alternative so that the Brazilian production of biodiesel continues in expansion.
Despite the great potential for biodiesel production, more information on crop characteristics, such as the ideal harvest time is necessary for the rational and sustainable exploitation of the crop (Montoya et al., 2016). The beginning of the Macaw harvest is determined by the natural detachment of ripe fruits, usually in extractive systems. The contact of the fruits with the soil entails the quantitative and qualitative loss of produced oil limiting the industrial use of the product (Queiroz et al., 2015). Studies and new technologies that can be used in the harvest and post-harvest phases of the fruits are essential to improve the oil extraction with better quality and in high quantity.
The automated classification of agricultural products from digital images has received special attention due to the increase in the demand for products of high quality in a short period Zhang et al., 2014). The color is a widely used parameter in the selection and harvesting of fruits as it is a direct indicator of quality and maturation stage presenting relation with physical and chemical attributes (Choong et al., 2006;Pu et al., 2015). Colorimetric characteristics such as hue (Tan et al., 2010;Wisutiamonku et al., 2015) and red, green and blue intensity (Mohammadi et al., 2015;Gomes Júnior et al., 2017) present important results in the identification on different stages on various fruits ripening.
Engenharia Agrícola, Jaboticabal, v.38, n.4, p.634-641, jul./ago. 2018 In palm tree the use of digital images is observed in the evaluation of maturation stage, definition of harvest point and estimation of oil content in bunches and fruits, aiming at the development of automated systems for harvesting and post harvesting (Makky et al., 2014, Dinah et al., 2015. Once the optical properties obtained on the surface of the palm tree fruits are used as an indicator of maturation, and the oil contents contained in the mesocarp of the fruits are directly associated to the point of physiological maturation. The different spectral responses can be used to estimate the amount of oil (Saeed et al., 2012, Matsimbe et al., 2015, Costa et al., 2017. Macaw fruit present subtle changes in color during development, ranging from green to brownish or brown when mature (Montoya et al., 2016). Due to this fact, it becomes difficult to associate the maturation stage with a specific staining pattern only by visual analysis. In this way, the quantitative representation of the color can allow the perception of subtle changes in the colorimetric aspect of the fruits, allowing the monitoring of the maturation process through optical sensors (Wu & Sun, 2013;Ávila et al., 2015). Thus, the objective of this study was to develop a classifier to discriminate the ripe fruit, suitable for harvesting, based on the colorimetric properties, relating them to the oil content contained in the fruit mesocarp.

Location and acquisition of samples
The fruits used in the experiment were collected in the period from September 2013 to January 2014 in an area located in the Acaiaca municipality located in Zona da Mata Mineira at 20º 23'33" south latitude and 47º 07'31" West longitude with average altitude of 601 m.
The Macaw trees evaluated were Acrocomia aculeata, over 10 years old in a reproductive stage, being cultivated in an extractive system, with no commercial purposes and without previous treatment of the soil.
For the execution of the experiment, 20 different bunches of trees were used, and five fruits were collected per bunch in nine maturation stages, totaling a set of 900 analyzed fruits. The stages of maturation were associated to the beginning of the bunches flowering time, being counted between 41 and 61 weeks after flowering (WAF).

Acquisition and image processing
For the acquisition of the images, a computer was used for image storage and processing, two halogen lamps of 100W each and a multispectral CCD camera, brand Fluxdata, model FD-1665. The digital camera was positioned 25 cm high in relation to the fruit. The images were collected in RGB.
In the pre-processing stage performed in the free software ImageJ, it was sought to highlight the object of interest (fruit), eliminating the background and possible noises of the image. For this process, the original image was binarized by dividing the histogram into two classes. The threshold between classes was chosen automatically by the method of Otsu (Otsu, 1975). The new image without background presence and noise elimination was obtained through a logical operator of the union of the original image with the binarized image ( Figure 1).  (Pedrini & Schwartz, 2007): Where, R -Intensity of red, 0 ≤ R ≤ 255; For the analysis in the CIELab color model, the RGB images of the fruits were converted according to the norms and equations proposed by the Comission Internacional d'Eclairage (Pedrini & Schwartz, 2007). For conversion, RGB images were initially converted to the XYZ primary color system according to [eq. (5)], [eq. (6)] and [eq. (7)]. From the CIELab model we evaluated the luminance (Lu) and the two color ranges a and b associated with chromaticity ranging from green to red and from blue to yellow using the [eq. (8) c= √ a 2 +b 2 (12) If, On what, a -Green to red chromaticity, -120 ≤ a ≤ 120; b -Chromaticity from blue to yellow, -120 ≤ b ≤ 120, c -Chromaticity module;

Determination of the oil content in the fruits
The oil extraction occurred using the solvent nhexane in Soxhlet extractor by the method 032 / IV from analytical standards of the Adolfo Lutz Institute (IAL, 2005). The mesocarp of the fruit was cut and dried in a ventilated oven at 65°C for 72 h. After drying, the samples were weighed and placed in paper filter cartridges which were placed in the Soxhlet extractor where they were submerged in 150 mL of n-hexane for 8 hours until the colorless extract was removed.
The extract was transferred to an oven at 105°C for 24 hours where occurred evaporation of the n-hexane and the water contained in the mesocarp. After 24 hours, the samples were cooled to room temperature and weighed sequentially. The extraction process was carried out by fruit, obtaining an oil content value (OC) for each sample evaluated by [eq. (13)].
On what, OC = Oil content, mg.g -1 ; Mbmass of the sample before the oil is extracted, g; Mamass of the sample after oil extraction, g, Mcmass of the paper filter cartridge, g.
The analysis of variance and a simple linear regression model were used to evaluate the oil content response as a function of each maturation period (week after flowering). The completely randomized design was used in which the nine maturation times were considered as the treatments and the twenty bunches evaluated at each stage as the replicates. Each repetition was obtained by determining the average value of five measurements, corresponding to the oil content of each of the five fruits collected in each bunches.

Classification of fruits for determining the harvest point according to the color models
The fruits used in the experiment were divided into two classes. The immature class (A) was composed by fruits collected at 41 st WAF and 45 th WAF. These fruits were chosen for being in the two initial stages of the collections, being considered green for harvest. The mature class (B) was composed by fruits collected at 60 th WAF and 61 st WAF which were considered to be close to physiological maturation and consequently close to the harvesting point. Table 1 shows the number of fruits in each class. In order to classify the fruits in mature and immature, two acyclic neural networks were trained and feedback by the algorithm backpropagation of the error using the Levenberg-Marquadt variation to accelerate the training time and to improve the performance in the standard classification.
Eleven neural networks were developed according to the color models parameters ( Table 2). The architecture of the networks was composed of up to three input vectors, depending on the number of descriptors for each network. Two other intermediate layers were used with two neurons each and one exit layer (Figure 2). The intermediate layers and the output layer used the hyperbolic tangent as an activation function for both networks. The obtained values in the output layer are continuously distributed between 0 and 1 and being approximated to the nearest whole number. In this way, the output of network 0 referred to fruits classified in the Immature class (A) and the output of network 1 referred to fruits classified in the Mature class (B).  Fifty percent of the total number of fruit was used for the training of the nets, 20% for validation and 30% for the trained nets test. Each network was trained ten times, since in the beginning of the training the network parameters are generated randomly. The training was interrupted by the error progress criterion in six consecutive cycles. It was selected the neural network that presented the highest percentage of correctness in the classification of the test samples to represent each type of network.
The accuracy of the classification was evaluated by the global accuracy coefficient and by the Kappa index calculated by means of the error matrix obtained in the classification of the test sample.
Nets with Kappa index greater than 0.75 (Landis & Koch, 1977) were selected and classified as better than a randomized classification using the Z test. To verify the occurrence of a difference between the accuracy of the classifications, the kappa coefficients of the selected neural networks were compared using the Z test at a significance level of 0.01 (Z tabulated = 2.57) (Congalton & Mead, 1986).
The selected neural networks were used to classify the fruits on nine maturation stages and the percentage of classified fruit in the Immatures and Mature class. Bunches were allocated in the class that presented the highest percentage of its fruits, being this denominated predominant class.

RESULTS AND DISCUSSION
The analysis of variance indicated the existence of the relation on oil content with the maturation times evaluated with a significance of 0.01 (p-value = 7.67e -32 ). When analyzing the response of the oil content as a function of the maturation week (Figure 3) we found that with each analyzed maturation week the values of oil content presented an increase of 20.22 mg.g -1 , reaching mean values of 506.70 mg.g -1 and 530.25 mg.g -1 at 60 th WAF and 61 st WAF, periods which are indicated for harvesting.
Engenharia Agrícola, Jaboticabal, v.38, n.4, p.634-641, jul./ago. 2018 * Significant coefficient at a level of 0.01 by the t-test. The increase of the oil content in Macaw fruit is associated to the synthesis of the polysaccharide reserves occurring during maturation (Montoya et al., 2016;Barreto et al., 2016). Harvesting the fruits at the appropriate time, before releasing them from bunches in a natural way, is essential for extracting larger amounts of oil (Queiroz et al., 2015). Choong et. al (2006) demonstrated a positive relationship between color, using the RGB model and the oil content levels contained in the mesocarp of the Elaeis guineenses palm fruits at different maturation stages.  pointed to the hue as the appropriate colorimetric property to relate to the oil content of the fruits on the same palm tree. Thus, automated systems using the analysis of the colorimetric properties of Macaw fruits can be used not only to indicate the appropriate time for harvest but also as a parameter to estimate the oil content in the mesocarp of these fruits without the need to destroy the sample.
The results of the use on eleven neural networks based on model's color can be observed in Table 3 by the Kappa index. The Kappa index calculated from matrix error generated by the application of the neural networks on tested group fruit showed that the rankings generated by RN-RGB, RN-H and RN-HSI were significant, considered better than at random at significance level of 1% by the z test, and presented Kappa index greater than 0.75, indicating excellent agreement (Landis & Koch, 1977). The classifications of RN-R, RN-L and RN-Lab networks, although considered statistically better than at random obtained Kappa index lower than 0.75; not presenting the necessary requirements for the selection. The RN-H and RN-HSI presented the highest values of overall efficiency and Kappa index, demonstrating that these networks were the ones with the best accuracy for fruit classification (Table 3). As RN-H and RN-HSI obtained equivalent accuracy, it was evidenced that the saturation (S) and intensity (I) colorimetric properties did not influence the fruit classification which shows that an analysis only by hue (H) is sufficient to classify Macaw fruits by the HSI color model. Already in analyzing the classification by the RGB color model it is noticed that the combination of red values (R), green (G) and blue (B) is determinant for the classifier success. The analysis of these properties in an individual way, did not allow to obtain classifiers with relevant accuracy, and red intensity was the only parameter to obtain a statistically better classification than at random, at a significance level of 1%, but with Kappa index between 0.40 and 0.75, indicating a median agreement (Landis & Koch, 1977).
The comparison between RN-RGB and RN-H by the Z test indicated that the classifiers were considered statistically equal to a significance level of 1% (Z calculated (0.76) < Z tabulated (2.57)).
The use of classification systems based on RGB model to identify the maturation degree has been successful in the analysis of palm fruits in general (Zhang et al., 2014;Mohammadi et al., 2015). However, in some cases where the colorimetric variation between stages is subtle, the RGB model may present difficulties in distinguishing maturation stages being influenced by the intensity of incident light . This difficulty is minimized when performing the analysis by the HSI model, since this model allows analyze separately the hue and saturation. The use of hue property is successfully applied in some automated fruit classification systems by means of artificial vision to determine maturation stage in palm tree (Tan et al., 2010;Makky et al., 2014), demonstrating that conversion of the RGB model to the HSI model facilitates the distinction of maturation degree of the palm tree Elaeis guineenses fruits. Fadilah et al. (2012) also used the hue as input descriptor of a classification system developed by neural network to classify fruits of the palm tree Elaeis guineensis for maturation stage and obtained a percentage of correctness of 91.63% which demonstrates the efficiency of this colorimetric parameter.
Although the RGB parameters can be used to classify the fruits regarding maturation, the use of the colorimetric hue parameter is the most recommended alternative for the development of systems that allow the determination of the fruits harvest point on Macaw with higher oil contents.
When analyzing the fruits classification by RN-H (Table 4), it was observed that the fruits were classified predominantly as immature up to 53 rd WAF. Between 55 th WAF and 55 th WAF there is a variation in the predominant classification attributed to the similarities of the colorimetric characteristics in the stages near the physiological maturation which reduced the accuracy of the classifier. From the 59 th WAF the fruits were classified in the Mature class being considered suitable for the harvest. The high percentage of classification accuracy in all weeks should be highlighted, and ratifies the ability of the classifier to distinguish immature and mature fruits. Macaw climacteric respiration allows the fruits to continue the maturation process, even with the elevation of the oil content, if the collection is carried out before the fruits physiological maturation (Evaristo et al., 2016). Thus, in the harvest from the 55 th WAF it is possible to obtain a quantity of fruits already suitable for harvest and fruits still inappropriate, but which can complete the maturation process during storage. From 59 th WAF, a selective harvest can be obtained with the fruits predominantly in physiological maturation stage and with the highest oil contents. Although anticipated harvesting allows the reduction of mature fruit losses due to natural detachment and a decrease in the number of harvesting operations in the field when comparing with selective harvesting, it will require a rigid management in the postharvest phase so that the fruit reaches the higher levels of quantity and quality of oil content.
In particular situations where fruit colorimetric measurements are not possible, an individual analysis of bunches through fruit sampling may be an alternative for determining the harvest point (Table 5). The bunches analysis for each maturation stage showed that a sample evaluation inside the bunches was sufficient for the RN-H classifier to indicate the moment of Macaw harvest from 59 th WAF.

CONCLUSIONS
The neural networks developed according to the color models identified the 59 th week after flowering as the appropriate time for harvesting Macaw fruits.
Because they are associated with the maturation stage of the fruit, the colorimetric properties can be used as an indicator of the range on oil content found in the Macaw fruit.