A smartphone-based apple yield estimation application using imaging

Apple yield estimation using a smartphone with image processing technology offers advantages such as low cost, quick access and simple operation. This article proposes a distribution framework consisting of the acquisition of fruit tree images, yield prediction in smartphone client, data processing and model calculation in server client for estimating the potential fruit yield. An image processing method was designed including the core steps of image segmentation with R/B value combined with V value and circle-fitting using curvature analysis. This method enabled four parameters to be obtained, namely, total identified pixel area (TP), fitting circle amount (FC), average radius of the fitting circle (RC) and small polygon pixel area (SP). An individual tree yield estimation model on an ANN (Artificial Neural Network) was developed with three layers, four input parameters, 14 hidden neurons, and one output parameter. The system was used on an experimental Fuji apple (Malus domestica Borkh. cv. Red Fuji) orchard. Twenty-six tree samples were selected from a total of 80 trees according to the multiples of the number three for the establishment model, whereby 21 groups of data were trained and 5 groups of data were validated. The R2 value for the training datasets was 0.996 and the relative root meansquared error (RRMSE) value 0.063. The RRMSE value for the validation dataset was 0.284. Furthermore, a yield map with 80 apple trees was generated, and the space distribution of the yield was identified. It provided appreciable decision support for site-specific management.


Introduction
Major fluctuations in apple yield from year to year and from orchard to orchard have not only seriously damaged the interests of farmers but have also affected the balance between supply and demand (Devetter et al., 2015).Thus, apple yield estimation with an emphasis on tree variability is a crucial step in precision orchard management and a pre-requisite for all partners in the food chain (Stajnko et al., 2009).
With the development of spectroscopic analysis and image sensing technology, research on estimating site-specific fruit yield has become a new research trend (Okamotoa and Won, 2009;Zude et al., 2008).The potential of the narrow-band TBVI (two-band vegetation index) derived from airborne hyperspectral imagery to predict citrus fruit yield was examined in Japan (Ye et al., 2008).The microwave backscatter response of pecan tree canopy samples to estimate pecan yield in situ using terrestrial radar was studied (James et al., 2013).Compared to spectroscopic and remote sense analysis, machine vision and image processing have the advantage of low cost and easy operation (Wachs et al., 2010).They have been widely applied in yield estimation of fruit such as apple (Aggelopoulou et al., 2011;Zhou et al., 2012;Qian et al., 2012b), citrus (Gong et al., 2013), blueberry (Swain et al., 2010), and mango (Payne et al., 2013).
In China, most orchards are managed by farmers with small production areas.Providing a simple and convenient mode for real-time and on-scene estimation is also important to orchard workers even if the precision is not very high.With the development of mobile communication technology, portable devices (smartphone, tablet) have become effective tools for collecting information.(Qian et al., 2012b;Gong et al., 2013;So-In et al., 2014).In order to improve model processing capacity and enhance the yield information display, a new framework was designed: 1) the fruit tree canopy images and the tree ID were synchronously acquired with a smartphone; 2) the image processing and model constructing following the ANN (Artificial Neural Network) method were carried out in the server client, and 3) the estimation result and yield map were shown in the smartphone.Based on this framework, an application on the Android platform was run and tested in an apple orchard.

Application framework design
The first step in a yield estimation exercise for an individual tree was the selection of a single fruit tree for analysis using a recognized identification method described in the literature (Qian et al., 2015) which requires a tree label card with a QR code to be hung on the tree.The QR code, which included the tree ID and tree position, could then be recorded and extracted on a smartphone.
As Figure 1 shows, the application framework consisted of the data necessary for acquiring and displaying Step 1: Scanning the QR code in the tree label and extracting tree ID.The QR code on the tree label card was captured with the smartphone camera.Through image preprocessing and barcode decoding, the tree ID stored in the QR code was extracted and recorded.

Agricultural
Step 2: Acquiring the canopy images from two sides.Because of overlapping, it was necessary to capture images from both sides of the fruit tree in question.The whole canopy for the object tree needed to be covered as much as possible and other trees needed to be eliminated.The two images for this one single tree were stored in the smartphone and related to the tree ID.
Step 3: Processing the images preliminarily in the smartphone client.With the aim of saving data throughput and reducing calculation pressure in the server client, preliminary image segmentation was implemented in the smartphone client.
Step 4: Uploading the tree ID and related images after preprocessing in the server client.
Step 5: Identifying the amount of objects in the images.Total identified pixel area (TP), fitting circle amount (FC), average radius of the fitting circle (RC) and small polygon pixel area (SP) were deployed as input parameters to establish a yield estimation model.TP showed the overall yield trend for different trees.FC and RC represented the amount and size of no and less overlapping fruit, which can be curved with the circle-fitting method.The two parameters had a direct relationship with fruit yield.SP represented the amount of small and unfitted polygons due to serious overlapping.In this step, these parameters need to be extracted.TP and SP can be directly calculated by the Matlab software program.FC and RC were also calculated by the Matlab software program after circle-fitting.
Step 6: Estimating the fruit tree yield by the ANN method.Four extracted image features were used as input parameters and the individual tree yield as the output parameter.A fruit tree estimation model was then established following the ANN method.An estimation result was given.
Step 7: Sending the yield information to the phone.
Step 8: Displaying the fruit tree yield followed by analysis.Yield value was recorded and displayed in the phone.Drawing on the many tree yield values for the whole orchard, a yield map can be drafted.

Preliminary image segmentation in phone client
Fuji apples show deeper red and reduced green colors during the harvest period, and the R/B (R and B is the value of RGB color space) value can be seen as an important segmentation parameter (Raphael et al., 2012).Qian et al. (2012a) used line profile analysis technology and combined the R/B and V values for segmentation whereby the apples and the background had a good result.Using the thresholds, the objects and background were segmented preliminarily.

Singe apple recognition with circle-fitting method
The common phenomena of apple separation and overlap, which occur naturally, strongly affected the apple identification rate.As the identified apple contour was considered a circle (approximately), circle fitting could be used to discriminate the separated apples and the overlapped apples based on curvature analysis (Xiang et al., 2012).Curvature k can be represented as follows: In the formula, (x i , y i ) was the coordinate of the sampling point.θ i the tangent angle, and ΔS the curve length of two consecutive sampling points.Figure 2 shows the flow determination for single apple recognition.

Apple yield estimation model established using the ANN method
In this study, a back propagation algorithm was used to train the neural network.A three-layer network architecture, consisting of one input layer, one hidden layer and one output layer, was established.There were four variables for the input layer including TP, FC, RC and SP.The hidden layer contained 14 neurons.As only the fruit estimation yield was used as the target in the network, the output layer contained one neuron only.From input layer to hidden layer, Tan-Sigmoid Transfer Function (tansig) was used.From hidden layer to outer layer, Linear Transfer Function (purelin) was used.The ANN structure is described in Figure 3. Image acquiring: Using the camera in the phone, QR code image and canopy images from two sides were acquired.Fruit tree ID was extracted and stored with the related canopy images (Figure 4B).
Image preliminary processing: A default segmentation threshold value of R/B and V was set in the application.The user can also adjust the value according to the actual condition.The threshold value enabled the original canopy images to be processed.
Actual yield collecting: At the time the model was established, individual actual yield had to be collected for comparing with the estimation yield to test the model's precision.The actual yield was recorded with the tree ID.
Data uploading: The preliminary processing images were uploaded to the server client with the tree ID.
Individual tree yield showing: When the server client returned the yield estimation result, the information including tree ID and estimation yield was shown in the phone interface (Figure 4C).
Orchard yield mapping: When many tree yields were estimated, a yield map can be drafted.In the map, the yield difference can be described with different colors.The user can directly know the yield distribution in different positions.

Data processing in server client
The processing program was developed on the Microsoft.NET platform using the Matlab V7.0 interface.Data processing was automatically implemented, when the program received data from the mobile client according to different data type of training data or testing data.The following procedures were carried out in the program: Extracting the image features: TP and SP can be directly calculated by the interface provided by the Matlab software program.FC and RC were calculated after circle-fitting.For the single tree analyzed, TP, FC, RC, SP of both sides were added as input parameters for the estimation model.Values for the default parameter The coefficient of determination (R 2 ) and relative root-mean-square error (RRMSE) were used to evaluate yield estimation model.The RRMSE was calculated as where N is the number of data samples; Y(i) the actual yield ith value (kg); Y the average of the actual values, and 'Y(i) the predicted yield ith value.If the R 2 value is higher and the RRMSE value lower, the model performs better.

Mobile application in smartphone client
Mobile application was developed on an Android platform using the Java language.The following procedures were performed by the application (Figure 4A).During the collecting of data, training data were discriminated from testing data.for the circle-fitting were set in the application.The user can also adjust the value according to the actual condition.
Establishing the potential yield model: In the training set, the yield estimation model was constructed based on the four input parameters extracted, the actual yield and the ANN method adopted.The individual tree yield predicted was calculated as an output parameter and the estimation precision was given by a comparison of the actual and predicted yields.
Estimating the individual tree yield: Using the model the individual tree yield could be predicted.
Calibrating the model: In order to increase the model practicability in a different apple orchard, the calibration was configured.With this function, the images of two sides and the actual yield of the single tree were collected and dealt with as a training set.

Study area
The test was conducted in Feicheng city in Shandong Province, China.The orchard is located at 36°13'47.41"North and 116°50'36.70"East and its altitude is 141 m.A total of 80 Fuji apple trees were planted in the orchard, which had an area of 0.15 hm 2 .The fruit tree row and column space were approximately 3.5 m × 2 m.The orchard was planted in 2001, and had a high management level with detailed production records.The height of the trees was 2.5-3 m and the trees remained free of serious disease during the study period.

Data acquisition
Approximately one week before the fruit harvest, digital images were acquired from 10 to 11 Oct 2013.During this period, the fruit was red and mature, and there were significant color differences in the fruits, leaves and trunks.The weather was sunny on these two days.Using a mobile phone (HUAWEI G716, CPU: MSM8930 1.2GHz, Memory: 1GB RAM, Resolution: 8 megapixels) with the developed Apple potential yield estimation application, images from 80 fruit trees were obtained between 10 a.m. and 3 p.m. from the southeast and northwest directions.The total image number of 80 trees from two directions was thus 160.
To maintain image conformity, the same focal length was maintained, and as much of the entire fruit tree was included in the images as possible.The image format was JPG with a resolution of 1600 × 1200.

Fruit yield on individual trees collection
Actual yield data for the individual apple trees were collected from 13 to 25 Oct 2013.In the harvest period, orchard harvest managers recorded the yield data at each sampling time using a smartphone with the developed yield estimation application.At the end of the harvest, yield data on individual apple trees were automatically counted.

Image segmentation
The R/B value and the V value along the profile line were calculated.Figures 5A and B show a test image of the line profile analysis.According to the original image with a white profile line (Figure 5A), the R/B values and V values were plotted in the pixel range from 0 to 639 in Figure 5B.There were five R/B peak values in the pixel range of 80-110, 135-220, 390-420, 425-486 and 566-575 (Figure 5B).With the exception of the range of 425-486, all the ranges listed above were not in the pixel range of mature apples.Further analysis showed that all four peak value ranges except for the 425-486 range were backlighting leaves and red trunk noise.These two types of objects with high red and low blue colors were thus easily identifiable as apples.
The V value was introduced to exclude the background high R/B value caused by backlighting leaves and red trunks from the actual apple-related values because the V value is low in non-apple objects with high R/B value ranges.Thus, the V value can be used together with the R/B value to exclude non-apple noise.In this case, the R/B value had five peak values, whereas the V value had only one peak in the range of 425-486, which was the actual apple range.With the application of the combined R/B value and V value to segment the image, the identification error could be significantly reduced.Ten images were selected at random to obtain the R/B and V segmentation threshold.A profile line was plotted in the middle of every image, and 10 lines passed through 54 apples.The R/B value and V value on the point of intersection of the profile line and apple edge were analyzed.Generally, there were two intersection points of an apple and a profile line, but because of apple separation and overlap phenomena, the number of intersection points was 174 rather than 108.The average values of the 174 points were calculated to be 1.375 for the R/B value and 0.456 for the V value.

Overlapping and separation processing by the circle-fitting method
The parameter values were confirmed based on the reference of tomato curvature analysis in the literature (Xiang et al., 2012).Ten apple images were selected from the 160 images acquired.The average radius value of 45, maximum radius value of 65 and minimum radius value of 30 were calculated.Compared with the tomato parameter values in the literature, the absolute values of radius value in this research were higher than the values in the research on tomato.This was most likely because the apple radius is greater and the curvature is smaller than for tomatoes in the same range.The circle fitting results were shown in Figures 6A and B.

Model input variable extraction
TP, FC, RC and SP as ANN input were used to establish the yield estimation model.TP was counted calling the "bwlabel" method provided by the Matlab V7.0 interface.SP was counted using the "count" method provided by the software.FC and RC can be calculated by calling the "count" and "average" method in the server client after fitting circles.Four color features were obtained by processing.Table 1 shows the extraction result from 21 fruit tree images for the training dataset and 5 fruit tree images for the validation dataset.

Yield estimation
From the 80 apple trees, 26 trees were selected for establishing the model at an interval of 3 trees with the ascending tree ID.The data were divided into two parts.The first part was a training dataset comprising 21 apple trees, and the other five trees were used for validation.The validation dataset was selected at random from 26 group data.In the training dataset, the actual yield of N° 3 apple tree was the lowest, with a weight of 13.60 kg, and the yield of N° 9 apple tree was the highest, with a weight of 71.92 kg.The average actual yield of training dataset was 40.36 kg and the validation dataset was 40.52 kg.The standard deviations of the training dataset and the validation dataset were 14.76 kg and 9.30 kg, respectively.
Back propagation neural network was used to establish a prediction model for fruit yield.Four image feature parameters comprised the input layer of TP, FC, RC and SP.The maximum training steps were set to 20,000 and  the training error was set to 0.001.The main procedure consisted of two parts.Firstly, all of the data was normalized, and then the training dataset was used to compute the weights between different layers of neurons.Secondly, the fruit yield was predicted using the validation dataset.The yield prediction model was established using data from 21 groups.The comparison of predicted yield and actual yield was shown in Figure 7 and Table 2.The absolute error between predicted yield and actual yield was small.The biggest absolute error value with 3.15 kg was in apple tree N° 63.The mean value and standard deviation predicted were 40.35 kg and 14.55 kg, which was very close to the value of the actual yield.The R 2 value for the training datasets was 0.996 and RRMSE value was 0.063, which indicated the effectiveness of the neural network algorithm in apple potential yield estimation for the training sets.
Furthermore, the validation dataset comprising five data was used to test the established model.The comparison of the result of the actual and predicted yield is shown in Figure 8 and Table 2.For the N° 27,   66 and 75 trees, the predicted yield was lower than the actual yield.For the other two trees, the predicted yield was higher than the actual yield.The minimum deviation value was 0.1 found in N° 66 tree and the maximum was 4.54 in N° 45.The mean value and the standard deviation of the predicted yield was 41.84 kg and 9.22 kg, which was close to the value of the actual yield.The RRMSE of the validation set achieved 0.284.Although the value was higher than the training dataset, the result could be considered acceptable according to the literature (Ye et al., 2008).In terms of training procedure and validation result, the model could be used for the potential yield estimation.

Yield mapping
In the orchard, images from 80 apple trees were acquired and the yield every fruit tree was estimated using the ANN potential yield estimation application in the smartphone.A yield map can be generated in server client and displayed in the smartphone sending the picture over the 3G or 4G networks.From the yield map in Figure 9, the fruit yield according to the estimation result was highly variable ranging from 13.60 to 71.92 kg.Moreover, the space distribution with different yields was revealed, which provided appreciable decision support for site-specific management.

Yield estimation precision
Establishing field models using image processing technology with a digital image capture instrument is a low cost method.Swain et al. (2010) presented an automated yield monitoring system for wild blueberry fruit and a significant correlation between percentage blue pixels and actual fruit yield with R 2 0.90 observed.
In this study, the R 2 value for the training datasets was 0.996.The yield evaluation precision improvement is due to two factors.To check for any possible differences not seen when obtaining the fruit image from one side, this study took two images from the southeast and northwest directions for the single fruit tree.Therefore, the fruit images can be taken as required.On the other hand, the overlapping situation of fruits on tree is very serious.A circle fitting method using curvature analysis was used to discriminate the separated apples from the overlapped apples.

Performance efficiency
Investigation into estimating site-specific fruit yield has become a new research trend assisted by information and communication technology.Zhao et al. (2013) assimilated remote sensing information with a crop model using an Ensemble Kalman Filter for maize yield estimation. Herrero-Huerta et al. (2015) presented vineyard yield estimation by automatic 3D bunch modelling in field conditions.Compared with the above-mentioned research on remote sensing, optical spectrum, this study was rendered convenient, economical and effective by the use of a mobile phone.The operation is simple and the time consumed short.Processing time was a major concern for the apple potential yield estimation application based on image processing.Compared to a computer, the smartphone had limited hardware resources and a slow processing speed.In order to save CPU time and memory, the image acquisition and result displaying procedures were deployed in the smartphone and image processing and model computation were arranged in the server.A HUAWEI G716 smartphone with 1.2GHz CPU frequency and 1G memory were used for testing the processing time.With the distribution mode of image recorded in the smartphone and the model computation in the server, the time for an individual fruit tree was about 4 s.In the model establishment and yield mapping period, time expended was more.

Conclusions
The wide application of mobile phones provides a convenient medium for real-time and in-situ management.This study demonstrates the potential for using a smartphone to predict individual tree yield in apple orchards.A distribution framework was designed consisting of acquiring and displaying information in the smartphone client and data processing in the server client.
A yield estimation model was established with image features using the ANN method.Four input parameters of TP, FC, RC and SP were obtained after image Sci.Agric.v.75, n.4, p.273-280, July/August 2018 segmentation and circle fitting.Individual tree yield as output parameter was estimated using the model and a smartphone-based application and server-based procedure were developed based on the yield estimation model using Matlab interfaces.
The application was used in an experimental apple orchard.21 group data were trained and 5 group data were validated.The R2 value for the training datasets was 0.996 and the RRMSE value 0.063.The RRMSE value for the validation dataset was 0.284.Furthermore, a yield map for 80 apple trees was generated.The space distribution of different yields was identified which provided appreciable decision support for site-specific management.
Although this study has an acceptable estimation result, there is still room for improvement.The image recognition effects and ANN performance were the main factors that influenced the accuracy of the estimates.In conclusion, there is a need for further study on improving the fruit identification rate under different illumination conditions..Meanwhile, the potential yield estimation model in this study has not been used under any such conditions.It needs to be calibrated according to the actual situation.
smartphone client and data processing in the server client comprising eight main steps.

Figure 1 -
Figure 1 -Application framework of a smartphone-based apple potential yield estimation combing smartphone client and server client with 8 main steps.

Figure 2 -
Figure 2 -Singe apple recognition determination with circle-fitting method processing.

Figure 3 -
Figure 3 -Structure of Artificial Neural Network with four input variables and one output variable for implementing apple yield estimation model.

Figure 4 -
Figure 4 -Interfaces of information acquiring and displaying application on smartphone platform (A for main menu; B for image acquiring; C for individual tree yield).

Figure 5 -
Figure 5 -Line profile analysis on R/B and V value to obtain the segmentation threshold of apples and background (A for an original image and B for profile line of R/B value and V value according to the original image).

Figure 6 -
Figure 6 -Diagrammatic sketch of circle fitting with determination threshold values to discriminate the overlapped apples (A for two original images and B for processing results after circle fitting).

Figure 7 -
Figure 7 -Results of linear regression between the actual yields and the predicted yields with training dataset.

Figure 8 -
Figure 8 -Comparison of the actual yields and the predicted yields with validation dataset.

Figure 9 -
Figure 9 -Apple yield map of the 80 apple trees in the experimental orchard using the ANN potential yield estimation application.

Table 1 -
Modeling data of 4 color features (total identified pixel area (TP), fitting circle amount (FC), average radius of the fitting circle (RC) and small polygon pixel area (SP)) and actual yield data for 21 fruit trees to train and 5 fruit trees to validate.

Table 2 -
Prediction result from statistical analysis comparing estimated yield and actual yield.RRMSE = relative root mean-squared error.