Sampling intensity and size to evaluate harvest losses in soybean crops

HIGHLIGHTS: Large harvesters require sampling of harvest losses only at half the header width. Small harvesters require sampling across the entire header width. A 62-foot harvester header has higher losses at the ends and center of the header. ABSTRACT Proper sampling of grain losses during harvesting operations, with reliable and efficient sizing of sample, is necessary for an efficient adjustment of the harvester to avoid harvest losses. Thus, the objective of this study was to estimate sampling intensity and sample size for harvest loss evaluations in soybean crops. Sampling was carried out in five locations with soybean crops, evaluating three different harvesters. Harvest losses were measured using square wooden frames (50 × 50 cm), which were arranged on the ground longitudinally across the harvester header width after its passage; this process was repeated 25 times at each location. The greatest harvester header width in Location 1 enabled to simulate different sample sizes for this location (50 × 100, 50 × 150, 50 × 200, and 50 × 250 cm). Only one sample size (50 × 100 cm) was used for the simulations in the other locations. Sampling only half the harvester header width is recommended to estimate harvest losses when using 62-foot harvester header, whereas 12.5- and 17-foot harvester headers requires sampling across the entire harvester header width, with a semi-amplitude of the confidence interval of 20% of the mean for all harvesters.


Introduction
Soybean (Glicine max L.) is a crop of great importance and Brazil is the world's largest producer (FAO, 2022).According to IBGE (2022), the Center-West region was responsible for 49% of soybean production in Brazil in the 2020 crop season, followed by the South, Northeast, Southeast, and North regions, with 28, 10, 8, and 5% of the total soybean production, respectively.
Soybean cultivation can be fully mechanized, but one of the critical points is the harvest time, as grain losses may occur during the harvesting process (Chioderoli et al., 2012).Harvesters are increasingly technological and robust; however, they require maintenance, adjustments, and setting, as well a trained operator, even to carried out these adjustments and setting whenever necessary, keeping them efficiently operational.These actions contribute to avoid grain losses during mechanized harvest, as high grain losses impact the producer's income (Holtz et al., 2020).
Sampling after the passage of the harvester through the field is an alternative to identify possible grain losses and make adjustments to the harvester to improve its efficiency in grain harvesting (Pereira Filho et al., 2021).Considering grain losses during the harvesting operation and their respective monetary value lost on the field, in addition to the constant technological evolution of harvesters, studies on proper sampling of grain losses during harvesting, with reliable and efficient sizing of samples, are necessary for efficient adjustments of harvesters to avoid harvest losses.In this context, the objective of the present study was to estimate the sampling intensity and sample size to quantify losses during soybean grain harvest.

Material and Methods
Samplings to quantify soybean grain harvest losses were carried out in five locations that are close to the central region of the state of Rio Grande do Sul (RS), Brazil.Three different harvesters were evaluated: John Deere S790, Massey Ferguson 3640, and New Holland TC57.The harvests were carried out when the soybean plants were at the maturation stage (R8), with grain moisture varying between 16 and 18% for all crop locations.
Sampling in Location 1 (municipality of Cruz Alta; 28°51'53"S, 53°41' 36.7"W, and 435 m of altitude) was carried out on April 26, 2021.The harvester evaluated for harvest losses was a John Deere S790, 2020 (new), equipped with a 62-foot draper header (new), equivalent to 18.9 m of useful harvesting width.Thirty-six samples were collected across the header width after the passage of the harvester (average harvesting speed of 6 km h -1 ).The soybean cultivar used in this location was Monsoy M6410 IPRO.
Samplings in Locations 2, 3, and 4 (soybean crops at the Federal University of Santa Maria, in the municipality of Santa Maria; 29°43'05"S, 53°43'59"W, and 116 m altitude) were carried out on April 12, 2021.The harvester evaluated was a Massey Ferguson 3640, 1980, subjected to full overhaul every six months, equipped with a 12.5-foot header, equivalent to 3.8 m of useful harvesting width.A sample size of 50 × 50 cm was defined to collect eight samples along the entire header width after the passage of the harvester (average harvesting speed of 4 km h -1 ).The soybean cultivar used in the three locations was NS 5959 IPRO.
Sampling in Location 5 (municipality of Restinga Seca; 29°44'23"S, 53°29'46"W, and 100 m of altitude) was carried out on April 26, 2021.The harvester evaluated was a New Holland TC 57, 2017, subjected to full overhaul every 12 months, equipped with a 17-foot header, equivalent to 5.2 m of useful harvesting width.Ten samples (sample size of 50 × 50 cm) were collected across the entire header width after the passage of the harvester (average harvesting speed of 4.5 km h -1 ).The soybean cultivar used was Brasmax Garra IPRO.
Square wooden frames (50×50 cm) were used to measure the harvest losses; they were arranged on the ground longitudinally along the entire harvester header width after its passage.Twenty-five replications of rows (spaced 10 m apart) were performed in each location.The number of grains inside the structures were collected and counted to quantify the grain losses throughout the harvesting process.
This methodology was based on a study by researchers of the Brazilian Agricultural Research Corporation (Embrapa Soybean) to determine losses in soybean crops, using a a transparent volumetric measuring cup (Silveira & Conte, 2013); adaptations were made to obtain more data.
The greater harvester header used in Location 1 enabled to simulate different sample sizes for this location (50 × 100, 50 × 150, 50 × 200, and 50 × 250 cm), in addition to the sample size (50 × 50 cm) defined for sample collection in each row replication in all locations.Only one sample size (50 × 100 cm) was simulated for the other locations.
Test of homogeneity of variance (Levene, 1960) was carried out for the sample row replications referring to harvest losses.Shapiro-Wilk normality test (Shapiro & Wilk, 1965) was applied to assess the normal distribution of the data set.Randomness test was applied to the data through the Run test function of the snpar package (Qiu, 2014) of the software R 4.1.0( R Core Team, 2021).
The sampling intensity for each crop row was estimated according to the methodology proposed by Cochran (1977), as Eq.1: where: n -sampling intensity (number of samples); t α/2 2 -value of Student's t-table with n-1 degrees of freedom at p ≤ 0.05; CV% -coefficient of variation of the variable, calculated by the expression: -sample variance; X -mean of each variable; and, (1) D% -semi-amplitude of the confidence interval of the mean (D% = 5, 10, 15, and 20).
Correction for finite population was carried out according to Cochran (1977) by Eq. 2: in Location 1 (John Deere S790); therefore, the cutterbar oscillation is smaller and, consequently, results in more uniform grain losses throughout the entire header width (Table 1).According to Conagin et al. (1993), the lack of homogeneity of variance in the data is highly affected by the lack of normality, which was found for all conditions evaluated in the present study.
This heterogeneity may be consequence of poor adjustment of harvesters; uneven cutterbar height; irregular terrain; harvesting speed outside the ideal range; and even long-term use harvesters designed for lower yields due to their low technology and grain processing capacity when compared to more recent harvesters, thus resulting in lower performance and, consequently, in greater heterogeneous harvest losses (Schanoski et al., 2011).
The randomness of the sequence of harvest losses within the sample rows showed non-random losses for Locations 1 and 2, regardless of the sample size.Random losses were found for Locations 4 and 5, for both sample sizes evaluated in these locations.Regarding Location 3, increases in sampling intensity generated sequences of random losses, indicating that increases in sample size are effective to quantify crop losses in this location, where they will occur regardless of the sample where: nc -corrected sample size; N -population size for each sample row (36 for Location 1; 8 for Locations 2, 3, and 4; and 10 for Location 5); and, n -sampling intensity for an infinite population.Regarding the data from Location 1, where the largest harvester header (62 feet) was used, the sampling intensity was estimated for a D% of up to 100% by the maximum curvature method.Subsequently, the semi-amplitudes of the confidence interval of the mean (D%) were plotted as a function of increases in sample intensity, and the maximum inflection point of the graph's curvature was found to estimate the sample intensity for Location 1, with a sample size of 50 × 50 cm.All analyses were performed at p ≤ 0.05, using the software R 4.1.0( R Core Team.R, 2021) and Microsoft Office Excel.

Results and Discussion
According to the normality test, the data did not follow a normal distribution, regardless of the sample size and location evaluated.This denotes that some phenomenon caused very high variances, which may also be connected to the occurrence of outliers or extreme values, such as data handling errors, or excessive threshing of grains in the plants due passage of animals, vehicles, or implements, not representing the actual loss of the harvester.However, despite the occurrence of some outliers in the database that could result in measurement error, the result may be genuine, indicating an extreme response of the variable, which deserves to be further studied and not removed (Pino, 2014).
The analysis of homogeneity of variance showed homogeneous rows for the sample size of 50 × 50 cm in Location 1, but the rows become heterogeneous as the sample size was increased (50 × 100, 50 × 150, 50 × 200, and 50 × 250 cm).The harvester header in Location 1 is much larger than those in the other locations; thus, greater losses may occur at certain locations of the harvester header.In addition, no grain loss was found in some places when grouping the samples to simulate samples with larger sample sizes, grouping samples that have high or low harvest losses, further increasing the amplitude of losses between samples (Table 1).
Regarding the Locations 3 and 4, the rows were heterogeneous for the sample size of 50 × 50 cm, but they become homogeneous for a larger sample (50 × 100 cm); however, Locations 2 and 5 showed homogeneity of variance between rows for both sample sizes.The harvesters used in these locations (Massey Ferguson 3640 and New Holland TC 57) have headers with a smaller usable width than that used  1).Similar results were found by Santos et al. (2012), who evaluated randomness and productive variability of beans and found that increasing the plot size is effective to increase randomness.
The number of samples varied according to sample size, location, and semi-amplitude of the confidence interval (D%).The smallest sample (50 × 50 cm) is preferable for Location 1, as it provided homogeneous rows, thus requiring a smaller number of samples to quantify harvest losses; using larger sample sizes, as those estimated, requires a larger sample area and, consequently, more labor (Table 2).
Considering a D% of 20%, a 50% decrease in sampling intensity is required to estimate the harvest losses in Location 1.This indicates the possibility of reducing costs, time, and labor for performing the sampling, with high reliability of results, even for a lower sampling intensity (Lúcio et al., 2020;Lambrecht et al., 2022) (Table 2).
Regarding the other locations, the number of samples did not decrease with increasing the semi-amplitude of the confidence interval, regardless of the sample size.This denotes the need for sampling the entire harvester header width, except for Locations 2 and 5, where the sample size of 50 × 50 cm and D% of 20% allowed a reduction to seven and nine samples respectively, for each header width, consequently reducing by 12.5 and 10% the number of samples needed to estimate harvest losses.Similar result was found for the sample size of 50 × 100 cm, for which the number of samples is equivalent to the header width, regardless of the D%, requiring to sample the entire harvester header width (Table 2).
In this context, several researchers recommend sampling the entire header width when working with small harvester headers.Câmara et al. (2007) evaluated the effects of the sample area on determination of total soybean grain losses during harvest and found that 3-m 2 sample area, covering the entire header width, results in a better estimate of harvest losses, with grain losses in the frames closer to the actual losses.
Researchers at Embrapa Soybean have developed a monitoring methodology that enables the farmer to estimate the quantity of grains lost by the harvester during the harvesting process.They recommend random sampling of lost grains in 2 m² after the passage of the harvester; after collection, the grains are placed in a transparent volumetric measuring cup with a graduated scale printed on it, allowing the visualization of the level of grains inside it and a fast quantification of the collected sample (Silveira & Conte, 2013).
The methodology used for grain collection in the present study is similar those used in the aforementioned studies to quantify harvest losses.The available literature provides similar results for the adequate sample area when working with small harvesters.Studies evaluating the harvesters Massey Ferguson 3640 and New Holland TC57 recommend sampling the entire header width after harvesting, totaling areas of 2 and 2.5 m² respectively.Regarding the John Deere S790 and its larger header and total size, the adequate sampling area for an efficient estimate of harvest losses is at least 4.5 m², even for sampling an area corresponding to half the header width.Therefore, determining the sample area size (larger or smaller) depends on the harvester used.
According to the results obtained for Location 1, based on the maximum curvature method, the ideal sampling intensity indicated by the inflection point is eight samples with a D% of 45% (Figure 1).Although this method resulted in a lower number of samples than those shown in Table 2, this sampling intensity cannot provide accurate and reliable estimates of harvest losses; i. e., the estimates from the eight samples may be insufficient, as the actual value may vary approximately 45%, which results in low sampling reliability (Figure 1).This analysis was not carried out for the other locations due to the low number of samples evaluated, which was limited by the header width of the harvesters used, thus making the use of the maximum curvature method unfeasible.
Several studies on sampling intensity estimates for different crops have indicated a D% equal to or less than 20% to obtain good reliability of results (Sari et al., 2020;Tartaglia et al., 2021;Lambrecht et al., 2022) (Figure 1).The distribution of grain losses in Location 1 (harvester John Deere S790) showed losses concentrated at the ends and center of the harvester, which had the largest header (Figure 2A).A greater oscillation at the harvester header ends is usual and favors grain losses.The greater amount of grains lost on the area corresponding to the passage of the central region of the harvester header may be due to the threshing process, which usually is responsible for approximately 15 to 20% of the total losses (Mesquita & Costa, 2006).
A pattern of grain loss distribution was found for Locations 2, 3, and 4, as large amounts of grains were lost during the harvesting process in those locations, and the maximum mean harvest losses found were 70, 70, and 80 grains per sample, respectively (Figures 2B, C, and D).The smaller harvesting width of the harvester used in these locations requires better inspection, measurement, and adjustment of all harvester components to optimize grain collection and, consequently, reduce harvest losses.The harvest losses in Location 5 (harvester New Holland TC57) did not follow any distribution pattern along the harvesting width.However, the lowest numbers of grains lost during harvesting were found for this location, indicating good previous setting of the harvester, making it more efficient in collecting the grains (Figure 2E).
In this context, monitoring, correct sampling, and adequate gauging and adjustment of all harvester components whenever necessary are highly important for keeping the harvest losses within the tolerated limits, which is 60 kg ha -1 (Silveira & Conte, 2013).Such actions improve the efficiency of the harvesting process, with a reduction in grain losses and greater durability of the harvester, thus increasing the profitability of growers.
Harvest losses can occur due to several factors, such as cultivar characteristics, incorrect soil preparation, irregular terrain making cutting the plants difficult, presence of weeds, harvest delay, natural threshing, inadequate grain moisture, or even improper adjustment and operation of the harvester (Aguila et al., 2011).
Soybean crops have several purposes, such as production of seeds for new crops, therefore, seed quality is highly important at planting for a good crop development in the field.Accordingly, great grain losses by the harvester are acceptable to avoid mechanical damage to the grains and not compromise the physiological quality of the seeds (Mathias et al., 2017).In this context, the harvester must be adjusted according to the purpose of the crop, genetic material, grain moisture, and harvester operating speed (Chioderoli et al., 2012).
According to Mesquita & Costa (2006), approximately 80% of grain losses by harvesters occur on the cutterbar, and can be higher when using a reel with unregulated height or speed.Losses caused by the harvester's internal components during threshing, grain separation, and cleaning processes tend to be smaller: approximately 15 to 20% of the total losses.
Holtz & Reis (2013) evaluated quantitative and qualitative soybean harvest losses and found increases in harvest losses as a function of the time of day.Losses on the harvester header were lower in the morning period when compared to the afternoon, probably due to a higher wetting of the plants, preventing the opening of the pods by the header components.However, the total losses and losses in the internal mechanisms did not differ statistically between the morning and afternoon periods.The dry straw conditions and the contact of the reel with other components of the harvester header can cause threshing of some pods and dispersion of seeds across the field.However, under this conditions, the threshing of the pods is very smooth, which justifies lower losses.
Thus, determining the correct sample size and sampling intensity is highly important for reliable evaluation and estimation of harvest losses, in addition to contribute to adequate adjustments to the harvester and all the processes involved in the harvest for a greater efficiency in grain collection.

Conclusions
1. Estimating soybean harvest losses for 62-foot harvester headers requires a sampling intensity of 16 samples and a sample size of 50 × 50 cm, being able sample only half of the harvester header width, with a semi-amplitude of the confidence interval of the mean (D%) of 20%.
2. Estimating soybean harvest losses for 12.5-and 17-foot harvester headers requires sampling along the entire header width, with a D% of 20%.

Table 2 .Figure 1 .
Figure 1.Number of samples required to quantify harvest losses for the harvester John Deere S790 in Location 1 (Cruz Alta, RS, Brazil) in function of the semi-amplitude of the confidence interval (D%)

Figure 2 .
Figure 2. Mean distribution of harvest losses in number of grains along the harvester header width, for 25 replications with a sample size of 50 × 50 cm in each location; Location 1 (A), Location 2 (B), Location 3 (C), Location 4 (D), and Location 5 (E), RS, Brazil