DEVELOPMENT OF A TECHNOLOGICAL INDEX FOR THE ASSESSMENT OF THE BEEF PRODUCTION SYSTEMS OF THE VERMELHO RIVER BASIN IN GOIÁS, BRAZIL

This study analyzed the productive strategies and technology of beef producers in the Vermelho basin in Goiás, Brazil. The data were used to develop a technological index, applicable to the local beef production systems. The data were obtained using questionnaires. A set of 60 properties was selected to provide a representative sample of the relief and soil quality within the study area. The data were analyzed using multiple correspondence, cluster analysis, and beta regression procedures. The variables that most contributed to the definition of the technological level were identified. The variables and production units each formed three clusters, corresponding to three levels of technology: low, mid, and high. The data were used to calculate a predictive index for the analysis and mapping of the technology used in the study area. High cattle densities were found in systems with low technology, indicating low productivity and profitability, and reduced environmental sustainability.


INTRODUCTION
The production of beef cattle in Brazil has advanced considerably over the past two decades through the improvement of rearing and management systems, the specialized management of large-scale, intensive methods, and investments in more efficient productive processes.Even so,

Study area
The Vermelho River basin is located in western Goiás, central Brazil (Fig. 1), within the Cerrado savanna biome, with a total area of 10,938.1 km 2 .The Vermelho River originates at an altitude of 830 m above sea level (asl), 17 km from the town of Goiás, and discharges into the right margin of the Araguaia River, at an altitude of 220 m asl, in the municipality of Aruanã.The Vermelho is one of the twelve most important tributaries of the Araguaia (Machado & Lima, 2011).For analysis, the Vermelho basin was divided into three sectors, according to its slopes and topographic profile, in order to guarantee representative samples of the conditions of relief and soils, which may influence the technological strategies adopted by the local producers.The different sectors of the basin were characterized and the sampling points were generated using topographic images of high spatial resolution (30 m) from the SRTM (Shuttle Radar Topography Mission), while the limits of the basin were generated using Rapideye (2012) satellite images, with a spatial resolution of 5 m.The three sectors were defined as (a) Upper basin, relief undulating to hilly, reflected in a greater variation in soil types, (b) Mid basin, with rolling to undulating relief, and (c) Lower basin, with flat to rolling relief.
Beef production in the region is based on distinct patterns of landholding in the three different sectors.In the upper basin, more than half (52.2%) of the properties are considered to be small, that is, with an area of no more than 50 hectares.This reflects the natural local conditions, which are less favorable for mechanization, given the steeper slopes and the shallower soils, which are relatively rocky.In the flatter lower basin, which is more appropriate for mechanization, a much larger proportion (68.3%) of the properties are classified as large (>2500 ha).
The Vermelho basin is a characteristic Cerrado region, dominated by both planted (67.6% of the total area) and natural pastures destined for the production of beef.In 2016, approximately 83% of the local stock (1,945,716 head) was made up of beef cattle (IBGE, 2006;IMB, 2017).Other components of the production chain are present in the region, including cooperatives, companies that provide technical assistance, feed and supplies, and the region is relatively close to the state capital (Goiânia), which facilitates marketing of the produce.Beef is also produced by confinement, including a JBS facility in the municipality of Aruanã, which is the firms second largest, and has been projected to confine 70,000 animals per annum.There are also three meatpacking plants in the basin, in particular the JBS installation in Santa Fé de Goiás, which is responsible for the basins exports, together with other plants in neighboring regions, including Goiânia, which also purchase cattle (ABIEC, 2014; SEFAZ, 2014; JBS, 2016).

Data collection
Data were collected using a structured questionnaire, which was applied by the authors, in situ, to the administrators of 60 beef producing operations, following the signing of an informed consent form, in July 2016.The questionnaire was designed to cover socioeconomic factors, as well as the description of production systems and pasture management techniques, with the data being validated by the producers and specialists.
The questionnaire was designed in accordance with the objectives of the present study, and with other studies within the scope of the integrated project, which focuses on Brazilian pastures.The information obtained using this questionnaire was used to compile indices of the technology used in the beef production systems, based on three principal groups (see Appendix I): feed (pasture and supplements), machinery and equipment, and management.
The sample points were selected randomly within each sector of the basin, considering the variation in slope, per pixel (area), using the "Zonal Statistic as Table " tool in ArcGIS.The points were determined by relating the number of pixels (area) that refer to the variance in the slope, to the total area of the Vermelho basin.

Data analysis
The data were analyzed using a multivariate approach appropriate for the type of qualitative information collected during the survey (Kubrusly, 2001).A data matrix was prepared in Excel, in which the lines represented each of the 60 cattle ranches and the columns, each of the categorical variables recorded during the collection of the data.This matrix was used to run multiple correspondence and cluster analyses, with the predictive validity of the results for the remaining properties of the study area being evaluated using a beta regression.The analyses were run in the R software.

Multiple Correspondence Analysis (MCA)
The Correspondence Analysis (CA) is an exploratory approach to interdependence, which is appropriate for the analysis of the interdependence of categorical variables in non-linear relationships, which permits the multivariate analysis of the data, reducing their intrinsic multidimensionality into an optimal (two-dimensional) space that permits the graphical representation of the individual sample units or the variables, as well as descriptive statistics (Escofier & Pagés, 1994).The data can be arranged in a rectangular, disjunctive matrix (Hair Jr et al., 2005; Pagés, 2014), which permits the application of contingency tables of frequencies, without the need for probabilistic models or distributions to generalize the results (Guedes et al., 2008).The MCA is often applied to the analysis of data collected using interviews or questionnaires, in which the questions represent the variables, with the objective of reducing the total number of variables in the dimensions/indices which express outlying values (Osborne & Costelo, 2005).In this approach, the number of principal components obtained by the analysis is equal to the number of eigenvalues, which cannot be greater than either the number of lines or columns, whichever is smaller.
The MCA can be derived from a complete matrix or a Burt matrix.When the complete matrix is used, k binary columns are used to represent k categories, although this approach may generate artificial dimensions, given that a single variable is being represented by k dimensions.In this case, the variance of the spatial distribution of the derivatives is inflated, resulting in the underestimation of the variance explained by each of the principal variables.To resolve this problem, the eigenvalues were adjusted using the formula of Benzécri (Breenacre, 2007): where λ α denotes the α-th principal inertia of the complete disjunctive matrix, Q denotes the number of variables, and the 1Q threshold is the mean inertia of the complete disjunctive matrix.
When the inertias are re-adjusted in the Burt matrix, it is necessary to consider λ α = λ b , where λ b is the b-th principal inertia of the Burt matrix.Please note that Camiz & Gomes (2016) provide a complete description of the Benzécri formula.
The interpretation of the MCA is based on the graphical representation of the points and variables, which reveals aspects of the relationships among the different variables that the statistics are unable to show.Each categorical variable is represented by a point, and the proximity between any two points represents the degree of association between the corresponding variables.Each dimension has an eigenvalue, which indicates the relative contribution of this dimension to the total variance in the categories, also known as a measure of inertia.This technique has significant advantages for the exploratory analysis of categorical data, although caution is required for the extraction of inferences from the results.The criterion for the reduction of the number of dimensions depends on the application of relatively reliable procedures and the technique is sensitive to outlying data, while the interpretability of the data representation may depend on the experience of the researcher (Hair Jr et al., 2005; Pagés, 2014).
Multiple Correspondence Analysis (MCA) was applied in the present study with the aim of establishing the relationships among the 60 production units and the 18 qualitative variables, with a total of 35 categories, which were selected as being representative of the technological level of the productive systems (Mangabeira, 2002).

Cluster Analysis
Cluster analyses apply a set of multivariate techniques to the evaluation of the combined similarities and dissimilarities of a set of variables, establishing clusters and representing the characteristics of the relationships among these different clusters (Hair Jr et al., 2005).However, cluster analysis demands a degree of care in relation to the characteristics of the sample: (a) there should be no outlying data points, and (b) the data must be representative of the population, to guarantee valid results (Punj & Stewart, 1983).
The similarities between the samples can be measured by a number of different methods, including measures of correlation and distance (both based on continuous variables), and measures of association, for categorical variables.In the present study, the vectors composed of the different components of the MCA (continuous variables) were used to represent the samples to be clustered.The hierarchy of the data clusters can be analyzed through either divisional or agglomerative methods.In the present study, a hierarchical agglomeration procedure was used to establish a tree-type structure, in which an initial cluster is established and then progressively compared and agglomerated with other clusters, according to their characteristics, generating a set of groups and subgroups, resulting in a dendrogram.
The centroid, Ward, median, complete, and single clustering approaches were tested, and the Ward method provided the most reliable interpretation of the data.The Ward (1963) algorithm was applied to form the clusters using a similarity approach, based on the sum of the squares of the errors within the groups, given by: where k represents the cluster being analyzed, n = the number of samples in cluster k, and x i is the i-th item of cluster k.The criterion for the formation of the clusters, in each iteration, depends on the lowest degree of deviation among the samples.The Ward approach minimizes the variability within groups, and maximizes the variability among groups.

Beta regression model
The beta regression is a statistical approach used to analyze the relationship between a single dependent or response variable, within the interval (0, 1) and a few independent or predictor variables (Sant'Anna & Caten, 2010).In the present study, the response variable was obtained by the cumulative empirical distribution of the technological indices derived from the MCA, restricted to the interval (0 and 1), which is defined by: where and y i is the index recorded on ranch i.The independent variables are the categorical variables that characterize the production system.
The probability density function for the family of beta distributions (Ferrari & Cribari-Neto, 2004) is given by: with the parameters 0 < μ < 1 and φ > 0, while (φ) is the gamma function.
If the random variables y 1 , y 2 , . . ., y n are independent, with density given by (3), with a mean μ t , for t = 1, . . ., n and unknown φ, the beta regression model is obtained assuming that the mean μ t can be written using a logit link function, given by: where β 0 is the intercept, β i is the i-th regression coefficient and X i is the i-th independent variable.
The confidence interval (1 − α) × 100% for the regression coefficients are given, respectively, by: where −1 (•) is the inverse of the cumulative distribution of the standard normal distribution.

RESULTS AND DISCUSSION
The Multiple Correspondence Analysis generated eight dimensions, derived from the eight eigenvalues, of which components 1 and 2 (Fig. 2) together account for 72.40% of the total explained variance, with the other dimensions making a relatively minor contribution.Given that the objective of the present study was to obtain a metrical index of technological development, based on the synthesis of the set of categorical variables obtained on the beef production systems, component 1 (dimension 1) was adopted here, considering its much greater (59.9%) explanation of the variance, given that component 2 (dimension 2) explained only 12.5% (Fig. 2).These percentages indicate the contribution of each variable to the total variation (inertia) of each dimension, indicating which is the most relevant for the formation of each dimension.The most relevant technological variables for the construction of dimension 1 were: the application of lime (uc:1), which explained 9.14% of the total variation of dimension 1, the use of fertilizers (uf:1), with 8.07%, the planning of the breeding season (tem:1) with 6.34%, and the analysis of the soil (as:1), with 5.77% (Appendix 2).The technological variables least relevant to the technological index were: all the cattle density indices [tl:0 with 0.35%; tl:1 with 0.02%; tl:2 with 0.4%; and tl:3 with 0.31%], the annual pasture rotation system, whether present or absent (sp:1 with 0.08% and sp:2 with 0.07%), and provisioning with 0.5-1 kg of feed per head per day (pmmp:2 com 0.05%).
The principal coordinates of dimension 1 for the variables and production units (ranches) were used for the cluster analysis, which provided a better visualization of the distribution of the variables (Fig. 3) and the ranches (Fig. 4).The cluster analysis of the variables indicated the presence of three significant groupings, the first cluster being formed by the variables that represent low levels of technology, including the highest cattle densities (Fig. 3).This association of low technological variables can also be observed in dimension 1 of the MCA (Fig. 2), in the quadrant of values below zero, which represents the absence of technological practices in the production system.These variables include, in particular, variables related to provisioning (pasture and salt only throughout the year [psm:1], and pasture, salt, and urea during the dry season [psmu:1]), which are extremely inefficient, principally during the dry season.In this quadrant, it is interesting to note the presence of the highest cattle density (tl:3), with two or more animals per hectare, which indicates that this density exceeds the support capacity of the pasture, resulting in inefficient production and reduced profitability, in addition to the degradation of the environment.The second cluster encompasses the mid-level technological variables, with investments in technology for the establishment and maintenance of pasture.These variables include the application of fertilizers (uf:1), analysis of the soil (as:1), application of lime (uc:1), the bromatological analysis of the grass (ap:1), the use of high powered tractors (> 100 hp), and the planning of the breeding season (tem:1).This group can be observed to the right of the (0.0; 0.6) interval in the MCA (Fig. 3), with the high-powered tractors (pt100CV:2) grouping together with the variables related to the establishment and maintenance of the pasture.
The third cluster refers to the highest technological level observed on the regions beef production units (ranches) which includes all the variables related to provisioning (pmmp:1; 2; and 3), water supplies (sga:1), technical assistance (at:1), the use of low-powered tractors (< 100 hp), dietary supplements (ui:1), annual pasture rotation (sp:1) or no rotation (sp:2), and cattle densities of 0.6-1.0individuals/hectare (tl:1) and 1.1-2.0individuals/hectare (tl:2).The inclusion of the two types of rotation (i.e., application or absence) indicates that the local producers who adopt supplementation avoid the need for pasture rotation strategies.One potential explanation for the indifferent effect of rotation is the predominance of supplementation during both the rainy and dry seasons, at all the concentrations evaluated.This high technology group can also be observed in the MCA (Fig. 2), with a greater concentration of the variables in the upper right quadrant, in the positive interval (0.0;0.2).
The cluster analysis of the variables indicates that the highest levels of technology evaluated in the present study, are related to the combination of supplementation with low (0.6-1.0 animals/ha) to medium (1.1-2.0 animals/ha) cattle densities.The production units based on the combination of pasture and supplementation obtain the most intensive systems, in comparison with those that invest predominantly in the establishment of pasture.
As for the analysis of the variables, three distinct clusters were observed in the cluster analysis of the production units, i.e., the ranches (Fig. 3).The first cluster (32%) was composed of the low technology units, the second (43%) by the mid-level units, and the third (25%) by the high technology units.These three clusters were well-defined, both by the proximity of the units within each cluster, and the distances between clusters.The parameters estimated by the beta regression are shown in Table 1.Only the highest cattle density (> 2 animals per hectare) is not statistically significant, even at a 10% level significance.This indicates that the technological indices estimated for the ranches with the highest cattle densities (tl:3) do not vary significantly in comparison with those with the lowest densities (tl:0), the reference variable for this analysis, with all other variables maintained constant.For all other variables of the model, it is possible to verify that there is an increase in the index in comparison

127
with the reference variable, when all others are maintained constant.The low standard error obtained for all the coefficients, except tl:3, indicates a reduced degree of variation when compared with a different model of the same size for this population.The model explains almost all the variance (adjusted R 2 = 98.09%), which indicates that the variables analyzed in the regression have a high degree of explanatory power for the definition of the technological index.Obs.: pmmp0;sga0;uc0;uf0;uh0;ui0;tem0;at0;as0;ap0;tl0 are reference categories, used for the calculation and interpretation of the model parameters.
The coefficients of the beta regression were used to calculate the Technology Index (TI), based on the variables with significant explanatory value, which define the productive units of the Vermelho basin.The TI is obtained by: T I = 100 e X 1 + e X , where X = −3, 33 where β i is the i-th coefficient estimated and x i ∈ {0, 1} is the i-th binary variable that indicates the technological strategies used by each beef-producing unit (0 = absent, 1 = present).Assuming that the sample used to create the index is representative of the production units in the Vermelho basin, and given the inferential capacity of the beta regression model, it is reasonable to conclude that the equation of the TI permits the acquisition of a reliable estimate of the technological level of any production unit within the Vermelho basin.The spatial distribution of the production units according to their technological level, or TI values (Fig. 5 and Appendix III) can be compared with the natural conditions found within each sector of the basin, providing an empirical baseline for the development of further research into the regions beef production systems.The findings of the present study indicate that the lower and middle Vermelho River basin encompass larger number of production units with mid to high indices of technology.These sectors are characterized by relief and soil quality that are more appropriate for the implementation of technological practices in comparison with the upper basin, which is reflected in their higher technological indices.These sectors are also characterized by the largest properties, according to the Agricultural Census of 2006, which raises a number of questions with regard to the relationship between natural conditions and the size (area) of the property, and their influence on technological practices.

CONCLUSION
The Multiple Correspondence Analysis provided a reliable approach for the evaluation of the combined set of technological variables, producing a metrical index, derived from a range of qualitative variables, which are important for production systems, as well as the relationships within and among the different groups of variables and production units, based on a cluster analysis.The beta regression analysis provided a predictive equation for binary variables, with a high degree of adjustment (R 2 = 98.09%), allowing for reliable generalizations to the other production systems of the Vermelho basin.
The Multiple Correspondence Analysis identified the principal variables that contributed to the formation of the dimension (principal component 1, which explains 59.9% of the total variance) that represents the technological level of the production units.The dendrograms produced by the cluster analyses of the variables and the production units were each characterized by three, well-separated, homogeneous clusters, consistent with low, mid, and high levels of technology, reflecting the management strategies defined in the interviews with local producers.In particular, the present study indicated that the highest cattle densities (> 2 animals/hectare) can be found on ranches with the lowest technology level, resulting in low productivity and profitability, in addition to impacts on the environment.
The technological index, derived from the measurement of categorical variables, provides an important database for the establishment of public policies for the sustainable development of the Vermelho River basin, the training of personnel operating in the beef cattle sector, and the organization of the productive chain, within a given region, in the context of the specific characteristics of the technological strategies applied in this region, which are determined by the relationship among the principal variables, that is, climate-soil-plant-animal-management.

Figure 2 -
Figure 2 -The MCA based on the adjusted Burt matrix for the technological variables of the beef production systems of the Vermelho River basin, Goiás, Brazil -2016.

Figure 3 -
Figure 3 -Dendrogram of the technological variables, derived from dimension 1 of the MCA for the beef production systems of the Vermelho River basin, Goiás, Brazil -2016.

Figure 4 -
Figure 4 -Dendrogram of the ranches, derived from dimension 1 of the MCA for the beef production systems of the Vermelho River basin, Goiás, Brazil -2016.

Figure 5 -
Figure 5 -Spatial distribution of the beef production units of the Vermelho basin, in Goiás, Brazil, showing their technological indices (TIs) -2016.

Table 1 -
Estimates of the parameters of the adjusted beta regression model of the characteristics of the beef production systems of the Vermelho River basin in Goiás, Brazil.
Source: results of the present study (2016).
REGINA DE OLIVEIRA et al. 133 Appendix II -Absolute (%) contribution of each variable to the total variation (inertia) of the first two axes of the Multiple Correspondence Analysis of technological levels in the beef production sector of the Vermelho River basin in Goiás, Brazil, in 2016.Pesquisa Operacional, Vol.38(1), 2018 134 BEEF PRODUCTION SYSTEMS OF THE VERMELHO RIVER BASIN IN GOI ÁS, BRAZIL Appendix III -Technological index recorded for each beef production unit in the Vermelho River Basin in Goiás, Brazil (July, 2016).
Source: Results of the present study -Vermelho River basin, Goiás, Brazil (2016).Pesquisa Operacional, Vol.38(1), 2018 ELIS Observation: the absolute contribution is expressed as the percent of the inertia explained by each variable that makes up each factor.