Modeling multivariate time series with copulas: Implications for pricing revenue insurance

Much of the soybean produced in Brazil is exported and, consequently, the domestic soybean price (R$) is greatly influenced by the price traded at the Chicago Mercantile Exchange Group (CME Group) (US$). Therefore, to model the dependency structure between soybean yield and price, the exchange rate must be incorporated into the modeling. This study aims to model the dependency structure between these three variables using the Copula methodology, calculate the crop revenue insurance rates, and compare with the rates offered in the insurance market. The rates applied by the Brazilian insurance market are overpriced when compared to the methodology presented in this study with the incorporation of the dollar rate in the modeling, which could increase the problem of adverse selection exchange and hamper massification of agricultural insurance in the Brazilian territory.


Introduction
In the work of Duarte et al. (2017), we analyzed the dependency structure between yield and monthly nominal average prices (in Reais 1 ) received by soybean producers in the state of Paraná.This spot price is collected from regional centers and weighted by the relationship between production in the region and state, and then the simple arithmetic average is used to calculate the monthly average price.
The purpose of a business in the spot (or physical) market is to make an immediate purchase and/or sale.Thus, the delivery and payment of the product occur concomitantly.This type of market is typically sporadic and has a high degree of uncertainty regarding price behavior, supply regularity, and product quality.Therefore, a company that adopts this mechanism assumes a great risk, which may lead to failure in its operations due to uncertainties related to product demand and supply.
One way to reduce those risks is using future markets, known as hedging, which is a price-locking operation in a future date.In most contracts, financial settlement occurs prior to the delivery date and does not require physical delivery to the agreed location.
For example, soybean produced in Brazil is usually traded at CBOT/CME; however, the physical delivery of the commodity is in China or the European Union.
Cooperatives, or trading companies, that trade Brazilian soybean use the futures market to price the commodity for future delivery.Thus, since much of the soybean produced in Paraná is exported, the US dollar volatility has a great influence on prices received by farmers and thus impacts directly in their income.
Therefore, in the income insurance context it is important to study not only the effect of the dependency structure between agricultural yields and price but also when the exchange rate variable is incorporated in the analysis.
In Brazil, two insurance companies offer revenue insurance to farmers, from now on called insurer A and insurer B. Insurer A uses the annual productivity series in its pricing for each municipality and the daily physical price series (CEPEA/ESALQ indicator).This indicator represents the average trading value of Brazilian export-grade bulk soybeans, referenced at the Paranaguá Port, Paraná, Brazil.
On the other hand, insurer B uses CME future price series transformed into Reais.
In this case, the correlation between future price (CME price) and the dollar rate (EXG rate) is not taken into account and may lead to a reduction in the impact of price (CME).
In order to understand the variation and dependency of the exchange rate (EXG rate) on revenue modeling, this variable was incorporated into the model.Therefore, this paper presents an alternative approach with the incorporation of the EXG rate into the Copula model (three-dimensional approach).The study of the relationship of these three variables was not found in the literature, in the context of agricultural risk and the comprehension of the relationship between these three variables is the main contribution of the study.Recent articles in the agricultural risk context only consider the modeling of two variables, price and yield (Duarte and Ozaki, 2019).
This study modeled the dependency structure between soybean yield, CME price, and EXG rate through Copula functions.
In order to compare the inclusion and the impact of the third variable the analysis was performed under the two-dimensional and the three-dimensional approaches.In addition, in both approaches, we calculated according to Duarte et al. (2017) and estimated insurance premium revenue rates and their comparison with the rates applied by the Brazilian insurance market.
In the next section, we present the way of calculating the revenue insurance premium, under the three-dimensional approach methodology to calculate premium rate.Section 3 presents the description of the data used in the study.Section 4 presents the Copula methodology for calculating the three-dimensional distribution and the models used for modeling marginal distributions.Section 5 presents the study results.Section 6 discusses the results from both approaches.Finally, the conclusions are presented in Section 7.

Revenue Insurance Premium Rate
In this study, we consider revenue F as a function of three variables, soybean yield (X ), USD price (Y ), and EXG rate (Z) in the expression F = XY Z.Guaranteed yield is defined by X g = λ X e , where 0 < λ < 1 is the coverage level (CL) chosen by the producer and X e is the expected yield, usually calculated by the average of the last five harvests.Guaranteed price is defined by Y g = λY e , where 0 < λ < 1 is the CL and Y e is the expected price, usually calculated by the average of the last 15 days of the futures price traded at CME on the day of contracting the insurance.
Similarly, we define the guaranteed EXG rate (Z g = λ Z e ).
In this type of insurance, we assume that the compensation paid per unit of area to the rural producer is given by: I = max{(x g y g z g − xyz; 0}. Loss probability is given by p = Pr(X < x;Y < y; Z < z | xyz < x g y g z g ).This conditional probability expresses the cumulative joint probability of an event (loss) as a function of the random variables X , Y and Z, since the choice of XY Z values is restricted to the condition.Thus, the premium rate is given by the expected loss, according to the equation: Therefore, the optimum prize rate is given by x g y g z g .
Note that the expected revenue is related to risks in marginal distributions of price, yield, and USD.The marginal distribution of yield and price is inferred from historical yield and price data using empirical and parametric methods.In turn, the construction of the joint distribution uses a parametric copula function.The theoretical concepts related to determining the premium rate of the two-dimensional revenue insurance were described by Miqueleto (2011) and Brisolara (2013).

Data Description
For yield modeling, we analyzed the annual grain yield series (kg/ha) of the municipal- The daily rate of trade quote for the same period was taken from the Macroeconomic and Regional Database of the Institute of Applied Economic Research Ipea (2017), on Feb 05, 2017.
To obtain the same periodicity of price and EXG rate series, we simulated the soybean yield reaching 455 observations; thus, allowing the use of the Copula methodology for the three variables with the same periodicity (455 observations each).In addition, a second analysis was performed simulating 20,000 observations of the three series from the adjusted Copula.Finally, the commercial rates charged by insurance coverage applied by insurer A are available from MAPA (2017).

Methodology
The methodological steps of this work consist of: • Step 1) Yield modeling.
Agricultural yield over the years has shown an upward trend, due to the great advance in technologies used in crops, such as planting techniques, the use of inputs, machinery, among others.Therefore, the deterministic procedure suggested by Gallagher (1987) and Tejeda and Goodwin (2008) is used for the yield detrend. • Step 2) CME Price and exchange modeling.Soybean prices are less predictable than yield, due to the large international soybean market.Price and EXG rate series have a much more complex dynamic as a number of external factors.Domestic supply and demand issues also affect both variables.Two characteristics of these series are the trend and non-constant variance over time.Therefore, it was decided to work with the differentiated series, transforming it into a series of prices adjusted at the end of the trading day, removing the trend of these series.In addition, to extract what is important for price risk, modeling price volatility of marginal univariate series is incorporated into the final Copula model.
To study the dependence structure of the variables in question, Copula functions are used to construct the multivariate function.In this work, two approaches were used: two-dimensional and three-dimensional.
In the two-dimensional approach, the CME price and EXG rate daily price series were multiplied and transformed into a single price series in Reais.
In this case, a null correlation was considered between these variables.This approach is similar to that used by insurer B when pricing agricultural revenue insurance.
In turn, the three-dimensional approach takes into account the correlation between the CME and EXG rate price variables.To understand the risk of Brazilian soybean price, influenced by exports and, consequently, international prices, it was decided to incorporate the exogenous exchange rate variable into the revenue insurance pricing model.

Yield Modeling
As previously described, soybean yield has increased over the years, mainly due to the technology employed in the field.
Therefore, in order to model the yield risk we detrend the series, through the deterministic procedure suggested by Gallagher (1987) and Tejeda and Goodwin (2008).For soybean yield modeling in Cascavel and Toledo, the Normal, Skew Normal, Odd Log Logistic Normal, and Skew T distributions were used, as presented in Duarte et al. (2017).
The well-known Normal models were used as a first approach to modelling yields.
The description of the methodology of the Normal and Skew Normal distribution follows Botts and Boles (1958), Just and Weninger (1999), Azzalini (1985), Ozaki and Silva (2009).
Figure 2 shows the density of the OLLN distribution for some values of the parameters µ,σ and α.It should be noted in the plots 2a and 2b the contribution of the parameter α on the unimodality and bimodality of the distribution, when µ and σ are fixed.
When the parameter α approaches zero, the pdf presents bimodality.On the other hand, when the value of α increases the function presents unimodality.It is observed that when µ varies, the plots is translated in the x-axis, regardless of the form Figure 2c.

Price and EXG rate Modeling
One of the goals in finance is the risk assessment of a portfolio of financial assets.In this study, the objective was to assess the risks of trading soybean commodity at the CME exchange.Risk is often measured in terms of asset price changes.
Suppose Y t is the commodity price at time t, usually a trading day at CME or the EXG rate quote (Z t ).The price change between t and t − 1 is given by the difference ∆Y t = Y t −Y t−1 = d t , adding it to daily price adjustments.
In finance, relative price variation is widely used, called simple net return (R t = ∆P t P t−1 ) or log return r t = log P t P t−1 . According to Morettin (2008), it is preferable to work with returns, as they are free of scales and have statistical properties that are more interesting (such as stationarity and ergodicity).
The ARMA, ARCH, and GARCH family models are used to model returns.Returns usually have interesting features: returns are self-correlating, return squares are selfcorrelating, have volatility groupings over time, and the returns distribution have heavier tails than the normal distribution.In addition, distribution is generally leptocurtic, although approximately symmetrical.These facts are known as stylized facts concerning returns (Morettin, 2008).
Therefore, deterministic volatility could be calculated by the ARCH family models, which were introduced by Engle (1982), with the purpose of estimating the inflation variance.The idea is that returns are serially uncorrelated; however, the conditional variance (volatility) depends on returns applied to a quadratic function (Morettin, 2008).
A drawback of this model is that positive and negative returns are treated similarly because the square of returns is present in the volatility formula.Generally, volatility reacts differently to positive and negative returns.In addition, because the returns are for the square, some large isolated values may lead to overestimated forecasts.Thus, ARCH models tend to overestimate future volatility, incorporating past extreme event outcomes (Morettin, 2008).
where: ε t are iid with zero mean, It is generally assumed that ε t errors are usually distributed or follow a t-Student distribution or a generalized error distribution (ged), described in Nelson (1991).According to Morettin (2008), it is recommended in practice to define low order models, for example, GARCH (1,1), (1,2) or (2,1).

Copulas
According to Joe (2014), the copula is a multivariate distribution in which all marginal one-dimensional distributions are uniform U (0,1).Consequently, if C is a Copula, then the distribution of a vector of dependent random variables is U (0,1).The result below is an extension of Sklar theorem, described in Sklar (1959).
with marginals U(0,1) that satisfy: (a) If F is a continuous d-dimensional distribution with uniform margins F 1 , . . .,F d and quantile functions F −1 1 , . . .,F −1 d , then: The proof of this theorem can be found on page 8 of Joe (2014).
Copula C is parameterized by a δ parameter vector, called the multivariate dependency parameter.
Copula families are generally given as a function of accumulated distribution (for likelihood inference) and Copula density is obtained by differentiation.If C(u) is an absolutely continuous cumulative distribution then its density function is given by: There are different multidimensional Copulas, including the Gaussian, t-Student and Archimedean families (Cherubini et al., 2004).

Inference of Multivariate Copulas via Pseudo-Likelihood
In the study of Copula dependence, there are several possibilities for choosing the marginal distribution and the Copula family.For continuous variables, the estimated copula parameter can be compared using the pseudo maximum-likelihood approximation, also known as the canonical maximum likelihood, proposed by Genest et al. (1995).
This is a semi-parametric method using the empirical distributions F1 , . . ., Fd and maximizing: where δ is the Copula parameter and Fj (y) = n −1 {∑ n i=1 I(y i j ≤ y) − 0.5}.
According to Joe (2014), the pseudo-maximum likelihood with empirical distribution for marginal functions can only be implemented if the variables are continuous and there are no covariates or censored observations.
If the Copula model is specified correctly, then Genest et al. (1995) shows that δ pseudo is consistent and asymptotically normal.
In addition, Chen and Fan (2006) proved other asymptotic properties when using semi-parametric inference in the time series context and developing estimation methods for one-dimensional copula and marginal functions.Copula selection, when applicable to the pseudo-likelihood approach is commonly performed by comparing AIC Akaike (1974) or BIC Schwarz et al. (1978), based on pseudo-likelihood logs.
In this work, we chose this method for the estimation of Copula parameters.

Graphical Tool to Detect Dependency -Chi-plot
In addition to the scatter plot, the Chi-plot is another graphical tool used in the literature to detect dependency between two variables, proposed by Fisher and Switzer (2001).
The description of this section is based on the works of Genest and Favre (2007) and Tursunalieva and Silvapulle (2007).The Chi-plot graph is based on the data ranks.
The Chi-plot is a scatter plot between pairs (λ i , χ i ) which are transformations of n pairs Let H be a continuous bivariate distribution function and F n and G n the marginal distributions of X and Y , respectively, which can be estimated for each pair (X i ,Y i ) as follows: where I(E) denotes the indicator function of the event E. Fisher and Switzer (2001) proposes to plot the graph of (λ i , χ i ) pairs, defined by: where S i = sign{(F i − 0.5) 2 , (G i − 0.5) 2 }.
The Chi-plot graph has confidence bands (dashed lines on the graph) drawn in ± c p √ n (approximate value for c p and with a significance level of 95% is 1.78).
The pairs (λ i , χ i ) of independent and continuous marginal factors tend to be located within the bands.For negatively correlated marginal functions, the pairs of (λ i , χ i ) tend to spread below the lower range.On the other hand, if the marginal functions are positively correlated, the pairs of (λ i , χ i ) tend to spread above the upper range.Finally, pairs close to zero χ i = 0 and between ranges indicate independence between variables.

Multivariate Independence Test
For the construction of Copulas, the studied variables Y 1 ,Y 2 , . . .,Y n must be a random sample from the joint distribution function H(Y 1 ,Y 2 , . . .,Y n ).According to Miqueleto (2011), to ensure that the sample is random, the variables must be mutually independent.In other words, if the null hypothesis from which the variables are independent is rejected, this provides evidence that Y 1 ,Y 2 , . . .,Y n were not randomly generated.The description of this section is based on the works of Kojadinovic and Yan (2011) and Genest and Rémillard (2004).
It is known that the dependency structure between the variables is completely summarized by the C Copula and the mutual independence between them occurs if, and only if, C(u 1 ,u 2 , . . .,u n ) = Π n i=1 u i , a test of the mutual independence of the components of Y i , i = 1, . . .,n, can be statistic I n , as proposed by Genest and Rémillard (2004): where A ⊆ {1, . . .,d}, |A| > 1 converges together to a mutually independent Gaussian process.Genest and Rémillard (2004) proposes that Y 1 ,Y 2 , . . .,Y n mutually independent is equivalent to having M A (C(u)) = 0, ∀u ∈ [0,1] d , A ⊆ {1, . . .,d} for |A| > 1.
Rather than simple I n statistics, the authors suggest considering 2 d −d −1 test statistics as: that are asymptotically independent under the hypothesis of independence.
To view test results, when based on all 2 d − d − 1 statistics M A,n , a graphical representation called a dependogram can be used.For each subset A ⊆ {1, . . ., d}, |A| > 1, a vertical bar will be drawn whose height is proportional to the value of M A,n .
The critical values of M A,n are represent by black balls.
Subassemblies for which the bar exceeds the critical value can be considered to be dependent variables.

Results
In this section, we present the modeling of soybean yield series, CME futures price, and EXG rate quotes.In addition, copula estimates are presented under the two-dimensional and the three-dimensional approaches.

Yield
Table 1 presents the descriptive statistics for the trend-adjusted yield series for Cascavel and Toledo.We observed an average yield of 3405.587Kg/ha, standard deviation 329.91, and asymmetry measurement -0.329 for the municipality of Cascavel, indicating a slight left asymmetry (from the normal distribution).In addition, the coefficient of variation in Toledo (15.82%) indicates a higher yield risk than in Cascavel (9.687%).The annual yield series for the municipalities of Cascavel and Toledo were modeled using the OLLN distribution as described in Duarte et al. (2017).
Table 2 presents the estimated parameters related to the OLLN distribution for the yield series, with standard errors in parentheses.Note that the parameters are very similar for both municipalities, indicating similar yield characteristics.The result of these estimates allows simulating the required number of observations and their respective cumulative probabilities that are useful for matching price and USD probabilities.Thus, as the series have the same number of observations, the copula estimates can be obtained.

Bivariate Analysis for Revenue Contract
Consider X the simulated yield for the municipalities for Toledo or Cascavel (kg/ha), Y the price of CBOT/CME (US$ per 60-kg bag) and Z the USD quote.
For the calculation of revenue, let us consider the price P = Y * Z, in this case, we do not take into account the exchange variability.The revenue F will be considered as a function of two variables F = X * P.This procedure does not take into the dependency structure between price and USD variables.It is equivalent to assume that the correlation between the variables is nonexistent.In this case, we are only assuming that there is a dependency structure between yield and price in Reais.Table 3 presents the descriptive statistics for the simulated yield series (20,000 observations) of the OLLN distribution for Cascavel, Toledo and price in Reais.An average yield of 3402.604Kg/ha, standard deviation (Std Dev) 425.01, and asymmetry measurement 0.0015 were observed for Cascavel.
In addition, the statistics are close to each other, suggesting a very similar premium rate for both municipalities.Moreover, the statistics presented in Table 3 are close to the original ones presented in Table 1, demonstrating that the simulations were properly implemented.The average price in Reais is 73,504 per 60-kg bag, with a coefficient of variation of 6.80 %.There is a trend in the soybean futures price series between Apr 5, 2015 to March 3, 2017, presented by the Dickey-Fuller test (p-value = 0.4537).Therefore, we opted for differentiating it, turning it into daily adjustments.Series differentiation does not preclude review of a revenue insurance contract, because the difference in guaranteed price p g and the price that occurred will be the adjustment.
The graphs in Figures 3a and 3b show the Histogram and Normal QQ-Plot for the adjustment price series in Reais (R$), respectively.Note that the adjusted price series fluctuates around zero, appearing to be stationary, tested by the Dickey-Fuller test (p-value < 0.01, rejecting the hypothesis of unit root); however, with time-dependent variability (volatility).
Moreover, by the normal QQ-plot graph, the adjusted series has a tail heavier than normal.The main concern in the actuarial area is with the left tail of the distribution, as it is the region where dependency between extreme values is observed and is used to calculate the loss risk (loss probability).Underestimating loss probability implies lower premium rates, reducing the financial gains by the insurer.The Ljung-Box test was applied for the adjusted price series (p-value = 0.03963) and it is squared (p-value = 0.00864).There is absence of serial autocorrelation and presence of conditional heteroscedasticity at a significance level of 0.01.Therefore, the modeling of heteroscedasticity was performed using models from the GARCH(m,n) family.
Table 4 presents the values of the AIC and BIC criteria and the maximum log likelihood for the GARCH(m, n) family models with t-Student errors for the adjusted price series.The model chosen was GARCH (1,1) with t-Student errors by the criteria AIC and BIC.
The Ljung-Box test was applied to the residuals of the chosen model, with p-value 0.626727.Considering a significance level 0.01, the test indicates no autocorrelation.On the other hand, the Ljung-Box test result for the squared residuals of the chosen model presented p-value 0.04475, which indicates absence of conditional heteroscedasticity, considering a 0.01 significance level.Therefore, with the simulated yield series of the OLLN distribution for each selected municipality and the residual of the GARCH model (1,1) for the adjusted price, the twodimensional copula model was adjusted.
Table 5 presents the AIC and BIC values for the adjusted Copulas for the selected municipalities.For both municipalities, the copula that best fits the data structure is the Frank Copula with parameter δ = −0.3358and standard error (0.289).Therefore, the premium rate for Cascavel and Toledo is the same, as these municipalities have the same dependency structure between productivity and price.The discussion of revenue insurance premium rates using the selected Copula is presented in subsection 6.1.

Three-dimensional Analysis for Revenue Contract
In this section, the dependency structure between yield, CME price, and the EXG rate is considered, that is, the variability of the exchange rate in the modeling is taken into account.
Table 6 presents the descriptive statistics for the adjusted price series (US$) and the EXG rate Note that the adjusted price has an average 21.254 (US$ per 60kg bag), standard deviation(Std Dev) 1.361, and coefficient of variation 6.40%.On the other hand, the average EXG rate is 3.47 Reais and standard deviation 0.314.In addition, the coefficient of EXG rate variation is higher than that of the price, respectively, with 9.05% and 6.40%.Indicating that the exchange rate risk is greater than the price risk, which is expected, as the exchange rate is influenced by many other variables than just commodity price.
Figures 4 and 5 show the adjusted price (differential price) for CME and the EXG rate, their histograms, and the normal Q×Q-plot graph, respectively.Both price series fluctuate around zero, appearing to be stationary; however, they have a time-dependent variability (volatility).In addition, the histogram for both series has a higher central part than a normal one, and values are quite far from the central positions of the data.These facts are characteristic of financial returns and are described by the Kurtosis measurement (5.168 and 6.689 for CME and EXG rate, respectively).In addition, the  normal QQ-plot graph shows that the adjusted series have tails heavier than normal thus it is suggested that they are leptocurtic.
The CME price and EXG rate adjusted price series are white noise, that is, they do not present temporal autocorrelation, according to the Ljung-Box test at 0.05 significance level, whose results are presented in Table 7.Therefore, both non-autocorrelated price series are adopted.
The existence of conditional heteroscedasticity was confirmed by the Ljung-Box tests at 0.05 significance level for both squared series (Table 7).
For conditional heteroscedasticity modeling, we used the GARCH models (m, n) with t-Student errors.Table 8 shows the AIC, BIC, and log-likelihood maximum (LLM) values   Figure 6 presents Chi-Plot for the pairs of variables (Toledo yield, CME price), (Toledo Yield, EXG rate), and (CME price, EXG rate).The graphs suggest a negative dependence between the pair of variables (CME price, EXG rate) and independence between the pairs (Yield, CME price) and (Yield,EXG rate).Thus, the CME price variable may be negatively correlated with the EXG rate quote, which may reduce the impact of price risk on insurance pricing.
Figure 7 presents the result of the empirical copula independence test of random variables proposed by Genest and Rémillard (2004).The heights of the bars represent the values of the test statistics by subset of variables and the black ball represents the critical value of the test.In addition, the first bar represents the test result for the set of variables 1,2, where 1 represents the variable yield, 2 the CME price, and 3 the EXG rate series.
At 0.05 of significance level, the subset variables (1,2) and (1,3) can be considered independent.On the other hand, the subset of variables (2,3) is considered dependent and variables 1,2 and 3 are considered mutually independent.Therefore, the yield, price, and EXG rate variables were randomly generated and are considered independent.
Table 9 presents the values of AIC and BIC criteria and the maximum log-likelihood (MLL) for the different copulas for the yield variables, adjusted CME, and EXG rate price in Toledo and Cascavel.According to all selection criteria, the t-Student Copula best represents the data dependency structure.
Table 10 presents the selected models with the selected parameter vector of the trio of variables.Note that for both municipalities, the selected copula was t-Student with parameters (ρ 1 , ρ 2 , df) = (0.5458; 0.0896; 1.84).The inference was based on the pseudomaximum likelihood with symmetric positive matrix structure characterized by toeplitz (dispstr = toeplitz).Therefore, the revenue insurance premium rate for both municipalities is the same, as the copula parameters are the same.This was expected, since we are working with the same price and EXG rate series, changing only the simulated yield (with very similar parameters between both municipalities).viewed solely as a direct proportionality of risk.For example, the standard deviation of simulated yield in Toledo is approximately 39% higher than that of Cascavel; nevertheless, it cannot be said that the Toledo rate is 39% higher than that of Cascavel.6. Discussion

Two-dimensional Revenue Agreement
Table 11 presents the pure rate and the commercial rate (with 20% loading) for the bivariate case for Frank Copula (δ = −0.3358),with N = number of observations.
For N = 454, we used the simulated yield series of the OLLN distribution with N = 454 observations and the adjusted price series in Reais, where (Y e ) was calculated by the arithmetic average of the last 15 days to from Aug 8, 2016 (date of insurance contracting).
On the other hand, for N = 20,000, we considered 20,000 simulated observations of adjusted Copula of adjusted series.In addition, for the expected price Y e , we used the average of the Copula simulated price vector with 20,000 observations.Note that the simulated pure rate (for N = 20,000) of the coverage level (CL) insurance premium of 60% and 70% corresponds to 4.682% and 7.115%, respectively.
Considering the commercial EXG rate for the coverage level of 60% and 70% is 9.365% and 14.230% respectively.Note that the higher the level of coverage, the higher the insurance premium rate.Table 12 presents the average commercial revenue insurance premium rate using the selected copula with N = 454 observations and insurer A rates for coverage levels(CL) 60 − 69% and 70 − 79%.Note that the rate calculated in this study is well above those applied by the insurance market.For example, insurer A rate is 63.32% and 40.28% of the rate calculated by the Copula for CL of 60-69% and 70-79%, respectively.
Therefore, there is evidence of an underestimation by insurer A, which may lead to a large loss for the insurer, as it may be considering a much lower risk.

Three-dimensional revenue agreement
From a three-dimensional approach, the selected Copula for both municipalities is the t-Student with parameters (ρ 1 , ρ 2 , df) = (0.5458; 0.0896; 1.84).Table 13 presents the pure premium revenue insurance rate and the commercial rate in % with 20% charge rate and coverage level 60%, 65%, 70%, 75% and 80% for N = 454 (adjusted prices), and N = 20,000 simulated observations of the selected Copula.Note that for CL = 65% and CL = 80%, the pure premium rate is 1.360% and 3.230%, respectively.
Comparing Tables 11 and 13, for all levels of coverage, the rate calculated by the bivariate Copula is overestimated when compared to the three-dimensional Copula.
Moreover, when considering exchange as a random variable, the rate calculated by the three-dimensional approach (CL = 80%) is almost threefold lower than that calculated under the two-dimensional approach, in which EXG rate variability is not taken into account.
Therefore, when the EXG rate variable is added to the modeling, there is a decrease in the revenue insurance premium rate.This decrease may be related to the incorporation of the negative correlation between price (US$) and the EXG rate into the model, which may be leading to risk neutralization between these variables.In other words, the negative dependence that exists between prices and the EXG rate may be nullifying the risk of these variables in modeling.Table 14 presents the commercial rate for coverage level 60 − 69% and 70 − 79% calculated by the three-dimensional copula and insurer A. For both coverage level ranges, insurer A is overestimating the insurance rate when compared to the copula methodology.
Rate overestimation can hamper securing insurance throughout the Brazilian territory as well as attracting producers with greater risk of compensation, thus increasing the problem of adverse selection.Therefore, this study proposes alternative methods for the calculation of the premium insurance premium rate.The Copula method was analyzed under two approaches: two-dimensional and three-dimensional.The two-dimensional approach takes into account yield and price risk in Reais.On the other hand, the three-dimensional approach takes into account yield, CME price, and exchange rate variables.Moreover, understanding the relationships among these three variables is the main contribution of the study.This study used the annual soybean yield series for the municipalities of Toledo and Cascavel, the soybean prices traded at CBOT/CME due March 2017, and the USD exchange rate during the same period.Yield was simulated from the OLLN distribution, price and USD series were adjusted by GARCH (1,1) models with t-Student errors for volatility modeling.
From the two-dimensional approach, the best-fit dependency structure was the Frank Copula with the negative dependency parameter (δ = −0.3358).The results suggest that the insurance market underestimates the insurance premium rate when compared to the copula methodology.Underestimating the rate leads the insurer to risk loss, which may lead to financial losses.In addition, insurers take into account a lower risk than should be taken into account when pricing the product.
From the three-dimensional approach, the best-fitting copula was t-Student with parameters (ρ 1 ,ρ 2 ,df) = (0.5458; 0.0896; 1.84).The rate calculated under this approach is lower than the one presented under the two-dimensional approach, because in this case, the random USD rate variable in the modeling was taken into account.Thus, there is the incorporation of an exogenous variable into the modeling, which has a great influence on soybean prices in the Brazilian context.This decrease may be related to the incorporation of negative dependency between price (US$) and the USD rate into the modeling, which may lead to risk neutralization between these variables.
In addition, the results suggest that rates applied by insurers may be overpriced when compared to the three-dimensional copulation model.This overpricing can attract high-risk producers, increasing the problem of adverse selection and making it difficult to massify the crop insurance among farmers.

Figure 1
Figure 1 illustrates the behavior of the ST's pdf.Note in the graphs 1a and 1b the contribution of the parameter ν in the form of the distribution, with µ, σ and τ fixed.Positive values of ν indicate positive asymmetry and negative values of ν indicate negative asymmetry.Figure 1c shows the effect of the kurtosis parameter on the distribution when the other parameters are fixed.Another distribution used in this work is the Odd log-logistic-Normal(OLLN) distribution.The new OLLN distribution family allows a greater flexibility of the distribution's tails.The cumulative probability function (cdf) with a shape parameter α > 0, is defined by:

Figure 1 .
Figure 1.Plots of the Skew-t density function for different parameter values.
Figure 2. Plots of the OLLN density function for different values of the parameters.

Figure 3 .
Figure 3. Original Price in Reais, Adjusted Price (R$), Histogram and Normal QQ-Plot for the adjustment price series in Reais.

Figure 4 .
Figure 4. Price, Histogram, and Normal QQ-Plot for the CME Daily Adjustment Price Series.

Figure 5 .
Figure 5. Price, Histogram, and Normal QQ-Plot for the EXG rate series.
companies offer agricultural revenue insurance in the Brazilian insurance market.Insurance company A takes into account the series of municipal annual yields and the physical price (CEPEA/ESALQ indicator) in pricing its municipality.On the other hand, insurer B differs by using the futures series negotiated at CME, already converted into Reais.
used to convert them into USD for 60-kg bags.The observation period was from Apr 05, 2015 to March 14, 2017, totaling 455 observations.
Group due in March 2017 (code ZSH17).These data are available inBarchart (2017).Futures contract prices, Y CME , are available in cents/bushels; therefore, the transformation factor FT = ( was

Table 1 .
Descriptive Statistics for the yield series adjusted for Cascavel and Toledo (n = 37

Table 3 .
Descriptive Statistics for simulated yield series for Cascavel, Toledo and price

Table 4 .
AIC and BIC criteria and log-likelihood maximum values for models adjusted to the price

Table 5 .
AIC and BIC Criteria Values for Copula Selection

Table 6 .
Descriptive Statistics for Adjusted Price Series (US$) and EXG rate (Reais)

Table 7 .
Ljung-Box test results for squared CME prices (rcme) and EXG rate (r dolar) adjusted price series.for the CME and dollar price-adjusted models.By the BIC criterion, the lowest value for both series is for GARCH (1,1) with t-Student errors.The Ljung-Box tests for GARCH model residuals (1,1) with t-Student errors for CME price and USD quote presented p-value=0.5505and p-value=0.7138,respectively.The Ljung-Box test results for the squared residuals of the model presented p-value=0.4886and p-value=0.2777for CME and EXG rate, respectively.These results indicate that the model assumptions are verified, that is, absence of temporal autocorrelation and conditional heteroscedasticity in the residuals, respectively.Therefore, the GARCH models (1,1) with t-Student errors were selected for adjusted CME prices and EXG rates.

Table 8 .
Values of the AIC and BIC criteria and maximum log-likelihood(LLM) for GARCH models (m, n) with t-Student errors to the CME prices and EXG rate series

Table 10 .
Selected models for revenue insurance pricing with dependency parameters.

Table 11 .
Pure and Commercial Rate in % for the two-dimensional revenue insurance premium for the municipalities of Toledo and Cascavel using the Frank copula (δ = −0.3358)

Table 12 .
Average commercial rate for revenue insurance using copula and insurer A rate for

Table 13 .
Pure and Commercial Charging Rates (20%) for revenue Insurance using t-Student copula for Toledo(Yield, CME Price, EXG rate)

Table 14 .
Commercial Average Rates for Revenue Insurance using t-Student Copula and Insurer A.