LAGRANGE MULTIPLIERS IN THE PROBABILITY DISTRIBUTIONS ELICITATION PROBLEM: AN APPLICATION TO THE 2013 FIFA CONFEDERATIONS CUP

Contributions from the sensitivity analysis of the parameters of the linear programming model for the elicitation of experts' beliefs are presented. The process allows for the calibration of the family of probability distributions obtained in the elicitation process. An experiment to obtain the probability distribution of a future event (Brazil vs. Spain soccer game in the 2013 FIFA Confederations Cup final game) was conducted. The proposed sensitivity analysis step may help to reduce the vagueness of the information given by the expert.


INTRODUCTION
"It is notable that the probability that emerged so suddenly is Janus-faced. On the one side it is statistical, concerning itself with stochastic laws of chance processes. On the other side it is epistemological, dedicated to assessing reasonable degrees of belief in propositions quite devoid of statistical background." Ian Hacking (1975) The purpose of a knowledge (or belief) elicitation process is to obtain one or more probability distributions that represent the experts' beliefs in a random event, π(θ). A parameterized distribution is usually assumed to facilitate the elicitation process. A method of elicitation of the experts' knowledge based on linear programming is proposed in Nadler Lins & Campello de Souza (2001) and Campello de Souza (2002). In Nadler Lins & Campello de Souza (2001), emphasis is given to the dificulty indicators of the elicitation process for the random variable case. Campello de Souza (2002) generalizes the model for the case without a random variable.
The contribution of Lagrange multipliers analysis to the model proposed by Campello de Souza (2002) is an unresolved problem and the main objective of this paper. A set of elicitations on the prediction for the Brazil vs Spain soccer match in the 2013 Confederations Cup is conducted; in terms of probabilistic predictions, the method proposed should be seen as one more option. In general, if the objective was to make a prediction of that particular game, the applied method should be used along with other methods, in the same way that other variables on the outcome of soccer matches should be taken into account. In any case, a presentation of the references on Campello de Souza's method (2002) and of the existing methods of soccer predictions, as well as the differences and similarities involved is necessary. Among the advantages of the method proposed by Campello de Souza (2002), the fact that it is compatible with other views of probabilistic representation as well as the possibility to answer other soccer questions, such as the question of the alternative ranking, can be highlighted. It is evident that changes in the elicitation questionnaire will be necessary in this case. Studies by Campello de Souza (1983) and Campello de Souza (1986) are classical references regarding the probabilistic preferences with triangle inequalities in order to obtain inaccurate and imprecise preferences, which have been recently proposed in the analysis of conflict stability (Santos & Rêgo, 2014;Rêgo & Santos, 2015).
For this purpose, this paper is organized as follows: in Section 2, the elicitation process proposed in Campello de Souza (2002) is reviewed; in Section 3, the sensitivity analysis and its contributions to the elicitation process and model calibration is performed; in Section 4, an application with 23 students of Economics to elicit the probability of a future event (Brazil vs Spain soccer game in the final of the 2013 FIFA Confederations Cup) is presented; finally, the article ends with conclusions and a view into future work in Section 5.  (Keynes, 1979), Keynes supports the hypothesis that, in the long run, we will all be dead and that historical data which would allow making predictions about our future would never exist. Thus, when there is little data or no data, the expert's a priori knowledge should be used.
A general model that can represent all aspects of random phenomena is still far from being achieved. The method for obtaining probability distribution families proposed by Campello de Souza (2002) Souza (2002), among the imprecise probability models (Walley, 1991).
The elicitation method of the expert's a priori distribution has as basic assumption the fact that the expert has vague knowledge about the probability distribution of the random event of interest, π(θ). It is also assumed that it can make "only a finite number of comparative probabilistic assertions" when answering questions about the likelihood of a random variable to take a value in one of two ranges. The method leads to expressing the expert's knowledge as families of probabilities distributions, featuring the human being's natural limitations. Thus, the expert's knowledge could be represented by a set of probability distributions limited by "a stochastically greater distribution than all other distributions compatible with the answers that have been given", as well as by a stochastically distribution lower than all others.
Initially, an elicitation questionnaire is proposed. Considering the case where the state of nature, θ, is a real and continuous parameter, the plausible range for θ should be established, in other words, [θ min , θ max ), where the probability that the value of θ is out of this range is zero. The range is partitioned into 2n subintervals. Then, π j = Pr(θ ∈ [θ j −1 , θ j )) is defined, where j = 1, . . . , 2n − 1.
The model consists of solving two linear programming problems, namely, first, solving a maximization problem and, second, a minimization one, subject to the same set of constraints obtained from the expert's survey responses. Mathematically, they are expressed as follows: subject to: where k(r) < l(r), a r > 0, r = 1, . . . , q, q is the number of questions answered by the expert and f (r) ∈ {0, 1} and its value depend on the r-th response to the expert's questionnaire.
Depending on the combination of parameters a r and b r , the expert's opinion can be captured in many ways. The two last restrictions are to ensure that one has a probability distribution.
Coefficients c j may be placed as the sum of the area of cumulative probability distribution. In this case, the goal would be to minimize the expected value when solving the maximization problem, and maximize the expected value when solving the minimization problem. This is made possible by considering c j = 2n − j + 1. Using the fact that θ j = θ 0 + j a, where a = θ j − θ j −1 , it can be shown (Campello de Souza, 2002) that maximizing Obviously, the choice of other values for c j will produce different results. The family of probability distributions defined by solving the optimization problem, in principle, is smaller than the set of all possible distributions compatible with the experts' responses. If the feasible set of the optimization problem is empty, it means that the expert was not consistent in his responses. The questions not answered by the expert will not enter the constraints of the linear programming problem.
It is now possible to present the dual model and perform a sensitivity analysis of the elicitation model.

THE DUAL MODEL: LAGRANGE MULTIPLIER AND SENSITIVITY ANALYSIS
Another mathematical programming problem known as the dual problem can be set to any mathematical programming problem with constraints (known as primal problem). The solution to both problems must be the same. The dual problem arises from the use of auxiliary variables, known as Lagrange multipliers, used to incorporate restrictions into the objective function of the problem.
Considering the model presented in the previous section, the aim is to obtain the distributions which provide the maximum and the minimum expected value by making c j = 2n − j + 1.
The choice of the objective function of the problem could be another one, but in this case, the problem can be considered a first order estimation; in other words, there is an attempt to estimate the mean of the distribution. However, other objective functions can be used, such as the variance or the entropy of the distribution. Any of these other functions would make the optimization problem nonlinear. Another way to incorporate the other quantities into the problem is for the researcher to use them in desired restrictions. Again, the problem would become nonlinear. The interpretation of the Lagrange multiplier in the dual problem depends on the choice of the objective function, the linear case being the one analyzed here.

The Simple Case
The behavior of the model is observed in a case with only two outcomes in which only a single question is answered by the expert. The model could then be written as follows: subject to: For the case of maximizing and minimizing the expected value, the values are c 1 = 2 and c 2 = 1. The dual problem can be written as subject to: Studying the primal model for the case where b = 0, as there are only two constraints, this problem can be represented as in Figure 1. Considering the case where constraint (6) is of type ≤, since the slope of the objective function is equal to −2, the maximum is obtained at the point where constraint (6) is satisfied in equality, the values that maximize the objective function being π 1 = 1 a+1 and π 2 = a a+1 . Thus, in this case, the value of the Lagrange multiplier λ 1 is positive. In the second case, where the problem is to minimize the expected value, the optimum is attained at π 1 = 0 and π 2 = 1; consequently, λ 1 = 0. The dashed line between points (π 1 , π 2 ) = a a+1 , 1 a+1 and (π 1 , π 2 ) = (0, 1) represents the feasible set. When a Lagrange multiplier is zero to any question, the interpretation is that it does not contribute at all to obtaining the optimal distribution, being the answer to the corresponding question deductible from the other answers.

An Interpretation for the Lagrange Multiplier
In Economics, linear programming models are typically used as tools for interpreting economic phenomena. This is a typical case where economic thinking produces scientific knowledge. In most cases, the analogy is made with physics or biology, but it can also be made with economic thinking. The objective function is what is desired. In the case of the firm, the objective function can be understood as the profit of a company, coefficients a r representing the coefficients of a technological matrix, the value of b r being the availability of an input, and the interpretation made of the Lagrange multiplier is that it reflects the opportunity cost, in terms of profit, of not having more of one of the inputs.
In the linear programming problems within this article, Lagrange multipliers measure how much the expected value could increase or decrease, in case it was possible to increase the difference between the probabilistic masses of the intervals presented in a question put to the expert. In the linear programming problem, the objective is the expected value, the value of b reflecting the maximum value stated by the expert to the difference between the odds ratios of the probabilistic masses of the two given intervals.
If b = 0, then coefficient a represents exactly the odds ratios between the two probabilistic masses and, in this case, the expert states no positive value for the difference between the odds ratios of the probabilistic masses. However, this does not mean that there is no opportunity cost related to this question. If the inequality constraint becomes an equality constraint, in other words, in the example above π 2 = 1 a π 1 , there will be an associated cost, since the expected value will not be higher in case of minimization, or will not be lower in case of maximization for not having a positive value, whichever it is, for the difference between the odds ratios of the probabilistic masses of the two given intervals. If the expert could refine this information by introducing a positive value for the difference between the odds ratios of the probabilistic masses, the model would provide a difference between the maximum and minimum expected values, smaller or equal, increasing the accuracy of his elicited beliefs.
When λ = 0, this implies that the corresponding answer to the question is informative about the phenomenon and the question is considered active. If the number of active questions is big, it means that there is contribution of a large number of questions in the questionnaire to obtain the distributions.

The General Case
To favor the presentation of the general case of the dual problem, the inequality will always be considered as ≤. Yet, it is easy to see that, regardless of the answer given by the expert, it is possible to transform the constraint into an inequality of this type. The linear programming problem described in (1)-(4) can be put in matrix form to make the presentation of the problem easier. Thus, where c = (c 1 , . . . , c j ), π T = (π 1 , . . . , π j ), b T = (b 1 , . . . , b q ), 1 = (1, . . . , 1) and Therefore, the problem proposed in (Campello de Souza, 2002) can be rewritten with its respective dual as follows: where λ r = (λ 1 , . . . , λ q ) and λ s refers to the inequality and equality constraints, respectively.

Sensitivity Analysis
The sensitivity analysis of parameters consists of assessing the changes in the defined objective function of the problem, given an infinitesimal change in the values of a r or b r . This paper follows the presentation in (Intriligator, 1971) in the use of the Langrangean function for sensitivity analysis. The Lagrangean function is defined at the optimum point as Then the following proposition can be obtained: Given an optimal solution (π * , λ r * , λ * s ) to the problem if λ r = 0 for some restriction r, then The demonstration for the case of minimization is similar. The immediate consequence of the above proposition is that if the goal is to decrease the length of the interval between the minimum and maximum expected values of variable θ, where max F = E(θ) and min F = E(θ), then a marginal change in b r , i.e., in the group of the third statements, is more relevant than the same marginal change in a r , i.e., in the group of the second statements.

The Example of a Soccer Game
In Walley (2000), the author presents an example of statements about the possible outcomes of a soccer game to make comparisons between models of imprecise probability. In a soccer game, there are three possible outcomes: victory W , draw, D and defeat, L. Considering an expert making three qualitative judgments about the possible outcomes of the game: 'not winning' is more likely than winning; winning is more likely than losing; a draw is more likely than losing the game, To implement the elicitation method proposed by Campello de Souza (2002), it is necessary to define the objective function. Since a random variable which allows for the calculation of the expected value is not defined, some possibilities may be suggested. There are two types of soccer competitions: consecutive points championships, where the winning team is granted three points; a draw gives one point to each team; a defeat, zero to the team beaten. The second type of championship is known as cup or knockout stage confrontation in at least one of its phases and, in this case, only one team remains in the competition, making it a zero-sum game. In the former situation, the objective function can be written as X = 3I W + I D , which gives the number of points scored in a match, whereas in the latter, the objective function can be written as X = 3I W − 3I L . The use of the function of the latter situation was chosen for the calculation of the distributions in the experiment. Thus, the linear programming model, according to the method proposed by Campello de Souza (2002), for the example by Walley (2000), can be written as follows: π(W ), π(D), π(L) ≥ 0 π(W ) + π(D) + π(L) = 1 As seen in Section 2, E(X ) should be maximized as well as minimized in order to obtain the set of probability distributions. First, the experts' statements must be made sure to be a nonempty set, proving to be consistent. Distributions that maximize (minimize) the expected value are shown in Table 1. Aspects on Lagrange multiplier's behavior to model constraints were never observed. There are four restrictions to be observed, the first three are related to the expert's qualitative judgment and the last is the equality constraint to obtain a probability distribution, the Lagrange multiplier being always active on this constraint since it is an equality constraint. For the first three statements, the maximization problem of Lagrange multipliers will be (1.5; 0; 0). With this result, only the first judgment is being used to determine the maximum E(X ). In the minimization problem, the Lagrange multipliers are (0; −3; 0); in this case, only the second judgment informs about the probability distribution that minimizes E(X ).

The Choice of the Objective Function
Given the possibilities of the kind of competition, an interesting problem arises regarding the choice of the objective function. There are two alternative linear programming problems with respect to the representation of points in consecutive games: the first problem has already been described, considering the values of the points earned by the home team, X = 3I W + I D ; the alternative problem is to consider the earnings by the visitor team Y = 3I L + I D . Events W , D and L are always considered taking the home team in the point of view. Hence, if the home team loses (L), the visitor gets 3 points. As the method indicates minimizing and maximizing the objective function, the same set of probability distributions can be expected to be found, but this is not what happens. It can be said that, despite the fact that both models represent the same situation, the results are different and this problem of choice of the objective function can be considered as an example of the Framing Effect problem (Tversky & Kahneman, 1981). The results for both problems are presented in Table 2.

Study Design
Regarding international competitions between national soccer teams, the greatest interest is in the FIFA World Cup, but a competition test before the World Cup is the FIFA Confederations Cup. This study refers to the 2013 FIFA Confederations Cup which had a small number of participating teams, as it was a competition test, eight in total: six representing each of the soccer confederation cup winners, the world cup champion and the host country, namely: Mexico, Italy, Tahiti, Japan, Uruguay, Nigeria, Spain and Brazil. By the competition's format, which is divided into two phases, a confrontation between Brazil and Spain (World Champion) could already be anticipated to only occur in the finals.
Before the beginning of the 2013 FIFA Confederations Cup, an experiment with undergraduate students of Economics was held. Two groups of students were selected from a total of 23 students. The first group was characterized as experienced students in the course and, therefore, in probability and game theory, as they were undergraduate seniors. The second group were beginner students in the Economics course, who, despite their initial training in probability, did not have any complete training in Economics, as they were undergraduate sophomores. There were nine students in the first group, six men and three women. The second group had eight men and six women.
The experiment was divided into three parts, the questions being presented to the students following the type of questions presented in Section 2. Firstly, students responded to probability statements of the first kind; later, of the second type and, finally, of the third kind. It was optional for the students not to answer any or all questions of a particular group. All questions referred to the possible match between Brazil and Spain. The first set of questions were about the following events: W , D and L, where each one represented Brazil's victory, draw and defeat, respectively. Which one of the following events is more likely to occur: 4. W or L?

D or L?
The statements about the odds ratios and differences between probabilistic masses are related to the first five questions and formed the next two parts of the experiment. X = 3I W − 3I L was used as the objective function in the linear programming problems. Three linear programming problems were analyzed. The first considered only the first group of statements; the second considered also the second set of statements; finally, the third linear programming problem consisted of all groups of statements. Five restrictions regarding the statements were presented in each problem, as well as the necessary condition for the probabilities to add up to one. The responses to each set of statements of the five questions above by the 23 experts are shown in Table 4 in the Appendix. For instance, Expert 1's answers to the first question in each group of statements are as follows: in the first group of statements, he considered Brazil's defeat more likely than a draw or a victory; the value of the odds ratios of the probability masses, a 1 , was equal to 3 in the second group of statements for the same question and, finally, he considered the upper bound for the difference between the odds ratios of the probability masses, b 1 , equal to 0.1. The interpretation of the answers to the other questions and by the other experts, given in Table 4 in the Appendix, is similar.
The results of the three linear programming problems are shown in Tables 5, 6 and 7 in the Appendix. As expected, as information is added to the linear programming problem, the length of the interval between the minimum and maximum expected values of points to be obtained seems to decrease. Taking the first expert as an example (one of the few who answered the three types of questions without generating any inconsistency), for the first set of information, the expected score was observed to be between [−0.75; −3.00], considering the odds ratios, the interval shrinks to [−2.00; −3.00]; finally, considering all the questions, the length of the interval gets even smaller [−2.30; −2.40]. Figures 2 and 3 show the distributions that maximize and minimize the expected points in a match by experts 1 and 14, respectively.
Another point to highlight is that the elicitation method shows experts' inconscistencies regarding their probabilistic knowledge. Knowledge about football is inherent to most Brazilians, but this does not necessarly mean a bias in favor of Brazil's victory when eliciting Brazilian experts, since at least 9 out of 23 experts had decided for Brazil's defeat in the fourth question.
Finally, a question information index is proposed. When the Lagrange Multiplier of a question is active, there is an indication that this question is limiting the growth of expected values and, consequently, the question tells something about the expert's opinion. Given the experiment with the 23 experts, the proposed index reflects which of the five questions were relevant to the experts elicited family of probability distributions. The information index is defined as: where L is the number of times the multiplier is nonzero, considering the maximization and minimization problem, and N is the number of experts who answered question x. The information index for each question per group of statements is shown in Table 3.
From observing the information index values for the first and second groups of statements, the fourth question was verified to be the most informative one with an information index of 0.3095. However, the information index of this question is much reduced in the second group of state-   On the other hand, the information index of the second question was more stable considering the first and second groups, being above 0.3 in both groups. Thus, if one must choose a single question for the first two groups of statements, question 2 should be chosen.

CONCLUSION
The main contribution of this paper is to propose a sensitivity analysis evaluation of the linear programming problem used to obtain families of probability distributions arising from experts' answers to a questionnaire, enabling a refinement of their knowledge, making it a calibration step. In this calibration step, it was possible to demonstrate that the evaluations of the difference between probabilistic masses are more informative than changes in the odds ratios between two events.
The example of soccer games, besides being recurrent in the literature, allows for an elicitation process, which is easy to apply to different groups of people for the easy exposition of concepts such as odds ratios and differences between probabilistic masses. However, one must have a certain care with the statements of differences between probabilistic masses, since only two experts were able to respond to such statements and generate a nonempty feasible set.
Finally, it is possible to have some alternative applications of the proposed sensitivity analysis in the elicitation process proposed by Campello de Souza (2002) through the elicitation questionnaire developed in Nadler & Campello de Souza (2001): the questionnaire can be dynamic and only seek questions which may alter the set of distributions already established so far. This is possible by evaluating whether the Lagrange multiplier would become nonzero in a follow-up question. In this case, the information index for each question would be higher when compared to the corresponding information index of a pre-established questionnaire. Table 4 -Responses of Experts.