# Abstract

In this paper we present a brief overview of 10 years of research on Parametric Data Envelopment Analysis (DEA). We begin with a brief introduction to DEA. Then, we present the central problem of how to distribute a new and total fixed variable (input or output) across all DMUs (Decision Making Units) present in the analysis. We then tell the history of the developments in Parametric DEA, beginning with its first conception in 2002, moving onward to the current research. We conclude with a brief discussion of future perspectives.

DEA; parametric DEA; on the distribution of new variables in DEA; total fixed

Ten years of research on parametric data envelopment analysis

Armando Zeferino MilioniI, * * Corresponding author ; Luciene Bianca AlvesII

IInstituto Tecnológico de Aeronáutica, ITA, São José dos Campos, SP, Brasil. E-mail: milioni@ita.br

IIInstituto Tecnológico de Aeronáutica, ITA, São José dos Campos, SP, Brasil. E-mail: bianca@ita.br

ABSTRACT

In this paper we present a brief overview of 10 years of research on Parametric Data Envelopment Analysis (DEA). We begin with a brief introduction to DEA. Then, we present the central problem of how to distribute a new and total fixed variable (input or output) across all DMUs (Decision Making Units) present in the analysis. We then tell the history of the developments in Parametric DEA, beginning with its first conception in 2002, moving onward to the current research. We conclude with a brief discussion of future perspectives.

Keywords: DEA, parametric DEA, on the distribution of new variables in DEA, total fixed.

1 INTRODUCTION

In this paper we present a brief overview of 10 years of research of "Parametric Data Envelopment Analysis (DEA)".

Our early work with Parametric DEA began in 2001, but it was only very recently that this line of research began to receive more attention from the international technical community. For this reason, we consider that this is an interesting story of how perseverance can lead to scientific development that deserves to be told in such a way that it is accessible by many. Thus, we wrote this article thinking of readers that would have only fundamental knowledge of geometry and real analysis, with no need of familiarity with DEA.

The structure of this paper is the following: Section 2 provides an introduction to DEA, explaining, in general terms, the origin and the initial concept, dating back to 1978.

Section 3 presents the theme of our particular interest, which is to use DEA to solve the problem of allocation of a new variable (input or output) which has a total fixed sum across all Decision Making Units (DMU).

In Section 4 the Parametric DEA Models are presented, from their early conception up to the latest ones. We also discuss some of the principal papers published on the theme over the 10 year period starting in 2002, when the first article on Parametric DEA was published.

Finally, Section V is dedicated to Conclusions and Future Perspectives of the activities in Parametric DEA.

2 DEA

In 1957, Michael J. Farrell published an article in the prestigious Journal of the Royal Statistical Society which would become a seminal contribution to the field of efficiency measures and productivity (see Farrell, 1957).

In his article, Farrell postulated that any production system could be described as a way of transforming inputs (or resources) into outputs (or products), and the productivity of the system could be measured by the ratio of the weighted sum of the outputs by the weighted sum of the inputs.

The weighting factors would be considered acceptable only if this ratio was:

(i) Positive (an assumption that simplifies things, although it is actually unnecessary), and

(ii) Less than one (satisfying the universal laws of thermodynamics, or the lack of perpetual motion).

Systems with productivity equal to one would be considered efficient, thus becoming benchmarks for the other systems.

In order to briefly illustrate the concept of Farrell, let us consider a University. Perhaps it could be considered a gross simplification, but it would not be entirely wrong to describe a University as a system that transforms inputs such as number of students, teachers, physical area, energy, etc., into outputs such as graduates, books, articles, dissertations, patents, etc.

Thus, if one wants to compare the efficiency of two Universities, what one has to do is to arbitrate a period of time; observe and collect the inputs and outputs of each University along that period; choose the weighting factors, which we will simply call weights; and calculate that ratio. The University with the largest value will be considered more efficient than the other one.

As simple as that.

The 20 years that followed Farrell's publication would be devoted to the discussion of the problem of the "fair" choice of these weights.

One can immediately realize that the issue is not simple.

Take, for instance, the comparison between two institutions of higher education (IHE) that operate in Brazil, one of them being federal (and, therefore, public, and tuition free) and the other one private.

They both may be similar in nature, but the perspective into which each one of them regards the production system, i.e., its own process of transforming inputs into outputs, is completely different.

It so happens that, in Brazil, the salaries of all professors of federal IHE are paid directly by the National Treasury. Therefore, they do not represent a direct burden to these IHE's budgets. Consequently, the natural tendency would be to assign a low weight (relatively speaking, of course) to the input variable "number of professors". Likewise, the tendency would be to assign a high weight to the output "number of published papers", for this is a very important variable involving the several internal and external evaluation processes to which the IHEs are submitted.

On the other hand, a private IHE would tend to consider the input "number of professors" as a crucial one, for it is probably the most relevant component of its cost structure. And the main purpose of this institution may be more linked to professional training, i.e., to value a variable such as "number of students graduated" than, for instance, "numbers of published papers".

The question, then, is the following: to an impartial outside observer, what would be the "fair" choice of weights for each input and output variable in order to be able to compute the efficiencies and thus be able to compare both institutions?

Many believe that the best answer to this question, so far, is the one that is found in the now extremely famous article - especially for those who work in the field - published by Abraham Charnes, William W. Cooper and Edwardo L. Rhodes in 1978 (Charnes, Cooper & Rhodes, 1978).

What they proposed - as often happens in articles that become classics - was the break of a paradigm. For the first time, a model was conceived to use Farrell's formulation in such a way that different entities were allowed to assign different weights to inputs and outputs.

Thus, in the example above, public and private IHEs would not have to argue about fair weights to be assigned to each input and each output, for it would now be possible for each one of them to assign weights that best represented their own values and goals.

And to avoid unfair manipulation of the weights to make any entity efficient, the authors had the clever idea of imposing a constraint to their model: the ideal choice of weights to a given entity would be the one that would maximize its own efficiency under the constraint, such that if the same weights were assigned to each one of all other entities present in the analysis (i.e., all entities whose efficiencies were to be measured and compared), none of them would have Farrell's ratio greater than one. A set of weights that would lead to an efficiency value greater than one would be called unfeasible.

This constraint is logical given that a set of weights that would allow for an entity to have an efficiency value greater than one would contradict Farrell's definition.

Thus, the authors proposed the problem of choosing the set of weights for each entity as an optimization problem. One by one, each entity of the list of entities under analysis would have their ideal set of weights determined according to the above logic. The set of weights assigned to each entity would be the one which maximized its efficiency, bringing it as close to one as possible, provided, of course, that with the same set of weights, no other entity would have an efficiency value greater than one.

The authors also provided a geometrical interpretation of the model's solution, showing that, if the entities had the values of their inputs and outputs represented on a graph, then, efficient entities would be located on the border (or frontier) of the plotted points. If these border points were connected by line segments or planes, there would be a frontier that would envelop all other entities present in the analysis. The region located within this frontier was called the Production Possibility Set (PPS) and it designated the space in which that kind of production was possible.

The search for optimal weights corresponded to the pursuit of that border, frontier or, even, envelope, which is the reason why such models would later be called Data Envelopment Analysis models.

Thereafter, DEA would be considered a tool of nonparametric statistics.

DEA was considered a statistical tool because the problem involved estimating the unknownPPS frontier from the available data (i.e. the input and output values of each of the entities present in the analysis); and it was nonparametric because there would be no a priori hypothesis on the shape of the frontier, unlike Inductive Parametric Statistics, wherein the model often relies on specific forms of pre-designed probability distribution functions.

Finally, the authors proposed that the standard expression to be used to describe the entities under analysis (whatever they were, whatever the problem was) would be DMU, the acronym for Decision Making Units.

Therefore, DEA is the nonparametric statistical tool that measures the relative efficiencies of DMUs that operate in the same production process, i.e., which transform the same type of inputs into the same type of outputs.

In fact, with DEA, one is capable of much more. For instance, DEA indicates the benchmarks for a relatively inefficient DMU and also provides the best path with which it can achieve efficiency.

This and many other properties would be shown and many lines of research would follow. Among them was the problem that would became the object of our special attention after it was proposed, in 1999, and that we present next.

3 THE PROBLEM OF FAIRLY DISTRIBUTING A NEW INPUT OR OUTPUT VARIABLE

3.1 The Problem

The formulation of the problem is straightforward:

Let there be a set of DMUs characterized by a known set of input and output variables. Suppose one wants to distribute a new input (or a new output) to these DMUs. The total amount to be distributed is known and fixed. The question is: how does one fairly conduct the distribution? Or, in other words, what share of the total input (or output) should be assigned to each DMU in a manner deemed fair?

There are numerous examples that fit the above framework and we provide two of them, one for the distribution of a new input and another for a new output. They should be enough for any reader to understand the usefulness of answering the above questions.

3.2 Examples

Example 1 - Distribution of a new Input

Consider a large Bank corporation with hundreds of branches. Say that each branch is a DMU that transforms known inputs (such as human resources, qualification of these resources, energy, equipment, area, etc.) into known outputs (deposits, savings, customer satisfaction, etc.). Suppose that the corporation has connected all its branches through a communication system that is based on the purchase of satellite hours. During the installation and testing of the system, the costs of this process were all covered by the corporation. Now, however, that the system is under regular operation, the corporation intends to apportion the annual cost (i.e., a new input), which has a total fixed and known amount, among all its branches, or DMUs. How should this be done?

Example 2 - Distribution of a New Product

Keeping the same example above, suppose now that the same corporation decides to launch a new product on the market such as a new credit line for financing real estate acquisition, or a new type of personal accident insurance, or a new type of credit card. The corporation establishes a total goal to be achieved in a given timeframe as, for example, a billion dollars in one year. What share of the total target should be assigned to each branch?

3.3 Using DEA to obtain the solution

These issues are not new and the answer that is provided usually is one-dimensional. In other words, one takes a specific variable (input or output), which is supposedly more relevant to the process, and the distribution is made as a proportion (direct or inverse, as applicable) of the value of that variable for each DMU with respect to the value of the total sum of the same variable for all DMUs.

In the article by Cook & Kress (1999), the authors raised questions on this approach, using examples where the choice of the most relevant variable was not obvious, thereby showing that different choices could lead to quite different answers. Then, the authors claimed that DEA would be an appropriate tool to solve this problem, for it would allow an answer that took into account all input and output variables in the model.

In the solution they proposed, the authors used the original set of input and output variables to calculate the relative efficiencies of each DMU and postulated that the fair distribution of the new input (or output) would be the one that would maintain the same values of efficiencies for each DMU.

The idea by Cook and Kress of using DEA to solve the problem would have many followers, but the main assumption of their solution was criticized and the work that followed had a different premise. The fair distribution would be the one that would assign to each DMU a share of the new input or output in such a way that, at the end, every DMU would be located at the efficiency frontier, i.e. all of them would have a relative efficiency value equal to one.

In practice, this means that an efficient DMU should receive proportionately more of the new input (or should be obligated to produce less of the new output) than an inefficient DMU. In other words, the originally efficient DMUs are, so to speak, somehow privileged at the end of the distribution. Although this hypothesis has already been criticized, it is now used by most groups working on this line of research.

In an article that would become a classic in the field, Beasley (2003) presented a solution to the problem. However, there were some difficulties. To begin with, his solution had a complex formulation which included several stages, one of which required solving a nonlinear constrained optimization problem. It is well known that these problems are difficult to solve. Secondly, at the end of the distribution, DMUs could be weak or strongly efficient, and that was not clear in the article. And there was a third problem, considered by many to be bigger than the previous ones, which would be noticed much later (see section 4.4 for details).

In Brazil, a research group from the Federal University of Rio de Janeiro developed a model that could also be used within the purpose of solving the same problem (see Estellita Lins, Gomes, Soares de Mello & Soares de Mello, 2003). The authors called their model ZSG (for Zero Sum Game), since the procedure of moving all DMUs to the efficiency frontier was based on partially removing some of the input (or output) of a certain DMU and, according to certain criteria, passing all or part of it to another DMU, keeping the total sum fixed.

Lozano and Villa, whose methodology is based on solving a set of linear programming problems (see, for instance, Lozano & Villa, 2004) have recently addressed the same problem.

4 PARAMETRIC DEA

4.1 "I had a strange idea"

I1 1 The use of the first person (pronoun "I") refers to Prof. Armando Milioni. first learned about DEA in 1998, in the context of a consulting job for a private Bank.

My approach to the subject led our group to develop a first paper which was published in 2002 (see Avellar, Polezzi & Milioni, 2002). In that paper we addressed a problem related to another consulting job in which the client was Anatel, the Brazilian Telecommunication Agency.

Then, in 2002 I supervised the final graduation thesis of Ernee Kozyreff Filho, who was completing his Bachelor program in Civil Aeronautics Engineering.

I was familiar with the article by Cook & Kress (1999), and I planned to use a variation of their formulation in order to solve a problem for the Brazilian Air Force, which was the problem of distributing a budget (in hours of training) to different air squadrons operating in the country. That was the original context of the work I proposed to Ernee.

In one of our meetings I told Ernee that a possible description of the problem of distributing a new input or output variable would be as follows:

Consider a group of DMUs described by a certain number of input and output variables. We say that the problem lies in Real Space whose dimension is equal to the sum of the number of inputs and outputs. For instance, if there were two inputs and three outputs, the problem would be in the Real Space of dimension five, i.e., all data and all possible solutions would be described by vectors with five scalars or numbers on the Real line, two of which inputs and three outputs. The values of inputs and outputs for each observed DMU would be within this space. Thus, one could estimate the PPS frontier or the efficiency envelope accordingly. Therefore, it was also in this dimension that efficiency analysis could be conducted. Distributing a new input or a new output is equivalent to moving up the problem by one dimension. In our example, adding, say, a new output, implies moving up the problem from the five-dimensional Real Space to the six-dimensional. But we would have no observations on this new one plus-dimensional space, which obviously would contain the original one. Hence - as one can see - the complexity of the issue.

A few days later, I received an e-mail from Ernee in which he called for the scheduling of another meeting to discuss an idea he had had. In the email, he said: "Professor, I had a strange idea."

4.2 The first parametric formulation

This was Ernee's idea.

We have seen that the problem of distributing a new variable (input or output) is the problem of finding, in the Real space of one dimension above the dimension in which all DMUs were described, a surface (which is seen as a frontier) that houses a PPS such that, after the distribution of the new variable, all DMUs will be distributed over this surface (or, over the frontier), thus ensuring that all DMUs will be efficient.

Now, the PPS has known properties such as the assurance that the frontier is first degree homogeneous. In other words, the equation which supposedly describes the frontier ought to satisfy Euler's Theorem for homogeneous functions. Perhaps the main property, however, is that it is a convex space, also known as convex set. Thus, a property that the surface must meet is that it contains a convex space (or set).

So, why not arbitrate the format of this space, ensuring the known property?

For instance, why not arbitrate that the PPS frontier is a hypersphere?

A sphere is a geometric solid in three-dimensional Real Space which is known to be convex, and a hypersphere, which is the generalization of a sphere in Real Space with a dimension greater than or equal to three which is also known to be convex. For any dimension, a hypersphere is defined as the set of points housed in or contained by a surface formed by all points that are equidistant to a same common point called the central point, or center.

So, let us imagine a family of hyperspheres centered at the origin, each one of them characterized by a different radius, which is the name given to the constant distance from the center to any point over the hypersphere's surface.

Moving (or projecting) the DMUs to this hypersphere's surface (or frontier) would assure that each and all of them would become efficient. It is simple to notice that the total value of the new variable assigned to the DMUs would then vary with the different values of each hypersphere's radius.

The answer to the problem would be the hypersphere whose radius was such that the total value was exactly the one that was intended to be distributed across the DMUs.

Although it might be simple to realize that the general idea works, the immediate and obvious question that I asked myself and that I would then hear many times is this: who or what guarantees that, in a certain real problem, the format of the PPS is a hypersphere?

The honest answer, of course, is that nobody or nothing guarantees that.

Moreover: the central idea of DEA is to allow DMUs to freely choose the weights to be assigned to each input and output variable. When one chooses the locus of points describing the frontier, as described above, these weights will be determined as a result (or consequence) of this choice. That is, the weights become the byproducts, rather than determiners.

So then, what happened to the freedom of choice, a fundamental DEA concept?

Given that the solution became extremely simple - as opposed to the complex formulations that were available in the technical literature at that time - and given the fact that it was an undergraduate senior's academic thesis and thus limited and personal in scope, I encouraged Ernee to continue investigating his original idea.

His final results would be published two years later (see Kozyreff & Milioni, 2004).

4.3 Further developments

I decided to continue the investigation of the properties and consequences of the idea and the research revealed several interesting results.

The first one was discovering that the initial solution was only good for the distribution of a new input. If the idea was to distribute a new output, then, the locus of points that would adequately describe the frontier would need assumptions that were distinct from those imposed in our initial model.

The second result was that even when the assumption was correct, there was an error in the formulation that compromised the correctness of the final solution.

Both problems were corrected and the final results was published in 2005 in a paper in which we presented the corrected Spherical Model (see Avellar, Milioni & Rabello, 2005).

It was along that time that Beasley published his famous article (see Beasley, 2003).

We compared the results of our Spherical Model with the results by Beasley, using the same data he used in an illustrative example and found out that the differences were small.

Furthermore, we found evidence that the sophisticated solution proposed by Beasley had some weaknesses - not to mention the high computational cost - which contrasted with the great simplicity of the model which postulated a priori the PPS's frontier locus of points.

These results were published in 2007 (see Avellar, Milioni & Rabello, 2007).

The evolution of this research led to several specific themes that were published in the Proceedings of many international Conferences, such as EURO, CLAIO and INFORMS.

We realized, for instance, that arbitrarily choosing the format of the space describing the PPS is equivalent to arbitrarily choosing the locus of points of the frontier that contains it. But Geometry is generous in providing curves that describe frontiers that contain convex spaces (or sets).

Immediate examples are the Conics, the curves that are results of the intersection of a plane crossing a Cone. They are: the ellipsis, from which the circumference is a particular case, the hyperbole, and the parabola.

We also noticed that there is a close similarity in the choice of such a curve and the choice of the distribution function of a random variable which is used to describe a certain phenomenon. Because of this similarity, we decided we would call our models as "Parametric DEA Models" purposely alluding, as a counterpoint, to the fact that DEA is considered a tool of non-Parametric Statistics.

Two new questions, then, naturally arose: the assurance of the existence of a parametric solution and the problem of distributing a second variable, after the distribution of a first one.

The latter case generated interesting questions that were discussed in more than one International Conference and a paper was finally published in 2009 (see Guedes, Freitas, Avellar & Milioni, 2009).

The intriguing aspect of this problem is that at the end of the distribution of a new variable, all DMUs are placed on the efficiency frontier. The natural question that follows, then, is this: if a second new variable is distributed after the first one, what is the consequence, if any, of the fact that all DMUs are already efficient?

We were able to prove that there were no consequences or, better, that Parametric DEA could still be adopted regardless of the fact that one or more new variables had already been distributed across all DMUs.

In the same paper we also addressed another matter regarding the relevance of the variable that is being distributed. In one of our internal meetings, this question arose and was intensively debated. In order to understand it, let us consider the general case of a corporation that needs to distribute a new input variable to its DMUs. Is the existence of a solution that would make all DMUs efficient dependent on the relevance (or magnitude) of the variable that will be distributed? Suppose, for instance, that the total fixed variable that has to be distributed to the DMUs is an orange(!). Is it possible that at the end of the distribution, DMUs that were not efficient would become efficient only because they were assigned the appropriate share of slices ofthat orange?

The answer to this question is: yes, that is right. In fact, Guedes, Freitas, Avellar & Milioni (2009) showed that, as surprising as it may initially seem, this characteristic is not different from what happens in classic Econometrics.

Consider, for instance, a Linear Regression that was calibrated in order to model the behavior of a certain independent variable from a set of dependent (or, explanatory) variables. If a new variable, whatever it may be (slices of an orange, for example), is included in the model in such a way that, for each observation, this new variable presents a value that is proportional to the residual value of the original Regression, then, the model will no longer be stochastic, but deterministic; its coefficient of determination becomes equal to one, or 100%.

In the paper we published in 2009, we showed that both scenarios are completely equivalent, concluding that in both cases the mathematical methods are robust in solving the problem. However, they both depend entirely upon the good sense of the decision maker responsible for choosing the variables that will be included in the analysis.

New developments followed.

In 2010, we published a paper (see Avellar, Milioni, Rabello & Simão, 2010) in which we showed that the Spherical Model could also be used to redistribute a variable already in the model; although, again, conditions had to be included to guarantee that the problem had a solution.

The potential application of the problem of redistributing a variable which was already present in the analysis is also easy to illustrate.

Let us consider, for example, a franchise network with a total number of employees. The network is aware of the fact that some of its franchise units are inefficient, but it does not want to lay off employees. What, then, should be the new distribution of the same total fixed number of employees that would lead all franchise units to be efficient?

Despite this other applications, DEA Parametric Models were still regarded as something exotic, a purely theoretical development with no potential for practical use, because of its strong a priori assumption on the locus of points of the frontier.

It was then that a new issue came up: models designed to distribute a new total fixed variable to a set of DMUs should obey a condition that would be called the "Coherence Property".

4.4 The issue of coherence

In order to illustrate the Coherence Property, let us recall our Example 1 of a large Bank corporation which intends to apportion a new input (the annual cost of a communication system that is based on the purchase of satellite hours) which has a total fixed and known amount among all its branches, or DMUs. How should this be done?

Let us suppose that the Spherical Model was employed to distribute the total cost referred to in that example. Let us also consider, however, that once the problem was solved, we found out that the solution was obtained in the presence of a serious mistake. Certain specific data from a certain branch was wrongly typed. For instance, the branch located in São José dos Campos had an output of R$1 Billion in Deposits, but the problem was solved with the information that the value of that output, for that DMU, was equal to R$1 Million.

The trivial action is to correct the mistake and rerun the model, right?

Right, but an interesting question arises. How does one expect that the initial solution, that we will call "wrong", compare to the new one, that we will call "right", since it is error free?

In order to answer that question, let's first examine what is embedded in it.

In the initial wrong solution we considered that the São José dos Campos branch produced an output that was much smaller than it actually was. So, this branch was assigned a certain share of the total cost that had to be split across all DMUs. Now that we know that that branch (or DMU) is much more productive than initially supposed, what is reasonable to expect? That the new share assigned to the São José dos Campos branch in the right solution should grow, or shrink?

If you understood the structural logic that at the end of the distribution every DMU should be efficient, you will agree that in the right solution, after the elimination of the typo, the São José dos Campos branch should receive a share of the total cost that is greater than the one that had been assigned to it in the initial wrong one.

Indeed, that was expected, but not only that.

Consider, now, all the other DMUs (branches) that had no data changed (we are assuming that the wrong solution was obtained with a single typo error which was corrected prior to the right solution, everything else remaining constant). How does one expect to compare the share of the total cost assigned to each DMU in both solutions (the original wrong one and the right one that followed)?

The answer, again, should not be surprising. If the São José dos Campos branch received, at the right solution, more of the new total fixed input (share of the total cost) than it had received in the wrong one, it is reasonable to expect that all other branches would receive, in the right solution, a little less than they had received in the wrong one.

At the XXII EURO (European Meeting of Operational Research Societies), held in Prague, Czech Republic, in 2007, we showed the proof of a theorem that stated that the Parametric DEA model of the Spherical type respected this so-called Coherence Property. We also showed the necessary conditions on the locus of points describing the frontier in order for the Coherence Property to be observed.

Furthermore, and perhaps more important, we also showed, using counterexamples, that no other technical solution available at that time could provide this desirable property.

Both ZSG and Beasley models presented erratic behavior. The correction of the single typo led to unexplainable and hardly acceptable variations in the values assigned to each DMU for the wrong and right solutions. Some had their shares increased, while for some others they decreased and, in some cases, especially in the model proposed by Beasley, the variations were huge, probably because of the non-linear stage of the model.

It seems fair to attribute the sudden growth of interest in Parametric DEA models to these results. Revisions of submitted papers began to be less critical, although the time for final acceptance (not to mention publication) was still very long.

It would only be in 2011, for instance, that the Journal of the Operational Research Society would publish a paper accepted in 2009 in which we showed that a Parametric Hyperbolical Model could be used to distribute a new output of total fixed values to a set of DMUs. Previously, as pointed out earlier, the models available could only distribute inputs (see Milioni, Avellar, Rabello & Freitas, 2011).

And it would also only be in 2011 that the European Journal of Operational Research would publish the results of the Ellipsoidal Model which generalized the Spherical Model and thereby solved another problem of Parametric DEA Models. It made it possible to add some control on the weights of the input and output variables, even though the possibility was limited at that time (see Milioni, Avellar, Gomes & Mello, 2011).

In 2012, the Journal of the Operational Research Society published our paper proposing a technical adjustment for the Spherical Model, a refinement that we developed in 2007 and first submitted to the referred journal early in 2008 (see Milioni, Guedes, Avellar & Silva, 2012).

4.5 Controlling the weights

The proof of the Coherence Property received positive feedback which increased acceptance of the DEA Parametric Models. But, despite that, the issue of arbitrarily choosing the locus of points describing the PPS's frontier and the weights assigned to input and output variables still inspired skepticism among some researchers.

There are currently two lines of research on this subject.

The first one concerns the elliptical models.

Let us recall that in the Spherical Model, the PPS was defined as the vector space composed of positive real numbers belonging to a sphere (or hypersphere) centered at the origin. The vector's dimension varies with the number of variables in the problem, and the numbers are positive for simplicity, in both cases. In the Elliptical Models, the solid, centered at the origin, is not a sphere, but an ellipsoid (or hyperellipsoid). It is known that a sphere has only two degrees of freedom, since it can be described by fixing its center and its radius. In the case of the ellipsis, however, we need to specify the center and the eccentricities. Thus, while a sphere in any vector space is defined by only two degrees of freedom (center and radius), an ellipsis is defined by a number of degrees of freedom equal to the dimension of the vector space plus one (for the specification of the center). We were able to show that, by making use of these additional degrees of freedom, the distribution of the new input could be made with partial control of the weights for the input and output variables present in the problem. That is, the weights were no longer entirely fixed as a result of the choice of the locus of points describing the PPS's frontier. The control was only partial because the choices of the eccentricities were conditioned to the feasibility of the solution, a technical issue whose discussion will be omitted.

As often happens in scientific developments, the solution proposed also raised new problems that still need investigation. In this case, the most immediate and natural one concerns the best way to choose the eccentricities.

The second line of research is more recent. It regards the inclusion of additional constraints on Parametric DEA Models allowing the control of the weights of all variables in the problem through the incorporation of a formulation known as the Cone Ratio (see Charnes, Cooper, Wei & Huang, 1987). As it would be proven later on, this method generalizes most of the other methods designed to control the weights of the variables in DEA. It was precisely for this reason that we decided to incorporate the Cone Ratio method in Parametric DEA formulations. Our first results were published in 2012 (Silva & Milioni, 2012). An earlier version of this paper (see Silva & Milioni, 2010) was presented at the Brazilian National Symposium of Operations Research (SBPO). It was awarded the Roberto Diegues Galvão award, which is awarded every year for the paper considered to be the best one presented at the Conference which is sponsored by SOBRAPO, the Brazilian Society of Operations Research.

5 CONCLUSIONS AND FUTURE PROSPECTS

We continue to investigate the use of Parametric DEA Models with constraints on the weights of the input and output variables.

We already know, however, that the desirable possibility of including restrictions on the weights implies an undesirable consequence: the models lose the Coherence Property discussed in subsection 4.4.

We have also realized that imposing restrictions on the weights is equivalent to modifying the shape of the PPS's frontier. The exact parametric configuration defined a priori no longer holds, because it is replaced by the shape of something that is only approximately similar to that original shape.

We have also understood that the loss of the Coherence Property directly relates to the degree or magnitude of the intervention in terms of the restrictions added to the problem. Weak restrictions barely modify the format of the PPS's frontier and mildly violate the Coherence Property, whereas strong restrictions tend to considerably modify the PPS's frontier; thus, the Coherence Property will be intensively violated. By gently or mildly violating the Coherent Property, for example, what we mean is that a variation that was expected to be positive might be negative, but when and if that happens, this negative value is small, close to zero. When we say that the Coherent Property is quite violated, the meaning is the opposite. One of the problems we are investigating is how to establish metrics that adequately support our perception.

Therefore, we are investigating a procedure to include constraints on the weights to give the decision maker some kind of control over the magnitude of the loss with regard to the Coherence Property.

This research is currently dedicated to the Spherical Model, since it is easier to treat due to the fact that it has only two degrees of freedom.

In parallel, we are testing the insertion of weight constraints on the Hyperbolical and Ellipsoidal Models, both under current development. The Elliptical case is analytically hard, whereas it seems to be conceptually simple, and it should be ready in the near future. The Hyperbolic case, used for the distribution of a new total fixed output, seems to be more challenging because there is not a Conic curve that generalizes the hyperbole (it is worth recalling that the Spherical and Elliptical Models are appropriate for the distribution of a new total fixed input).

In particular, in the elliptical model, we are also investigating a criterion for choosing the values of the eccentricities. One line of research we are pursuing is the one under the assumption that the optimal set of eccentricities is the one that moves all DMUs to the efficiency frontier under the smallest possible variation on the initial values of the variable that one wants to redistribute. The idea is defensible, but it requires the existence of an initial distribution. If the problem is redistributing an already existing variable, this is not an issue. However, if the problem is the distribution of something new, the question becomes relevant. In this case, we are investigating the effects of different initial solutions, ranging from uniform (all DMUs receive the same share of the total value of the variable that is to be distributed) to proportional to a single variable, traditional in the technical literature prior to the use of DEA type models for solving this problem.

We are convinced that, in due time, a better understanding of the dynamics of these models will lead to further developments and applications. Who knows? Maybe even from an undergraduate's "strange idea".

ACKNOWLEDGMENTS

I would like to thank CNPq (project numbers 472106/2007-4, 472122/2009-6 and 470916/2011-7) and FAPESP (project number 06/01438-6), for the research grants, and I would also like to express my gratitude to all undergraduate and graduate students who worked with me in DEA in the past 10 years. They are: Alexandre Polezzi, André Luiz Chiossi Forni, Ernée Kozyreff Filho, Eric Cezzane Colen Guedes, Gustavo Machado de Freitas, José Virgílio Guedes de Avellar, Henry Rossi de Almeida, Rodrigo Cesar da Silva, Joyce Evania da Costa Toledo Teixeira and Luciene Bianca Alves.

Received October 20, 2012

Accepted March 13, 2013

• *
Corresponding author
• 1
The use of the first person (pronoun "I") refers to Prof. Armando Milioni.
• [1] AVELLAR JVG, POLEZZI A & MILIONI AZ. 2002. On the evaluation of Brazilian Landline Telephone Services Companies. Pesquisa Operacional (impresso), 22(2): 231-246.
• [2] AVELLAR JVG, MILIONI AZ & RABELLO TN. 2005. Modelos DEA com variáveis limitadas ou soma constante. Pesquisa Operacional (Impresso), 25(1): 135-150.
• [3] AVELLAR JVG, MILIONI AZ & RABELLO TN. 2007. Spherical frontier DEA model based on a constant sum of inputs. Journal of the Operational Research Society, 58: 1246-1251.
• [4] AVELLAR JVG, MILIONI AZ, RABELLO TN & SIMÃO HP. 2010. On the redistribution of existing inputs using the spherical frontier DEA model. Pesquisa Operacional (Impresso), 30: 1-16.
• [5] BANKER RD, CHARNES RF & COOPER WW. 1984. Some Models for Estimating Technical and Scale Inefficiencies in Data Envelopment Analysis. Management Science, 30: 1078-1092.
• [6] BEASLEY JE. 2003. Allocating fixed costs and resources via data envelopment analysis. European Journal of Operational Research, 147: 198-216.
• [7] CHARNES A, COOPER W & RHODES E. 1978. Measuring the efficiency of decision-making units. European Journal of Operational Research, 2: 429-444.
• [8] CHARNES A, COOPER W, WEI Q & HUANG Z. 1989. Cone ratio data envelopment analysis and multi-objective programming. International Journal of Systems Science, 20: 1099-1118.
• [9] COOK WD & KRESS M. 1999. Characterizing an equitable allocation of shared costs: A DEA approach. European Journal of Operational Research, 119: 652-661.
• [10] ESTELLITA LINS MP, GOMES EG, SOARES DE MELLO JCCB & SOARES DE MELLO AJR. 2003. Olympic ranking based on a zero sum gains DEA model. European Journal of Operational Research, 148: 312-322.
• [11] FARRELL MJ. 1957. The Measurement of Productive Efficiency. Journal of the Royal Statistical Society, 120: 253-281.
• [12] GUEDES EC, FREITAS GM, AVELLAR JVG & MILIONI AZ. 2009. On the allocation of new inputs and outputs with DEA. Engevista, 11: 4-7.
• [13] KOZYREFF FILHO E & MILIONI AZ. 2004. Um Método para Estimativa de Metas DEA. Produçăo (Săo Paulo), 14(2): 90-101.
• [14] LOZANO SA & VILLA G. 2004. Centralized resource allocation using data envelopment analysis. Journal of Productivity Analysis, 22: 143-161.
• [15] MILIONI AZ, AVELLAR JVG, GOMES EG & MELLO JCCBS. 2011. An Ellipsoidal Frontier Model: allocating input via parametric DEA. European Journal of Operational Research, 209: 113-121.
• [16] MILIONI AZ, AVELLAR JVG, RABELLO TN & FREITAS GM. 2011. Hyperbolic frontier model: a parametric DEA approach for the distribution of a total fixed input. Journal of the Operational Research Society, 62: 1029-1037.
• [17] MILIONI AZ, GUEDES ECC, AVELLAR JVG & SILVA RC. 2012. Adjusted spherical frontier model: allocating input via parametric DEA. Journal of the Operational Research Society, 63: 406-417.
• [18] SILVA RC & MILIONI AZ. 2010. Parametric DEA Model with Weight Restrictions. Proceedings of SBPO 2010, Bento Gonçalves, RS, Brazil, 1: 377-387.
• [19] SILVA RC & MILIONI AZ. 2012. The Adjusted Spherical Frontier Model with Weight Restrictions. European Journal of Operational Research, 220: 729-735.
* Corresponding author 1 The use of the first person (pronoun "I") refers to Prof. Armando Milioni.

# Publication Dates

• Publication in this collection
24 May 2013
• Date of issue
Apr 2013