Acessibilidade / Reportar erro

Systems of exchange values as tools for multi-agent organizations

Abstract

This paper introduces systems of exchange values as tools for the organization of multi-agent systems. Systems of exchange values are defined on the basis of the theory of social exchanges, developed by Piaget and Homans. A model of social organization is proposed, where social relations are construed as social exchanges and exchange values are put into use in the support of the continuity of the performance of social exchanges. The dynamics of social organizations is formulated in terms of the regulation of exchanges of values, so that social equilibrium is connected to the continuity of the interactions. The concept of supervisor of social equilibrium is introduced as a centralized mechanism for solving the problem of the equilibrium of the organization The equilibrium supervisor solves such problem making use of a qualitative Markov Decision Process that uses numerical intervals for the representation of exchange values.

Social Exchanges; Exchange Values; Exchange Values-Based Social Organization; Social Equilibrium; Equilibrium Supervisor; Qualitative Markov Decision Process


ARTICLES

Systems of exchange values as tools for multi-agent organizations

Graçaliz P. DimuroI; A. C. Rocha CostaI, II; Luiz A. M. PalazzoI

IEscola de Informática, Universidade Católica de Pelotas - UCPel. R. Felix da Cunha 412, 96010-000 Pelotas, RS, BRAZIL. liz@atlas.ucpel.tche.br; rocha@atlas.ucpel.tche.br; lpalazzoj@atlas.ucpel.tche.br

IIPrograma de Pós-graduação em Computação-PGIE, Universidade Federal do Rio Grande do Sul - UFRGS. Av. Bento Av. Bento Gonçalves 9500, Bloco IV, 91501-970, Porto Alegre, RS, BRAZIL

ABSTRACT

This paper introduces systems of exchange values as tools for the organization of multi-agent systems. Systems of exchange values are defined on the basis of the theory of social exchanges, developed by Piaget and Homans. A model of social organization is proposed, where social relations are construed as social exchanges and exchange values are put into use in the support of the continuity of the performance of social exchanges. The dynamics of social organizations is formulated in terms of the regulation of exchanges of values, so that social equilibrium is connected to the continuity of the interactions. The concept of supervisor of social equilibrium is introduced as a centralized mechanism for solving the problem of the equilibrium of the organization The equilibrium supervisor solves such problem making use of a qualitative Markov Decision Process that uses numerical intervals for the representation of exchange values.

Keywords: Social Exchanges, Exchange Values, Exchange Values-Based Social Organization, Social Equilibrium, Equilibrium Supervisor, Qualitative Markov Decision Process

1 INTRODUCTION

The exchange values approach to social interactions [31, 19], in which an interaction is seen as an exchange of services among a small group of agents, along with the corresponding evaluation of (assignment of values to) such services, was conceived as a methodological stance where organizations are taken as collections of small groups of agents.

Each such group is seen as having its own internal interaction dynamics, and the collection of all groups may be recursively construed as a collection of small groups of small groups, etc., thus picturing the organization as a hierarchy of levels of structurally similar dynamical systems, all centered around the issue of exchanging values between their components.

The overall balance of exchange values involved in the operation of a system can be analyzed for its state in terms of equilibrium (beneficial for all agents) or disequilibrium (beneficial to a particular agent or group of agents, or beneficial to nobody).

It is usually attributed to George Caspar Homans the initial development of the exchange values approach to social organizations [19, 20]. In connection to that, the exchange values approach is sometimes criticized as too narrow in scope, on the basis of its supposedly inherent behavioristic limitations, due to Homans' explicit adoption of Burrhus Skinner's behavioristic psychology as an ancillary explanatory theory of individual human behavior.

Nevertheless, the exchange values approach to the analysis of social interaction has nothing inherently behavioristic in it. Well before Homans introduced his ideas in [19], Jean Piaget proposed a constructive, non-behavioristic, theory of social exchanges, where exchange values play a central role [30, 31].

Moreover, Piaget's theory of exchange values goes much further than Homan's theory, both in what concerns the scope of application of exchange values (going as far as showing their role in the origination of laws, moral rules and organizational norms) and in what concerns the degree of formalization of the ideas.

The main contribution of the exchange values approach to social exchanges, in comparison to the classical, quantitative utility functions based approach, is the introduction of the possibility of taking care of the subjective values, of qualitative nature, with which everyone judges the daily exchanges he has (good, bad, better than, worst than, etc.), which usually cannot be faithfully represented quantitatively, due to the lack of neat objective conditions for their measurement.

In this paper, we build on our previous work regarding the application of Piaget's approach to the analysis of interactions in multi-agent systems [34, 35, 14] and cooperative environments [8, 10], to consider the way systems of exchange values can be taken as useful tools for helping to solve problems concerning the organization of multi-agent systems.1 1 Values have been extensively used in the MAS area, through value-based and market-oriented decision, and value-based social theory, see e.g., [2, 36, 25]. The merits and interest in using Piaget's notion of exchange values is discussed in [34, 35], showing its complementarity to the dependence theory approach. Also, we explain our understanding of the connection between Piaget's and Homans' approaches to exchange values, showing the role that Homan's theory can play in the broader analytical framework set up by Piaget.

The paper is organized as follows. In Sect. 2, we summarize the sociological basis of the work, as encompassed by Piaget's theory of exchange values (Sect. 2.1), by Homans' theory of social behavior (Sect. 2.2), and by the way we think both theories fit together (Sect. 2.3).

In Sect. 3, we present in an axiomatic way our notion of social organization: in Sect. 3.2 we state how social organizations are construed by social functions, social roles, exchange values, and social rules. In Sect. 3.4, we express the dynamics of social organizations, in terms of regulation of exchanges of values, so that social rules and social equilibrium can be connected to that dynamics. Section 3.5 introduces the notion of supervisors of social equilibrium.

In Sect. 4 we bring the main concrete contribution of the paper, namely, a general model of a supervisor of social equilibrium. Section 4.1 introduces our way of representing qualitative exchange values by intervals of a numeric scale. Section 4.2 explains the use of such intervals to the modelling of social exchanges between two agents, and Sect. 4.3 generalizes that modelling to a matrix-like notation capable of representing social exchanges between all agents of an organization. Section 4.4 shows how the exchange values equilibration problem of an organization can be solved by a qualitative interval Markov Decision Process [37], which uses the interval representation for qualitative values introduced earlier.

Section 5 brings the analyzes of the model: a theoretical analysis (Sect. 5.1), regarding the reachability of the equilibrium state, considering the case in which all agents follow the recommendations of the equilibrium supervisor; and a comparative analysis (Sect. 5.2) of simulations of (unsupervised and supervised) exchanges processes, considering different degrees of obedience to the supervisor. Section 6 is the Conclusion.

2 SOCIOLOGICAL BASES

2.1 PIAGET'S THEORY OF EXCHANGE VALUES

As a cognitive psychologist, Piaget gave much less attention in his work to social and affective aspects of human behavior, than to its cognitive aspects. Nevertheless, he had the opportunity to express in various places his ideas about those subjects (e.g., in [30, 31]).

2.1.1 INTERACTIONS AS SOCIAL EXCHANGES

In what concerns human society, Piaget adopts a relational approach, such that the structure of a society is defined as a relational structure where the relationships among the individuals are established by social exchanges among them. Thus, interactions are understood as exchanges of services among individuals, involving not only the realization of services by some individuals on behalf of others, but also the valuation of such services, from various points of view, by every individual involved in them.

A service performed by an individual, however, is not a simple action or interference on the action of somebody else. To be counted as a service, an action performed by an individual has be to understood by all individuals involved in the action as an intentional action, directed toward some other individual, thus allowing its evaluation as beneficial or prejudicial to the latter. The evaluation of a service by an individual (either the server of the service or its client) is done on the basis of a scale of so-called exchange values,which are of a qualitative nature, since such values express subjective evaluations2 2 A scale of exchange values is a set of qualitative, ordinal values, which can be compared for their magnitudes (less than, equal, greater than), but cannot be algebraically operated in an unrestricted way, as fully quantitative values can. E.g., the values can be added or subtracted: if a < b and a < c then a < b + c. However, the differences between values cannot be compared: if a < b and c < d it is not possible to decide which of b - a < d - c or b - a = d - c, or else b - a > d - c, is true. .

2.1.2 EXCHANGE VALUES AND SOCIAL EQUILIBRIUM

Exchange values give rise to a qualitative economy of social exchanges, where individuals acquire credits for services they have performed, and debits to others for services the others have performed to them. The balances of exchange values allow individuals to observe the state of equilibrium of the social exchanges (even between just two individuals) and to react according to such state (e.g., trying to enforce equilibrium, to overcome high debts, to keep their status as privileged beneficiaries of the exchanges etc.).

Qualitative exchange values encompass economic values as a particular kind of (quantitative) values, and are seen as the cornerstones of social rules. Social rules of the many different varieties (formal or informal; moral, economical or juridical, including organizational norms) can often be understood as means put to operate in an effort to guarantee that the overall balance of exchange values are kept in certain equilibrated (or disequilibrated, i.e., favorable to some individuals or groups of individuals) states, so individuals are kept motivated (or, enforced) to continue their participation in those exchanges.

A crucial aspect for the right understanding of Piaget's approach is the clear differentiation between the notions of social equilibrium and social order (social stability). By social equilibrium it should be understood a kind of equality, or equity, in the distribution of exchange values among the agents participating in the exchange. By social order, or social stability, it should be understood the temporal continuity of the established set of social exchanges. The two concepts are orthogonal, in the sense that any stable or unstable exchange can be either in equilibrium or in disequilibrium, and any equilibrated or disequilibrated exchange can be either stable or unstable. On the other hand, they are not independent, in the sense that each may impact the other 3 3 But this mutual dependence was not fully exploited by Piaget, who concentrated on stable ("static") societies. .

The main technical contribution of this paper is the concept of an equilibrium supervisor, a system component (possibly an agent) that, at each moment, is able to recommend that particular exchanges among agents be performed so that a subset of the agents of the system (possibly, the whole set of agents) be kept in equilibrium (or, disequilibrium). That is, so that all agents in that set benefit from the exchanges in equal (or, differentiated) terms. In other words, equilibrium supervisors embody social rules designed to keep a set of agents in certain (equilibrated or disequilibrated) states of exchange values.

2.1.3 MATERIAL AND VIRTUAL EXCHANGE VALUES

Social exchanges are classified by Piaget into two broad categories: immediate exchanges and deferred exchanges. In immediate exchanges, individuals exchange services in an immediate way, service for service, so that the evaluations of such services can be done immediately, as the services are being performed, allowing each individual to immediately regulate the quality and quantity of the service it performs for the other (as when two people exchange material goods, negotiating the quantities in which each good will participate in the exchange).

Two kinds of values are associated to such services, corresponding to the investment (cost) necessary to perform them, and to the satisfaction they may cause to the client. Such values are called material exchange values.

Deferred exchanges involve a separation in time between the stages of an exchange of services, and give rise to so-called virtual exchange values, encompassing credits and debits: after an individual performed a service for the other, the first is entitled a credit for the service performed, and the other is entitled a debit for receiving the service, and is supposed to pay that debit in the future 4 4 The term virtual value refers precisely to the fact that they represent services that are yet to be performed, in the future. .

2.1.4 SOCIAL RULES

Virtual values have the weakness that they tend to vanish as time passes: in a very distant future, one may not feel obliged to perform a service to somebody, in return for a service he received before, precisely because that fact, having happened in a too distant past, may have lost its importance for the current situation.

Thus, besides serving the purpose of governing individual reactions to the state of the overall balance of exchange values, social rules also play the essential role of a means to avoid the vanishing of virtual values, by associating such values to behaviors that aim their conservation, and that are to be mandatorily performed. Virtual values have their most neat application to the regulation of exchanges between two individuals, but are easily extended to cover relations involving more individuals.

Social rules regulating the preservation of such values are either moral rules, historically created by the group, or private contracts spontaneously established between the individuals involved in the exchanges.

2.1.5 THE STRUCTURE OF SOCIAL EXCHANGES

A social exchange between to agents, α and β, is performed involving two types of stages, illustrated by the schemes in Fig. 1. In stages of type Iαβ, the agent arealizes a service for β. The exchange values involved in this type of exchange stage are the following:

- rIαβ the value of the investment done by afor the realization of the service for β;

- sIβα the value of β's satisfaction due to the receiving of the service done by α;

- tIβα the value of β's debt, the debt it acquired to α for its satisfaction with the service done by α;

- νIαβ the value of the credit that α acquires from β for having realized the service for β.


Investment values are always negative, while the other values may be either positive or negative.

In stages of the type IIαβ, α asks β the payment for the service he did previously for β, and the values related with this exchange stage - νIIαβ, tIIβα, rIIβα and sIIαβ - have similar meaning. rIαβ, sIβα, rIIβα and sIIαβ are called material values. tIβα, νIαβ, tIIβα and νIIαβ are the virtual values. The order in which the exchange stages may occur is not necessarily Iαβ-IIαβ.

Piaget's modelling of social exchanges has an algebraic flavor, aiming at the formalization of algebraic laws for the operations involved in those exchanges, laws that serve as the bases for formalization of the rules that determine the equilibrium of exchanges:

Rule Iαβ: (rIαβ = sIβα) ^ (sIβα = tIβα) ^ (tIβα = νIαβ)

Rule IIαβ: (νIIαβ = tIIβα) ^ (tIIβα = rIIβα) ^ (rIIβα = sIIαβ)

Rule IαβIIβα: (νIαβ = νIIαβ)

Rule Iαβ states the conditions for the internal equilibrium of stage Iαβ, implying that the investment made by α in the performance of the service for β equals the credit that β assigns to α,that is

Rule IαβrIαβ = νIαβ.

Rule IIαβ states the conditions for the internal equilibrium of stage IIαβ, implying that the credit charged by α on β equals the satisfaction α gets from the return service performed by β,that is

Rule IIαβ⇒ νIIαβ = sIIαβ.

Rule IαβIIαβ states the conditions for the external equilibrium between the two stages, Iαβ and IIαβ, implying that the initial investment made by α equals the final satisfaction it gets from the interaction with β, that is

Rule IαβIIαβrIαβ = sIIαβ.

The equilibrium rules play a central role in Piaget's explication of the dynamics of the social organization and in the identification of situations of disequilibrium (including several kinds of social crises).

The development of certain moral rules, for instance, is associated to the equilibrium rules, for such moral rules are developed precisely for the sake of guaranteeing the validity of the equilibrium rules, through the enforcement of certain behaviors that can compensate behaviors which are prone to produce (or, that may have produced) a situation of disequilibrium.

The main role of the equilibrium supervisor introduced in this paper is precisely that of indicating what behaviors should be performed at each moment, by the agents, so as to compensate extant deviations from the state of equilibrium of exchange values.

2.2 HOMANS' THEORY OF ELEMENTARY SOCIAL BEHAVIORS

Homans approached the subject of social exchanges (that he called elementary social behaviors) from a different point of view [20]: he was interested in explaining why each agent behave the way he does, in such exchanges. Being a sociologist, Homans borrowed from Skinner the theory of operant conditioning, as a means to explain why men continue to behave in certain ways (or, change behaviors), in certain situations.

Homans looked at the exchanges values when he looked for a sufficient stimulus for continued (or, discontinued) social behavior, and found it in the concept of profit, defined as profit = benefit - cost, where benefit and cost are defined almost exactly as in Piaget's theory. Profit, in this qualitative sense, is seen as the element that can play the role of stimulus: bigger profit means stronger stimulus to continue the current behavior, while smaller profit means stronger stimulus to discontinue the current behavior.

Of course, search for profit maximization was known, to Homans, not to lead necessarily to the best overall results for the partners of an interaction, specially when looked from the point of view of social equilibrium, as Pareto had shown much earlier5 5 Homans and Piaget were both simpathetic readers of Pareto, see [22] and [30]. . So, Homans had to extend Skinner's conceptions of human behavior with notions that are essentially non-behavioristic (such as the notion of personal integrity before a group [19]) in order to produce consistent explanations.

Homans' proposal gave the starting point for the formalization we present below, where the process of social control that regulates an interaction (which in Homans' theory results from the combination of the various individual behaviors involved in the interaction) is reduced to a Markov Decision Problem [37], to be solved by a central equilibrium supervisor.

2.3 THE ROLE THAT HOMANS' THEORY MAY HAVE IN PIAGET'S FRAMEWORK

The striking similarity between the basic concepts of the two theories, by Homans and Piaget, regarding the idea that a society is based on an organization where the relationships between individuals are conceived as valued exchanges, puts the question about the possible ways the two theories can be made closer to each other.

From our point of view, the way the two theories can be combined is based on the following observations. On the one hand, in the theory of the psychologist Piaget, the psychological aspects of social interactions were kept out of the formalization of social exchanges, to allow for a qualitative algebraic form. The price Piaget paid for that is that the theory had to give up producing a description of the decision processes that the individuals adopt while interacting, that is, to give up explaining the individuals' behaviors.

On the other hand, in the theory of the sociologist Homans, the psychological aspect of social interactions was brought to the foreground, in order to produce a description of the decision processes adopted by the interacting individuals, and an explanation of their behaviors. The behavioristic basis of the adopted psychological explanation, however, forced two consequences. First, the exchange values that could be taken into account were restricted to those of a quantitative nature, so that the notion of profit could be introduced, as a measurable and comparable difference between benefits and costs. Second, the psychological theory itself had to be extended in non-behavioristic directions, in order to accommodate subjective phenomena not directly observable, but essential for the explanation of moral behaviors.

The very constructions of the two approaches, thus, already indicate the way they may be combined: Piaget's theory can be used to formalize an explanation of social exchanges where the agents are understood algebraically, on the basis of the operations they use to handle the exchange values, while Homans' theory can be adapted to deal with the piagetian algebra of qualitative values, in order to formalize the internal decision processes that agents follow when deciding on the keeping or changing of their social behaviors.

In this paper, we present a first step in that direction. We don't show agents making decisions on the basis of their own idiosyncratic criteria, but we show a simplified version of the problem, where all agents are assumed to uniformly follow the same set of criteria when deciding what exchanges to propose to another agent, but where each agent is allowed to answer to that proposal in its own manner.

Halving the complexity of the problem in this way, we were able to perform the first, provisory stage in the formalization of the proposed integration of the two sociological theories that were taken as our bases of our work. The next step in the evolution of this synthesis, accounting for the agents' individual decision making with respect to exchange values, is briefly considered in the Conclusion of the paper.

3 SOCIAL ORGANIZATIONS

In this section, we build on previous work on the dynamics of multi-agent organizations [4, 5, 6, 7, 3, 13] in order to coordinate the notions of organization and exchange values, so that systems of exchange values can become useful tools for the organization of multi-agent systems.

The central concept introduced here is that of dynamics of exchange values in social organizations.

3.1 THE NOTION OF SOCIAL ORGANIZATION

The notion of organization is not univocal. There are at least two main senses in which it is used, namely, a functional sense and a structural sense. In the functional sense, organization is one of the functional invariants that characterize all forms of autonomous dynamic systems [28]. In the structural sense, the organization of an autonomous dynamic system is the relational structure that allows the system's components to interact with each other. In this paper, we use the term organization in the latter, structural sense.

In its simplest form, the organization of a society S, at a given time t, is conceived as a structure where is the set of agents of the society at time t, and is the set of social exchanges that are happening at that time6 6 We follow [11] in considering the dependence relations as one of the main reasons for the establishment of social exchanges. . In the following, we will only consider the static (synchronic) case, where the organization of a society is not changing while the society is functioning. So, we will let implicit the time index t. Also, since we will consider one single society, we will let implicit the index S.

3.2 SOCIAL FUNCTIONS, SOCIAL ROLES, EXCHANGE VALUES, AND SOCIAL RULES

Societies with the simple form defined above should have a big problem to keep their existence in time: the only permanent element they have are the agents, since exchanges are usually finite, and vanish suddenly. Agents, in such situations, are thus allowed to interact with any other agents in the society, at any time, as they are pleased, or simply not to interact at all. Organization, in such societies, is no more than a set of momentary exchanges. There is no way organization can become a structural invariant of the society, in a society with such a simple form.

Many other permanent elements are required, in the organization of a society, if the society is to be able to keep its organization reasonably stable during a certain period. We identify four main notions allowing for such invariant organization, namely, social functions, social roles, exchange values,and social rules:

- social functions are the services that agents (or, sets of agents), perform for other agents (or, sets of agents) in the society, and that justify the very existence of the society: agents that are self-sufficient need not live in societies; only agents that have the need that others perform certain services for them care to live together with others;

- social roles are the relational elements that establish the links between the agents, the social functions they perform, and the behavior they should have in order to perform such functions in an adequate way; social roles are the gluing elements of an organization;

- exchange values are the means by which the services that an agent performs are evaluated by the members of the society (including the agent itself and the other agents involved in the exchange), so that a resulting balance of exchange values can be used by social rules (see below) to compensate behaviors that deviate the society from the desired kind of balance (equilibrated, disequilibrated);

- social rules are the means by which agents (given the social functions that are to be performed in the society, and the way the social roles were assigned to the agents) are obliged to behave in certain ways, and forbidden to behave in some other ways.

Implicit in the structural conception of organization is, thus, a dynamical notion of social equilibrium (or, disequilibrium), which - in this paper - we restrict to the notion of equilibrium (or, disequilibrium) of social exchanges, as defined by Piaget.

3.3 DEFINITION OF SOCIAL ORGANIZATION

For the purposes of this paper, the notion of social organization can be defined as:

Definition 1The organization of a society is a structure O = (A, F, Ro, E, BV, Ru) where: A is the set of agents; F is the set of services (functions) that agents, and sets of agents, should provide for each other; Ro is the set of social roles that agents may be assigned to; E is the set of social exchanges that agents may perform between them; BV is the set of balances of exchange values that supports the various ways agents may evaluate social exchanges; Ru is the set of social rules that regulate the agents' behaviors. 7 7 We let undetermined the details of the sets in Def. 1, in order to have an abstract notion of social organization, which may be instantiated in various ways, in various applications.

To the main elements of the organization of a society, given in the definition above, a few more complementary elements should be added, so that the dynamics of the organization can be explained:

- the set IBeh of all possible individual behaviors of all agents of the society, so that to each agent corresponds the set of individual behaviors that it is capable of realizing, given by IB : AIBeh;

- the way the set E of all possible social exchanges are related to the subset of agents that are capable of realizing them together, given by Cap : ℘(A) → E;

- the way each social function is implemented by a set of agents in the form of a social exchange, given by I : F × ℘(A) → E;

- the way each social role determines the individual behavior of the agent to which it is assigned, regarding the performance of a social function, given by the function P : F × RoIBeh;

- the way each agent evaluates the performance of an exchange, given by the function Eν : E × ABV

- the way each social rule determines the permitted, obligatory and forbidden behaviors of agents in a social exchange, according to the balance of exchange values assigned to the social exchange, regarding the performance of a social function, given by Ru : F × A × E → ℘(IBeh × {p, o, f}).

The detailed exploration of the connections among the main and complementary elements of a social organization, as defined above, are out of the scope of this paper. We concentrate just on the connections that allow the understanding of our notion of dynamics of exchange values, leading to the concept of exchange value-based social control, and its provisory centralized form, the supervisor of social equilibrium.

3.4 THE DYNAMICS OF EXCHANGE VALUES

Although Piaget, in his sociological works, did not develop his notion of dynamics of an organization, both in his works on the Epistemology of Biology [28] and in his psychological works on the equilibration of cognitive structures [29], that notion is well exposed. We summarize here that piagetian cybernetics,con-centrating just in the concepts that directly apply to our problem.

In any dynamical system where a notion of equilibrium can be defined, two related concepts immediately apply, namely, the concepts of deviation and compensation. Deviation is any action that may happen in a system and lead it to disequilibrium, that is, away from equilibrium. Compensation is any action that may happen in a system, when it is in disequilibrium, and lead it back to equilibrium.

Regulation is the process of determining which compensation should be performed, at a given moment, to compensate a deviation, when the system is in disequilibrium. In Homan's terms [19], Piaget's process of regulation is a social control process that aims to keep the society stable in a state of equilibrium.

Piaget applies such ideas both to the synchronic regulation of a system's functioning (where the structure of the system is kept essentially unchanged) and to the diachronic regulation of the development of the system's structural organization, when the system is in a process of development. To the first, he assigned the name minor equilibration, while the latter he called major equilibration. In our case, we focus on the synchronic functioning of a social system, so that only minor equilibrations will be considered.

We consider social systems where the interactions are seen as exchanges of services, and whose equilibrium is defined on the basis of balances of values associated to such exchanges. In such systems, deviations are actions (performances of services) whose evaluations lead the system to a state where the balances of exchange values are such that at least one of the equilibrium rules Iab,IIaband IabIIabis not satisfied. A compensation is a performance of a service that may lead the system back to a state where the balances of exchange values are such that those rules are satisfied.

Social rules specify a mechanism of social control by stipulating that, for each state of disequilibrium, the kind of action that should be performed in order to re-establish the equilibrium of the system. Two kinds of such actions are possible: punishment and reciprocation. Punishment is an action by which an individual suffers some lost in order to be reinforced towards not repeating the deviation action again.

Reciprocation is an action by which an individual is forced to perform a service for the other, in order to compensate him for some service the latter had previously performed for him. Reciprocation is the fundamental operation of compensation, in Piaget's model of regulation of social exchanges.

Of course, everything that has been said here in connection to social control processes that aim to stabilize the society or organization in a state of equilibrium, also applies to social control processes that aim to keep the society or organization stable in a state of disequilibrium 8 8 See [12] for a discussion of the role of social control processes in artificial societies. .

3.5 SUPERVISORS OF SOCIAL EQUILIBRIUM

In general, such exchange values-based mechanism of social control may be put to operate in two main ways. On the one hand, social rules may be enforced by authorities, which have the capacity to push the agents of the society to follow such rules.

On the other hand, social rules may be internalized by the agents, so that agents follow such rules because they are incorporated into the agents' behaviors.

Typical of the social rules enforced by authorities are the juridical rules of a society (laws, statutes, organizational norms), dealing with rights and duties. Typical of the internalized rules are the moral rules, dealing with permissions and obligations.

In the following, as a preparatory step to a future study of decentralized social control mechanisms based on social rules internalized in agents, we introduce a centralized version of such mechanism. We consider the notion of supervisor of social equilibrium, a component of the society (possibly an agent) that is able to determine, at each time, the set of compensation actions that may be performed in order to bring the social system back to the equilibrium (or, disequilibrium), regarding the balances of exchange values, and that may recommend some of such actions to the agents of the system, in order to get that equilibrium (or, disequilibrium).

Obviously, supervisors of social equilibrium do not implement moral rules, because moral rules, in the sense defined above, can only be implemented inside the agents themselves, not in a component of the society that is external to them.

So, supervisors of social equilibrium implement juridical rules (laws, norms). However, they do so in a way that they are not law enforcers, because we don't require that agents follow the recommendations: agents are allowed to autonomously decide if they are to follow, or not, any given recommendation.

4 A MODEL SUPERVISOR OF SOCIAL EQUILIBRIUM

In this section, we introduce a formal model for a supervisor of social equilibrium, which is able to implement a regulation process for the equilibration (or, disequilibration) of exchange values.

The regulation process is embedded in the equilibrium supervisor in the form of a recommendation policy, which determines for each kind of exchange values state an appropriate compensation action.

However, we don't consider here the connection between the recommendation policy that implements the regulation process, and the expression of such regulation process in the form of juridical rules. That is, everything works in the model as if the juridical rules were pre-compiled into its recommendation policy, by way of the reward function of the Markov Decision Process solved by the equilibrium supervisor.

Further work is necessary on this compilation process both to understand the connection between the language of juridical laws and its translation in terms of exchange values, and to allow for a dynamic compilation process, capable of supporting dynamic changes in juridical laws, in order to model organizations supporting evolutive juridical systems.

4.1 USING INTERVAL MATHEMATICS FOR REPRESENTING SOCIAL VALUES

Interval Mathematics is a mathematical theory introduced in the 1960's by Moore [26] that aims at the automatic and rigorous control of the errors that arise in numerical computations. Any real number xR is represented by a real interval X = [x1, x2], with x1, x2R, such that x1< x < x2. x1 and x2 denote, respectively, the left and right endpoints of a real interval X. The set of real intervals is denoted by IR.

The arithmetical operations *IR ∈ {+, -, × ÷} are defined on IR as X *IRY = {x * y | xX, yY}9 9 Whenever it can be understood from the context, we shall not use the notation to distinguish interval operations. , and they can be explicitly calculated by [27]:

Addition: X + Y = [x1 + y1, x2 + y2]

Subtraction: X - Y = [x1 - y1, x2 - y2]

Product: X × Y = [minρ, maxρ],

with ρ = {x1y1, x1y2, x2y1, x2y2}

Quotient: X ÷ Y = [minσ, maxσ],

with and 0 ∉ Y.

The interval arithmetic operations satisfy the monotonic inclusion property, for all X' ,X",Y',Y"IR [1, 27]: X'X",Y'Y"X' * Y'X" * Y". This property plays an important role in the usage of Interval Mathematics to compute with real numbers that are uncertain for some reason (e.g., if they are obtained by a measuring instrument with limited resolution) or that are not exactly representable in a floating point system (e.g, the numbers π, 0.1, ).

A machine interval has floating point numbers as endpoints and outward roundings are used to guarantee that the resulting output interval of any interval computation process contains the actual result [17]. Besides that, the range of the output interval is the indicative of the maximum error that may have occurred in the whole process.

Interval Mathematics has also been applied to represent other kinds of uncertainty rather than numerical uncertainty [16], with applications in Artificial Intelligence, Soft Computing etc.

In this paper, intervals are used to capture the qualitative nature of Piaget's concept of scale of exchange values [31]. The chosen representation is a compromise between a purely qualitative and a purely quantitative representation. It makes the representation mathematically operational, and the decision process computationally viable, without being unfaithful to Piaget's approach.

Let IRL = {[x1,x2] | -L < x1< x2< L} be the set of real intervals bounded by LR (L > 0) and let IRL = (IRL, +, Θ, ~, ≈)bea scale of interval exchange values, where:

- +: IRL × IRLIRL is the L-bounded addition operation defined by

X + Y = [max{x1 + y1, -L}, min{x2 + y2, L}].

- A null value is any XIRL such that mid(X) = 0, where is the mid point of X. The set of null values is denoted by Θ. 0 = [0,0] is the absolute null value.

- A quasi-symmetric value of XIRL is any interval X'IRL such that X + X' ∈ Θ. The set of quasi-symmetric values of X is denoted by .

- ≈ is the qualitative equivalence relation defined by XY ⇔ ∃Y': X + Y' ∈ Θ.

μ is said to be the least quasi-symmetric value of X if whenever there exists S it holds that d) < d(S), where d(X)= x2 - x1 is the diameter of X. For all XIRL, it follows that:

Proposition 1(i) = {-[mid(X)-k,mid(X)+k] | kR ^ k > 0}; (ii) μ = -[mid(X),mid(X)].

Proof. It holds that mid(X +[-(mid(X) + k), -(mid(X) - k)]) = = 0. Consider S such that mid(S) ≠ mid(X). For k1k2R, it follows that mid(X + S) = mid(X + [-(mid(X) + k1), -(mid(X) - k2)]) = = ≠ 0, which is a contradiction. It follows that any quasi-symmetric value of X is of the form [-(mid(X)+k), -(mid(X) - k)] and, when k = 0, μ =[-mid(X), -mid(X)] is obtained. □

In practical applications, due to the rounding errors that arise in any numerical computation, it is usually not possible to verify wether or not a computed interval is precisely the absolute null value 0 [1, 17, 18, 27]. Then, we shall introduce the concept of absolute e-null value 0є = [-є, +є], with є ∈ R (є > 0) being a given tolerance. In this case, an e-null value is any XIRL such that mid(X) ∈ 0є. The set of e-null values is denoted by Θє. The related set of є-quasi-symmetric values for an interval value XIRL is denoted by є, and, from Prop. 1, it follows that the least є-quasi-symmetric value is given by

The qualitative equivalence relation (module є) is then defined by XєY ⇔ ∃Y'є: X + Y' ∈ Θє.

4.2 THE MODELLING OF SOCIAL EXCHANGES

Let T be a set of discrete instants of time. Let α and β be any two agents. A qualitative interval exchange-value system for modelling the exchanges from α to β is a structure IRαβ = (IRL; rIαβ, rIIβα, sIβα, sIIαβ, tIβα, tIβα, tIIβα, νIαβ, νIIαβ), where

are partial functions, called exchange-value functions, that evaluate, at each time instant tT, the investment, satisfaction, debt and credit values10 10 The values are undefined if no service is done at all at a given moment t ∈ T. , respectively, involved in the exchange. The symbol ⊥ denotes an undefined exchange value. In the following, we use the notation rIαβ(t) = , tIIβα(t) = , sIIαβ(t) = , sIβα(t) = , tIβα(t) = , tIIβα(t) = , νIαβ(t) = and νIIαβ(t) = .

For the exchange-value functions given in (2) and (3), at a given time instant t, the following constraints must be satisfied for every pair of agents α and β:

where:

- = ⊥ denotes that the agent α did not perform a service for the agent β at time t, and, therefore, all the other corresponding exchange values in the stage I resulted undefined;

- = ⊥ denotes that the agent α, at time t, did not charge the credit for a service previously done for the agent β, and, therefore, all the other corresponding exchange values in the stage II resulted undefined.

The implication (6) means that, according to the structure of social exchanges (Fig. 1), it is not possible for an agent α to perform a service for β and, at the same time t, to charge him a credit. From (6) it follows that it is also required that ≠ ⊥ ⇒ = ⊥.

A configuration of exchange values for any pair of agents α and β at a time instant t is specified by one of the tuples of well defined exchange values:

A social exchange process between two agents a and b of a multi-agent system, occurring during the time instants T = t1,...,tn, is any finite sequence of such configurations of exchange values et1 ,...,etn. Each element of this sequence is called a stage of the exchange process, which may be a stage of type I or type II.

The exchange balance of stages of type I of a social exchange process between any pair of agents a and b that has occurred during a time interval T is a tuple

where, for k = r,s, t, v,

for all and . The exchange balance of stages of type II, denoted by bfj^ , is defined analogously. The general exchange balance is then given by

The material results and of a social exchange process that happens between agents α and β, during the interval T, according to the points of view of a and b, respectively, are given by the sum of the respective material values involved in the process:

Analogously, the virtual results and are given by:

The general results take into account all kinds of exchanges values, and is obtained by:

A social exchange process between a pair of agents a and b is said to be in equilibrium (with tolerance ∈ > 0) if

The material equilibrium is achieved when

4.3 MODELLING SOCIAL EXCHANGES INVOLVING MULTIPLE AGENTS

In this section, a matrix-like notation is introduced to make possible the generalization of the results concerning the social exchanges between two agents, presented in Sect. 4.2, for the case of an organization composed by m agents. An m × m interval ∗-matrix [xij] is defined as the interval m × m matrix [xij] where xij = whenever i = j, that is:

A2 × 2 ∗-matrix is denoted schematically by an ordered pair

For two interval m × m ∗-matrices X = [xij] and Y = [yij], we define:

- The addition operation: X + Y =[xij + yij ].

- Any ★-matrix [nij] such that nij ∈ Θ is an e-null ★-matrix. The set of such ★-matrices is denoted by N.

- The qualitative equivalence relation (module ∈):

In a multi-agent system composed by m agents, the exchange values determined by the functions defined in (2) and (3) can be represented by the eight m × m × #T interval ★-matrices

called investment (K = R), satisfaction (K = S), debt (K = T)and credit (K = V) matrices for the social exchange stages I and II, respectively, that occurred between each two agents α and β in T. 11 11 To extend this representation to interactions between two groups A and B of agents one should take into account the sum total of the interactions of every agent of A with every agent of

For each are m × m interval ★-matrices. For a time sequence T = t1,...,tn, the four m × m matrices of global investment, satisfaction, debt and credit in T are given by

for K = R, S, T, V.

The exchange balance of stages of type I, given in (7), is represented by a tuple of ★-matrices:

Analogously, the exchange balance of stages of type II is represented as a tuple . Therefore, the general exchange balance can be represented by the tuple:

where, for

The material, virtual and general results of social exchange processes in a multi-agent system, given in (8-12), are then evaluated by

respectively.

Thus, a multi-agent system is said to be in equilibrium if

It is in material equilibrium when

If a social exchange process between any pair of agents α and β, which occurred in T, is not in material equilibrium according α's point of view, that is, then the least e-quasi-symmetric of given in (1), is said to be the compensation value from α's point of view. The compensation value for is the interval [0, 0]. A compensation ★-matrix for a ★-matrix of material results MT is the ★-matrix M' whose each entry is the compensation value of the corresponding entry of MT. It follows that

4.4 SOLVING THE EQUILIBRATION PROBLEM USING A QI-MDP

We conceive, in the context of a social exchange process in a multi-agent system, a special agent, called equilibrium supervisor, which analyzes the exchange processes between each pair of agents and makes suggestions of exchanges to each two agents in order to keep the material results of exchanges in equilibrium. The equilibrium supervisor also takes into account the virtual results of the exchanges in order to decide which type of exchange stage he shall suggest for the two agents.

To achieve that purpose, the equilibrium supervisor models the exchanges between each pair of agents as simultaneous Markov Decision Processes (MDP) [37], where the states of the models represent "possible material results of the overall exchanges" and the optimal policies represent "sequences of actions that the equilibrium supervisor recommends that the interacting agents execute".

In the following, the Qualitative Interval Markov Decision Process (QI-MDP) is introduced. An initial formulation of this model, considering a society of just two agents, was presented in [14].

4.4.1 THE BASIS OF A QI-MDP FOR A MULTI-AGENT SYSTEM

Consider an admissible tolerance ∈ > 0, a bound (L> 0) for the set of real intervals and (n> 0). Let Ê = {E-n,...,En} be the set of 2n +1 equivalence classes of intervals X , defined, for i = n,...,n,as:

The classes Ei Ê are the supervisor representations of classes of unfavorable (i < 0), equilibrated (i = 0) and favorable (i > 0) material results of exchange social processes. Whenever it is understood from the context, we shall denote by E- (or E+) any class

The accuracy of the equilibrium supervisor is given by The range of the midpoints of the intervals that belong to a class E* is called the representative of the class Ei. Whenever it is clear from the context, we shall identify a class Ei with its representative.

The states of the QI-MDP model are ★-matrices , where each each entry is the class representing the material results of the social exchange process between α and β, from the point of view of the agent α. For the analysis of the equilibrium, we shall consider each pair of co-related classes of material results .

The ★-matrix is the terminal state, representing that the system is in equilibrium. However, in some applications, it may be considered just a subset B of agents for the analysis of the equilibrium, even when all agents are involved in the exchanges processes. In this case, the terminal state is such that .

The actions considered in the model are state transitions

with , where is an interval ★-matrix operator such that mid . is an interval action that should be one of the following types:

- a compensation interval Cl, which is the least quasi-symmetric of a class representative E*;

- a go-forward-k-step interval Fk, which is an interval that transforms a class E* into Ei+k, with iL and i + k ≠ 0;

- a go-backward-k-step interval , which is an interval that transforms a class E* into Ei-k, with i = - L and i - k =0.

The set of compensation intervals, denoted by C,is shown in Table 1. The set F of go-forward intervals and their respective effects are partially presented in Table 2. The set of go-backward intervals, denoted by B, can be specified analogously.

For example, consider the case of a society with just two agents α and β. For the pair of classes of material results (using the notation given in (13))

it follows that the compensation-compensation action and the go-backward-3-go-forward+2 action are specified by

respectively, resulting in the following state transitions, with -n < i < -1 and 1 < j < n:

The equilibrium supervisor has to find, for each state [Ei]*, the action that shall achieve the terminal state or, at least, another state from where the terminal state can be achieved, with the least number of steps and least final value uncertainty12 12 By value uncertainty of a state [Ez] ★ we mean the diameters of the intervals Ez. . We observe that the choice of such actions are also regulated by the rules of the social exchanges, and, therefore, there are some state transitions that are not allowed.

Based on a optimal policy, the equilibrium supervisor may be asked to recommend that the agents act optimally. An optimal exchange recommendation consists of a function that gives, for each actual material result (represented by a state of the model), a partially defined exchange stage that shall restore or establish the material equilibrium or, at least, give conditions that it be achieved in a least number of steps with least value uncertainty. This partial definition shall be completed by the analysis of the virtual results, which allows the specification of which particulary types of exchange stages (I or II) should be considered.

Although the interacting agents acknowledge the optimal recommendations from the equilibrium supervisor, they are autonomous in the sense that they may not follow the recommendations exactly. The agents may have different personalities, interests, needs etc., which may lead them not to always follow the recommendations. This means that there is a probability that the system achieves another state different from the suggested by the supervisor and, therefore, there may be a great deal of uncertainty about the effects of the agents actions.

Even if the agents follow a recommendation exactly, we will show that the effect may not be the expected by the supervisor, since it depends on the ratio , where is the equilibrium supervisor accuracy and is the admissible tolerance. On the other hand, in this paper, we assume that there is never any uncertainty about the current state of the system, that is, the equilibrium supervisor always has access to the current configuration of exchange values and has complete and perfect abilities to evaluate the current material result.

Definition 2A Qualitative Interval Markov Decision Process (QI-MDP), for keeping in equilibrium the social exchanges in a multi-agent systems of m agents, is a tuple ,where:

- The set of the states of the model is the set of m × m star-matrices of classes of material results as specified in (14).

- The set of the actions of the model is the set of m × m star-matrices , of compensation , go-forward and go-backward intervals.

- F : E × A → II(E) is the state-transition function, which gives for each state and each action, a probability distribution over the set of states;

-R : (E × A)is the reward function, giving the expected immediate reward gained by choosing an action when the current state is .

In this model, the next state and the expected reward depend only on the previous state and the action taken, satisfying the so-called Markov property.

4.4.2 THE OPTIMAL POLICY AND THE REWARD FUNCTION

The reward function plays an important role when the equilibrium supervisor is choosing the action that will generate a recommendation of agents interaction, in each state. The supervisor aims to maximize the utility of sequences of actions, evaluated according to the reward function, thus trying to bring the system back to equilibrium (or keep it there) with the least possible number of state transitions.

A sample reward function R : (E × A) that conforms to the idea of supporting a recommendation function that is able to direct pairs of agents into social equilibrium is partially sketched in Table 3, using the notation given in (13). This particular sample function illustrates various requirements that should be satisfied by any reward function of the model.

Observe, for instance, that if the current state is of the type (E~,E+), then the best action to be chosen is the compensation-compensation action (C, C), which results in a state transition (E-,E+) (E°,E0). Any other choice should make the agents either take a long way to the equilibrium or get away from it.

On the other hand, if the current state is of type (E-,E-), then a compensation-compensation action ( C, C) would generate a recommendation of agent exchanges of satisfaction-satisfaction type, which is impossible according to the model of social interactions [33], since it is impossible for an agent to get a satisfaction value from no service at all. The reward function R should state that ( C, C) is a very bad action to be chosen in such situation.

Any optimal policy π* : E A solving the social equilibrium problem should satisfy the set of requirements expressed by the schema partially sketched in Table 4 (for a pair of agents). Notice that it is a non deterministic policy.

The optimal value recommendation associated to an optimal policy π is a ★-matrix operator ρp* that gives, for each state and optimal action , recommendations of exchange stages, possibly partially undefined, consisting of ★matrices whose elements in symmetric positions are either and or and , where (rδλ W) means the performance, by the agent λ, of a service with investment value W < 0, and (sδλ, W') means δ's satisfaction with interval value W'. The optimal value recommendation ρp*, corresponding to the the optimal policy shown in Table 4, is partially sketched in Table 5.

Finally, the equilibrium supervisor has to decide which types of exchange stages (I or II) should be recommended. This is done by the analysis of the virtual results from the points of view of each pair of agents α (given in (10)) and β (given in (11)):

- If vag > 0, then α is able to charge β the credit for services previously done. Thus, an exchange stage of type IIcg should be recommended.

- If vgc > 0, then it is the case that the agent β can charge α the credit for services previously done, indicating that an exchange stage of type IIgc should then be recommended.

- If vcg < 0, then the agent α does not have any credit to charge α. Therefore, the service done by αβ must be spontaneous. In this case, an exchange stage of type Igc should then be recommended.

- If vgc < 0, then the agent αβ does not have any credit to charge β, resulting that an exchange stage of type Icg should then be chosen.

Table 6 shows the criteria used by the equilibrium supervisor to reason about the possible stage recommendations, based on virtual results, according to the discussion presented in the paragraph above. Observe that these alternatives are not mutually exclusive. The final decision of which type of exchange stage shall be executed is let to the agents to decide.

The stage effects of the recommendations (Table 6) are sketched in the simplified state transition diagram shown in Fig. 2. The dot lines represent alternative paths to the equilibrium state that were not considered as optimal recommendations since they are considered impossible according to the structure of the exchange stages. The symbol ★ of the ★-matrices was omitted.


5 Analysis of the Model

For the analysis of the model, we consider a multi-agent organization whose agents are classified by the degree in which they follow the recommendations given by the equilibrium supervisor. The obedient agent always follows the recommendations; the disobedient agent may not follow the recommendations.

5.1 THEORETICAL RESULTS

The analysis concerns the reachability of the terminal state and the complexity of the regulation process in terms of number of steps that are necessary to achieve the equilibrium.

If no agent follows the recommendations of the equilibrium supervisor, the latter is unable to regulate the system. Its effectiveness increases with the number of agents that follow its recommendations.

Thus, for the theoretical analysis, we shall consider only the case in which the agents always follow the recommendations given by the equilibrium supervisor. We show that, even in this favorable case, the decision process is a non-trivial one, due the qualitative nature of exchange values and to the restrictions imposed by the definition of exchange, that always requires a service (with a definite cost) to be done in any exchange stage. However, we show that under some conditions, it is always possible to have the system equilibrated in at most four steps.

Consider a multi-agent system composed by just two obedient agents a and ( and use the notation given in (13). Let be the material results, at step t, of a social exchange process performed by the agents α and β. For a given tolerance where Kn is the equilibrium supervisor accuracy, the following results hold:

Proposition 2 If and , then the system achieves the equilibrium in one step if and only if .

Proof. Considering that the system is at the state (E-1,E ), then, for β's material result, it holds that , and the optimal recommendation (Table 5, R11) is based on the optimal action . It follows that:

where , with h > 1. If the system achieves the equilibrium in the step 1, then it holds that . It follows that , and therefore, , since . The proof for α's material result is analogous. The proof of (⇐) is almost straightforward. □

Proposition 3, with 1 < i < n, then it is possible to getin at most t = 2 steps if and only if ; (ii) If ,with -n < i < -1, then it is possible to get in at most t = 2 steps if and only if .

Proof. (i)(⇒) Considering that and that the optimal recommendation (Table 5, R3), thinking about each agent individually, is based on the optimal action , it follows that:

Therefore, it holds that G EC. From Prop. 2, it follows that with one more step we can get the desired result. The proofs of (i)(⇐) and (ii) are analogous. □

The following result is almost immediate:

Proposition 4 If is such that then .

From Prop. 3, it follows that an individual transition from a material result that belongs to a class Ei, with 1 < i < n or - n < i < - 1, to the equilibrium can be done in at most two steps (Ei E1 (or E-1) E°). However, in any interaction between two agents, combined transitions departing from a state (Ei, Ej) or (Ej, Ei), with 1 < i < n and -n < j < -1, may result in a state different from (E, E-1), (E-1, E) or (E°, ). We may have, for example, (E -1 ,E°), and, in this case, it will not be possible to get the equilibrium in one more step, since any compensation or go-forward action for α is not allowed without a corresponding β's service. The solution is then to have a transition to (E, E-1) and then, finally, to reach (E° ,E°). Thus, the overall process takes three steps.

The worst case is when the interaction presents material results that belong to the state (Ei, Ej), with -n < i, j < -1, since two simultaneous positive compensation actions (that would require a recommendation of satisfaction values for the two agents without any service at all) are not allowed. In this case, the optimal recommendation (Table 5) leads the agents to get the material equilibrium in at most four steps, by one of the following transitions:

5.2 SIMULATION RESULTS

Simulations of supervised social exchange processes were developed in the Python programming language, generating two types of reports: tables with the configurations of exchange values and material results at each time t T =0,... , 1000, and graphics showing the trajectory of the mid points of the material results of the exchanges between two agents of a (simulated) system. The material and virtual values that agents could use at each exchange stage were set to vary in the range -100 ... + 100. A tolerance of [-e, e] = [-25, 25] was adopted for the definition of the equilibrium point.

First, we considered simulations of social exchanges processes in organizations where the supervisor was inactive, Fig. 3, having exchange values bound to L = 2400). In those simulations, the social exchanges were totally random along all the time interval T. Then we awaked up the supervisor and let it make recommendations, which the agents followed or not, according to their degree of obedience to the supervisor.


In successive experiments, we increased the percentage of agent obedience to the supervisor, generating five different simulations: obedience during 1% of the time (Fig.4, exchange values bound to L = 1400), obedience during 25% of the time (Fig. 5, exchange values bound to L = 700), obedience during 50% of the time (Fig. 6, exchange values bound to L = 350), obedience during 75% of the time (Fig. 7, exchange values to bound L = 185) and obedience during 100% of the time (Fig. 8, exchange values bound to L = 105).






In the unsupervised exchange processes shown in Fig. 3, agents a and b present behaviors such that b profited from the interaction much more than a, which was forced to keep its material results at a negative level for the most part of the experiment. Considering the given tolerance, one finds that the system starts in equilibrium, but was unable to keep it.

Figure 4 shows a 1% supervised exchange processes. The figure shows that just such level of supervision is enough to make the two agents alternate their kinds of behaviors during the exchanges, thus avoiding that one of them profits from the interaction at the expense of the other. Also, the system was able to achieve the equilibrium in various opportunities (e.g., at t = 196 and at t = 893), but was kept yet in disequilibrium almost all of the time.

Figures 5, 6, 7 and 8 shows the increasing effect of the supervisor recommendations, according the increasing level of the agents' obedience to it. In particular, one can see that the range of deviations of the material results from the equilibrium was progressively reduced as the agents progressively adhered to the supervisor's recommendations.

The simple simulations that we produced seem to confirm well the theoretical predictions that we could derive from the supervisor model in section 5.1.

6 CONCLUSION

This paper introduced the QI-MDP version of the Markov Decision Process. The combination of interval-based modelling and qualitative approach to the comparison of values of the model made it well suited for solving the problem of keeping social exchanges in equilibrium.

From the point of view of Piaget's theory of social interactions, it means a sound way of making practical use of the INRC group of social exchanges that structure the social interactions and defines its equilibrium problem [9, 10].

The QI-MDP model is general enough to be applied to other problems, besides the problem of keeping social interactions in equilibrium. It can be adapted to model situations in which the social interactions should be kept stable, but in disequilibrium. This can be done by choosing a non null terminal state for the supervisor.

The model can also be applied to equilibrium problems of other kinds of systems, besides systems of social exchanges, if such systems have one single terminal (equilibrated, disequilibrated) state.

Regarding the notion of social control introduced in [23], and explored in [12] in connection to multi-agent systems, the social rules concerned with exchange values introduce the notion of exchange values-based social control, first analyzed in [19]. Thus, the present paper can be seen as a preliminary step in the computational formalization of such notion.

Immediate future work will be concerned with the case of an equilibrium supervisor that is not able to determine the material balance of social exchange processes with complete reliability (i.e., it is not allowed to know all the exchange values of the two agents). In this case, a partially observable Markov decision process (POMDP) shall be considered (see, p.ex., [15]), since the equilibrium supervisor shall be able to make external observations (also probabilistic) to help him to decide about the recommendations.

Further future works will deal with (i) the internalization of the model decision process introduced in the paper, in each agent of the organization, so that the mechanism of exchange values-based social control that it supports can be performed in a decentralized way, and (ii) further exploration of the role of exchange values in dependence-based agent interactions.

ACKNOWLEDGEMENTS

This work was partially supported by CNPq and FAPERGS.

The authors thank the anonymous referees for the valuable suggestions, specially the requirement to distinguish between equilibrium and order (stability).

  • [1] G. Alefeld and J. Herzberger. Introduction to Interval Computations. Academic Press, New York, 1983.
  • [2] L. Antunes and H. Coelho. Decisions Based Upon Multiple Values: the BVG Agent Architecture. In Proceedings of 9th Portuguese Conference on Artificial Intelligence, EPIA'99, Evora, 1999. P. Barahona and J.J. Alferes (Eds.), Springer Lecture Notes in Artificial Intelligence, Vol. 1695, pages 297-311, 1999.
  • [3] R.H. Bordini, A.C.R. Costa, J.F. Hübner, and R.M. Viccari. Linguistic Support for Agent Migration. In V. Lesser and L. Gasser (Eds.), Proceedings of the 1st International Conference on Multi-Agent Systems, San Francisco, page 441, 1995.
  • [4] A.C. Rocha Costa, J.M.V. Castilho, and D. Claudio. Functional Processes and Functional Roles in Societies of Computing Agents. In Proceedings of Brazilian Symposium on Artificial Intelligence, SBIA 93, Porto Alegre, pages 267-276, 1993.
  • [5] A.C.R. Costa, J.F. Hübner, and R.H. Bordini. On Entering an Open Society. In Proceedings of Brazilian Symposium on Artificial Intelligence, SBIA 94, Fortaleza, pages 535-546, 1994.
  • [6] A.C.R. Costa, J.M.V. Castilho, and D.M. Claudio. Toward a Constructive Notion Of Functionality. Cybernetics and Systems, 6(4):443-480, 1995.
  • [7] A.C.R. Costa and Y. Demazeau. Toward a Formal Model of Multi-agent Systems with Dynamic Organization. In J.W. Perram and J.-P Müller (Eds.), Proceedings of 2nd. International Conference on Multi-agent Systems, ICMAS'96, Kyoto, page 431, 1996.
  • [8] A.C.R. Costa. The Piagetian Theory of Social Exchanges and its Application to Learning Enviromnent. Informática na Educação: Teoria e Prática, 6(2):77-900, 2003. (in Portuguese)
  • [9] A.C.R. Costa and G.P. Dimuro. About INRC Groups of Social Exchange Stages: a Formal Critical Analysys of Piaget's Social Exchange Model. Technical Report, ESIN/UCPel, Pelotas, 2003. (available at http://gmc.ucpel.tche.br/valores,Mar. 2005, in Portuguese)
  • [10] A.C.R. Costa and G.P. Dimuro. The Case for Using Exchange Values in the Modelling of Collaborative Learning Interactions. In J. Mostow and P. Tedesco (Eds.), Proceedings of the 7th International Conference on Intelligent Tutoring Systems/2nd International Workshop on Designing Computational Models of Collaborative Learning Interaction, Maceió, pages 19-24, 2004.
  • [11] C. Castelfranchi, M. Miceli, and A. Cesta. Dependence Relations among Autonomous Agents. In E. Werner and Y. Demazeau (Eds.), Decentralized A.I.-3, pages 215-227. Elsevier, Amsterdam, 1992.
  • [12] C. Castelfranchi. Engineering Social Order. In A. Omicini, R. Tolksdorf, and F. Zambonelli (Eds.), Engineering Societies in the Agents World, pages 1-18. Springer, Berlin, 2000.
  • [13] Y. Demazeau and A.C.R. Costa. Populations and Organizations in Open Multi-Agent Systems. In Proceedings of 1st National Symposium on Parallel and Distributed AI, Hyderabad, 1996.
  • [14] G.P. Dimuro and A.C.R. Costa. Interval-based Markov Decision Processes for Regulating Interactions Between Two Agents in Multi-Agent Systems. In: Extended Abstracts of the Workshop on State-of-the-Art in Scientific Computing/Mini-Symposium on Interval Methods, pages 51-58, Lyngby, 2004. (the full paper will appear in Springer Lecture Notes in Computer Science)
  • [15] L.P. Kaelbling, M.L. Littman, and A.R. Cassandra. Planning and Acting in Partially Observabe Stochastic Domains. Artificial Intelligence, 101(1):99-134, 1998.
  • [16] R.B. Keafort and V. Kreinovich (Eds). Applications Of Interval Computations. Kluwer, Boston, 1996.
  • [17] U. Kulisch and W.L. Miranker (Eds). A New Approach to Scientific Computation. Academic Press, New York, 1983.
  • [18] U. Kulisch. Advanced Arithmetic for the Digital Computer, Design of Arithmetic Units. Electronic Notes in Theoretical Computer Science, 24, 1999.
  • [19] G.C. Homans. Social Behavior as Exchange. American Journal of Sociology, 63(1958):597-606. Reprinted in [21]
  • [20] G.C. Homans. Social Behavior - Its Elementary Forms. Harcourt, Brace & World, New York, 1961.
  • [21] G.C. Homans. Sentiments and Activities. The Free Press of Glencoe, New York, 1962.
  • [22] G.C. Homans and C.P. Curtis, Jr. An Introduction to Pareto - His Sociology. Alfred A. Knopf, New York, 1934.
  • [23] G.C. Homans. The Human Group. Harcourt, Brace & World, New York, 1950.
  • [24] W.S Lovejoy. A Survey of Algorithmic Methods for Partially Observable Markov Decision Processes. Annals of Operations Research,28(1):47-65, 1991.
  • [25] M. Miceli and C. Castelfranchi. The role of evaluation in cognition and social interaction. In K. Dautenhahn (Ed.), Human cognition and agent technology, pages 225-262. John Benjamins, Amsterdam, 2000.
  • [26] R.E. Moore. Interval Analysis. Prentice-Hall, Englewood Cliffs, 1966.
  • [27] R.E. Moore. Methods and Applications of Interval Analysis. SIAM, Philadelphia, 1979.
  • [28] J. Piaget. Biology and knowledge : an essay on the relations between organic regulations and cognitive processes. The University of Chicago Press, Chicago, 1971.
  • [29] J. Piaget. The development of thought: equilibration of cognitive structures. The Viking Press, New York, 1977.
  • [30] J. Piaget. L'Explication en Sociologie. In J. Piaget (Ed.), Introduction à la Epistémologie Génétique - Tome III: La Pensée Biologique, La Pensée Psychologique et la Pensée Sociologique. Presses Universitaire de Fance, Paris, 1950. Reprinted in [33]
  • [31] J. Piaget. Essay sur la Théorie des Valeurs Qualitatifs en Sociologie Statique ("synchronique"). In [32]
  • [32] J. Piaget. Études Sociologiques. Librairie Droz, Paris, 1965. English translation in [33]
  • [33] J. Piaget. Sociological Studies. Routlege, London, 1995.
  • [34] M.R. Rodrigues, A.C.R. Costa, and R. Bordini. A System of Exchange Values to Support Social Interactions in Artificial Societes. In Proceeding of the Second International Conference on Autonomous Agentts and Multiagents Systems, AA-MAS 2003, Melbourne, pages 81-88, 2003.
  • [35] M.R. Rodrigues and A.C.R. Costa. Using Qualitative Exchange Values to Improve the Modelling of Social Interactions. In D. Hales, B. Edmonds, E. Norling, and J. Rouchier (Eds.), Procedings of 4th Workshop on Agent Based Simulations, Melbourne, 2003. Springer Lecture Notes in Computer Science Series, Vol. 2927, pages 57-72, 2003.
  • [36] W.E. Walsh and M.P. Wellman. A Market Protocol for Distributed Task Allocation. In Proceedings ofthe Third International Conference on Multiagent Systems, Paris, pages 325-332, 1998
  • [37] D.J. White. Markov Decision Processes. Wiley, New York, 2002.
  • 1
    Values have been extensively used in the MAS area, through value-based and market-oriented decision, and value-based social theory, see e.g., [2, 36, 25]. The merits and interest in using Piaget's notion of exchange values is discussed in [34, 35], showing its complementarity to the dependence theory approach.
  • 2
    A scale of exchange values is a set of qualitative, ordinal values, which can be compared for their magnitudes (less than, equal, greater than), but cannot be algebraically operated in an unrestricted way, as fully quantitative values can. E.g., the values can be added or subtracted: if
    a < b and
    a < c then
    a < b + c. However, the differences between values cannot be compared: if
    a < b and
    c < d it is not possible to decide which of
    b - a < d - c or
    b - a =
    d - c, or else
    b - a > d - c, is true.
  • 3
    But this mutual dependence was not fully exploited by Piaget, who concentrated on stable ("static") societies.
  • 4
    The term
    virtual value refers precisely to the fact that they represent services that are yet to be performed, in the future.
  • 5
    Homans and Piaget were both simpathetic readers of Pareto, see [22] and [30].
  • 6
    We follow [11] in considering the dependence relations as one of the main reasons for the establishment of social exchanges.
  • 7
    We let undetermined the details of the sets in Def. 1, in order to have an
    abstract notion of social organization, which may be instantiated in various ways, in various applications.
  • 8
    See [12] for a discussion of the role of social control processes in artificial societies.
  • 9
    Whenever it can be understood from the context, we shall not use the notation
    to distinguish interval operations.
  • 10
    The values are undefined if no service is done at all at a given moment
    t
    T.
  • 11
    To extend this representation to
    interactions between two groups A and B of agents one should take into account the sum total of the interactions of every agent of
    A with every agent of
  • 12
    By value uncertainty of a state
    [Ez]
    we mean the diameters of the intervals
    Ez.
  • Publication Dates

    • Publication in this collection
      01 Dec 2010
    • Date of issue
      July 2005
    Sociedade Brasileira de Computação Sociedade Brasileira de Computação - UFRGS, Av. Bento Gonçalves 9500, B. Agronomia, Caixa Postal 15064, 91501-970 Porto Alegre, RS - Brazil, Tel. / Fax: (55 51) 316.6835 - Campinas - SP - Brazil
    E-mail: jbcs@icmc.sc.usp.br