1. INTRODUCTION

The issue of time inconsistency has been extensively studied in the monetary policy
literature following ^{Kydland and Prescott (1977)}
and ^{Barro and Gordon (1983)}. A central result
emerging from these studies is that monetary policy will be more effective if it is both
transparent and credible. Meanwhile, a credible monetary policy is believed to feature
pursuit of an explicit target, and authors such as ^{Svensson (1999)} and ^{Ball, Mankiw, and Reis
(2005)} claim that the optimal monetary policy entails targeting the price
level. Moreover, there is robust historical and empirical evidence for several countries
that the credibility of the monetary authority changes over time frequently and
significantly (^{Bordo & Siklos, 2014}, ^{2015}). In this context, this paper derives a
best-reply monetary policy when price-setters’ confidence in the commitment of the
monetary authority to price level targeting is sticky and, therefore, may be incomplete.
As the optimal price of an individual firm depends positively upon the other firms’
prices, there is therefore strategic complementarity in price setting. While the
corresponding foresight problem can be solved for sure by paying an optimization cost, a
rule of thumb, costless but unsure strategy is to confidently use the price level target
periodically announced by the monetary authority in seeking to set the optimal
individual price.

In an innovative approach, the frequency distribution of foresight strategies across firms (or the state of confidence in the officially announced price level target) follows evolutionary dynamics, with finite-horizon firms periodically revising their foresight strategies in response to the payoffs they earn. Meanwhile, the infinite-horizon monetary authority uses best-reply monetary policy to manage the confidence held by price setters. As it turns out, when such an evolutionary game of conquering confidence is subject to exogenous perturbations analogous to mutations in a biological system, confidence is not fully conquered but the price level target is nonetheless continuously achieved. Absent such exogenous perturbations, however, the continuous reaching of the price level target ensures that complete confidence (or full credibility) is eventually conquered. Therefore, our central analytical result, which carries significant policy implications, is that complete confidence (or full credibility) is not a necessary condition for reaching a price level target even when heterogeneity in firms’ price level expectations is endogenously time-varying and may emerge as a long-run equilibrium outcome. In fact, in the absence of exogenous perturbations to the dynamic of confidence building, it is the reaching of a price level target for long enough that, due to stickiness in the state of confidence, rather ensures the conquering of full credibility.

2. STRUCTURE OF THE MODEL

Considera monopolistically competitive economy populated by a continuum of firms as in
^{Ball and Romer (1991)}. In their model money is
introduced by assuming that it is a means of exchange required for transactions, which
allows taking the money supply as a proxy for the nominal aggregate demand. We draw on
the model developed in ^{Ball and Romer (1991)} due
to its focus on the product market (the economy is populated by yeoman farmers who sell
differentiated goods produced with their own labor and purchase the products of all
other farmers) and its assumption of a continuum of firms. This is a more convenient
formal structure for the evolutionary game-theoretic modeling developed in what follows,
which is based on random pairwise comparisons of payoffs by a large measure of
firms.

As in this monopolistically competitive economy the optimal individual price depends
positively upon the other firms’ prices, there is strategic complementarity in price
setting. A fraction *λ* ∈ [0,1] ⊂ of price setters, by not trusting
the monetary authority’s commitment to price level targeting, rather solves the same
foresight problem by paying a cost to perfectly predict the aggregate price level and,
accordingly, set the optimal individual price (we call them non-confident firms). What
we are suggesting here is that: (a) in general, by specifying transparent policy rules,
policy makers can perform an intrinsically useful function by creating conventional
anchors for expectations formation; and (b) price level target-the pursuit of a
desirable price level enshrined in a clearly announced target that policy makers
credibly and accountably commit to achieve-exemplifies the creation of a conventional
anchor for expectations by policy makers. The remaining fraction 1 -
*λ* of price setters, by not trusting the monetary authority’s
commitment to price level targeting, rather solves the same foresight problem by paying
a cost to perfectly predict the aggregate price level and, accordingly, set the optimal
individual price (we call them non-confident firms).

In fact, ^{Diron and Mojon (2005)} use data for
seven inflation targeting countries to provide evidence that the forecast error incurred
when assuming that future inflation will be equal to the inflation target announced by
the central bank is typically at least as small as (and often smaller than) the forecast
errors of model-based and published inflation forecasts. Meanwhile, ^{Brazier, Harrison, King, and Yates (2008)} develop a
model in which agents use two heuristics to forecast inflation: one is based on
one-period lagged inflation, the other on an inflation target announced by the central
bank (which is the steadystate value of inflation). Agents switch between these
heuristics based on an imperfect assessment of how each has performed in the past.
Agents observe such performance with some noise, but the better the true past
performance of a heuristic, the greater chance there is that an agent uses it to make
the next period’s forecast. The authors find that, on average, the majority of agents
use the inflation-target heuristic, even though there are times when everyone does, and
times when no one does. While ^{Brazier et al.
(2008)} embed those two forecasting heuristics in a monetary
overlapping-generations model and heuristic switching is described by a discrete choice
model, in this paper a different pair of forecasting heuristics is embedded in a
macroeconomic model featuring a monetary authority that conducts a bestreply monetary
policy and private decision makers switch between heuristics based on evolutionary
dynamics with and without exogenous perturbations analogous to mutations in a biological
system. De Grauwe (2011) develops a macro model in which agents have cognitive
limitations and use simple but biased heuristics to forecast future inflation. The
author follows ^{Brazier et al. (2008)} in allowing
for two inflation forecasting rules. One heuristic is based on the announced inflation
target, while the other heuristic uses last period’s inflation to forecast next period’s
inflation. The market forecast is a weighted average of these two forecasts, with these
weights being subject to predictor selection dynamics based on discrete choice theory.
While ^{De Grauwe (2011)} formulates an extended
three-equation model generating endogenous and self-fulfilling waves of optimism and
pessimism, the model of this paper explores whether there is convergence towards an
equilibrium consistent with the price level targeted by policy makers when private
decision makers switch between forecasting heuristics based on evolutionary
dynamics.

This paper is also related to ^{Arifovic, Dawid,
Deissenberg, and Kostyshyna (2010)}, who investigate the role of announcements
by the policy maker as a means to sustain a Pareto superior macroeconomic outcome. Each
private agent can choose in any period between two strategies: believe, that is, act as
if the policy announcement was true; or not believe, and compute the best possible
forecast of the policy maker’s next action. In each period, word of mouth information
exchange allows a fraction of the agents to compare their last-period payoffs with the
ones obtained by agents who followed the other strategy, with each agent then adopting
the strategy that provided the highest payoff. Therefore, the proportion of believers
may change over time and can be interpreted as a measure of the policy maker’s
credibility. However, while in ^{Arifovic et al.
(2010)} the policy maker has limited abilities for dynamic optimization and
forecasting, and uses individual evolutionary learning to improve his strategy, in this
paper the policy maker performs dynamic optimization knowing the evolutionary dynamics
that govern the distribution of forecasting strategies in the population of agents.
Moreover, in this paper a slightly different pair of forecasting strategies is embedded
in a macroeconomic model featuring a monetary authority that computes the best-reply
monetary policy and private agents switch between strategies based on evolutionary
dynamics with and without mutation. Also, while the dynamic extension set forth in ^{Arifovic et al. (2010)} is an agent-based model of a
more complex environment whose results have to be derived through simulation, in the
model below all results are derived analytically.

The optimal individual price, *P _{n}*, which is the price set by
non-confident firms at a strictly positive prediction cost, is given as in

^{Ball and Romer (1991}, p.542, eq.11), varying positively with the actual aggregate price level

*P*and the (publicly known) nominal stock of money,

*M*:

where *p _{n}* = ln

*P*= ln

_{n}, p*P, m*= ln

*M*and

*Φ*∈ (0,1) ⊂ is a constant denoting the elasticity of each individual price with respect to the actual aggregate price level. Given the functional form of the utility function adopted by

^{Ball and Romer (1991}, p.540, eq.1), the parameter

*Φ*depends positively on the elasticity of substitution between any two goods and negatively on the rate of change of the marginal disutility of labor.

Meanwhile, confident firms follow the costless but unsure strategy of using the price
level target periodically announced by the monetary authority,
*P ^{T}*, to predict the optimal individual price,

*P*:

_{c}where *p _{c}* ≡ ln

*P*and

_{c}*p*≡ ln

^{T}*P*. Therefore, while in

^{T}^{Ball and Romer (1991)}the alternative strategy to paying an exogenously fixed cost of adjusting prices is to keep prices unchanged, here the corresponding strategy is to set prices using the other publicly announced policy variable (whose reach, however, unlike any implicit target for the publicly known

*m*, can be confirmed only after all prices have been set).

^{1}

Based on ^{Ball and Romer (1991}, p.541, eq.6), the aggregate price level,
*P*, is approximated by the geometric average of the price set by
confident firms, *P _{c}*, and the price set by non-confident
firms,

*P*, that is,

_{n}^{2}

Substituting (1) and (2) in (3), we obtain:

Therefore, the aggregate price level is a weighted average between
*p ^{T}* and

*m*. Note that as

*λ*converges to one, the aggregate price level converges to

*p*=

*Φp*+ (1 −

^{T}*Φ*)

*m*(or, equivalently,

*P*= (

*P*)

^{T}^{Φ}M

^{1−Φ}). Meanwhile, as

*λ*converges to zero, the aggregate price level converges to

*p*=

*m*(or, equivalently,

*P*=

*M*), which is (per (1)) the symmetric Nash equilibrium price. Consequently, when all firms confidently come to use the price level target announced by the monetary authority in seeking to establish the optimal individual price

*λ*= 1, the monetary authority has to set

*p*=

_{T}*m*for for the symmetric Nash equilibrium price to obtain.

While the distribution of foresight strategies (*λ*, 1 −
*λ*)is given in the short run, it varies over time according to
evolutionary dynamics, with firms periodically revising their strategies in response to
changes in expected payoffs. A non-confident firm, by setting the actual optimal
individual price, does not suffer any loss caused by foresight errors. However, as a
non-confident firm faces a fixed cost *c* > 0 to perfectly predict the
aggregate price level, it suffers a loss given by:

Meanwhile, a confident firm is subject to a quadratic loss by using the official aggregate price level target to predict the optimal individual price. Using (1), (2) and (4), the loss of a confident firm can be expressed as follows:

By taking (5) and (6) as the expected payoffs of the existing foresight strategies, we
obtain the following replicator dynamic:^{3}

The intuition underlying the above expression is the following. In every revision
period, each firm with probability one (for simplicity) learns the payoff to another
randomly chosen firm and changes to the other’s foresight strategy if it perceives that
the other’s payoff is higher. As the expression between brackets in (7) is the
difference given by *L _{c}* −

*L*, the proportion of confident firms increases (decreases) if the module of the expected loss of confident firms is smaller (larger) than the cost to perfectly predict the aggregate price level. Under the replicator dynamic (7), to put it alternatively, the frequency of a foresight strategy increases exactly when it has above-average payoff.

_{n}Moreover, we assume that the evolutionary dynamic (7) operates in the presence of
disturbances analogous to mutations in natural environments. In a biological setting,
mutation is interpreted literally, consisting of random changes in genetic codes. In
economic settings, as pointed out by ^{Samuelson
(1997}, ch.7), mutation refers to a situation in which a player refrains from
comparing payoffs and changes strategy at random. Hence the present model features
mutation as an exogenous disturbance in the evolutionary selection mechanism (7) leading
some firms to choose a foresight strategy at random. This disturbance component is
intended to capture the effect, for instance, of exogenous institutional factors such as
changes of administration in the monetary authority or other changes in the
policy-making framework (which nonetheless do not involve an abandonment of the price
level targeting regime).^{4} A question that then
arises is whether the occurrence of such an exogenous disturbance (or noise) precludes
the continuous reaching of the price level target. As shown shortly, the answer is no.
Yet the long-run equilibrium distribution of foresight strategies does depend on whether
the evolutionary dynamic (7) is so perturbed.

Drawing on ^{Gale, Binmore, and Samuelson (1995)},
mutation can be incorporated into the selection mechanism (7) as follows. Let
*θ* ∈ (0,1) ⊂ be the number (measure) of mutant
firms that choose a foresight strategy in a given revision period independently of the
related payoffs. Therefore, there are *θλ* confident firms
and *θ*(1 − *λ*) non-confident firms acting
as mutants. We assume that mutant firms choose one of the two foresight strategies with
the same probability, so that there are *θλ* confident mutant firms and
*θ*(1 − *λ*) non-confident mutant firms changing
strategies. The net flow of mutant firms becoming confident firms in a given revision
period, which can be either positive or negative, is the following:

This noise can be incorporated to the evolutionary selection mechanism (7) to yield the following noisy (perturbed) replicator dynamic:

Therefore, for given values of *p ^{T}* and

*m*, the state transition of the economy is determined by the dynamic system (9), whose state space is [0,1] ⊂ .

3. BEST-REPLY MONETARY POLICY AND EQUILIBRIUM DISTRIBUTION OF FORESIGHT STRATEGIES

The monetary authority is assumed to aim to minimize
∫_{0}^{Շ}(*p -
p ^{T}*)

^{2}

*e*d

^{−θt}*t*, where

*Շ*> 0 is the terminal planning time and

*θ*> 0 is a constant and exogenously given intertemporal discount rate. Using (4), the monetary authority’s objective functional can be expressed as

Therefore, the best-reply monetary policy is derived as the path of control variable
*m* that minimizes the objective functional (10) subject to the noisy
replicator dynamic (9), given the initial proportion of confident firms, denoted by
*λ*_{0} = *λ*(0), and the
terminal time *Շ*.^{5}

The current-value Hamiltonian of this optimal control problem is given by

where *μ* is the current-value costate variable.

Using the Maximum Principle (MP), we can determine the path of *m* that
minimizes *H _{c}* at each moment of time. Given that

*m*∈

_{++}, we can use the first-order condition for an interior minimum:

The above condition excludes *m* ≠ *p ^{T}*
for all

*t*∈ [0

*,Շ*] ⊂ . Indeed, given that for all

*Φ*∈ (0, 1) ⊂ and

*λ*∈ [0, 1] ⊂ , if

*m*≠

*p*for all

^{T}*t*∈ [0

*,Շ*] ⊂ , the signal of the derivative of the integrand in (10) with respect to the state variable is strictly negative. In sum,

Based on Lemma 3.1 stated in ^{Caputo (2005}, p.56),
it follows from (13) that *μ*(*t*) < 0 for all
*t* ∈[0,Շ)⊂ . Considering this fact and the
transversality condition given by the free terminal boundary condition
*μ*(*Շ*) = 0 (as established in ^{Caputo 2005}, p.32), it then follows that 1 −
*μ*(1− *θ*)*λ*(1 −
*λ*)*βΦ*^{2} > 0 for
all *β* > 0, *Φ* ∈ (0,1) ⊂
,
*λ* ∈ [0,1] ⊂ , and *t* ∈
[0*,Շ*] ⊂ . Therefore, the inequality given by
*m* ≠ *p*^{T} for all *t*
∈ [0*,Շ*] ⊂ implies that . To put it another way, with
*m* ≠ *p ^{T}* along the optimal path of
the control variable the first-order condition (12) would not be satisfied.

Meanwhile, if

the inequalities in (13) become equalities for all *t* ∈
[0*,Շ*] ⊂ . Thus, it follows from the Lemma 3.1
recalled above that

Given (15), the control-variable path (14) satisfies the necessary condition (12).

Indeed, the control-variable path (14) minimizes (11) given that

Substituting (14) into (4), it follows that the aggregate price level is given by

The equation of motion for the state variable *λ* is obtained from
the following MP’s condition:

which is the noisy replicator dynamic (9). Given the optimal path of the control variable (14), (9) becomes

The above expression implies that there is a unique equilibrium value for
*λ*, which is given by

The path of the costate variable, given by (15), satisfies the transversality condition
for an optimal control problem with finite-horizon planning and free terminal state,
given by *μ*(*Շ*) = 0. Moreover, if (15)
holds for all *t* ∈ _{+}, the transversality
condition for an infinite-horizon planning and free terminal state, given by
lim* _{t→∞}
μ*(

*t*)

*e*

^{−θt}= 0, is satisfied as well. And the other corresponding transversality condition given by lim

*(*

_{t→∞}H*t*) = 0 is also satisfied. In fact, the present-value Hamiltonian can be alternatively expressed as

*H*=

*H*. Given that

_{c}e^{−θt}*m*=

*p*(

^{T}, μ*t*) = 0, and

*λ*∈ [0,1] ⊂ for all

*t*∈

_{+}, it then follows from (11) that

*H*= 0 for all

_{c}*t*∈

_{+}, so that lim

_{t→∞}

*H*(

*t*) = lim

_{t→∞}

*H*

_{c}e^{−θt}= 0.

The graph of the function *g*(*λ*), given by (18),
is a concave-down parabola with intercept given by *g*(0) =
*θ/*2 > 0. This quadratic function has two distinct real
roots, one of them being

and the other one being given by (19). Therefore, for all *λ*
∈ [0*,λ*^{*}) ⊂ it follows that =
*g*(*λ*) > 0, while for all
*λ* ∈ (*λ*^{*},1] ⊂
it follows that
=
*g*(*λ*) < 0, so that the equilibrium point
*λ*^{*} ∈ (0,1) ⊂ is asymptotically stable.

Consequently, the best-reply monetary policy is to set *m* =
*p ^{T}* at each moment of the planning horizon, which
ensures

*p*=

*p*all along. We can interpret this best-reply policy-making in two alternative ways, according to which variable is assigned the role of policy instrument (or control variable). First, for an exogenously given aggregate price level target, the best-reply monetary policy is to set

^{T}*m*=

*p*throughout. Second, for an exogenously given nominal money supply, the best-reply monetary policy-making is to set

^{T}*p*=

^{T}*m*all along.

Meanwhile, the fact that the best-reply monetary policy given by *m* =
*p ^{T}* ensures

*p*=

*p*all along even when imperfect credibility emerges as an asymptotically stable equilibrium outcome, which is the case when there is mutation, as given by (19), does not mean that the monetary authority does ignore or can afford to ignore the importance of conquering credibility for monetary policy. In fact, the monetary authority is permanently striving to conquer or maintain the highest possible credibility, and a clear and reliable evidence of such confidence-building behavior is that the best-reply monetary policy is derived as the path of the nominal stock of money that minimizes the objective functional (10) subject to the noisy replicator dynamic (9) that governs the evolution of the state of confidence across price setters.

^{T}Furthermore, if we draw once again on ^{Ball and Romer
(1991)} and incorporate to the model an aggregate demand relation given by
γ = *m* − *p*, where γ is the logarithm of
the real output,^{6} we find that the best-reply
monetary policy yields an equilibrium real output gap which is zero. In fact, it follows
from (4) that the best-reply monetary policy given by *m* =
*p ^{T}* implies that the equilibrium price level is given
by

*p*=

*m*, so that the equilibrium log real output is zero (and, therefore, the equilibrium level of real output is one). As a result, the equilibrium real output is the natural real output.

^{7}

Meanwhile, when the evolutionary dynamic driving the distribution of foresight
strategies in the population of price setters is subject to exogenous perturbations, the
unique equilibrium point *λ*^{*} ∈ (0,1) ⊂
is asymptotically
stable. Even though confidence is never fully conquered, the price level target is
nonetheless continuously achieved. Meanwhile, absent perturbations, which is equivalent
to setting *θ* = 0, the equation of motion (18) yields two
equilibrium solutions, *λ*^{**} = 0 and
*λ*^{***} = 1. As the graph of the function
*g*(*λ*) under *θ* = 0 is
a concave-down parabola, it follows that the equilibrium point
*λ*^{**} = 0 is unstable, while the equilibrium point
*λ*^{***} = 1 is asymptotically stable.

Therefore, when the evolutionary selection dynamic is not perturbed by the interference
of mutations, the continuous achievement of the official price level target comes to
ensure that complete confidence (or full credibility) is eventually conquered.
Paraphrasing how ^{Samuelson (1997}, p.3) aptly
describes an equilibrium emerging from rule-of-thumb behavior through evolutionary
dynamics:^{8} an equilibrium solution with the
aggregate price level target being achieved does not appear because all price setters
are fully confident, but rather price setters eventually become fully confident because
a best-reply monetary policy ensuring the achievement of that target as an equilibrium
solution has been followed for long enough. Though derived in a specific macroeconomic
setting, this analytical result due to confidence stickiness has broader implications
for the design of monetary policy in pursuit of price stability. One such implication is
that setting a price level target matters more as a means to provide monetary policy
with a sharper focus on price stability than as a device to conquer credibility. As for
the conquering of credibility for monetary policy, it turns out that actions speak
louder than words, as the continuing achievement of price stability is what ultimately
performs better as a confidence-building device. Thus, the role of central banking as
management of expectations (^{Woodford, 2004}) is
performed in the context of stickiness in the state of confidence in the monetary
policy-making held by price setters, which implies that the best-reply monetary policy
has to be formally and explicitly derived also subject to such stickiness constraint
(recall that the best-reply monetary policy (14) is derived subject to the noisy
replicator dynamic (9) that drives the evolution of the frequency distribution of
confidence across price setters).

4. CONCLUSIONS

This paper contributes to the literature on the importance of the public’s confidence in the monetary policy stance by deriving a best-reply monetary policy when price-setters’ state of confidence in the commitment of the monetary authority to price level targeting is sticky and, therefore, may be persistently incomplete. While confident price setters use the official price level target to form price expectations which may, therefore, prove incorrect, non-confident price setters attain perfect foresight by paying an optimization cost.

In an innovative approach, the frequency distribution of the confidence in the price level targeting across firms follows evolutionary dynamics wherein firms revise their strategies in response to expected payoff differentials. Meanwhile, the monetary authority uses best-reply monetary policy to manage the state of confidence of the price setters. When such evolutionary game of conquering confidence is subject to noise, in the long-run equilibrium not all firms form price expectations using the price level target, yet the latter is nonetheless achieved. Absent such noise, however, in the long-run equilibrium both the price level target is achieved and all firms form price expectations using the price level target. Consequently, complete confidence in the price level target is not a necessary condition for achieving such target even when heterogeneity in firms’ price expectations is endogenously time-varying and may emerge, due to stickiness in the state of confidence, as a long-run equilibrium outcome.

In deriving our qualitative results, we assume that the best-reply monetary policy is
conducted by exogenously setting the optimal nominal money stock. However, whether
monetary policy is conducted by exogenously setting the optimal nominal money stock (and
letting the interest rate adjust) or the optimal nominal interest rate (and supplying
the required nominal money stock) is just a difference in the operational procedures
which does not change our qualitative results. Moreover, although we assume that the
monetary authority targets a given price level, our qualitative results likewise do not
change if we assume an inflation target instead. In this case, monetary policy should
follow a variant of the κ-percent rule advocated by ^{Friedman (1960)} and others, according to which the nominal stock of
money should grow steadily at a rate of κ. In a monetary regime with inflation
target, therefore, κ becomes intuitively equal to the inflation target.

Although derived in a specific macroeconomic setting, our analytical results due to stickiness in the state of confidence have broader implications for the design of monetary policy. One main implication is that for the conquering of credibility for monetary policy, actions speak louder than words, as the steady achievement of price stability is what works better as a confidence-building mechanism. As the role of central banking as management of expectations is played under confidence stickiness, the best-reply monetary policy has to be derived also subject to such stickiness.