Abstract
It is emphasized that nonMarkovian processes, which occur for instance in the case of colored noise, cannot be considered merely as corrections to the class of Markov processes but require special treatment. Many familiar concepts, such as firstpassage times, no longer apply and need be reconsidered. Several methods of dealing with nonMarkov processes are discussed. As an example a recent application to the transport of ions through a membrane is briefly mentioned.
^{*}* This text corresponds to an invited talk at the "Workshop on the Foundations of Statistical Mechanics and Thermodynamics'' held in Natal, Brazil, in October, 1997.
N.G. van Kampen
Institute for Theoretical Physics, University Utrecht
Princetonplein 5, 3584 CC Utrecht,
The Netherlands
Received March 9, 1998
It is emphasized that nonMarkovian processes, which occur for instance in the case of colored noise, cannot be considered merely as corrections to the class of Markov processes but require special treatment. Many familiar concepts, such as firstpassage times, no longer apply and need be reconsidered. Several methods of dealing with nonMarkov processes are discussed. As an example a recent application to the transport of ions through a membrane is briefly mentioned.
I. Definition of Markov processes
The term 'nonMarkov Process' covers all random processes with the exception of the very small minority that happens to have the Markov property.
FIRST REMARK. NonMarkov is the rule, Markov is the exception.
It is true that this minority has been extensively studied, but it is not proper to treat nonMarkov processes merely as modifications or corrections of the Markov processes  as improper as for instance treating all nonlinear dynamical systems as corrections to the harmonic oscillator. I therefore have to start by reviewing some general facts [1, 2].
A stochastic process is a collection of random variables X_{t}, labeled by an index t which may be discrete but more often covers all real numbers in some interval. The stochastic properties of {X_{t}} are expressed by the joint distribution functions
The process is uniquely defined by the entire set of these distribution functions for n = 1,2,..., which in general is infinite.
When the values are given, the remaining variables obey the conditional probability distribution function
This is a probability distribution of , in which enter as parameters. Let us take the t_{i} in chronological order, then the process is Markov if this conditional probability depends on the latest value x_{k} at t_{k} alone and is independent of the earlier values . This must hold for all n, for any choice of k, and for any and . If this is true, all P_{n} can be constructed once P_{1} and P_{2} are given. For example,
SECOND REMARK. The reason for the popularity of Markov processes is the fact that they are fully determined by these two functions alone. For nonMarkov processes the distribution functions(1) must be determined by some other means, usually an entirely different mathematical construction. For Mprocesses it makes therefore sense to honor the function with the name transition probability.
II. Example
Symmetric random walk in 1 dimension. Here t = 0,1,2, ... and x takes integer values i. The process is Markovian with symmetric transition probability
But suppose the walker has a tendency to persist in his direction: probability p to step in the same direction, and q to return [3]. Then X_{t} is no longer Markovian since the probability of X_{t} depends not just on x_{t}1 but also on x_{t}2. This may be remedied by introducing the twocomponent variable {X_{t}, X_{t}1}. This joint variable is again Markovian, with transition probability
THIRD REMARK. A physical process (i.e. some physical phenomenon evolving in time) may or may not be Markovian, depending on the variables used to describe it.
If the memory of our random walk involves more preceding steps, more additional variables are needed. That does no longer work, however, if the memory extends over all previous steps. Example: polymers with excluded volume. This problem is often modelled as a random walk, but it is irremediably nonMarkov and has not been solved [4].
III. The master equation
Take a Markov process in which t is the time while X_{t} takes discrete values i = 0,1, 2, ... In eq. (2) take ,
Sum over i_{2} and take the limit to obtain the master equation
The are transition probabilities per unit time and are properties belonging to the physical system (such as squares of matrix elements), while P refers to the state of the system. The parameters i_{1}, t_{1} are often not written, which may lead to the misconception that P in (3) is the same as P_{1} in (1).
FOURTH REMARK. The master equation is an equation for the transition probability of a Markov process, valid for any initial i_{1}, t_{1}. If one knows in addition P_{1} (i_{1}, t_{1}) the whole hierarchy (1) and thus the process is uniquely determined (for ³ t1).
Warning. In the literature one occasionally encounters something called a "master equation with memory'',
with the claim that it defines a nonMarkov process. Objections. (i) A nonMarkov process is not defined when merely P(i, t  i_{1}, t_{1}) is known. (ii) The equation cannot be true for every x_{1}, t_{1}. (iii) The equation is no guarantee that the process is not Markovian [5].
IV. Diffusion
Let x be continuous as well. Then takes the form of an integral kernel W(x  x'). If the process is such that during an infinitely short Dt only infinitely small jumps are possible, then the kernel reduces to a differential operator [6]. The simplest example is the diffusion equation for the coordinate x of a Brownian particle,
The solution of this equation specified by the initial condition P(x, t_{1}) = d(xx_{1}) is the transition probability P(x, t x_{1}, t_{1}).
Here the coordinate is treated as Markovian, although the particle has a velocity v as well. One ought therefore to consider the joint variable {x, v} as Markovian, with master equation
This is Kramers' equation [7]; g is a friction coefficient and U(x) an external potential.
In this picture x by itself is not Markov. How is that compatible with (4)? The answer is that for large
g the collisions are so frequent that the velocity distribution rapidly becomes locally Maxwellian,
This makes it possible to eliminate v so that there remains an equation for P(x, t) by itself [7,1], namely
V. Firstpassage problems
Consider onedimensional diffusion in a potential field as given by (7). Let it take place in a finite medium x_{a} < x < x_{c} (Figure 1). When the diffusing particle starts at an interior point x_{b}, what are the chances that it will exit through x_{a} or x_{c}, respectively?
The answer is obtained by solving (7) with absorbing boundary conditions: P(x_{a}, t) = 0 and P(x_{c}, t) = 0. (The solution can be obtained explicitly thanks to the fact that the time does not occur in the equation for the probabilities.) It is clear that when x_{b} is at the top of a high maximum of U(x) the exit probabilities will be fiftyfifty. It is also possible to find the mean time for either exit [2].
In three dimensions the question is: When I surround the potential maximum by a closed surface, what is the probability distribution of the exit points on that surface and how long does it take? This problem can be formulated in the same way, but cannot usually be solved analytically.
For a particle described by (5), however, the coordinate is not Markovian and it does not suffice therefore to know that x(t_{1}) = x_{b}: one also has to know its preceding history. In the present case, that history is represented by the value of v at t_{1}. Of course it is possible to simply pick an initial P_{1}(x, v, t_{1}), but the correct choice depends on the problem one wants to solve. For instance I want to compute the autocorrelation function of x,
Evidently one needs to know the correct initial distribution P_{1} (x, v, t). Only in the limit of large g may it be replaced with P_{1} (x, t_{1}) with the aid of (6).
FIFTH REMARK. For a nonMarkov process the initial value problem is not welldefined unless further information about the problem is supplied.
Of more interest is the question of escape from a potential minimum such as x_{a} in Fig. 2. How long does it take to get across the barrier? Here the ambiguity of the initial velocity is harmless because the particle moves around in the potential valley long enough for the Maxwellian (6) to prevail.
In the case of diffusion described by (7) one may take the mean time of first arrival at x_{b}  and multiply by 2 because once in x_{b} there is equal probability to escape or go back. The mean firstpassage time can again be computed analytically. In more dimensions however, the question is: How long does it take to escape from a minimum x_{a} surrounded by a potential ridge? This mean time is determined by the lowest mountain pass on the ridge and an elegant approximation is available [8].
In Kramers' equation (5), however, it is not enough to compute the time for x(t) to reach x_{b} because x(t) is not Markovian and one cannot tell the probability of subsequent escape without taking into account v. It is necessary to reformulate the problem by picking a large x_{c} and solving (5) with boundary condition: P(x_{c}, v, t) = 0 for v < 0 (Fig. 3). This is the much discussed Kramers problem [7,9].
SIXTH REMARK. For nonMarkov processes the firstpassage problem may be formulated but its physical relevance is questionable.
VI. Langevin equation and colored noise
We start from Kramers' equation (5) and consider the spatially homogeneous case U' = 0. Then it is possible to integrate over x and obtain an equation for the distribution P(v, t) of v alone,
The Markov process v(t) described by this master equation is the OrnsteinUhlenbeck process. A mathematically equivalent way of describing this process is the Langevin equation,
where x(t) is a stochastic function, called Gaussian white noise, whose stochastic properties are determined by
As a result, the solution v(t) (with given initial value v(t_{1} ) = v_{1}) is also a stochastic function and Markovian, its master equation being (8) if one takes C = 2gT. The Langevin equation (9) with (10) is popular, because it is more intuitive than (8), but it is not better.
Numerous authors have generalized (9) by taking x(t) nonwhite or colored noise, i.e., not deltacorrelated [10]
with some even function f.
SEVENTH REMARK. When the noise in the Lequation is not white the variable v(t) is not Markov. Hence for the Langevin equation with colored noise one cannot formulate a meaningful initial condition or a firstpassage time.
A different generalization of the Langevin equation (9) is the nonlinear version [11],
When x(t) is white, then x(t) is Markov. Unfortunately the equation as it stands has no welldefined meaning and has to be supplied by an "interpretation rule'', either Itô or Stratonovich. To avoid the vagaries of the Itô calculus we choose the latter. It corresponds to the Mequation,
When x(t) is colored, then x(t) is not Markov and no Mequation exists. Attempts at constructing analogous partial differential equations are doomed to fail. Eq. (12) belongs to the general class of stochastic differential equations. They may be treated by approximation methods developed for the case that x(t) is offwhite, i.e. short correlation and sharply peaked f(tt' ) [12]. The Mequation appears as the first approximation.
An other device, applicable when this time is not short, was used by Kubo [13] and has been rediscovered many times [14]. Suppose x(t) is itself a Markov process governed by an Mequation, for example the O.U. process governed by (8). Then the joint variable (x, x) is Markov, with Mequation
Again, the nonMarkov x has been tamed by introducing an additional variable.
In practice this is of not much help unless one chooses for x(t) an even simpler nonwhite process, viz., the "dichotomic Markov process''. That is, x has two possible values +1, 1, and jumps with a constant probability g per unit time, as described by the Mequation
With this choice equation (13) reduces to
which is less forbidding, but still the subject of many articles [15]. Other choices for x(t), twovalued but not Markovian, have also been considered [16].
VII. A class of nonMarkov processes
In various connections the following type of process occurs [17,18]. Let X_{t} have two or more possible states j, and in each it has a probability per unit time g_{ij} to jump to i. If the g_{ij} are constants, {X_{t}} is a Markov process with Mequation (3). However, suppose they are functions g(t) of the time t elapsed since arrival in j. Examples: Molecules in solution may get stuck to the wall temporarily; a bacterium produces offspring after reaching a certain age. Other examples in [18,1].
The probability that X is still in j at a time t after entering it, is given by
When starting at t = 0 in j_{0}, it may have arrived in the state j at time t through a sequence of s transitions, at times t_{1}, t_{2}, ..., t_{s}, taking it through the states j_{1} , j_{2} , ..., j_{s} = j. The probability for this particular history to happen is
The probability P_{jj0}(t) to be at time t in state j is obtained by summing over all histories, that is: summing over all s and all j_{s} , ..., j_{s1} and integrating over all intermediate times t_{1}, ..., t_{s}. The result, written in Laplace transforms, is
where is the transform of (14), and the matrix is
In some cases the result (15) can be evaluated explicitly.
Example: transport of ions through a membrane. The following model has been suggested [19]. A membrane separates two reservoirs and ions may enter from the left or from the right reservoir. The rates at which they enter (i.e., the probabilities per unit time) are determined by the two outside liquids, with the restriction that there can be at no time more than one ion inside the membrane. Once inside, the ion may exit after a while on either side. Thus the interior of the membrane is a system having 5 states:
(0) empty;
(1) one ion that entered at left and is destined to exit at right;
(2) one ion from the left destined to exit at left;
(3, 4) two similar states with ions from the right.
The exits occur with probabilities g_{oj} per unit time (j = 1,2,3,4), which depend on time t elapsed since the ion entered. This is a nonMarkov process of the type described by (15). The equation can be solved and makes it possible to compute such physically relevant quantities as the average of the transported current, and the spectrum of fluctuations [20].
References
[1] N.G. van Kampen, Stochastic Processes in Physics and Chemistry (NorthHolland, Amsterdam 1981, 1992).
[2] C.W. Gardiner, Handbook of Stochastic Methods (Springer, Berlin 1983).
[3] F. Zernike, in Handbuch der Physik III (Geiger and Scheel eds., Springer, Berlin 1926).
[4] M. Doi and S.F. Edwards, The Theory of Polymer Dynamics (Oxford Univ. Press 1986).
[5] For a counterexample see ref. 1, ch IV, eq. (2.9).
[6] W. Feller, An Introduction to Probability Theory and its Applications I (2^{nd} ed., Wiley, New York 1966) p. 323.
[7] H.A. Kramers, Physica 7, 284 (1940).
[8] Z. Schuss and B. Matkowski, SIAM J. Appl. Math. 35, 604 (1970).
[9] P. Hänggi, P. Talkner, M. Berkovich, Rev. Mod. Phys. 62, 251 (1990).
[10] E.g. S. Faetti and P. Grigolini, Phys. Rev. A 36, 441 (1987); M.J. Dykman, Phys. Rev. A 42, 2020 (1990).
[11] P. Hänggi and P. Riseborough, Phys. Rev. A 27, 3379 (1983); C. Van den Broeck and P. Hänggi, Phys. Rev. A 30, 2730 (1984). For an application to lasers see S. Zhu, A.W. Yu, and R. Roy, Phys.. Rev. A 34, 4333 (1986).
[12] N.G. van Kampen, Physica 74, 215 and 239 (1974); Physics Reports 24, 171 (1976); J. Stat. Phys. 54, 1289 (1989); R.H. Terwiel, Physica 74, 248 (1974).
[13] R. Kubo, in Stochastic Processes in Chemical Physics (K.L. Shuler ed., Interscience, New York 1969).
[14] E.g. A.J. Bray and A.J. McKane, Phys. Rev. Letters 62, 493 (1989); D.T. Gillespie, Am. J. Phys. 64, 1246 (1996).
[15] Much literature is given in A. Fulinski, Phys. Rev. E 50, 2668 (1994).
[16] R.F. Pawula, J.M. Porrà, and J. Masoliver, Phys. Rev. E 47, 189 (1993); J.M. Pontà, J. Masoliver, and K. Lindenberg, Phys. Rev. E 48, 951 (1993).
[17] G.H. Weiss, J. Stat. Phys. 8, 221 (1973); K.J. Lindenberg and R. Cukier, J. Chem. Phys. 67, 568 (1977).
[18] N.G. van Kampen, Physica A 96, 435 (1979).
[19] E. Barkai, R.S. Eisenberg, and Z. Schuss, Phys. Rev. E 54, 1 (1996).
[20] N.G. van Kampen, Physica A 244, 414 (1997).
 [1] N.G. van Kampen, Stochastic Processes in Physics and Chemistry (NorthHolland, Amsterdam 1981, 1992).
 [2] C.W. Gardiner, Handbook of Stochastic Methods (Springer, Berlin 1983).
 [3] F. Zernike, in Handbuch der Physik III (Geiger and Scheel eds., Springer, Berlin 1926).
 [4] M. Doi and S.F. Edwards, The Theory of Polymer Dynamics (Oxford Univ. Press 1986).
 [6] W. Feller, An Introduction to Probability Theory and its Applications I (2^{nd} ed., Wiley, New York 1966) p. 323.
 [7] H.A. Kramers, Physica 7, 284 (1940).
 [8] Z. Schuss and B. Matkowski, SIAM J. Appl. Math. 35, 604 (1970).
 [9] P. Hänggi, P. Talkner, M. Berkovich, Rev. Mod. Phys. 62, 251 (1990).
 [10] E.g. S. Faetti and P. Grigolini, Phys. Rev. A 36, 441 (1987);
 M.J. Dykman, Phys. Rev. A 42, 2020 (1990).
 [11] P. Hänggi and P. Riseborough, Phys. Rev. A 27, 3379 (1983);
 C. Van den Broeck and P. Hänggi, Phys. Rev. A 30, 2730 (1984). For an application to lasers see S. Zhu, A.W. Yu, and R. Roy, Phys.. Rev. A 34, 4333 (1986).
 [12] N.G. van Kampen, Physica 74, 215 and 239 (1974);
 Physics Reports 24, 171 (1976); J. Stat. Phys. 54, 1289 (1989);
 [13] R. Kubo, in Stochastic Processes in Chemical Physics (K.L. Shuler ed., Interscience, New York 1969).
 [14] E.g. A.J. Bray and A.J. McKane, Phys. Rev. Letters 62, 493 (1989);
 [15] Much literature is given in A. Fulinski, Phys. Rev. E 50, 2668 (1994).
 [16] R.F. Pawula, J.M. Porrŕ, and J. Masoliver, Phys. Rev. E 47, 189 (1993);
 [17] G.H. Weiss, J. Stat. Phys. 8, 221 (1973);
 [18] N.G. van Kampen, Physica A 96, 435 (1979).
 [19] E. Barkai, R.S. Eisenberg, and Z. Schuss, Phys. Rev. E 54, 1 (1996).
 [20] N.G. van Kampen, Physica A 244, 414 (1997).
Publication Dates

Publication in this collection
03 May 1999 
Date of issue
June 1998
History

Received
09 Mar 1998