Introduction

In a simplified way, the role of a scientist is to observe a system and explain it in order to translate such explanations into a mathematical form. Through this mathematical model, it is possible to reproduce the observed system behavior, in order to facilitate the theoretical examination and the speculation about the temporal evolution of the system, besides the prediction of new phenomena. In this work, we consider two types of mathematical models: the deterministic and the stochastic ones. The former manages to establish a temporal evolution that repeats itself if we always use the same set of initial conditions. The latter depends on a probability density function, and it is not possible to always retrieve the same result when we start with the same set of initial conditions. From the point of view of physics teaching, we feel the lack of literature that addresses the difference between determinism and stochasticity. For this reason, we have decided to write this manuscript, to try to fill this gap a little bit. We choose to examine a system that draws attention in the real world because it moves a large amount of money and knowing how to model such a system is the dream of many people. This system is the financial market. In these notes, we will scrutinize a component of the financial market, the Stock Exchange of the State of São Paulo.

Some physicists consider the financial market as a complex system, which presents real problems that deserve detailed examinations ^{[1]}
^{2}
^{3}–^{[4]}. Specifically, essential works in references ^{[5]}, ^{7}
^{[6]} approach index returns of the stock markets as stochastic processes. This procedure generated exciting discussions ^{8}
^{9}
^{10}
^{[11]}
^{12}–^{[13]}. Recent works tried to answer the following question: Can a stock market be studied from the point of view of the deterministic system? ^{[11]}
^{12}–^{[13]}. These works performed the time series analysis based on the theory of dynamical systems. Since irregularity is present in the time series, it is then reasonable to consider the stock market as a chaotic dynamical system. The most direct link between chaos theory and the real world is the analysis of time series from real systems concerning nonlinear dynamics. However, one might think of incorporating a stochastic component into the description as well. In this work, we have to assume that this stochastic component is small and does not change the nonlinear properties. We would like to answer the abovementioned question for the Brazilian stock market index returns studied in ^{[14]}, ^{[15]} like a stochastic series. The authors of the references ^{[16]} and ^{[17]} also studied the question about the deterministic character in the stock market and considered the time series of such system in the context of recurrence plots (RP). Works like these have inspired us to conduct our research with the same tools. Regarding the Bovespa index, the following references present detailed studies of the time series based on the first Poincare return ^{[18]}
^{19}–^{[20]}. Still, regarding the Bovespa, there are detailed studies that address other aspects of this system. In the reference ^{[21]} the authors analyze the Ibovespa from the statistical point of view find that the index is described by an exponentially truncated Levy flight. The detection of long-range correlations in the Bovespa index appears as a result of the reference ^{[22]}. The authors of the reference ^{[23]} present an introductory course to econophysics. Finally, among other works, reference ^{[24]} presents a quantification of fluctuations in the Brazilian stock market.Here we do not treat with stochastic processes and Poincaré returns, and we do not deal with the chance of people gain or lose money under certain a probability. Instead, we are interested in understanding the mechanisms that govern the stock market index returns from a deterministic point of view, using the return map of the stock returns.

We organized the article as follows. To ensure a quick apprehension of the reader of the primary theoretical tool used in this work, we have graphically introduced the concept of return plot in Section 2 of this paper. Subsequently, the manuscript becomes more technical, but the question about the difference between determinism and stochasticity permeates the whole discussion that is in the text. In Section 3 we present briefly the Brazilian stock market index, or Bovespa Index, which is the focus of our study. Section 4 makes an overview concerning phase space, chaotic attractors, and phase space reconstruction; we also accomplish the reconstruction of the Brazilian market attractor to estimate its fractal dimension. In Section 5 we make an overview of recurrence plots (RP) and quantitative recurrence analysis (RQA). In section 6 we use the RQA based on the time series of Bovespa Index to suggest it has a deterministic component. In Section 7 we present some scaling laws obtained through the RQA. Finally, in section 8 we present our conclusions.

2. An intuitive approach to return plots

A significant part of the experimental data collected in the real world comes in the form of long time series. These series are analyzed using several techniques, but the purpose of the analysis is always the same: to understand the dynamic behind the process that generated this series, to establish a mathematical model based on some suitable variables and parameters, and then to use the model built for predict the dynamic at a future time. In general, the time series can be generated by a stochastic or deterministic system. In the deterministic case, a system creates a sequence according to a rule. A notable example would be the trajectory described by a particle under the action of an external field. As we learn in classical mechanics once the forces acting on a particle are known, Newton's laws allow us to understand the future motion from a given initial condition, that is, we can determine future states (the points of the trajectory) uniquely from the past states. In the case of a stochastic system the series arises without any previously established rule, we say in this case that the data are random: a typical realization of such a system is the flipping of a fair coin each day to determine the price of an asset on the next day. Another example of a deterministic series is the one provided by the logistic map *x _{n+1}* =

*ax*, in which

_{n}(1-x_{n})*a*∊ [0,4] and

*x*∊ [0,1]

^{[26]}. In this equation the state variable at time

*n*+ 1,

*x*, obtained from

_{n + 1}*x*, is reintroduced into the equation in order to generate a new state. In this way, we can generate a time series {

_{n}*x*,

_{0}*x*,

_{1}*x*, ... } from a certain initial condition

_{2}*x*. A system which evolves ruled by this equation will exhibit a great dynamic diversity, according to the value of the parameter

_{0}*a*: stationary solutions (fixed points), cycles, aperiodicity (chaos). In Figure 1(a), we represent a small fragment of a long time series generated by the logistic map when

*a*= 3.83, we can see that the dynamic is that of a cycle of period 3, that is, {0.505, 0.957; 0.156; 0.505; 0.957; ... }. In this case, the system described by this equation returns to the same state every three iterations. This return of the system to a certain state or even a neighborhood of this state is known as recurrence. A graphic way to see this recurrence would be to construct the recurrence plot, see Figure 1(b): in this type of graph, we compare the value of the variable

*x*at a certain instant

*i*with its value at another instant

*j*. We repeat the same procedure for all other points in the analyzed series. Every time the values of the variables associated with two instants

*i*and

*j*are within a small neighborhood ε we draw a black dot in the coordinates (

*i,j*) of a graph

*i*versus

*j*, if the points are outside this neighborhood, we draw a white point. We will formally introduce this type of graph into a future section, but we can already mention some of its striking features. In Figure 1(b) we notice the regularity in the spacing between the 45 degree inclined strips, this is a striking feature in an RP for periodic dynamics. In Figure 1(c), we show a logistic map series for

*a*= 4.00, at first glance, the dynamic does not seem to differ much from stochastic dynamic (Figure 2(a)), but this is only apparent. The dynamic generated by the logistic map with

*a*= 4.00 has exceptional characteristics: absence of cycles (aperiodicity), the dynamics is limited to the interval [0,1], the dynamic presents sensitivity to the initial conditions

^{[26]}. In this case, we say that the dynamic is chaotic. It is interesting to compare the RPs for the chaotic series with the stochastic one (Figures 1(d) and 2(b)), we note two very striking differences, the RP for the chaotic series presents greater recurrence, it also shows a set of 45 degree inclined strips of various sizes, the so-called diagonal structures. Such characteristics are not present in stochastic dynamics.

3. The Bovespa Index

The movements of the prices in a stock market reflect the changes in the business environment of a market or a sector of a market. These movements can be captured using stock indices, which express the performance of the stock prices of a collection of important companies in a market. The Bovespa index hereafter referred to as Ibovespa, is the main indicator variable of the Brazilian stock market ^{[27]}. For the analysis of such an index, represented in this work by *I(t)*, we selected a dataset with 846,000 successive observations sampled each 30 seconds, from January 2003 to February 2007 (50 months or 1050 trading days). Figure 3(a) shows *I(t)* as a function on time *t*. The visual examination of this figure shows us that *I(t)* increases for time intervals of about 20 months. However, we can observe a decreasing behavior of the index for a shorter time interval, as illustrated in the small inset. This oscillation is one of the most studied features of the stock market indices since it is thus important to know if the index will increase or decrease after a time interval. This issue is about how much money people will gain or lose with the change of the stock market index at a period. A device for measuring changes in the stock market index is the return of this economic variable, defined by

where δ*t* is an arbitrary time interval multiple of 30 seconds. In this work we shall use δ*t*= 30s.

A more simplified version of that quantity is the standard return index, which is defined by

where 〈Z〉 and σ are the mean value and the standard deviation of the distribution of *Z*, respectively. The transformation (^{[2]}) provides a distribution with mean value equals zero and standard deviation equals 1. Therefore, all fluctuations about the mean are measured in units of σ. In figure 3(b) are displayed the dependence of the standard return index *z* on time *t* for the data used in ^{[3]}(a).

4. Phase Space and Attractor Reconstruction

The phase space of a dynamical system is a vector space with *f* orthogonal axes representing the *f* independent variables, or degrees of freedom needed to specify the instantaneous state of the system. While the time evolves, this collection of variables merged in a vector state *f*-dimensional space. In dissipative systems, the orbits are attracted asymptotically from a basin of initial conditions to a limiting set of states called attractor. The volume of an attractor is always zero and its dimension *d* typically smaller than the dimension *f* of the phase space. Despite these simple characteristics some attractors can have very intricate structures ^{[28]}. A chaotic attractor, for instance, is a set of states on which the orbit wanders forever visiting regions of the phase space in a non-periodic and highly disordered way. Although the phase space of a dissipative system can be possibly high dimensional, the dynamics on the attractor is nevertheless low dimensional, in other words, the effective number of degrees of freedom needed to characterize the long-term dynamics is relatively small. The low dimensional feature is an advantage since we are particularly interested in studying the evolution of the attractor of a dynamical system itself and not the evolution in the full, high dimensional phase space. Following this idea, we suppose in this work that all-time series emerge from a chaotic attractor of a possible high dimensional system, the stock market itself.

In computer experiments we generally know the vector state ^{[29]} from a purely empirical point of view, but it also has a rigorous mathematical basis in the works of Whitney ^{[30]}, Takens ^{[31]} and Mané ^{[32]}. We shall describe it below empirically by considering the time series of the Ibovespa standard return *z(t)* as the single-variable natural time series obtained from the Brazilian market attractor.

Suppose the single-variable experimental time series {*z _{n}*,

*n*= 1,…,

*N*,

*z*ℝ} of

_{n}*N*successive points on a

*d*-dimensional attractor (

*d*<

*f*). In order to reconstruct the dynamics on the attractor we must transform the scalar measurements

*z*in vector states

_{n}*m*-dimensional phase space. Each vector

*m*-tuplet of consecutive values (delayed coordinates) of the time series.

where the delay time τ is some multiple of the spacing between successive measurements.

When this set of vectors are graphed (embedded) in the new *m*-dimensional space, it can be analyzed as if it were an orbit of a dynamical system. Though the reconstructed attractor usually has a different shape of the original one, it preserves the original dynamical properties provided we determine an adequate value for *m*.

If we choose *m* too small it may happen that the reconstructed attractor is too folded, that means that two distant points on the attractor in the original phase space will overlap each other in the reconstructed space. This overlapping of points, known as false-neighbors points, produces errors in the calculation of many dynamical properties since the calculation are based on the trajectories in the phase space. This overlapping can also be seen as a self-intersection of the reconstructed orbit at a given instant of time, which leads clearly to a violation of the deterministic nature of a trajectory: it is not possible to have two distinct future states from just one prior state (or initial condition). To disconnect these false-neighbors points, the attractor must be represented in spaces of higher dimensions. According to Takens Theorem, *m* = 2*d*+1 is a sufficient condition for the attractor to be completely unfolded ^{[33]}.

m | Estimated slope d
_{c} |
Standard deviation | Correlation coefficient |
---|---|---|---|

2 | 1.821353 | 0.005061482 | 0,9998186 |

4 | 3.275242 | 0.007255244 | 0.9998524 |

6 | 4.465634 | 0.01414984 | 0.999734 |

8 | 6.285727 | 0.005242163 | 0.9988355 |

10 | 6.003654 | 0.004297553 | 0.9995703 |

12 | 6.002467 | 0.005634016 | 0,9999990 |

A pictorical example of attractor reconstruction (*m*=4, τ=1) for the data series of the Ibovespa standart return *z(t)* is presented in figure 4 by means of the planes (*z _{n}*,

*z*) and (

_{n+1}*z*

_{n},

*z*

_{n+3}) (figures 4(a) and 4(b), respectively). The set of points do not resemble a cloud (or ball) of uniform scattered points, which is typical for data emerging from a random source, the set of points seems to have a strong spatial correlation. The straight horizontal or vertical lines appearing in Figure 4 are the results of a new zero return followed by a sharp increase, or decrease (or vice-versa) in return. These lines appear for the data collected at the opening and closing of the stock exchange. In these two times, the volume of transactions decreases and the Ibovespa variation is practically non-existent.

According to Takens theorem the required value for the embedding dimension *m* is known only if we determine the attractor dimension *d*. A good estimation for *d* is the *correlation dimension d _{c}*

^{[34]}. To calculate

*d*we need first evaluate the

_{c}*correlation integral*,

for the reconstructed orbit, *N* is the number of points on the attractor, ε is a threshold distance, ‖…‖ is the Euclidean norm and Θ(.) is the Heavside step function. Grassberger and Procaccia established that for small ε the correlation *C*(ε) grows like a power, *C*(ε) ≈ ε* ^{d}_{c}*, and the exponent

*d*can be taken as a measure of the dimension of the chaotic attractor

_{c}^{[35]}. It is clear that the obtained value for this exponet is affected by the choosed embedding dimension

*m*, since

*C*(ε) itself is a quantity that depends on the distance between points in the

*m*-dimensional space. We conclude that if we calculate

*d*by increasing

_{c}*m*sucessively (

*m*=1, 2, 3, ...), once the attractor is fully unfolded

*d*must cease to modify as

_{c}*m*changes, in other words,

*d*saturates. We have then at the same time reached the exact value for

_{c}*d*and

_{c}*m*. On the basis of our assumptions we have estimate the correlation dimension

*d*for the Brazilian market attractor as well as a satisfactory value for

_{c}*m*. Fig.4(c) presents the result, in this figure we can fairly observe a range of values of ε at which the slope

*d*(correlation exponent) is constant, we have used this range to feet the power law scaling region. We can observe a reasonable saturation beyond

_{c}*m*= 10 (see table), we shall use the value

*m*= 10 throughout this work. The corresponding fractional dimension calculated is

*d*≈ 6.004.

_{c}5. Recurrence Plots and Recurrence Quantification Analysis

We saw that in a chaotic dynamical system the orbit exhibits bounded, aperiodic and highly erratic behavior. Though the state variables never return to their previous values, they may return very close to them. Recurrence plots (RPs) ^{[36]}, ^{[37]} provide a graphical representation of how close an orbit or a reconstructed orbit approaches or recurs itself, in other words, RPs exhibit the recurring patterns of a system. Such a graphical representation can be introduced by the *N* × *N* matrix

where *m*-dimensional phase space. The RP is thus obtained by assigning a black dot, called recurrence point, to a position of coordinates (*i*,*j*) provided that the spatial distance between the system states at instants *n*=i and *n*=*j* is smaller than a distance ε. The construction of RP requires the specification of the time delay τ, but for discrete time series such as financial data , τ = 1 is usually apropriate. ^{[42]}
^{[17]}.

The recurrence points (black dots) may form two small-scale structures which contribute to the overall pattern of the RP: the vertical and diagonal lines. A vertical line identifies a state which does not change considerably during an interval of time ^{[38]}, the length of this interval is precisely the length of the vertical line. A diagonal line, represented by any line parallel to the main diagonal, identifies two similar segments of the trajectory beginning at different instants of time, the length of a diagonal line is the interval of time in which those distant segments remain similar. A priori, diagonal and vertical lines must not occur in a RP arising from a random time series ^{[41]}.

Different time series exhibit different RPs; we have thus a huge number of visual patterns, which depend on particular details of dynamics. Introduced by Zbilut and Webber ^{[39]}
^{40}–^{[41]}, ^{[43]}of dynamics. Introduced by Zbilut and Webber^{[45]}, the *recurrence quantification analysis* (RQA) assigns a group of real numbers to each RP regardless of its visual appearance. These numbers, or measures, were mostly based on the statistical analysis of diagonal structures, but a few years later, further measures based on vertical structures have been integrated into this analysis ^{[39]}. The set of all measures represents a clear and sound criterion nowadays for comparison between RPs from different types of dynamics. Since RPs and RQA were introduced they have been extensively used in a wide diversity of applications: quantification of complex behavior in heart-rate-variability ^{[38]} and electroencephalographic data ^{[47]}, quantification of correlation between data from different climatological phenomena ^{[46]}. More recently they were also extended beyond the domain of time analysis: the quantitative analysis of spatial disorder or correlation in complex spatial patterns at a fixed time ^{[48]}, ^{[49]}. In this work we deal with four measures, namely the *recurrence rate* (REC), the *determinism* (DET), the entropy (ENT), and the *laminarity*} (LAM) ^{[38]}, ^{[43]}, ^{[44]}.

The simplest measure of the RQA is the recurrence rate

which is a measure of the density of recurrent points in the RP.

The laminarity is definded as the ratio of recurrence points forming vertical structures of length *v*, larger than *v _{min}*, to all recurrence points.

*P(v)* denotes the histogram of the vertical lines. The determinism *DET* is defined as the ratio of recurrence points that form diagonal structures of length *l*, larger than *l _{min}*, to all recurrence points (points belonging to the main diagonal are always excluded), and reads

in which *P(l)* is the histogram of diagonal lines of length *l*. *Stochastic processes display absence of short diagonals even for large values of ε, whereas deterministic processes cause longer diagonals and less single, isolated recurrence points*. Another measure based on the length of diagonal lines is the Shannon entropy of the probability *p(l)* = *P(l)/N _{l}* of finding a diagonal line of length

*l*in the RP,

where *N _{l}* is the total number of diagonal lines.

*A RP with large (small) diversity of diagonal lines renders a high (small) value for ENT, e.g. for uncorrelated noise the value of ENT is quite small, indicating low diversity of diagonal lines*. The measure

*ENT*is thus inversely related to the amount of disorder (dynamical complexity) in the time series under analysis. In the present analysis we have used

*l*= 3 and

_{min}*v*= 2.

_{min}6. RQA for original and shuffled financial data

The shuffling of a data series does not affect its statistical distribution, measures like mean, variance and standard deviation are preserved. On the other hand, the shuffling affects the ordering of data, measures like correlation and recurrence are therefore strongly affected: the more the set is shuffled, the more uncorrelated it becomes. Since the RP and the measures of RQA are based on the recurrence, it is thus possible in principle to find out if a time series has a deterministic or random nature by shuffling the data and comparing the respective RPs and the results from RQA for the original afterwards and shuffled series: if the RPs and the measures from RQA are different for the original and shuffled data series, we may suppose there must be a deterministic component in the original series.

For the RQA analysis, we selected two intervals of the time series of the standard return index. The intervals are distant in time and have different degrees of disorder, which can be estimated through the average of the absolute value of returns, <|*z*|>. In Figs. 5(a) and 5(b) we present the time series and the corresponding RPs for the intervals *I _{1}*=[1;4800] and

*I*=[150001;154800], respectively. Whereas Figs. 5(c) and 5(d) were based on the the shuffled data series of intervals

_{2}*I*and

_{1}*I*, respectively. For the construction of each RPs, we choosed ε = 1.08, for the RPs of Figs. 5(a) and 5(c) and ε = 1.18, for the RPs of Figs. 5(b) and 5(d). We selected values for the cutoff distance ε so that

_{2}*REC*keeps close to 1%. Although the time series of the standard return index display highly disordered data, the RPs, on the other hand, display some interesting large-scale structures or “clusters” of recurrence points. Such clusters disappear in the RPs for the shuffled data. The RQA reveals a large decrease in all four measures when we compare the original and shuffled data

^{[2]}. The qualitative and quantitative analysis shows that the consecutive returns from the stock market must not be considered a set of disconnected and random data.

7. Scaling laws arising from diagonal and vertical lines

The RP for a periodic time series consists of a series of parallel stripes at 45 degrees, with length decreasing from the length of the main diagonal line to the length of the shortest diagonal line at the lower-right corner of the RP. The vertical distance between two successive stripes is right the period. Following this idea, it is thus possible to obtain some information about the periodicity of a reconstructed orbit from an arbitrary dynamical system by analyzing the set of vertical blank gaps, also called recurrence times, in the corresponding RP. The following process can obtain the set of recurrence times: for each state *i* of the horizontal axe we compute the number and the length of the vertical blank gaps by varying *j* from 1 to *N*, the number of states of the reconstructed series, then we equally vary *i* from 1 to *N*.

Once the set of recurrence times is obtained, we can compute the probability *p(q)* of finding a recurrence time of length *q*. We computed such a distribution for the RPs based on the intervals *I _{1}* and

*I*studied in the last section. The empirical distribution, presented in figure 6 for both sets of recurrence times, can be exceptionally well fitted (see straight line) in the full range of

_{2}*q*by the mixed function

In an earlier work the authors of the ^{[25]} reference, the authors propose a stretched exponential scaling-law to describe recurrence events in the case of long-term correlations. This proposition is in agreement with our result since the index time series long-range correlation. For interval *I _{1}* we obtained (α;β;γ)=(0.187;0.987;478.77 ), whereas for interval

*I*(α;β;γ)=(0.246; 1.093; 549.12). We could observe that the power law behavior is noticeable for low values of

_{2}*q*. Since

*q*is a time interval, we may conclude it exists a scale invariance, at least for

*q*<100 or equivalently 3000 seconds, in the set of recurrence times. Such mixed function

*p(q)*showed to be persistent throughout the analysis of other intervals of the time series of Ibovespa returns and also of other time series with different values of δ

*t*. Concerning the question about the deterministic nature of the times series under analysis, we observed that the RPs for the corresponding shuffled data did not exhibit any recurrence pattern. As a result, the corresponding distribution of recurrence times must approximately follow a uniform distribution, and scaling laws will thus obviously not arise.

Motivated by the obtention of a simple function describing the distribution of recurrence times, we also computed the probability *p(l)* of finding a diagonal line of length *l* from the RPs based on the intervals *I _{1}* and

*I*. These distributions, with length

_{2}*l*larger than 3, are displayed in figure 7. Again, the fitting with a mixed function,

with parameters (α;β;γ)=(22767.20;0.419;3.272) for *I _{1}* and (α,β,γ)=(26048.4;0.902;4.949) for

*I*, showed to be remarkably good (see the straight and dashed lines). However, different from the distribution of recurrence times, a power law behavior extends only over a very small range. Moreover, we observe a quite small diversity of diagonal lines as well as the absence of long ones: the maximum value is

_{2}*l*≈ 40, which represents a time interval of about only 1200 seconds (40 x 30s), in other words, the longest similar segments of the reconstructed series last 1200 seconds. As for the corresponding set of vertical lines, we did not obtain a well-behaved distribution. Because of this, a fitting with some simple function was not possible.

The distribution of diagonal lines can be useful when we intend to accomplish a fast computation of the measures *DET* and *ENT*, equations 8 and 9, respectively. A comparison between both ways of computation of these two measures is summarised in table 3. We see that the obtention of *DET* and *ENT* by using equation 11 is in good agreement with the direct counting based on the RP, see the deviation presented in the last column.

8. Conclusions

This paper aimed to investigate whether there is some deterministic behavior ruling the stock market. To do this task, we have hypothesized the existence of a “hidden” dynamical system controlling a specific stock market, the Brazilian stock market. We have then applied the Grassberger-Procaccia method to estimate the dimension of the attractor of the dynamical system. Different from data arising from a random source, the correlation integral computed for the dataset from Ibovespa tend to saturate around an embedding dimension of 10 which leads to a fractal dimension of about 6.004. We laid a particular emphasis on the method of Recurrence Plots and Recurrence Quantitative Analysis; the broke of patterns in the recurrence plots and the decreasing of the measures from RQA after shuffling the dataset from Ibovespa support our hypothesis about the existence of a dynamical system.

We have analyzed the distribution of recurrence times and diagonal lines from the obtained RPs; both distributions showed to be extremely well fitted by a product of two essential functions: a power law and an exponential function. This scaling seems not to depend on any particular segment of the time series.

Acknowledgments

We would like to thank BOVESPA for providing us with the data, without which it would not be possible to carry out this work.Some numerical results displayed in this article were carried out with computer programs developed by Profs. Charles L. Webber Jr. and Norbert Marwan. These programs can be downloaded at http://homepages.luc.edu/~cwebber and http://www.agnld.uni-potsdam.de/~marwan. We would also like to thank CAPES and CNPq for their financial support.