HIGH-ORDER MULTIVARIATE MARKOV CHAIN APPLIED IN DOW JONES AND IBOVESPA INDEXES

In this paper we analyzed the probabilities of transitions of state between Ibovespa and Dow Jones indexes using High-order Multivariate Markov Chain. While the stock market may be profitable, the existence of risks can lead to large losses. A mathematical model capable of considering different sources can aid in decision making. This model can work with stochastic data, causing different databases to be transformed into transitional matrices between states. For this, a set of a daily variation data were used between January 2008 and March 2018. Through this application, it was possible to show an interaction between the indexes and that the highest frequency of events was of the variation of –0.49 to 0.5% in Dow Jones to –0.49 to 0.5% in Ibovespa, with 428 cases, and the probability of this situation occurring again, of Dow Jones at time t to Ibovespa at time t + 2, is 27.21%. Empirical results suggest that this application can help investors make decisions based on transition probabilities.


INTRODUCTION
To invest in the stock market it is important to calculate the risks. Related to this idea, it is important to understand that smaller markets are affected by larger markets (Achcar et al. (2012)). The authors comment that the Brazilian stock exchange is affected by the American stock exchange, since the values of transitions of the second are considerably higher. Shephard (2008) presents a collection of relevant models and studies. In this study is presented the ARCH models (autoregressive conditional heteroscedasticity) applied to Univariate stochastic volatility (SV) and multivariate ARCH models (MARCH). This type of problem is often studied using Econometric Theory (Yu & Meier (2006)). Although these models exist, the relationship between the two stock exchanges was not presented simultaneously. In order to analyze this relation, we are proposing an application of the High-order Multivariate Markov Chain. The The authors Doubleday & Esunge (2011) used Markov Chain to determine the relationship between a diverse portfolio of stocks and the market as a whole, where, the authors comment on the feasibility of applying the tool to support decision making. In the same line of application, Fitriyanto & Lestari (2018) presented an article with the application of Markov Chain in PT HM Sampoerna stock price. These mentioned articles used Markov Chains of 1st order.
Although the applications with 1st order is more widespread in the literature, some articles of higher order began to gain space in different areas. For example, the articles of Ching et al. (2004), Ching et al. (2008) and Yang et al. (2011). However, the investment area is also attractive for these models, as the publication of Ky & Tuyen (2018). The authors emphasize the importance of this technique for forecasting time series. They also comment that while there are ARIMA methods, exponential smoothing and even artificial intelligence, these methods demand a hardly fit nonlinear data. The authors present a novel High-order Markov model for time series forecasting where the state space of the Markov chain was constructed from different levels of changes of the time series.
The objective of this paper is to calculate the probability of variation of the main index of the Brazilian and American stock exchanges, being respectively Ibovespa and Dow Jones, using the High-order Multivariate Markov Chain. It also was studied the behavior of these two indexes separately, to obtain the expected recurrence time, in order to be useful information for investors.

Financial market volatility
Interest in forecasting stock market volatility has been around for some time (Bollerslev et al. (1992)). According to Mello (2009), volatility, in the financial area, shows the intensity and the frequency of fluctuations in the prices of a financial asset, which may be stock, bond, investment fund or stock market indices, in a given period of time. The more the price of a stock fluctuates over a short period of time, the greater the risk of gaining or losing money by trading this stock, and therefore volatility is one of the parameters most frequently used as a way of measuring the risk of an asset, according to Ishizawa (2008).
The more volatile an asset is, the more significant its variation in relation to market fluctuations. In other words, it is a riskier investment. Savings are at the extreme of low volatility and profitability, while the derivatives market as options and futures are at the other extreme, with much volatility and possibility of return. Stochastic data analyzes can help reduce investment risk Ribeiro (2009) comments that the measurement of a risk is according to the history of the asset. According to Mello (2009) and Almeida (2013), one of the ways to analyze the volatility of an asset is by measuring its oscillation or the standard deviation of its value or profitability. It can be used in many ways, for example in the stock market, it is possible to analyze the levels of overbought/oversold, and the definition of targets for gains and losses.
As the financial world is influenced by several external factors, such as politics itself, it is also subject to lesser or greater volatility. Chiu et al. (2018) comment that long-term volatility is related to macroeconomic fundamentals associated with future cash flows and rates, while shortterm volatility is related to transitory determinants, such as investor sentiment. This is related to the hypothesis that volatility reflects both market expectations of future cash flows and rates, and the short-term behavioral effects that are not directly linked to economic activity.
For Fornari & Mele (2013), one characteristic of the capital markets is the countercyclical behavior of asset price volatilities. For example, in the last 50 years, the annualized return volatility of the S&P 500 was 14.18% on average. However, during recessions, this figure increased to 17.39%, 23% more than the overall average. During the expansions, however, this same volatility reached an average of 13.5%, being 4% below the general average.
The work of Schwert (1989) studies the changes in stock market volatility over time. In particular, it relates stock market volatility to time-varying volatility of various economic variables. For the period 1857-1987, volatility was remarkably high from 1929 to 1939 for many economic series, including inflation, monetary growth, industrial production, and other measures of economic activity. This is because stock market volatility increases during recessions.
Schwert (1989) also comments that the value of corporate equity depends on the health of the economy. If discount rates are constant over time, the price movement of securities is proportional to the variation of expected future cash flows. It is possible that a change in the level of uncertainty of future macroeconomic conditions would cause a proportional change in the volatility of stock returns. , a good understanding of how the asset, its profitability and its volatility interact with the economy, will enable policymakers and finance and investment professionals to obtain more accurate forecasts, influenced by macroeconomic factors.

Markov Chain
Which, X t represents a variable at time t , and p n i j is the probability that a process passes from state i to state j in n steps at time t . As it is conditional probability, Hillier & Lierberman (2005) affirm that these values cannot be negative. .
The authors demonstrate that the transition of states occurs from the row-to-column index, that is, the probability p i j corresponds to the transition of state i to state j . Taha (2007) comments that the sum of each row of the matrix must be equal to 1, besides that an array is classified as ergodic when it is possible to go from any state to another any in n time steps. .
Taha (2007) explains that, after a large number of transitions, the probability of finding the process in a given state, for example j , tends to the value π j , independent of the probability distribution of the initial state. Hillier & Lierberman (2005) comment that because there is M + 2 equations, M + 1 unknowns and a single solution, it is necessary to exclude one of the equations, but it cannot be Equation 5, because π j = 0 for any j will satisfy M + 1 equations.
Furthermore, it is possible to analyze the expected recurrence time, represented by μ ii for when j = i, which, according to Taha (2007), is the expected number of transitions until the process returns to the initial state i, shown in Equation 6.

High-order Multivariate Markov Chains
As mentioned before, the probabilities associated with various state changes are called transition probabilities. The process is characterized by a transition matrix describing the probabilities of transitions from one initial state to another. The High-order Multivariate Markov Chain is used when there is more than one time series of data and the transition occurs for more than one-time step, that is, it can be considered of order n.
In order to facilitate the understanding of this concept, this section has been divided into three subsections.

High-order Markov Chains
According to Ching et al. .
Which λ h represents the weight and is a non-negative real number, which summation is equal to 1; x r is the probability distribution of the state at time r; P h is the h step transition matrix. to represent the First-order Multivariate Markov Chain model.

First-order Multivariate Markov Chains
Which x ( j ) 0 is the initial probability distribution of the sequence j th; λ jk is a non-negative real number, its sum being equal to 1; and 1 ≤ j, k ≤ s.

High-order Multivariate Markov Chains
The High-order Multivariate Markov Chain is an addition of the concepts presented earlier, that is, models the behavior of multiple data sequences for more than one step in time, in the nth order. Kárný (2016) understands that the state probability distribution of the sequence j th at time r + 1 depends on the state probability distribution of all sequences, including itself, at times t = r, r − 1, . . . , r − n + 1. Then, the High-order Multivariate Markov Chain (nth order) model is presented in Equation 11, according to Ching et al. (2008).
which λ jk is a non-negative real number, its sum being equal to 1; and 1 ≤ j, k ≤ s.
The probability of state of the sequence j th, x which: If i = j , so: Where Q is the one-step transition probability matrix of the multivariate model, which determines the probability of making a transition depending on the current state.

METHODOLOGY AND RESULTS
In order to achieve the proposed goal of this study, the elaborated methodology was divided into four stages, as shown in Figure 1.

Data gathering
For this study, the values of the percentage variation of the Ibovespa index and the Dow Jones index were collected. The Ibovespa index is a theoretical portfolio of companies, which serves to indicate the average performance of asset prices of greater tradability and representativeness of the Brazilian stock market, while the Dow Jones is one of the main indicators of the movements of the American market. This data was taken from the InfoMoney electronic address, the time period was commercial day-to-day trading of derivatives on the São Paulo and New York stock exchanges, and index values refer to January 2, 2008 to March 21, 2018. Figure 2 shows the Ibovespa data and Figure 3 the Dow Jones index.

Elaboration of the transition matrix
In order to calculate the transition matrices, it was necessary to define intervals for the percentage changes of the two indexes, as shown in Table 1. The development of the transition matrix, it was separated into two subsections, one for the Ibovespa index and another for Dow Jones.

Transition matrix for Ibovespa
In this step the transition matrix of the Ibovespa index was created. For this, the first stage was the creation of the frequency matrix, which represents the total number of times the transition occurred, shown in Figure 4.  Based on this matrix the cumulative percentage that occurred in the variations of each of the intervals was observed, as shown in Figure 5.
By means of this chart it is noticed that the highest frequency of variation occurred in the range of -0.49 to 0.5%. Following the calculation procedure and considering this information, it was possible to calculate the matrix with the transition probabilities. The value P i j is the probability of price variation from state i to state j in one step of time, which is presented in Figure 6.

Transition matrix for Dow Jones
Based on data from the Dow Jones daily percentage change, it was possible to analyze the frequency of variations, shown in Figure 7.   As has been created previously, the cumulative percentage that occurred in the variations of each of the intervals was developed, as shown in Figure 8. In this specific case by analyzing the graph, it is noted that the higher frequency of variation also occurred in the range of -0.49 to 0.5%. The Markov transition probabilities is shown in Figure 9.

Scenario analysis
The analysis of the scenarios of the two indices was also separated into two subsections.

Scenario analysis for Ibovespa
Scenario analysis requires the calculation of steady state, as presented in Equations 16 and 17.
In Table 3 it is observed that, when the range of variation is -0.49 to 0.5%, for example, the expected recurrence time for this variation is 3.4 days, this means that in 3.4 days a value ranging from -0.49 to 0.5% occurs again. Probability of Expected steady-state π i recurrence time μ ii π 0 17.2 days π 1 11 days π 2 5.1 days π 3 3.4 days π 4 5.1 days π 5 9.8 days π 6 17.9 days

Scenario analysis for Dow Jones
Scenario analysis requires the calculation of steady state, as shown in Equations 26 and 27.
With Table 5, it is observed that, when analyzing a variation of -0.49 to 0.5%, for example, the expected recurrence time for this variation is 1.9 days, that is, it is expected that in 1.9 days a value ranging from -0.49 to 0.5% occurs again.

Application of High-order Multivariate Markov Chains
Considering 1 as the Dow Jones data set and 2 as the Ibovespa data set for both with the same intervals stipulated previously, then F (12) 1 represents the transition frequency matrix by analyzing Probability of Expected steady-state π i recurrence time μ ii π 0 47.8 days π 1 21.8 days π 2 6.6 days π 3 1.9 days π 4 5.1 days π 5 27.7 days π 6 44.8 days the Dow Jones for Ibovespa with one step of time, that is, Dow Jones at time t and Ibovespa at t + 1. So, F  Using this information, it was possible to calculate the transition matrices, where P represents the probability of the transition from the Dow Jones data to the Ibovespa data with one step in time, that is, Dow Jones at time t and Ibovespa at t + 1. So, P (12) steps in time, that is, Dow Jones at time t and Ibovespa at t + 2. These matrices are shown in Figures 12 and 13  Then, the probability of steady state was calculated, as presented in Equations 4 and 5. The results for the probabilities with one and two steps in time are presented in Table 6.

DISCUSSION OF RESULTS
For this analysis, the data from January 2, 2008 to March 21, 2018, it was observed that the highest probability of variation of the two indexes is within the range of -0.49 to 0.5% and for Ibovespa, this variation is likely to occur again in 3.4 days, while for Dow Jones, it will be in 1.9 days. This information can be important, since repeating patterns can help in choosing the time to buy or sell stocks.
By the application it was studied the behavior of probabilities by analyzing the Dow Jones data scenario for time t , while Ibovespa was for t + 1 and t + 2. It was analyzed that the variation of these probabilities was small. It was also observed that there is a 27.21% chance that the Ibovespa variation, with two steps in time, is in the range of -0.49 to 0.5%, when given the Dow Jones also vary within these limits in time t .
In relation to the steady state, its behavior was similar for the two indexes, that is, a low probability for the first and last data intervals, and a high probability for the intervals close to 0% of variation. This behavior was repeated for the expected recurrence time, but the number of days to occur again the highest probability is, for Ibovespa, practically twice when compared to Dow Jones. This indicates that the Brazilian stock exchange is more volatile than the American stock market, a fact that can be verified with the tables of the historical variations of these two indexes, since the one of the Ibovespa has a greater intensity in the oscillations of the values.
Information such as this can be advantageous for investors who negotiate on the stock exchange, since the knowledge of the behavior of the markets allows to assist in the decision making regarding the purchase and sale of shares. For practical applications, it is suggested that the database be fed in the desired time step, and that the analysis be done from one period to the next, maximizing the possibility of gain through the transition probabilities.

CONCLUSIONS
The applications of High-order Multivariate Markov Chains are still little explored in the literature. However, it can be a tool of great potential. This paper studied an application to analyze the variation of the main indexes of the Brazilian and American stock exchange, which was efficient in obtaining results.
Thus, this work aimed to instruct Markov Chains and their possible practical applications, providing information to investors and, consequently, enabling a better targeting of their efforts and investments, as well as better risk management. It is known that other techniques may be associated with valuation for investors, the technique presented here may be considered as one more option.
From this application, it was possible and feasible to forecast the range of variation of the Ibovespa index, using its data and the Dow Jones data, because the Brazilian economy is related to the American stock market, as mentioned in the literature.
This work had the purpose of presenting the application of the High-order Multivariate Markov Chains for two stock indexes. As a continuation hint more comparisons with other tools can be made from this implementation.