SciELO - Scientific Electronic Library Online

 
vol.20 issue4“I’VE FALLEN IN LOVE AND I WANT TO MEET YOU”: A LONGITUDINAL STUDY ON NEW VOLUNTEERS` MOTIVATIONIS THE CAPITAL STRUCTURE STABLE IN BRAZIL? author indexsubject indexarticles search
Home Pagealphabetic serial listing  

Services on Demand

Journal

Article

Indicators

Related links

Share


RAM. Revista de Administração Mackenzie

On-line version ISSN 1678-6971

RAM, Rev. Adm. Mackenzie vol.20 no.4 São Paulo  2019  Epub Aug 12, 2019

http://dx.doi.org/10.1590/1678-6971/eramf190146 

Strategic Finance

APPLYING SINGULAR SPECTRUM ANALYSIS AND ARIMA-GARCH FOR FORECASTING EUR/USD EXCHANGE RATE

1Ipanema Comercial Exportadora e Importadora Ltda., Alfenas, MG, Brazil.

2Universidade Federal de Minas Gerais (UFMG), Belo Horizonte, MG, Brazil.

3Universidade Federal de Minas Gerais (UFMG), Belo Horizonte, MG, Brazil.

ABSTRACT

Purpose:

The objective of this article is to model a minute series of exchange rates for the EUR/USD pair using the singular spectrum analysis (SSA) and ARIMA-GARCH methods and evaluate which one offers better forecasts for a five-minute horizon.

Originality/value:

Despite being a successful technique in other branches of science, the application of SSA in finance is quite new. Furthermore, exchange rate modeling is a complex problem, comprising statistical concepts and properties. However, despite the complexity, the analysis of this series is extremely important for several agents playing, directly or indirectly, a role in the economy and the financial market.

Design/methodology/approach:

Time series models were estimated using the ARIMA-GARCH and SSA techniques, taking into account three samples of the ask exchange rate (closing): uptrend, downtrend, and no well-defined trend.

Findings:

The forecasts carried out by the SSA were the ones closest to the original observations for the three cases. Regarding the quality measurements, SSA obtained the best results for both uptrend and downtrend samples; for the sample with no well-defined trend, the findings indicated that the ARIMA-GARCH technique attained better results. However, it was concluded that the SSA forecasts, regarding exchange rates during the studied period, are more appropriate than the ones obtained by the ARIMA-GARCH model, regardless of the market movement.

Keywords Exchange market; Exchange rates; Dollar; Euro; Time-series forecast

1. INTRODUCTION

In the last two decades, it has been observed that with the intense increase in globalization, international trade flow has reached levels previously unimaginable, wherein transactions involving different currency pairs have become fundamental for the various economy agents. According to the World Trade Organization (2014), in 2014, merchandise exports among its member countries totaled US$18 trillion. Trade growth is a necessary condition for the development of the global economy and the fight against poverty. Many factors have influenced this growth, such as international cooperation treaties and new statistical approaches made possible by advances in information technology for measuring transactions in terms of added value.

In this context, Atkočiūnas, Mačiulis, Klimavičiene., and Kalendiene. (2010) believe that exchange rates play a central role, since they allow an easy comparison of prices of goods and services produced in different countries and because governments, companies, and individuals participating in global scale commerce are exposed to the risk of exchange fluctuations. Therefore, the forecast of exchange rates is a relevant object of research.

The foreign exchange (FOREX) market comprises spot and forward transactions, exchange swaps, exchange options, and other derivatives exposed to more than one currency. According to the Bank for International Settlements (2013), in 2013, the FOREX market handled, on average, 5.3 trillion dollars per day, one trillion more than in 2010 and two trillion more than in 2007, making it the highest liquidity financial market niche in the world. The North American dollar was the predominant currency, used in 87% of this year’s transactions, and the euro, in turn, was the second most negotiated currency, enjoying a 33% share.

Although trade with different currency pairs is not a recent advent, the technological boom has given it greater prominence. The possibility of obtaining data in real-time together with the increase in the capacity to process this data has made, over the last few years, the use of sophisticated statistical and mathematical models increasingly common for the study of financial market phenomena. These models have been developed and applied, for example, to test whether the capital asset pricing model (CAPM) or the arbitrage pricing theory (APT) are paradigms that better determine the return on risky assets; explain the determinant variables in the evaluation of credit titles; obtain the optimal hedge and minimize risks; test whether disclosure of changes in dividend policy affects the stock price of a company; and measure and predict share volatility and prices, as well as the exchange rate between currency pairs (Brooks, 2008).

Due to the importance of this matter, work has been carried out aiming at a better description of exchange rate behavior and modeling exchange rates, as well as forecasting them or forecasting their volatilities. Yao and Tan (2000), for example, used technical analysis indicators to feed neural networks and forecast the exchange rates between the US dollar and the Japanese yen, the German mark, the Sterling pound, the Swiss franc, and the Australian dollar. Dacorogna, Müller, Pictet, and Vries (2001) demonstrated how the use of FOREX high-frequency data improves the efficiency of risk value estimates. Abraham (2005) compared soft computing (artificial intelligence, evolutionary computing, probabilistic logic, and diffuse logic) and hard computing (numerical analysis and binary logic) methods for forecasting monthly average return rates in FOREX. Lai, Yu, and Wang (2005) presented a neural network with a decision support system (DSS) to forecast the exchange rate trend changes. Alamili (2011) compared the exchange rate forecast obtained by a support vector machine against the ones obtained by neural networks. Ravi, Lal, and Kiran (2012) presented the application of the following computational intelligence methods for forecasting the FOREX rates: neural networks with wavelets, multivariate adaptive regression splines, support vector machines, evolutionary dynamic system, and genetic programming. Chaudhuri and Ghosh (2016) utilized neural networks fed by macroeconomic variables for predicting the price of the Indian rupee in relation to the US dollar.

The use of the time-series technique in exchange rate modeling requires specific conditions, which implies the limitation of its practical use since the financial series are often not restricted to the assumptions of this technique. Recently, it has been recognized that methods based on the singular value decomposition (SVD) are able to produce noise reduction in non-stationary series (Hassani & Thomakos, 2010).

This work proposes the following research question: Between the SSA and ARIMA-GARCH, which modeling method for EUR/USD exchange rates performs better when using high-frequency data to predict future information?

Thus, this study has the general objective of evaluating which EUR/USD pair exchange rate modeling method results in better forecasts when using high-frequency data. To achieve this goal, we adopted the following steps: 1. modeling the one-minute series of the EUR/USD pair exchange rates with the SSA (non-parametric) and the ARIMA-GARCH (parametric) estimation methods; 2. forecasting using the two models4; and 3. evaluating which model is the most appropriate, by means of the root mean square error (RMSE), the mean absolute percentage error (MAPE) and Theil’s inequality coefficient (TIC).

Despite being a successful technique in other disciplines, such as meteorology, oceanography, biomedicine, image digital treatment, and digital signal processing, only recently has SSA been applied in finance. Additionally, exchange rate modeling is a complex issue, dealing with statistical concepts and properties such as the non-stationarity, the absence of auto-correlations (random walk), unstable volatilities, kurtosis distributions, and high asymmetry, seasonality and noise. However, in spite of the complexity, the analysis of this series is very important for several agents, direct or indirectly acting in the economy and financial markets.

Our goal is to apply the SSA within the scope of the FOREX market and verify the advantages or disadvantages of this method when compared to the classic ARIMA-GARCH model.

In order to reach the objectives outlined here, in addition to this introduction, this work is organized as follows: section 2 presents a literature review; section 3 presents the methodology and the database where, initially, a description of the data collection and analysis is presented, followed by the mathematical aspects of each model; section 4 discusses the obtained results; and section 5 presents the final considerations.

2. LITERATURE REVIEW

2.1 Time-series classical methods

According to Makridakis and Hibon (1997), after reflecting on the spurious correlation issue, the British statistician George Udny Yule introduced autoregressive models (AR) in the 1920s. The economist Eugen Slutsky proposed the moving average (MA) model, suggesting the possibility of measurement errors in economic data and indicating that the cyclic components of these data can be caused by random events. In 1938, Herman Wold demonstrated that the combination of autoregressive elements and mobile averages can model stationary time-series, thus coming up with the ARMA model.

In Makridakis and Hibon’s (1997) opinion, the utilization of Wold’s model became possible in the mid-1960s, when computers became readily available, rendering the ability to perform the required calculations and optimize parameters. Still, according to these authors, within this context, in 1970 (original edition, later published in 1976), Box and Jenkins introduced the methodology for the ARIMA models (adding the “I” to the ARMA model, where the “I” represents the integration parameter necessary to transform non-stationary series into stationary ones), which has been widely used in academia since then.

The method proposed by Box and Jenkins was questioned when other less sophisticated techniques have proven more precise. Groff (1973), for example, made forecasts for 63 sales series using the Box and Jenkins methodology and exponential weighting; he concluded that the forecast errors of the best models estimated by the Box-Jenkins method were bigger than the ones arising from exponential weighting. In addition, Makridakis, Hibon, and Moser (1979) applied several forecasting methods to 111 time-series and concluded that for all series, the simplest methods obtained better results than the ARIMA models.

In 1982, the scientific magazine Econometrica published an article Autoregressive Conditional Heteroscedasticity with Estimates of the Variance of United Kingdom Inflation. Robert Engle showed that the assumption of the constant variance of the ARIMA models is not plausible, and introduced the ARCH model. This model was the predecessor of many others dealing with conditional variance, with GARCH being one of the most widely used.

As the name suggests, the GARCH model, proposed by Bollerslev (1986), is a generalization of the autoregressive conditional heteroscedasticity introduced by Engle. The classical GARCH models express conditional variance as a linear function of the square of the series’ past values, which is useful for modeling different economic and financial phenomena (Bollerslev, 1986). The generalization of the original model includes the major factors characterizing the financial series, such as the random behavior of the returns, the grouping of volatility and asymmetries. Consequently, it was successful with respect to its applicability in this area. According to Asai, McAleer, and Yu (2006), a wide range of multivariate GARCH models have been developed, analyzed, and applied in the characterization of the inherent volatility of time-series in the field of finance. For Bauwens, Laurent, and Rombouts (2006), published after Engle, the ARIMA models were adapted in order to incorporate the conditional variance, and the models of the ARCH family have been regularly utilized for describing and forecasting changes in financial series volatility.

2.2 Time-series contemporary methods

The mean non-stationarity, or variance, and the structural breaks of financial series are challenges to modeling and forecasting them since they are not intrinsic to the data but come from factors such as exogenous shocks, political and technological changes, changes in consumer preferences, and informational asymmetry. The classical models fail in dealing with these problems and, for this reason, new techniques have been developed to overcome them. Among them, we can mention applications of the space-state model and Kalman’s filter, which consider the variability of coefficients (such as the CAPM bets) over time, semi- and non-parametric models, models with co-integration, and correction of intercepts, among others (Hassani & Thomakos, 2010).

Structural or autoregressive time-series models assume normality, linearity, and stationarity of the data or residues. This limits their practical use, as the financial series are not frequently restricted to these assumptions. Furthermore, another element able to restrict the predictive capacity of the models is the occurrence of noise. Hassani and Thomakos (2010) think that in general, there are two main approaches regarding the forecasting of time series with noise. The first one ignores the noise and adjusts the model that best extracts the subjacent deterministic dynamics; the second approach, a more effective one, tries to decompose the series into subcomponents, identify the noise component and extract it, and then predict new information from the filtered series.

Within the second approach, we can mention the singular spectrum analysis, which is a non-parametric technique that incorporates classical elements of time-series analysis, multivariate geometry, dynamical systems and signal processing (Hassani, 2007). It can be applied to arbitrary, linear or non-linear, stationary or non-stationary, Gaussian or non-Gaussian statistical processes (Hassani, 2010). SSA has two properties that justify its use in the analysis of financial time-series: first, it does not make assumptions about the data; second, unlike other methods, it can be applied to small samples (Hassani, 2007).

SSA has the characteristic of being a technique applicable to time-series with few observations. This allowed its use in a range of studies in which the data are not frequent or where their collection at high frequency is impracticable. Hassani and Zhigljavsky (2009), for example, have demonstrated that the use of SSA was satisfactory for predicting both small and large macroeconomic series.

Although SSA has become a widely used tool in the analysis and prediction of climatic (Vautard & Ghil, 1989), meteorological (Ghil & Vautard, 1991), hydrological (Menezes et al., 2014), biomedical (Sanei & Hassani, 2015), digital image processing (Rodríguez-Aragón & Zhigljavsky, 2010), digital signal processing series (Vautard, Yiou, & Ghil, 1992), and other fields of knowledge such as the social sciences and physics (Hassani, 2010), its application in economic and financial series is still a recent development. Hassani and Thomakos (2010) reviewed the development of the use of SSA in the analysis of this type of data.

Another important feature that should be emphasized is that although some probabilistic and statistical concepts are employed in SSA-based methods, they do not assume stationarity assumptions of the analyzed series and the normality of the residues (Hassani & Thomakos, 2010).

According to Golyandina, Nekrutkin, and Zhigljavsky (2001), SSA can be modified in different manners, resulting in variations such as the Single and Double Centering SSA, Toeplitz SSA, and Sequential SSA. A modification often found in studies and publications about the SSA is the multi-channel singular spectrum analysis (MSSA). This modification applies to cases in which the time-series is constituted by several variables correlated with each other (Vautard et al., 1992).

Unlike the SSA, the application of the ARIMA-GARCH model depends on the sample size and assumptions as sample distribution. According to Ng and Lam (2006), different sizes of a time-series guide the estimation of the model towards different points of optimal local solutions. In their work, these authors used size samples ranging from 200 to 3000 observations of the NASDAQ daily returns index and concluded that the sample with 1000 observations presented the best result, although the day of the beginning of the observations interfered in it.

Therefore, in different knowledge areas, the SSA technique allows for better forecasts than other models usually used for time-series (e.g., Esquivel, Senna, & Gomes, 2013; Menezes et al., 2014; Royer, Wilhelm, & Patias, 2015), and we considered the following research hypothesis: SSA can more accurately predict the EUR/USD exchange rate compared to the ARIMA-GARCH model.

3. METHODOLOGY AND DATABASE

3.1 Database

In order to achieve the proposed objectives, a descriptive research effort was carried out in two parts. The first is the modeling of the EUR/USD exchange rate by the SSA and the ARIMA model (p, d, q)-GARCH (r, s). The second part consists of predicting future data using the optimal models obtained in the previous step. The results of the second step are used to measure the accuracy of the estimates and determine which technique generates better forecasts.

For each sample set, the SSA model and the ARIMA (p, d, q)-GARCH (r, s) model were estimated. The R software, version 3.3.0 for Windows®, as well as the RStudio® integrated development environment, version 0.99.484, were used. For the modeling of the series through the SSA, the “Rssa” package developed by Anton Korobeynikov, Alex Shlemov, Konstantin Usevich, and Nina Golyandina was made available in the official statistical software repository. For the modeling of the series through ARIMA-GARCH, the “rugarch” package developed by Alexios Ghalanos, also available in the official repository, was used.

Three samples of EUR/USD pair ask closing price were utilized, with 1-minute frequency. The data were collected with the Marketscope 2.0® software, from the FOREX broker FXCM. This choice was made by the fact that the broker company provides quotations through the electronic communication network (ECN), i.e., directly from liquidity providers (major market participants), and does not interfere with them.

Unlike the stock exchanges, the FOREX market, in addition to being extremely liquid, operates 24 hours a day from Sundays 5:00 pm (UTC), when the Asian section opens, with liquidity coming from Japan (Tokyo being the most important pole), China, Australia, New Zealand, Russia, and others, until 5:00 pm (UTC) on Fridays, when the American section closes, New York being the most important hub. Therefore, there was no need to worry about gaps (even on weekends).

The data were selected so that the three typical movements of the financial markets could be obtained: the uptrend or bull trend; the downtrend or bear trend; and the absence of a trend (range). The analysis of the predictions for both models was made for each sample.

According to Murphy (1999), “trend” is the direction in which the market moves; however, this movement does not occur in a straight line, but in a sequence of “zigzags”. Such “zigzags” form a successive series of peaks and bottoms. Thus, as per Murphy (1999), one can define the upward trend as a successive series of increasingly higher peaks and bottoms, the downward trend as a successive series of lower and lower peaks and bottoms, and the absence of a tendency as a series of horizontal peaks and bottoms.

Figure 3.1.1 shows the evolution of the minute-to-minute exchange rate close for the first sample. The time interval extends from 08/23/2016 08:02 am (UTC) through 08/24/2016 03:02 am (UTC). Figures 3.1.2 and 3.1.3 present, respectively, samples 2 and 3 used for estimation of models in uptrend (from 03:21 pm UTC on 08/17/2016 to 04:05 pm UTC on 08/18/2016) and in range (from 09:41 am UTC on 08/16/2016 to 01:54 pm UTC on 08/17/2016).

Source: Elaborated by the authors with data obtained through Marketscope 2.0®.

Figure 3.1.1 SAMPLE 1 – EUR/USD PAIR EXCHANGE RATE CLOSE EVOLUTION IN DOWNTREND 

Source: Elaborated by the authors with data obtained through Marketscope 2.0®.

Figure 3.1.2 SAMPLE 2 – EUR/USD PAIR EXCHANGE RATE CLOSE EVOLUTION IN UPTREND 

Source: Elaborated by the authors with data obtained through Marketscope 2.0®.

Figure 3.1.3 SAMPLE 3 – EUR/USD PAIR EXCHANGE RATE CLOSE EVOLUTION WITH NO WELL-DEFINED TREND 

However, it is important to highlight that, unlike the stock market, for the most part and especially with regard to the EUR/USD pair, the frequency and impact of news on FOREX occurs on a global and not local scale. In view of this aspect of the exchange rate market, we discharged from sample 3, four outliers resulting from the disclosure of economic news and/or reports (GDP, personal spending, core PCE price index, ISM manufacturing, ADP employment survey, FOMC - fed funds rate,sovereign debt to be rated, PMI, among others).

In the FOREX, the exchange rate can also be understood as price, because when transacting a currency pair, the agent sells one currency at the same time as buying another. Therefore, the criterion defining outlier was:

ABSrt*100.000>50xt=outlier (1)

where rt is:

rt=lnxtxt1 (2)

xt is the remark x at time t. The return modulus was multiplied by 100,000 because the FOREX standard unit (named pip) is 0.0001 and the quotation differences between subsequent sample moments are, in general, on the order of 0.00001.

3.2 ARIMA-GARCH model

According to Makridakis and Hibon (1997), an xt ARMA stationary series can be modeled as a combination of its past values as well as past errors:

xt=φ1xt1+φ2xt2++φpxtp+εtθ1εt1θ2εt2θqεtq (3)

xt=i=1pφixtii=1qθiεt1+εt (4)

where p is the number of AR terms and q is the number of MA terms.

A summary of the mathematical definition of the GARCH model presented by Bollerslev (1986) is as follows. Let 𝓔t be a discrete stochastic process and Ψt the set of all information over time t. The GARCH process (r, s) is then given by:

εtψtN0,ht (5)

ht=a0+i=1saiεti2+i=1rβihti=a0+ALεt2+BLht (6)

where p ≥ 0; q > 0; a0 > 0; ai ≥ 0, i = 1, ..., q; βi ≥ 0, and i = 1, ..., p.

The specification of an ARIMA-GARCH model can then be represented by:

i=1pφixtii=1qθiεt1+i=1saiεti2+i=1rβihti+εt+a0 (7)

3.3 Singular spectrum analysis (SSA)

The emergence of the SSA is usually associated with the publication of articles of Broomhead and King (1986) and Broomhead, Jones, King, and Pike (1987). Many works studying the methodology and applications of the SSA have since been disclosed: Vautard et al. (1992), Allen and Smith (1996), Danilov and Zhigljavsky (1997), and Golyandina et al. (2001, p. 14).

The most basic SSA method is the transformation of a nonzero real values time-series into a series sum so that each component of that sum can be recognized as trend, seasonality, or noise (Hassani, 2007). For this, the algorithm of this technique comprises two stages: decomposition and reconstruction. The first step is executed by transforming the data into a trajectory matrix and decomposing it into singular values (SVD). The second step, in turn, deals with the grouping of elements of the trajectory matrix decomposed to form subgroups from which the time-series will be obtained again (Hassani, 2010).

In the decomposition, Elsner and Tsonis (1996) demonstrated the possibility of applying multivariate statistical analysis to a single time-series sample. According to the authors, the records of a dynamic system are the result of the interaction of all variables, and therefore a single sample must contain information about the dynamics of the main variables involved in the evolution of the system. Furthermore, it is possible to assume that such variables satisfy a p set of first-order differential equations:

x˙1=f1x1,x2,,xp,x˙2=f2x1,x2,,xp,x˙p=fpx1,x2,,xp, (8)

where i indicates the first derivative of variable xl over time dxdt and so on. Through successive differentiations, this system can be transformed into a single differential equation of p order that represents the whole system without any loss of information:

x1p=fx1,x˙1,x¨2,,x1p1, (9)

In an analogous way, it is possible to make the inverse transformation of a discrete time-series and its successive lags. Let us suppose a time-series xt, where t = 1,2 ...,n:

x1,x2,,xt (10)

Let m be the number of lags, such that 2 < m < t. Then, we can obtain the following from the original series n - m + 1 subsets of size m, with subsequent elements.

v1=x1,x2,,xmv2=x2,x3,,xm+1vnm+1=xnm+1,xnm+2,,xn (11)

These are then written in the vector form to obtain the trajectory matrix X:

X=v1vnm+1=x1,x2,,xmxnm+1,xn=m+2,,xn (12)

Therefore, the decomposition process occurs by the mapping of the one-dimensional series xt = (x1,...,xt) into a matrix X(t-m+1)xl.

The trajectory matrix will be composed by X1,...,Xn-m+1 with Xi = (yi,...,yi+m-1) ϵ RL; m represents the size of the decomposition window (lags) and is the only parameter to be defined. The resulting trajectory matrix is a Hankel matrix. According to Pan, Rami, and Wang (2002), the Hankel matrix is given by ai, j ϵ Xai, j = ai-1, j+1 and can be represented as follows:

h0h1hn1h1hnhn1hnh2n2 (13)

The symmetric trajectory matrix X is then transformed into a sum of elementary matrices.

Let S = XXT , λ1 ≥... ≥ λm be the S eigenvalues arranged in decreasing order of magnitude, and U1,...,Um the orthogonal system of their eigenvectors corresponding to eigenvalues. Let d = rank(X) = max (i | λi > 0). Then, if Vi=XTUiλii=1,d , the SVD of the trajectory matrix can be expressed as X = X1 + ... + Xd, where Xi=λiUiViT , known as eigentriple.

According to Golyandina et al. (2001), when reconstructing, as we obtained the trajectory matrix, it is possible to group the matrix components into subsets and turn each matrix resulting from this group into a new time-series of size. The mathematical demonstration of this stage is beyond the purpose of this study and can be found in the mentioned works of different authors.

Hassani (2007) proposed the investigation of supplementary information as a method to choose the trajectory matrix components to be grouped and transformed into a separate trend, seasonality, and noise series. For the author, such information is like a bridge linking decomposition and reconstruction; the idea is to impart practicality to the identification process of the eigentriples constituting each component. Eigenvalues and eigenvector of matrix S are considered as “supplementary information”, as well as the weighted correlation matrix. Hassani (2007) showed that breaks or jumps between eigenvalues indicate different components of the time-series and, as a rule, the noise causes a sequence of eigenvalues, which decreases slowly. These characteristics can be observed in the eigenvalues figure of the trajectory matrix S from the original series. Besides the figure created by the eigenvalues, we can also analyze the figure resulting from the elements of a certain eigenvector and check if this eigenvector is a component of the eigentriple causing trends, seasonality, or noise. In order to verify the separability of the eigentriples, that is to say, to confirm the formed groupings do not present a correlation with each other, the weighted correlation matrix is used.

3.4 Quality measurements

The following computations were carried out in order to test the models’ predictive capacity:

  • Mean Absolute Percentage Error:

    MAPE=100Ti=1Txixi'xi (14)

  • Root Mean Square Error:

    RMSE=1Ti=1Txi'xi (15)

  • Theil’s Inequality Coefficient:

    TIC=RMSE1ni=1nxi'+1ni=1nxi (16)

where xi is the i-th observation, x'i is the i-th forecast, and n is the forecast horizon. For all measurements, the following rule is valid: the closer to zero, the better the fit of the models.

4. RESULTS

The SSA and ARIMA (p, d, q)-GARCH (r, s) models were implemented for each of the three samples: 1. downtrend, consisting of 1140 observations; 2. uptrend, consisting of 1471 observations; and 3. absence of a trend, with 1686 observations.

To perform the modeling, only the last five data of each of the samples were not used. That is, the definition of the best models was made according to the criteria of each technique separately and using 1135 data from the downtrend series, 1466 from the uptrend, and 1681 from the sample with no trend. In the end, out-of-sample predictions were performed for five minutes ahead using the two models, fitted for each of the three samples and compared to each other.

4.1 Modeling by SSA

According to Elsner and Tsonis (1996), the decomposition window (mentioned in section 3.3) is the only parameter to be defined and should be one-quarter of the number of observations. However, this is not a consensus in the literature. Golyandina et al. (2001), for example, believe the decomposition window depends on the intrinsic properties of each series being modeled. For Hassani (2007), it is possible that 2 ≤ L ≤ T, where L is the decomposition window and T is the number of observations. However, the author suggests that max L=T2 .

This parameter has a great influence on the series “separability”. According to Golyandina et al. (2001), the main purpose of the SSA is the decomposition of the original series into a sum of series, in a way that each component in this series can be identified as a trend, regular component, or noise. For the authors, the decomposition will only succeed if the additive components are separable from one another.

As a means of obtaining the best window, several decompositions were conducted through trial and error, starting from the maximum value suggested by Hassani (2007), L=T2 . The “separability” of each decomposition was verified using the weighted correlation matrix of the vectors from the trajectory matrix. Figure 4.1.1 displays some weighted correlation matrices resulting from the different decomposition windows and Figure 4.1.2 shows the optimum parameters found for each sample set. These parameters were found by the trial and error method.

Source: Elaborated by the authors from RSSA pack, R statistic software.

Figure 4.1.1 WEIGHTED CORRELATION MATRIX FOR DIFFERENT SAMPLE DECOMPOSITION WINDOWS 

Figure 4.1.2 DECOMPOSITION WINDOW SIZE FOR EACH SAMPLE SET  

Sample* T L
1 - Downtrend 1.135 ROUND/(T/4) = 284
2 - Uptrend 1.466 ROUND/(T/11) = 133
3 - Range 1.681 ROUND/(T/11) = 841

Source: Elaborated by the authors from RSSA pack, R statistic software.

*Does not include the forecast horizon

Once the optimum parameters of the series decomposition windows were found, we proceeded to analyze the eigenvectors of each sample set. Figure 4.1.3 shows the reconstructed series using only the first eigenvector (in red) and the original series of each sample (in black). For all three samples, the first eigenvector is the most significant.

Source: Elaborated by the authors from RSSA pack, R statistic software.

Figure 4.1.3 RECONSTRUCTION OF SAMPLE SET BY EIGENVECTOR 1 

By analyzing together the matrix of weighted correlations and the other eigenvectors, we added other components that presented periodic characteristics. The other eigenvectors were considered noise components and were excluded from the reconstruction used to perform the predictions. Figure 4.1.4 shows the reconstruction of the noise components of each sample.

Source: Elaborated by the authors from RSSA pack, R statistic software.

Figure 4.1.4 RECONSTRUCTION OF NOISE COMPONENTS OF EACH SAMPLE 

4.2 ARIMA-GARCH modeling

We used the method suggested by Box and Jenkins (1976) for modeling the series with ARIMA (p,d,q)-GARCH (r,s). Makridakis and Hibon (1997) stated that the method can be summarized as follows: 1. check if the series is stationary; if it is not, turn it into a stationary series in its mean and variance; 2. determine the p and q parameters by means of the autocorrelation and the partial autocorrelation coefficients; 3. check model adequacy.

Augmented Dickey-Fuller (ADF) and Phillips-Perron (PP) tests were deployed for checking if the series were stationary. Figure 4.2.1 presents the test results for each series.

Figure 4.2.1 AUGMENTED DICKEY-FULLER AND PHILLIPS-PERRON TESTS FOR THE THREE SAMPLES 

Sample 1 (downtrend) Sample 2 (uptrend) Sample 3 (range)
ADF Test
Number of observations 1135 1466 1681
Dickey-Fuller statistics -3.45 -2.54 -4.33
Lag order 10 11.00 11.00
p-value 0.05 0.35 < 0.01
Alternative hypothesis: stationarity
PP Test
Number of observations 1135 1466 1681
Z (alpha) -19.23 -21.14 -35.04
Lag order 7.00 7.00 8.00
p-value 0.08 0.06 < 0.01
Alternative hypothesis: stationarity

Source: Elaborated by the authors from RSSA pack. R statistic software.

The results in Figure 4.2.1 indicate that samples 1 and 2 series do not present stationarity. Both were differentiated, obtaining I(1) and tested once again. Figure 4.2.2 shows new results.

Figure 4.2.2 AUGMENTED DICKEY-FULLER AND PHILLIPS-PERRON TESTS FOR THE DIFFERENTIATED SERIES 

Sample 1 (downtrend) Sample 2 (uptrend)
ADF Test
Number of observations 1135 1466
Dickey-Fuller statistics -10.24 -11.86
Lag order 10 11.00
p-value < 0.01 < 0.01
Alternative hypothesis: stationarity
PP Test
Number of observations 1135 1466
Dickey-Fuller statistics -1099.3 -1403.60
Lag order 7.00 7.00
p-value < 0.01 < 0.01
Alternative hypothesis: stationarity

Source: Elaborated by the authors from RSSA pack, R statistic software.

Figure 4.2.3 displays the autocorrelations and the partial autocorrelations for each sample. In order to be considered parsimonious, models with a maximum order of 5 were specified for parameters p and q, and maximum order of 1 for parameters r and s.

Source: Elaborated by the authors from RSSA pack, R statistic software.

Figure 4.2.3 SERIES ACF AND PACF 

The choice of the optimum model for each sample set followed the Akaike’s criterion. Figure 4.2.4 shows the estimated parameters for the optimum models and the respective AIC criteria.

Figure 4.2.4 PARAMETERS AND AIC CRITERIA OF SELECTED MODELS 

Sample* ARIMA GARCH AIC
1 - Downtrend (1. 1. 1) (1. 0) -15.782
2 - Uptrend (5. 1. 1) (1. 0) -14.909
3 - Range (4. 0. 1) (1. 1) -15.547

Source: Elaborated by the authors from RSSA pack, R statistic software.

*Does not include the forecast horizon.

4.3 Comparison of results

Once the parameters were estimated, the 5-minute forecast horizon was established. figures 4.3.1, 4.3.2, and 4.3.3 present the forecasts made by each model as well as the result of the quality measures for samples 1, 2, and 3, respectively.

Figure 4.3.1

Figure 4.3.1 FORECAST RESULTS FOR SAMPLE 1 

Sample 1 - Downtrend Observations: 1135
Observations ARIMA-GARCH forecast SSA forecasts Quality measures Models
1.12922 1.129090 1.129181 ARIMA (1, 1, 1) GARCH (1, 0) SSAL = T/4
1.12930 1.129080 1.129194
1.12928 1.129070 1.129226 MAPE 0.01628% 0.005898%
1.12926 1.129060 1.129274 RMSE 0.000187 0.000078
1.12921 1.129050 1.129330 THEIL 0.000083 0.000034

Source: Elaborated by the authors from data obtained through Marketscope 2.0®.

Figure 4.3.2 FORECAST RESULTS FOR SAMPLE 2 

Sample 2 - Uptrend Observations: 1466
Observations ARIMA-GARCH forecast SSA forecasts Quality measures Models
1.13589 1.136392 1.135931 ARIMA (5, 1, 1) GARCH (1, 0) SSAL = T/11
1.13606 1.136406 1.135931
1.13575 1.136390 1.135920 MAPE 0.042457% 0.008645%
1.13586 1.136372 1.135857 RMSE 0.000492 0.000113
1.13596 1.136371 1.135836 THEIL 0.000217 0.000050

Source: Elaborated by the authors from data obtained through Marketscope 2.0®.

Figure 4.3.3 FORECAST RESULTS FOR SAMPLE 3 

Sample 3 - Range Observations: 1681
Observations ARIMA-GARCH forecast SSA forecasts Quality measures Models
1.12631 1.126359 1.126352 ARIMA (4, 0, 1) GARCH (1, 1) SSAL = T/2
1.12639 1.126355 1.126379
1.12651 1.126348 1.126406 MAPE 0.008221% 0.007227%
1.12648 1.126351 1.126435 RMSE 0.000104 0.000107
1.12626 1.126348 1.126465 THEIL 0.000046 0.000047

Source: Elaborated by the authors from data obtained through Marketscope 2.0®.

According to the results shown in the figures, the forecasts made by the SSA were the ones closest to the original observations for the three cases in which the technique was applied and also for all quality measures, except in sample 3, in which RMSE and TIC were better for ARIMA-GARCH. This fact allows concluding that the SSA forecasts, with respect to the exchange rates in this period, are more adequate than those obtained by the ARIMA- -GARCH model independently of the market movement, that is, for both uptrend and downtrend when there was no well-defined trend.

5. FINAL CONSIDERATIONS

The objective of this work was to evaluate which EUR/USD exchange rate (ask) close model provides more satisfactory forecasts when using high- -frequency data (one minute). In view of this, the research was divided into two parts: the estimation of the parameters of each technique and the use of the estimated models to predict the series in a 5-minute time horizon.

The results showed that the forecasts were more satisfactory when performed through the SSA. Only when observing the series in which there was no well-defined trend (sample 3), the ARIMA-GARCH model presented lower prediction errors when the RMSE and TIC metrics were considered. Still, it obtained a MAPE higher than the SSA. The results found in the work agree with the one in the literature (e.g., Esquivel et al., 2013), which highlight the greater accuracy of the SSA when compared to other time-series models. In addition, according to Hassani (2007), the SSA does not make assumptions about the data and, unlike other methods, it can be applied in small samples, thus constituting an advantage for the method.

It is important to note that the types of forecasts offered by the two techniques, the SSA and the classic ARIMA-GARCH model, are of different types. In the first case, we seek to identify and break up the series in their trend components, regular components, and noises. Then, the noise component is eliminated, and the predictions are made. In the second case, it is assumed that noise is the stochastic error with conditional variance and we sought to model it along with the other autoregressive and seasonal components.

Regarding the limitations of the research, it should be mentioned that the samples used in the modeling and forecasting of the exchange rate have different sizes. This can be a relevant factor when estimating the parameters of the ARIMA-GARCH model. According to the work of Ng and Lam (2006), sample size may guide estimation for different optimal local solutions, and so the selected samples are relatively small.

Another issue to consider is the period from which the samples were selected. If the observations were from another time of year, would the predictions made by the classical model still be less satisfactory? This is a pertinent question since the FOREX market presents different volatilities in different periods of the year and even in periods within the same day.

In view of the results found and the limitations of this work, we suggest, as subjects of further research: predicting the volatility of the EUR/USD exchange rate; the impact of the volatility difference in different periods in the model forecasts; predicting the returns and/or volatility of returns in the Brazilian stock market; using the SSA for forecasting with dynamic samples (i.e., scrolling observations) and for specifying a non-parametric value-at-risk model.

4SSA is not a model, but rather a spectral estimation method. For teaching purposes, however, the terms “SSA model” and “SSA modeling” will be used in this work to refer to the spectral estimates made.

References

Abraham, A. (2005). Hybrid soft and hard computing based Forex monitoring systems. In N. Nedjah & L. de M. Mourelle (Eds.), Fuzzy systems engineering (pp. 113-129). Berlin: Springer. doi:10.1007/11339366_5 [ Links ]

Alamili, M. (2011). Exchange rate prediction using support vector machines. Thesis doctorate, Delft University of Technology, Delft, Netherlands. [ Links ]

Allen, M. R., & Smith, L. A. (1996). Monte Carlo SSA: Detecting irregular oscillations in the presence of colored noise. Journal of Climate, 9(12), 3373-3404. doi:10.1175/1520-0442(1996)009<3373:MCSDIO>2.0.CO;2 [ Links ]

Asai, M., McAleer, M., & Yu, J. (2006). Multivariate stochastic volatility: A review. Econometric Reviews, 25(2-3), 145-175. doi:10.1080/07474930600713564 [ Links ]

Atkočiūnas, V., Mačiulis, N., Klimavičiene. , A., & Kalendiene. , J. (2010). Short-term currency exchange rate forecasting with econometric models. Thesis doctorate, ISM Vadybos ir ekonomikos universitetas, Vilnius, Lithuania. [ Links ]

Bank for International Settlements (2013). Triennial Central Bank Survey. Foreign exchange turnover in April 2013: Preliminary global results. Retrieved from http://www.bis.org/publ/rpfx13fx.pdfLinks ]

Bauwens, L., Laurent, S., & Rombouts, J. V. (2006). Multivariate GARCH models: A survey. Journal of Applied Econometrics, 21(1), 79-109. doi:10.1002/jae.842 [ Links ]

Bollerslev, T. (1986). Generalized autoregressive conditional heteroskedasticity. Journal of Econometrics, 31(3), 307-327. doi:10.1016/0304-4076(86) 90063-1 [ Links ]

Box, G. E., & Jenkins, G. M. (1976). Time series analysis: Forecasting and control. San Francisco: Holden-Day. [ Links ]

Brooks, C. (2008). Introductory econometrics for finance. Cambridge: Cambridge University. [ Links ]

Broomhead, D. S., Jones, R., King, G. P., & Pike, E. R. (1987). Singular system analysis with application to dynamical systems. In E. R. Pike & L. A. Lugiato (Eds.), Chaos, noise and fractals (pp. 15-27). Boca Raton, FL: CRC Press. [ Links ]

Broomhead, D. S., & King, G. P. (1986). Extracting qualitative dynamics from experimental data. Physica D: Nonlinear Phenomena, 20(2-3), 217-236. doi:10.1016/0167-2789(86)90031-X [ Links ]

Chaudhuri, T. D., & Ghosh, I. (2016). Artificial neural network and time series modeling based approach to forecasting the exchange rate in a multivariate framework. Journal of Insurance and Financial Management, 1(5), 92-123. doi:arxiv.org/abs/1607.02093 [ Links ]

Dacorogna, M. M., Müller, U. A., Pictet, O. V., & Vries, C. G. (2001). Extremal forex returns in extremely large data sets. Extremes, 4(2), 105-127. doi:10.1023/A:1013917009089 [ Links ]

Danilov, D., & Zhigljavsky, A. (1997). Principal components of time series: The “Caterpillar” method. St. Petersburg: University of St. Petersburg. [ Links ]

Elsner, J. B., & Tsonis, A. A. (1996). Singular spectrum analysis: A new tool in time series analysis. New York: Plenum Press. [ Links ]

Esquivel, R. M., Senna, V., & Gomes, G. S. S. (2013). Análise espectral singular: Comparação de previsões em séries temporais. Revista ADM. MADE, 16(2), 87-101. [ Links ]

Ghil, M., & Vautard, R. (1991). Interdecadal oscillations and the warming trend in global temperature time series. Nature, 350(6316), 324-327. doi:10.1038/350324a0 [ Links ]

Golyandina, N., Nekrutkin, V., & Zhigljavsky, A. A. (2001). Analysis of time series structure: SSA and related techniques. Boca Raton: Chapman & Hall, CRC. [ Links ]

Groff, G. K. (1973). Empirical comparison of models for short range forecasting. Management Science, 20(1), 22-31. doi:10.1287/mnsc.20.1.22 [ Links ]

Hassani, H. (2007). Singular spectrum analysis: Methodology and comparison. Journal of Data Science, 5, 239-257. [ Links ]

Hassani, H. (2010). Singular spectrum analysis based on the minimum variance estimator. Nonlinear Analysis: Real World Applications, 11(3), 2065-2077. doi:10.1016/j.nonrwa.2009.05.009 [ Links ]

Hassani, H., & Thomakos, D. (2010). A review on singular spectrum analysis for economic and financial time series. Statistics and its Interface, 3(3), 377-397. doi:10.4310/SII.2010.v3.n3.a11 [ Links ]

Hassani, H., & Zhigljavsky, A. (2009). Singular spectrum analysis: methodology and application to economics data . Journal of Systems Science and Complexity, 22(3), 372-394. doi:10.1007/s11424-009-9171-9 [ Links ]

Lai, K. K., Yu, L., & Wang, S. (2005). A neural network and web-based decision support system for forex forecasting and trading. In Y. Shi, W. Xu, & Z. Chen (Eds.), Data Mining and knowledge management (pp. 243-253). Berlin: Springer. doi:10.1007/978-3-540-30537-8_27 [ Links ]

Makridakis, S., & Hibon, M. (1997). ARMA models and the Box-Jenkins methodology. Journal of Forecasting, 16(3), 147-163. doi:10.1002/(SICI)1099-131X(199705)16:3<147::AID-FOR652>3.0.CO;2-X [ Links ]

Makridakis, S., Hibon, M., & Moser, C. (1979). Accuracy of forecasting: An empirical investigation. Journal of the Royal Statistical Society. Series A (General), 142(2), 97-145. doi:10.2307/2345077 [ Links ]

Menezes, M. L. D., Cassiano, K. M., Souza, R. M. D., Teixeira Júnior, L. A., Pessanha, J. F. M., & Souza, R. C. (2014). Modelagem e previsão de demanda de energia com filtragem SSA. Revista da Estatística da Universidade Federal de Ouro Preto, 3(2), 170-187. [ Links ]

Murphy, J. J. (1999). Technical analysis of the financial markets:A comprehensive guide to trading methods and applications. New York: New York Institute of Finance. [ Links ]

Ng, H. S., & Lam, K. P. (2006). How does sample size affect GARCH models? Proceedings of the 2006 Joint Conference on Information Sciences, Kaohsiung, Taiwan. Retrieved from https://www.researchgate.net/publication/221556756_How_does_Sample_Size_Affect_GARCH_ModelsLinks ]

Pan, V. Y., Rami, Y., & Wang, X. (2002). Structured matrices and Newton’s iteration: Unified approach. Linear Algebra and its Applications, 343, 233-265. doi:10.1016/S0024-3795(01)00336-6 [ Links ]

Ravi, V., Lal, R., & Kiran, N. R. (2012). Foreign exchange rate prediction using computational intelligence methods . International Journal of Computer Information Systems and Industrial Management Applications, 4, 659-670. [ Links ]

Rodríguez-Aragón, L. J., & Zhigljavsky, A. (2010). Singular spectrum analysis for image processing . Statistics and Its Interface, 3(3), 419-426. doi:10.4310/SII.2010.v3.n3.a14 [ Links ]

Royer, J. C., Wilhelm, V. E., & Patias, J. (2015). Previsão de séries temporais de subpressão de barragens com filtragem SSA e regressão múltipla com modelagem ARIMA dos resíduos. Congresso de Métodos Numéricos em Engenharia, Lisboa, Portugal. [ Links ]

Sanei, S., & Hassani, H. (2015). Singular spectrum analysis of biomedical signals. Boca Raton: CRC Press. [ Links ]

Vautard, R., & Ghil, M. (1989). Singular spectrum analysis in nonlinear dynamics, with applications to paleoclimatic time series. Physica D: Nonlinear Phenomena, 35(3), 395-424. doi:10.1016/0167-2789(89)90077-8 [ Links ]

Vautard, R., Yiou, P., & Ghil, M. (1992). Singular-spectrum analysis: A toolkit for short, noisy chaotic signals. Physica D: Nonlinear Phenomena, 58(1-4), 95-126. doi:10.1016/0167-2789(92)90103-T [ Links ]

World Trade Organization (2014). International Trade Statistics. Retrieved from https://www.wto.org/english/res_e/statis_e/its2014_e/its14_toc_e.htmLinks ]

Yao, J., & Tan, C. L. (2000). A case study on using neural networks to perform technical forecasting of forex. Neurocomputing, 34(1-4), 79-98. doi:10.1016/S0925-2312(00)00300-3 [ Links ]

Received: August 31, 2018; Accepted: December 17, 2018

Correspondence concerning this article should be addressed to Rafael J. Abreu, Rua Joaquim Borges, 97, ap. 101, Jardim Aeroporto, Alfenas, Minas Gerais, Brazil, CEP 37130-810. E-mail: abreu.fael@gmail.com

Rafael J. Abreu, Faculdade de Ciências Econômicas (Face), Universidade Federal de Minas Gerais (UFMG); Rafael M. Souza, Departamento de Engenharia Elétrica, Pontifícia Universidade Católica do Rio de Janeiro (PUC-Rio); & Joice G. Oliveira, Departamento de Administração e Contabilidade, Universidade Federal de Viçosa (UFV).

Rafael J. Abreu is now commercial analyst at Ipanema Comercial Exportadora e Importadora Ltda.; Rafael M. Souza is now associate professor at Department of Accounting Sciences at Federal University of Minas Gerais (UFMG); & Joice G. Oliveira is now doctoral student in controllership and accounting at Department of Accounting Sciences at Federal University of Minas Gerais (UFMG).

Creative Commons License This is an Open Access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited