Acessibilidade / Reportar erro

Granger causality in the frequency domain: derivation and applications

Abstract

Physicists are starting to work in areas where noisy signal analysis is required. In these fields, such as Economics, Neuroscience, and Physics, the notion of causality should be interpreted as a statistical measure. We introduce to the lay reader the Granger causality between two time series and illustrate ways of calculating it: a signal X “Granger-causes” a signal Y if the observation of the past of X increases the predictability of the future of Y when compared to the same prediction done with the past of Y alone. In other words, for Granger causality between two quantities it suffices that information extracted from the past of one of them improves the forecast of the future of the other, even in the absence of any physical mechanism of interaction. We present derivations of the Granger causality measure in the time and frequency domains and give numerical examples using a non-parametric estimation method in the frequency domain. Parametric methods are addressed in the Appendix. We discuss the limitations and applications of this method and other alternatives to measure causality.

Keywords:
Granger causality; autoregressive process; conditional Granger causality; non-parametric estimation

1. Introduction

The notion of causality has been the concern of thinkers at least since the ancient Greeks [1][1] A. Falcon, in: The Stanford Encyclopedia of Philosophy, edited by E.N. Zalta (Stanford University, Palo Alto, 2019).. More recently, Clive Granger [2][2] C.W.J Granger, Econometrica 37, 424 (1969)., in his paper entitled “Investigating Causal Relations by Econometric Models and Cross-spectral Methods” from 1969, elaborated a mathematical framework to describe a form of causality – henceforth called Granger Causality1 1 It is also referred as Wiener-Granger causality. (GC) in order to distinguish it from other definitions of causality. Given two stochastic variables, X(t) and Y(t), there is a causal relationship (in the sense of Granger) between them if the past observations of Y help to predict the current state of X, and vice-versa. If so, then we say that Y Granger-causes X. Granger was inspired by the definition of causality from Norbert Wiener [3][3] N. Wiener, in: Modern Mathematics for Engineers edited by E.F. Beckenbach (McGraw-Hill, New York, 1956), v. 1., in which Y causes X if knowing the past of Y increases the efficacy of the prediction of the current state of X(t) when compared to the prediction of X(t) by the past values of X alone2 2 Other notions of causality have been defined, one worth mentioning is Pearl's causality [4]. Over the years, Pearl's causality has been revised by him and colleagues in a series of published works [5,6]. .

In the multidisciplinary science era, more and more physicists are involved in research in other areas, such as Economics and Neuroscience. These areas usually have big data sets. Data analysis tools, such as GC, come in handy to extract meaningful knowledge from these sets.

Causality inference via GC has been widely applied in different areas of science, such as: prediction of financial time series [7[7] T. Vỳrost, Š. Lyócsa and E. Baumöhl, Physica A. 427, 262 (2015). [8] B. Candelon and S. Tokpavi, J. Bus. Econ. Stat. 34, 240 (2016).-9[9] H. Ding, H. Kim and S.Y. Park, Energ. Econ. 59, 58 (2016).], earth systems [10][13] G. Tissot, A. Lozano–Durán, L. Cordier, J. Jiménez and B.R. Noack, J. Phys. Conf. Ser. 506, 012006 (2014)., atmospheric systems [11][11] D.A. Smirnov and I.I. Mokhov, Phys. Rev. E. 80, 016208, (2009)., solar indices [12][12] P.O. Amblard and O.J.J. Michel, Entropy 15, 113 (2013)., turbulence [12[12] P.O. Amblard and O.J.J. Michel, Entropy 15, 113 (2013)., 13[13] G. Tissot, A. Lozano–Durán, L. Cordier, J. Jiménez and B.R. Noack, J. Phys. Conf. Ser. 506, 012006 (2014).], inference of information flow in the brain of different animals [14[14] A. Brovelli, M. Ding, A. Ledberg, Y. Chen, R. Nakamura and S.L. Bressler, P. Natl. A. Sci. 101, 9849 (2004). [15] M. Ding, Y. Chen and S.L. Bressler, in: Handbook of Time Series Analysis: Recent Theoretical Developments and Applications, edited by B. Schelter, M. Winterhalder and J. Timmer (Wiley-VCH, Weinheim, 2006). [16] F.S. Matias, L.L. Gollo, P.V. Carelli, S.L Bressler, M. Copelli and C.R. Mirasso, Neuroimage 99, 411 (2014). [17] S. Hu, H. Wang, J. Zhang, W. Kong, Y. Cao and R. Kozma, IEEE T. Neur. Net. Lear. 27, 1429 (2015).-18[18] A.K. Seth, A.B. Barrett and L. Barnett, J. Neurosci. 35, 3293 (2015).], and inference of functional networks of the brain using fMRI [19[19] M. Havlicek, J. Jan, M. Brazdil and V.D. Calhoun, Neuroimage 53, 65 (2010)., 20[20] W. Liao, D. Mantini, Z. Zhang, Z. Pan, J. Ding, Q. Gong, Y. Yang and H. Chen, Biol. Cybern. 102, 57 (2010).], MEG [21][21] L. Pollonini, U. Patidar, N. Situ, R. Rezaie, A.C. Papanicolaou and G. Zouridakis, in: 2010 Annual International Conference of the IEEE Engineering in Medicine and Biology (Buenos Aires, 2010)., and EEG [22][22] A.B. Barrett, M. Murphy, M.A. Bruno, Q. Noirhomme, M. Boly, S. Laureys and A.K. Seth, PLoS ONE 7, e29072 (2012).. It appears as an alternative to measures like linear correlations [23][23] M.E. Lynall, D.S. Bassett, R. Kerwin, P.J. McKenna, M. Kitzbichler, U. Muller and E. Bullmore, J. Neurosci. 30, 9477 (2010)., mutual information [24[24] S.H. Jin, P. Lin, S. Auh and M. Hallett, Movement Disord. 26, 1274 (2011)., 25[25] V. Lima, R.F.O. Pena, C.A.C. Ceballos, R.O. Shimoura and A.C. Roque, Rev. Bras. Ensino. Fis. 41, e20180197 (2019).], partial directed coherence [26][26] L.A. Baccalá and K. Sameshima, Biol. Cybern. 84, 463 (2001)., ordinary coherence [27][27] P. Fries, Trends Cogn. Sci. 9, 474 (2005)., directed transfer function [28][28] M.J. Kamiński and K.J. Blinowska, Biol. Cybern. 65, 203 (1991)., spectral coherence [29][29] D. La Rocca, P. Campisi, B. Vegso, P. Cserti, G. Kozmann, F. Babiloni and F. De Vico Fallani, IEEE T. Bio-Med. Eng. 61, 2406 (2014)., and transfer entropy [30[30] T. Schreiber, Phys. Rev. Lett. 85, 461 (2000)., 31[31] R. Vicente, M. Wibral, M. Lindner and G. Pipa, J. Comput. Neurosci. 30, 45 (2011).], being usually easier to calculate since it does not rely on the estimation of probability density functions of one or more variables.[10] J. Runge, S. Bathiany, E. Bollt, G. Camps-Valls, D. Coumou, E. Deyle, C. Glymour, M. Kretschmer, M.D. Mahecha, J. Muñoz-Marí et al., Nat. Commun. 10, 1 (2019).

The definition of GC involves the prediction of future values of stochastic time series (see Fig. 1). The measurement of the GC between variables may be done in both the time and the frequency domains [26[26] L.A. Baccalá and K. Sameshima, Biol. Cybern. 84, 463 (2001)., 32[32] J. Geweke, J. Am. Stat. Assoc. 77, 304 (1982). [33] J.F. Geweke, J. Am. Stat. Assoc. 79, 907 (1984).-34[34] M. Dhamala, G. Rangarajan and M. Ding, Phys. Rev. Lett. 100, 018701 (2008).].

Figure 1
When time series Y(t) Granger-causes time series X(t), the patterns in Y(t) are approximately repeated in X(t) after some time lag (two examples are indicated with arrows). Thus, past values of X can be used for the prediction of future values of Y.

In the present work, we will focus on the frequency domain representation of the GC [26[26] L.A. Baccalá and K. Sameshima, Biol. Cybern. 84, 463 (2001)., 32[32] J. Geweke, J. Am. Stat. Assoc. 77, 304 (1982).] and, for pedagogical purposes, will discuss illustrative examples from previous works by other authors [26[26] L.A. Baccalá and K. Sameshima, Biol. Cybern. 84, 463 (2001)., 34[34] M. Dhamala, G. Rangarajan and M. Ding, Phys. Rev. Lett. 100, 018701 (2008).]. Our main goal is to provide a basic notion of the GC measure to a reader not yet introduced to this subject.

This work is organized as follows: in Section 2, we present the concept of an autoregressive process – a model of linear regression in which GC is based (it is also possible to formulate GC for nonlinear systems, however such a formulation results in a more complex analysis which is beyond the scope of this work [35[35] A. Seth, Scholarpedia. 2, 1667 (2007)., 36[36] D. Marinazzo, W. Liao, H. Chen and S. Stramaglia, Neuroimage 58, 330 (2011).]). Sections 3 and 4 are used to develop the mathematical concepts and definitions of the GC both in the time and frequency domains. In Section 5.1, we introduce the nonparametric method to estimate GC through Fourier and wavelet transforms [34][34] M. Dhamala, G. Rangarajan and M. Ding, Phys. Rev. Lett. 100, 018701 (2008).. In Section 6 we introduce examples of the conditional GC (cGC) to determine known links between the elements of a simple network. We then close the paper by discussing applications, implications and limitations of the method.

2. Autoregresive process

Autoregressive processes form the basis for the parametric estimation of the GC, so in this section we introduce the reader to the basic concepts of such processes [37][37] G.E.P. Box, G.M. Jenkins, G.C. Reinsel and G.M. Ljung, Time Series Analysis: Forecasting and Control (Wiley, Hoboken, 2015), 5ª ed.. A process X(t) is autoregressive of order n (i.e., AR(n)) if its state at time t is a function of its n past states:

(1) X ( t ) = i = 1 n a i X ( t i ) + ϵ ( t ) ,

where t is the integer time step, and the real coefficients ai indicate the weighted contribution from i steps in the past, to the current state t of X. The term ϵ(t) is a noise source with variance Σ that models any external additive contribution to the determination of X(t). If Σ is large, then the process is weakly dependent on its past states and X(t) may be regarded as just noise. Fig. 2 shows examples of an AR(2) (a) and an AR(4) (b).

Figure 2
Autoregressive processes. (a) time series of an AR2 process with coefficients (a1,a2)=(0.3,0.5). (b) time series of an AR4 process with coefficients (a1,a2,a3,a4)=(0.2,0.5,0.6,0.2).

Fitting the autoregressive coefficients ai and the noise variance Σ, for a recorded signal, is usually done by solving a Yule-Walker set of equations [15[15] M. Ding, Y. Chen and S.L. Bressler, in: Handbook of Time Series Analysis: Recent Theoretical Developments and Applications, edited by B. Schelter, M. Winterhalder and J. Timmer (Wiley-VCH, Weinheim, 2006)., 38[38] G. Eshel, Internet resource 2, 68 (2003).]. For a brief review on this topic see the Section A of the Appendix.

3. Granger causality in time domain

In this section we develop the mathematical concepts and definitions of GC in time domain. Consider two stochastic signals, X1(t) and X2(t). We assume that these signals may be modeled by autoregressive stochastic processes of order n, independent of each other, such that their states in time t could be estimated by their n past values:

(2) X 1 ( t ) = i = 1 n a i X 1 ( t i ) + ϵ 1 ( t ) ,
(3) X 2 ( t ) = i = 1 n c i X 2 ( t i ) + ϵ 2 ( t ) ,

where the variances of ϵ1 and ϵ2 are, respectively, Σ11 and Σ22, and the coefficients ai and ci are adjusted in order to minimize Σ11 and Σ22.

However, we may also assume that the signals X1(t) and X2(t) are each modeled by a combination of one another, yielding

(4) X 1 ( t ) = i = 1 n a i X 1 ( t i ) + i = 1 n b i X 2 ( t i ) + ϵ 1 ( t ) ,
(5) X 2 ( t ) = i = 1 n c i X 2 ( t i ) + i = 1 n d i X 1 ( t i ) + ϵ 2 ( t ) ,

where the covariance matrix is given by

(6) Σ Σ = [ Σ 11 Σ 12 Σ 21 Σ 22 ] .

Here, Σ11, Σ22 are the variances of ϵ1 and ϵ2 respectively, and Σ12=Σ21 is the covariance of ϵ1 and ϵ2. Again, the coefficients ai, ci, and di are adjusted to minimize the variances Σ11 and Σ22.

If Σ11<Σ11, then the addition of X2(t) to X1(t) generated a better fit to X1(t), and thus enhanced its predictability. In this sense, we may say there is a causal relation from X2 to X1, or simply that X2 Granger-causes X1. The same applies for the other signal: if Σ22<Σ22, then X1 Granger-causes X2 because adding X1 to the dynamics of X2 enhanced its predictability.

We may summarize this concept into the definition of the total causality index, given by

(7) F 1.2 = log ( Σ 11 Σ 22 det ( Σ Σ ) ) = log ( Σ 11 Σ 22 Σ 11 Σ 22 ( Σ 12 ) 2 ) .

If F1.2>0, there is some Granger-causal relation between X1 and X2, because either Σ11<Σ11 or Σ22<Σ22, otherwise there is correlation between X1 and X2 due to Σ12>0. If neither Granger-causality nor correlations are present, then F1.2=0.

To know specifically whether there is Granger causality from 1 to 2 or from 2 to 1, we may use the specific indices:

(8) F 1 2 = log ( Σ 22 Σ 22 ) ,
(9) F 2 1 = log ( Σ 11 Σ 11 ) ,
(10) F 1 2 = log ( Σ 11 Σ 22 det ( Σ Σ ) ) ,

such that

(11) F 1.2 = F 1 2 + F 2 1 + F 1 2 ,

where F12 defines the causality from X1(t) to X2(t), F21 is the causality from X2(t) to X1(t), and F12 is called instantaneous causality due to correlations between ϵ1 and ϵ2. Just as for the total causality case, these specific indices are greater than zero if there is Granger causality, or zero otherwise.

4. Granger causality in frequency domain

In order to derive the GC in frequency domain, we first define the lag operator Lk, such that

(12) L k X ( t ) = X ( t k ) ,

delays X(t) by k time steps, yielding X(tk). We may then rewrite equations (4) and (5) as:

(13) X 1 ( t ) = ( i = 1 n a i L i ) X 1 ( t ) + ( i = 1 n b i L i ) X 2 ( t ) + ϵ 1 ( t ) ,
(14) X 2 ( t ) = ( i = 1 n c i L i ) X 1 ( t ) + ( i = 1 n d i L i ) X 2 ( t ) + ϵ 2 ( t ) ,

and rearrange their terms to collect X1(t) and X2(t):

(15) ( 1 i = 1 n a i L i ) X 1 ( t ) + ( i = 1 n b i L i ) X 2 ( t ) = ϵ 1 ( t ) ,
(16) ( i = 1 n c i L i ) X 1 ( t ) + ( 1 i = 1 n d i L i ) X 2 ( t ) = ϵ 2 ( t ) .

We define the coefficients a(L)=1i=1naiLi, b(L)=i=1nbiLi, c(L)=i=1nciLi and d(L)=1i=1ndiLi, and rewrite equations (15) and (16) into matrix form:

(17) ( a ( L ) b ( L ) c ( L ) d ( L ) ) ( X 1 ( t ) X 2 ( t ) ) = ( ϵ 1 ( t ) ϵ 2 ( t ) )

where a(0)=d(0)=1 and b(0)=c(0)=0.

We apply the Fourier transform to equation (17) in order to switch to the frequency domain,

(18) ( a ~ ( ω ) b ~ ( ω ) c ~ ( ω ) d ~ ( ω ) ) A A ( ω ) ( X 1 ( ω ) X 2 ( ω ) ) X ( ω ) = ( ϵ 1 ( ω ) ϵ 2 ( ω ) ) Σ Σ ( ω ) ,

where ω is the frequency and AA(ω) is the coefficient matrix whose elements are given by

a ~ ( ω ) = 1 i = 1 n a i exp ( j ω i ) , b ~ ( ω ) = i = 1 n b i exp ( j ω i ) , c ~ ( ω ) = i = 1 n c i exp ( j ω i ) , d ~ ( ω ) = 1 i = 1 n d i exp ( j ω i ) .

The expressions above are obtained by representing the lag operator in the spectral domain as Li=exp(jωi). This derives from the z-transform, where the representation of the z variable3 3 The lag operator L is similar to the z-transform. However, z is treated as a variable, and is often used in signal processing, while L is an operator [39]. in the unit circle (|z|=1) is zi=exp(jωi) [40[40] R. Takalo, H. Hytti and H. Ihalainen, J. Clin. Monit. Comput. 19, 401 (2005).,41[41] R. Meddins, Introduction to Digital Signal Processing (Newnes, Oxford, 2000), 1ª ed.].

To obtain the power spectra of X1(ω) and X2(ω), we first isolate X(ω) in equation (18):

(19) ( X 1 ( ω ) X 2 ( ω ) ) = ( H 11 ( ω ) H 12 ( ω ) H 21 ( ω ) H 22 ( ω ) ) H ( ω ) ( ϵ 1 ( ω ) ϵ 2 ( ω ) ) ,

where H(ω)=A1(ω) is called the transfer matrix, resulting in the following spectra:

(20) S ( ω ) = X ( ω ) X ( ω ) = H ( ω ) Σ Σ ( ω ) H ( ω ) ,

where . is the ensemble average, the transposed conjugate of the matrix, and S(ω) is the spectral matrix defined as:

(21) S ( ω ) = [ S 11 ( ω ) S 12 ( ω ) S 21 ( ω ) S 22 ( ω ) ] .

In equation (21), S11(ω) and S22(ω) are called the autospectra, and the elements S12(ω) and S21(ω) are called the cross-spectra.

We can expand the product in equation (20) to obtain S11(ω) and S22(ω) (see Section B of the Appendix for details) as:

(22) S 11 ( ω ) = H ¯ 11 ( ω ) Σ 11 H ¯ 11 ( ω ) Intrinsic + H 12 ( ω ) ( Σ 22 Σ 12 2 Σ 11 2 ) H 12 ( ω ) Causal ,
(23) S 22 ( ω ) = H ^ 22 ( ω ) Σ 22 H ^ 22 ( ω ) Intrinsic + H ¯ 21 ( ω ) ( Σ 11 Σ 21 2 Σ 22 2 ) H ¯ 21 ( ω ) Causal ,

where the symbols .¯ and .^ are used to differentiate the terms below from the variables H11, H21, and H22, as follows:

H ¯ 11 ( ω ) = H 11 ( ω ) + Σ 12 H 12 ( ω ) Σ 11 , H ¯ 21 ( ω ) = H 21 ( ω ) + Σ 12 H 11 ( ω ) Σ 11 , H ^ 22 ( ω ) = H 22 ( ω ) + Σ 12 Σ 22 H 21 ( ω ) .

Once we have the S11(ω) and S22(ω) spectra as the sum of an intrinsic and a causal term, we may define indices to quantify GC in frequency domain just as we did in the time domain (Section 3). For instance, to calculate the causal index, we divide the spectra by their respective intrinsic term in order to eliminate its influence. Thus, the causality index I21(ω) is defined as:

(24) I 2 1 ( ω ) = log ( S 11 ( ω ) H ¯ 11 ( ω ) Σ 11 H ¯ 11 ( ω ) ) ,

and analogously, I12(ω),

(25) I 1 2 ( ω ) = log ( S 22 ( ω ) H ^ 22 ( ω ) Σ 22 H ^ 22 ( ω ) ) .

The instantaneous causality index I12(ω) is defined as:

(26) I 1 2 ( ω ) = log ( H ¯ 11 ( ω ) Σ 11 H ¯ 11 ( ω ) ) ( H ^ 22 ( ω ) Σ 22 H ^ 22 ( ω ) ) det ( S ( ω ) ) .

In equations (24) to (26), we have one index for each value ω of the frequency. Conversely, in the time domain there was a single index for the GC between the two signals X1 and X2. Just as discussed in Section 3, the indices I21(ω), I12(ω) and I12(ω) are greater than zero if there is any relation between the time series. They are zero otherwise.

Just like in the time domain, the total GC in the frequency domain is the sum of its individual components:

(27) I ( ω ) = I 2 1 ( ω ) + I 1 2 ( ω ) + I 1 2 ( ω ) , = log ( S 11 ( ω ) S 22 ( ω ) det ( S S ( ω ) ) ) .

The total GC is related to the so-called coherence C12(ω) between signals (see Section C of the Appendix):

(28) I ( ω ) = log ( 1 C 12 ( ω ) ) .

Moreover, we recover the GC in time domain through [15[15] M. Ding, Y. Chen and S.L. Bressler, in: Handbook of Time Series Analysis: Recent Theoretical Developments and Applications, edited by B. Schelter, M. Winterhalder and J. Timmer (Wiley-VCH, Weinheim, 2006)., 34[34] M. Dhamala, G. Rangarajan and M. Ding, Phys. Rev. Lett. 100, 018701 (2008).]:

(29) F i j = 1 ω f ω 0 ω 0 ω f I i j ( ω ) d ω .

5. Estimating Granger causality from data

In the last two sections we have mathematically defined the GC in both time and frequency domains. Here, we discuss how to calculate GC. In Section 5.1, we address a non-parametric estimation method that involves computing the Fourier and wavelet transforms of X1(t), and X2(t) [42[42] I. Daubechies, Ten Lectures on Wavelets (Society for Industrial and Applied Mathematics, Philadelphia, 1992). [43] C. Torrence and G.P. Compo, Bull. Amer. Meteor. Soc. 79, 61 (1998).-44[44] P. Stoica and R.L. Moses, Spectral Analysis of Signals (Pearson Prentice Hall, Upper Saddle River, 2005).]. In Section D of the Appendix, we address the parametric estimation of GC, which involves fitting the signals X1(t), and X2(t) to auto-regressive models (Section 2).

5.1. Calculating GC through Fourier and Wavelet Transforms

Here, we will give a numerical example for calculating and interpreting GC using a nonparametric estimation approach based on Fourier and wavelet transforms [34][34] M. Dhamala, G. Rangarajan and M. Ding, Phys. Rev. Lett. 100, 018701 (2008).. Our example consists of calculating the spectral matrix S(ω) through the Fourier transform of the signals. For two stationary4 4 Stationarity, by definition, refers to time shift invariance of the underlying process statistics, which implies that all its statistical moments are constant over time [45]. There are several types of stationarity. Here, the required stationarity conditions for defining power spectral densities are constant means and that the covariance between any two variables apart by a given time lag is constant regardless of their absolute position in time. signals X1(t) and X2(t), the i,j element of the spectral matrix in equation (21) is

(30) S i j ( ω ) = X ~ i ( ω ) X ~ j ( ω ) T , i = 1 , 2 and j = 1 , 2 ,

where T is the total duration of the signal, X~i(ω) is the discrete Fourier transform of Xj(t) (calculated by a fast fourier transform, FFT, algorithm) and X~j(ω) is its complex conjugate.

The variable ω contains the values of the frequency in the interval [0,fmax] corresponding to where the FFT was calculated. If Δt is the sampling time interval of the original signals, then the sampling frequency is fs=1/Δt and fmax=fs/2, whereas the frequency interval contains nω=1+T/(2Δt) points. Then, for m signals (m=2 in our example), we have a total of nω spectral matrices SS of dimensions m×m. Recall that the diagonal elements of SS are called the autospectra, whereas the other elements are called the cross-spectra.

The transfer matrix, HH(ω) and the covariance matrix ΣΣ are given by the decomposition of SS(ω) into the product of equation (20). The Wilson algorithm [34[34] M. Dhamala, G. Rangarajan and M. Ding, Phys. Rev. Lett. 100, 018701 (2008)., 46[46] G.T. Wilson, SIAM J. Appl. Math. 23, 420 (1972)., 47[47] G.T. Wilson, J. Multivariate Anal. 8, 222 (1978).] (see also Section E of the Appendix) may be used for the decomposition of spectral matrices.

After determining these two matrices, we may calculate the GC indices through the direct application of equations (24) to (26).

For example, consider the autoregressive system studied in Ref. [34][34] M. Dhamala, G. Rangarajan and M. Ding, Phys. Rev. Lett. 100, 018701 (2008)., which is given by:

(31) X 1 ( t ) = 0.55 X 1 ( t 1 ) 0.8 X 1 ( t 2 ) + C X 2 ( t 1 ) + ϵ 1 ( t ) , X 2 ( t ) = 0.55 X 2 ( t 1 ) 0.8 X 2 ( t 2 ) + ϵ 2 ( t ) .

Here, X1(t) and X2(t) are AR(2). The variable t is the time step index, such that the actual time is t=tΔt=t/fs. Besides, we know by construction that X2(t) influences X1(t) through the coupling constant C (although the opposite does not happen). The terms ϵ1(t) and ϵ2(t) are defined to have variance Σ11=Σ22=1 and covariance Σ12=0 (they are independent random processes). To obtain a smooth power spectrum, we simulated 5000 trials of the system in equation (31) and computed the average spectra across trials. We set the parameters as C=0.25, fs=200 Hz and T=25 s, resulting in 5000 data points.

When C=0, X1(t)=X2(t), both processes are independent, and oscillate mainly in 40 Hz (Fig. 3a). For C>0, the process X1 receives input from X2, generating a causal coupling that is captured by the GC index in equations (24) and (25): a peak in 40 Hz in I21 indicates that process 2 (which oscillates in 40 Hz) is entering process 1 in this very frequency (Fig. 3c). The flat spectrum of I12 indicates that, on the other hand, process 2 does not receive input from 1. The absolute value of C changes the intensity of the GC peak. The instant causality index, I21(ω)=0 from equation (26), because Σ12=0 for all ω. The total GC in the system is obtained from the spectral coherence, equation (28). However, only the specific GC index reveal the directionality of the inputs between 1 and 2.

Figure 3
GC of a pair of autoregressive processes. GC for the system given in equation (31): by construction, the process 2 causes 1 by providing it input through the coupling constant C=0.25. Parameters: total time T=25 s and sampling frequency fs=200 Hz, resulting in 5000 time steps. a. Spectral matrix components calculated via equation (30). b. Coherence between signals 1 and 2, equation. (28). c. GC from 2 to 1 and 1 to 2, equation. (24) and (25): a peak in 40 Hz in the I21 GC index indicates that 2 Granger-causes 1, whereas the flat zero I12 shows, as expected, that 1 does not influence 2. The peak is in 40 Hz because process 2 has its main power in this frequency (see panel a).

This simple example illustrates the meaning of causality in the GC framework: a Granger causal link is present if a process runs under the temporal influence of the past of another signal. We could have assumed C as a time-varying function, C(t), or even different parameters for the autoregressive part of each process alone; or the processes 1 and 2 could have been of different orders, implying in complex individual power spectra. These scenarios are more usual for any real world application [43[43] C. Torrence and G.P. Compo, Bull. Amer. Meteor. Soc. 79, 61 (1998)., 48[48] P. Holme and J. Saramäki, Phys. Rep. 519, 97 (2012).]. Then, instead of observing a clear peak for the GC indices, we could observe a more complex pattern with peaks that vary in time.

Instead of using the Fourier transform (which yields a single static spectrum for the whole signal), we may use the Wavelet transform [43[43] C. Torrence and G.P. Compo, Bull. Amer. Meteor. Soc. 79, 61 (1998)., 49[49] A.D. Poularikas, Transforms and Applications Handbook (CRC press, Florida, 2010)., 50[50] P.S. Addison, The Illustrated Wavelet Transform Handbook: Introductory Theory and Applications in Science, Engineering, Medicine and Finance (CRC press, Florida, 2017).] to yield time-varying spectra [51[51] M.J.A. Bolzan, Rev. Bras. Ensino. Fis. 28, 563 (2006)., 52[52] M.O. Domingues, O. Mendes, M.K. Kaibara, V.E. Menconi and E. Bernardes, Rev. Bras. Ensino. Fis. 58, 228 (2016).]. Then, the auto and cross-spectra from equation (30) may be written as

(32) S i j ( t , ω ) = W i ( t , ω ) W j ( t , ω ) T ,

where Wi(t,ω) is the Wavelet transform of Xi(t) and Wi(t,ω) is its complex conjugate. To compute the Wavelet transform, we use a Morlet kernel [43][43] C. Torrence and G.P. Compo, Bull. Amer. Meteor. Soc. 79, 61 (1998)., with scale s=6 oscillation cycles within a wavelet – a typical value for this parameter [53][53] M. Farge, Annual review of fluid mechanics 24, 395 (1992).. Similarly to what we did to the power spectrum, we measure the wavelet transforms for 5000 trials of the system in equation (31) in order to average the results. It is important to stress that a wavelet transform is applicable in this case because ensemble averages are being taken. Otherwise, estimates would be too unreliable for any meaningful inference.

In practice, there is one matrix SS for each pair (t,ω); or more intuitively, we have nT=T/Δt matrices SS(ω), each one for a given time step t. The decomposition of SS(ω) in equation (20) is done through Wilson's algorithm. Then, we may calculate GC's indices via equations (24) to (26) for each of the SS(t,ω) matrices with fixed t. This calculation results in I21(ω), I12(ω) and I12(ω) for each time step t. Finally, we concatenate these spectra across the temporal dimension, yielding I21(t,ω), I12(t,ω) and I12(t,ω).

For example, consider the same set of processes in equation (31), but with time-varying C(t)=0.25H(t0t), where H(x)=1 for x0 (zero otherwise) is the Heaviside step function. The parameter t0 is the time step index in which the coupling from 2 to 1 is turned off. This scenario is equivalent to having a set of concatenated constant C processes, such that the processes with t>t0 have C=0. Then, we expect the analysis in Fig. 3 to be valid for all the time steps t<t0, and no coupling should be detected whatsoever for t>t0.

That is exactly what is shown in Fig. 4: a sharp transition in the I21(t,ω) happens exactly at t=t0 when C is turned off. The index I12(t,ω) remains zero for all the simulation. Again, this illustrates the meaning of GC in our system: whenever there is a directional coupling from a variable to another, there is nonzero GC in that link, in the example from signal X2(t) to X1(t).

Figure 4
Time-varying GC in the frequency domain. GC of the system defined in equation (31), but with time-varying C(t)=0.25H(t0t). The spectral matrix is calculated via a Wavelet transform, equation (32), and decomposed for each time step t, yielding a temporal decomposition of the frequencies of the signals. a. Coupling constant as function of time. b. GC index from 2 to 1, I21(t,ω). c. GC index from 1 to 2, I12(t,ω).

6. Conditional Granger Causality

The concepts developed so far may be applied to a case with m variables. In this case, in order to try and infer the directionality5 5 Whether i Granger-causes j or vice-versa. of the interaction between two signals, in a system with m signals, we may use the so-called conditional Granger causality (cGC) [15[15] M. Ding, Y. Chen and S.L. Bressler, in: Handbook of Time Series Analysis: Recent Theoretical Developments and Applications, edited by B. Schelter, M. Winterhalder and J. Timmer (Wiley-VCH, Weinheim, 2006).,33[33] J.F. Geweke, J. Am. Stat. Assoc. 79, 907 (1984).,54[54] Y. Chen, S.L. Bressler and M. Ding, J. Neurosci. Meth. 150, 228 (2006).,55[55] S. Malekpour and W.A. Sethares, Biol. Cybern. 109, 627 (2015).]. The idea is to infer the GC between signals i and j given the knowledge of all the other m2 signals of the system. This is done by comparing the variances obtained considering only i and j to the variances obtained considering all the other signals in the system. The AR model from Eqs (4) and (5) ends up having a total of m variables.

We may write the cGC in time domain as

(33) F i j | k , , m ,

or in the frequency domain as

(34) I i j | k , , m ( ω ) .

But one may ask: “isn't it simpler to just calculate the standard GC between every pair of signals in the system, always reducing the problem to a two-variable case?”

To answer that question, consider the case depicted in Fig. 5a: node 1 (X1(t)) sends input to node 2 (X2(t)) with a delay δ12 and sends input to node 3 (X3(t)) with a delay δ13. Measuring the pairwise GC between X2(t) and X3(t) suggests the existence of a coupling between them even if it does not physically exist (as in Fig. 5b). This occurs because signals X2(t) and X3(t) are correlated due to their common input from X1(t), and the simple pairwise GC between X2(t) and X3(t) fails to represent the correct relationship between the three nodes of Fig. 5a. The cGC solves this issue by considering the contribution of a third signal (X1(t) on this example) onto the analyzed pair (X2(t) and X3(t)), as described below.

Figure 5
A system that GC fails to describe. a. Node 1 (X1(t)) sends input to node 2 (X2(t)) with delay δ12 and to node 3 (X3(t)) with delay δ13. b. A simple GC calculation wrongly infer a link from X2(t) to X3(t) if δ12<δ13, or from X3(t) to X2(t) if δ13<δ12. These links are not physically present in the system and appear only due to the cross-correlation between X2(t) and X3(t) caused by the common input X1(t).

To describe the system in (Fig. 5), equation (19) may be written as

(35) ( X 1 ( ω ) X 2 ( ω ) X 3 ( ω ) ) = ( H 11 ( ω ) H 12 ( ω ) H 13 ( ω ) H 21 ( ω ) H 22 ( ω ) H 23 ( ω ) H 31 ( ω ) H 32 ( ω ) H 33 ( ω ) ) ( ϵ 1 ( ω ) ϵ 2 ( ω ) ϵ 3 ( ω ) ) ,

where X3(t) has the noise term ϵ3(t) with variance Σ33. The corresponding spectral matrix SS(ω) is

(36) S S ( ω ) = [ S 11 ( ω ) S 12 ( ω ) S 13 ( ω ) S 21 ( ω ) S 22 ( ω ) S 23 ( ω ) S 31 ( ω ) S 32 ( ω ) S 33 ( ω ) ] ,

and the noise covariance matrix is

(37) Σ Σ = [ Σ 11 Σ 12 Σ 13 Σ 21 Σ 22 Σ 23 Σ 31 Σ 32 Σ 33 ] .

We want to calculate the cGC from X2(t) to X3(t) given X1(t), i.e.F23|1 in the time domain and I23|1(ω) in the frequency domain. The first step is to build a partial system from equation (35) ignoring the coefficients related to the probe signal X2(t), resulting in the partial spectral matrix SSp(ω):

(38) S S p ( ω ) = [ S 11 ( ω ) S 13 ( ω ) S 31 ( ω ) S 33 ( ω ) ] .

From this partial system, we can calculate SSp(ω) and SS(ω) using the nonparametric methods already discussed above. Suppose that for SS(ω), we obtain the transfer matrix HH(ω) and the covariance matrix ΣΣ (equation (37)), whereas for SSp(ω) we obtain the transfer matrix GG(ω) and the covariance matrix ρρ:

(39) ρ ρ = [ ρ 11 ρ 13 ρ 31 ρ 33 ] .

The matrices HH(ω) and ΣΣ are 3×3. The matrices GG(ω) and ρρ are always one dimension less than the original ones, because they are built from the leftover rows and columns of the original system without the coefficients of the probe signal.

In the time domain, F23|1 is defined as

(40) F 2 3 | 1 = log ( ρ 33 Σ 33 ) ,

or, in general,

(41) F i j | k = log ( ρ j j Σ j j ) ,

which is used to calculate the cGC from i to j given k, in time domain. Note that if the link between i and j is totally mediated by k, ρjj=Σjj, yielding Fij|k=0. However, the standard GC between i and j would result in a link between these variables. For our example in Fig. 5, we obtain F23|10, meaning that the influence of X2(t) to X3(t) is conditioned on signal X1(t), and hence is almost null.

In the frequency domain, we first must define the transfer matrix QQ(ω)=GG(ω)1HH(ω). However, the dimensions of matrix GG(ω) do not match the dimensions of matrix HH(ω). To fix that, we add rows and columns from an identity matrix to the rows and columns that were removed from the total system in equation (35) when we built the partial system (i.e. we add the identity rows and columns to the rows and columns corresponding to signal X2(t)

that was

removed

for generating SSp(ω)), such that:

(42) G ( ω ) = [ G 11 ( ω ) G 13 ( ω ) G 31 ( ω ) G 33 ( ω ) ] [ G 11 ( ω ) 0 G 13 ( ω ) 0 1 0 G 31 ( ω ) 0 G 33 ( ω ) ] .

We can now safely calculate QQ(ω)=GG(ω)1HH(ω), from where we obtain I23|1(ω):

(43) I 2 3 | 1 ( ω ) = log ( ρ 11 | Q 11 ( ω ) Σ 11 Q 11 ( ω ) | ) ,

or, in general,

(44) I i j | k ( ω ) = log ( ρ j j | Q j j ( ω ) Σ j j Q j j ( ω ) | ) .

To illustrate the procedures for determining cGC, consider the system defined in Fig. 6 composed of 5 interacting elements [26][26] L.A. Baccalá and K. Sameshima, Biol. Cybern. 84, 463 (2001). 6 5 Baccalá and Sameshima [26] analysed this system using partial directed coherence and directed transfer function. :

(45) X 1 ( t ) = 0.95 2 X 1 ( t 1 ) 0.9025 X 1 ( t 2 ) + ϵ 1 ( t ) X 2 ( t ) = 0.5 X 1 ( t 2 ) + ϵ 2 ( t ) X 3 ( t ) = 0.4 X 1 ( t 3 ) + ϵ 3 ( t ) X 4 ( t ) = 0.5 X 1 ( t 2 ) + 0.25 2 X 4 ( t 1 ) + + 0.25 2 X 5 ( t 1 ) + ϵ 4 ( t ) X 5 ( t ) = 0.25 2 X 4 ( t 1 ) + 0.25 2 X 5 ( t 1 ) + ϵ 5 ( t ) .
Figure 6
Many interacting components system. Illustration of a system with 5 interacting signals, having physical relations between them. We want to check whether GC or cGC is capable of capturing these interactions.

Here, X1(t) sends its signal to X2(t), X3(t) and X4(t) with coupling intensities 0.5, 0.4 and 0.5, respectively. Also, X4(t) sends input to X5(t) and vice-versa with couplings 0.252 and 0.252 respectively. Note that X1 sends signals to X2 and X4 with 2 time steps of delay, and to X3 with 3 time steps of delay. X4 and X5 exchange signals with only 1 time step of delay.

Calculating the cGC index through equations (41) and (44), we recover the expected structure of the network (Figs. 7a and 8, respectively). The gray shades in Fig. 7 and the amplitude of the peaks in Fig. 8 are proportional to the coupling constants between each pair of elements. For comparison, Fig. 7b shows the simple pairwise GC, which detects connections that are not physically present in the system. Again, this occurs because the hierarchy of the network generates correlations between many pairs of signals that are not directly connected, as discussed in the example of Fig. 5.

Figure 7
cGC and GC matrices in the time domain for a system of many interacting components. The system is given by equation (45) and is depicted in Fig. 6. a. Conditional GC, equation (41), captures exactly the physical interactions of the system. b. Simple pairwise GC, equation (29), captures the interactions, but also captures underlying correlations coming from the hierarchy of the network. Real systems often do not have a clear cGC matrix as given in panel a due to second order effects. The covariance matrices ΣΣ and ρρ were calculated using the nonparametric method.
Figure 8
cGC and GC matrices in the frequency domain for a system of many interacting components. The same system as depicted Fig. 6. Each panel in row i and column j corresponds to the cGC index between elements i and j in the frequency domain, equation (44).

It is important to note that the cGC connectivity not always reflects the underlying physical (or structural) connectivity between elements [35][35] A. Seth, Scholarpedia. 2, 1667 (2007).. The example system in Fig. 6 is an illustrative simple case in which we obtained a neat result. However, real-world applications, such as inferring neuronal connectivity from brain signals, result in a cGC matrix that is more noisy due to multiple incoming signals and multiple delays. Thus, cGC is most generally referred to as giving “functional” connectivity, instead of structural connectivity.

7. Conclusion

Granger causality is becoming increasingly popular as a method to determine the dependence between signals in many areas of science. We presented its mathematical formulation and showed examples of its applications in general systems of interacting signals. This article also gives a contemporary scientific application of the Fourier transform – a subject that is studied in theoretical physics courses, but usually lacks practical applications in the classroom. We also used wavelet transforms, which may motivate students to learn more about the decomposition of signals in time and frequency domain, and its limitations through the uncertainty principle.

We showed numerical examples, and explained them in an algorithmic way. We included the inference of steady and time-varying coupling, and the inference of connectivity in hierarchical networks via the conditional GC. A limitation of the GC is that it is ultimately based on linear regression models of stochastic processes (the AR models introduced in Section 2). Other measures, such as the transfer entropy, are more suitable to describe nonlinear interactions, and do not need to be fitted to an underlying model. Even though the nonparametric estimation of GC does not rely on fitting, it is still a measure of linear interaction. It is also possible to show that both GC and transfer entropy yield the same results for Gaussian variables [56][56] L. Barnett, A.B. Barrett and A.K. Seth, Phys. Rev. Lett. 103, 238701 (2009)..

In spite of the existing debate about what exactly the GC captures, specially in the neuroscience community [57[57] P.A. Stokes and P.L. Purdon, P. Natl. A. Sci. 114, E7063 (2017)., 58[58] L. Barnett, A.B. Barrett and A.K. Seth, Neuroimage 178, 744 (2018).], GC has become a well-established measurement for the flux of information in the nervous system [59][59] S.L. Bressler and A.K. Seth, Neuroimage 58, 323 (2011).. And here, we hope to have provided the necessary tools to those who wish to learn the basic principles and applications underlying GC.

Acknowledgments

This article was produced as part of the S. Paulo Research Foundation (FAPESP) Research, Innovation and Dissemination Center for Neuromathematics (CEPID NeuroMat, Grant No. 2013/07699-0). The authors also thank FAPESP support through Grants No. 2013/25667-8 (R.F.O.P.), 2015/50122-0 (A.C.R.), 2016/03855-5 (N.L.K.), 2017/07688-9 (R.O.S), 2018/20277-0 (A.C.R.) and 2018/09150-9 (M.G.-S.). V.L. is supported by a CAPES PhD scholarship. A.C.R. thanks financial support from the National Council of Scientific and Technological Development (CNPq), Grant No. 306251/2014-0. This study was financed in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - Brasil (CAPES) - Finance Code 001.

Supplementary material

The following online material is available for this article:

Apendix A Apendix B Apendix C Apendix D Apendix E

All codes were developed in Python and are available in: https://github.com/ViniciusLima94/pyGC.

References

  • [1] A. Falcon, in: The Stanford Encyclopedia of Philosophy, edited by E.N. Zalta (Stanford University, Palo Alto, 2019).
  • [2] C.W.J Granger, Econometrica 37, 424 (1969).
  • [3] N. Wiener, in: Modern Mathematics for Engineers edited by E.F. Beckenbach (McGraw-Hill, New York, 1956), v. 1.
  • [4] J. Pearl, Biometrika 82, 669 (1995).
  • [5] J.Y Halpern and J. Pearl, Br. J. Philos. Sci. 56, 843 (2005).
  • [6] J. Halpern, in: 24th International Joint Conference on Artificial Intelligence (Buenos Aires, 2015).
  • [7] T. Vỳrost, Š. Lyócsa and E. Baumöhl, Physica A. 427, 262 (2015).
  • [8] B. Candelon and S. Tokpavi, J. Bus. Econ. Stat. 34, 240 (2016).
  • [9] H. Ding, H. Kim and S.Y. Park, Energ. Econ. 59, 58 (2016).
  • [10] J. Runge, S. Bathiany, E. Bollt, G. Camps-Valls, D. Coumou, E. Deyle, C. Glymour, M. Kretschmer, M.D. Mahecha, J. Muñoz-Marí et al., Nat. Commun. 10, 1 (2019).
  • [11] D.A. Smirnov and I.I. Mokhov, Phys. Rev. E. 80, 016208, (2009).
  • [12] P.O. Amblard and O.J.J. Michel, Entropy 15, 113 (2013).
  • [13] G. Tissot, A. Lozano–Durán, L. Cordier, J. Jiménez and B.R. Noack, J. Phys. Conf. Ser. 506, 012006 (2014).
  • [14] A. Brovelli, M. Ding, A. Ledberg, Y. Chen, R. Nakamura and S.L. Bressler, P. Natl. A. Sci. 101, 9849 (2004).
  • [15] M. Ding, Y. Chen and S.L. Bressler, in: Handbook of Time Series Analysis: Recent Theoretical Developments and Applications, edited by B. Schelter, M. Winterhalder and J. Timmer (Wiley-VCH, Weinheim, 2006).
  • [16] F.S. Matias, L.L. Gollo, P.V. Carelli, S.L Bressler, M. Copelli and C.R. Mirasso, Neuroimage 99, 411 (2014).
  • [17] S. Hu, H. Wang, J. Zhang, W. Kong, Y. Cao and R. Kozma, IEEE T. Neur. Net. Lear. 27, 1429 (2015).
  • [18] A.K. Seth, A.B. Barrett and L. Barnett, J. Neurosci. 35, 3293 (2015).
  • [19] M. Havlicek, J. Jan, M. Brazdil and V.D. Calhoun, Neuroimage 53, 65 (2010).
  • [20] W. Liao, D. Mantini, Z. Zhang, Z. Pan, J. Ding, Q. Gong, Y. Yang and H. Chen, Biol. Cybern. 102, 57 (2010).
  • [21] L. Pollonini, U. Patidar, N. Situ, R. Rezaie, A.C. Papanicolaou and G. Zouridakis, in: 2010 Annual International Conference of the IEEE Engineering in Medicine and Biology (Buenos Aires, 2010).
  • [22] A.B. Barrett, M. Murphy, M.A. Bruno, Q. Noirhomme, M. Boly, S. Laureys and A.K. Seth, PLoS ONE 7, e29072 (2012).
  • [23] M.E. Lynall, D.S. Bassett, R. Kerwin, P.J. McKenna, M. Kitzbichler, U. Muller and E. Bullmore, J. Neurosci. 30, 9477 (2010).
  • [24] S.H. Jin, P. Lin, S. Auh and M. Hallett, Movement Disord. 26, 1274 (2011).
  • [25] V. Lima, R.F.O. Pena, C.A.C. Ceballos, R.O. Shimoura and A.C. Roque, Rev. Bras. Ensino. Fis. 41, e20180197 (2019).
  • [26] L.A. Baccalá and K. Sameshima, Biol. Cybern. 84, 463 (2001).
  • [27] P. Fries, Trends Cogn. Sci. 9, 474 (2005).
  • [28] M.J. Kamiński and K.J. Blinowska, Biol. Cybern. 65, 203 (1991).
  • [29] D. La Rocca, P. Campisi, B. Vegso, P. Cserti, G. Kozmann, F. Babiloni and F. De Vico Fallani, IEEE T. Bio-Med. Eng. 61, 2406 (2014).
  • [30] T. Schreiber, Phys. Rev. Lett. 85, 461 (2000).
  • [31] R. Vicente, M. Wibral, M. Lindner and G. Pipa, J. Comput. Neurosci. 30, 45 (2011).
  • [32] J. Geweke, J. Am. Stat. Assoc. 77, 304 (1982).
  • [33] J.F. Geweke, J. Am. Stat. Assoc. 79, 907 (1984).
  • [34] M. Dhamala, G. Rangarajan and M. Ding, Phys. Rev. Lett. 100, 018701 (2008).
  • [35] A. Seth, Scholarpedia. 2, 1667 (2007).
  • [36] D. Marinazzo, W. Liao, H. Chen and S. Stramaglia, Neuroimage 58, 330 (2011).
  • [37] G.E.P. Box, G.M. Jenkins, G.C. Reinsel and G.M. Ljung, Time Series Analysis: Forecasting and Control (Wiley, Hoboken, 2015), 5ª ed.
  • [38] G. Eshel, Internet resource 2, 68 (2003).
  • [39] W. Van Drongelen, Signal Processing for Neuroscientists, (Academic Press, London, 2018), 2ª ed.
  • [40] R. Takalo, H. Hytti and H. Ihalainen, J. Clin. Monit. Comput. 19, 401 (2005).
  • [41] R. Meddins, Introduction to Digital Signal Processing (Newnes, Oxford, 2000), 1ª ed.
  • [42] I. Daubechies, Ten Lectures on Wavelets (Society for Industrial and Applied Mathematics, Philadelphia, 1992).
  • [43] C. Torrence and G.P. Compo, Bull. Amer. Meteor. Soc. 79, 61 (1998).
  • [44] P. Stoica and R.L. Moses, Spectral Analysis of Signals (Pearson Prentice Hall, Upper Saddle River, 2005).
  • [45] W.A. Woyczyński, A First Course in Statistics for Signal Analysis (Birkhäser, Boston, 2019), 3ª ed.
  • [46] G.T. Wilson, SIAM J. Appl. Math. 23, 420 (1972).
  • [47] G.T. Wilson, J. Multivariate Anal. 8, 222 (1978).
  • [48] P. Holme and J. Saramäki, Phys. Rep. 519, 97 (2012).
  • [49] A.D. Poularikas, Transforms and Applications Handbook (CRC press, Florida, 2010).
  • [50] P.S. Addison, The Illustrated Wavelet Transform Handbook: Introductory Theory and Applications in Science, Engineering, Medicine and Finance (CRC press, Florida, 2017).
  • [51] M.J.A. Bolzan, Rev. Bras. Ensino. Fis. 28, 563 (2006).
  • [52] M.O. Domingues, O. Mendes, M.K. Kaibara, V.E. Menconi and E. Bernardes, Rev. Bras. Ensino. Fis. 58, 228 (2016).
  • [53] M. Farge, Annual review of fluid mechanics 24, 395 (1992).
  • [54] Y. Chen, S.L. Bressler and M. Ding, J. Neurosci. Meth. 150, 228 (2006).
  • [55] S. Malekpour and W.A. Sethares, Biol. Cybern. 109, 627 (2015).
  • [56] L. Barnett, A.B. Barrett and A.K. Seth, Phys. Rev. Lett. 103, 238701 (2009).
  • [57] P.A. Stokes and P.L. Purdon, P. Natl. A. Sci. 114, E7063 (2017).
  • [58] L. Barnett, A.B. Barrett and A.K. Seth, Neuroimage 178, 744 (2018).
  • [59] S.L. Bressler and A.K. Seth, Neuroimage 58, 323 (2011).
  • 1
    It is also referred as Wiener-Granger causality.
  • 2
    Other notions of causality have been defined, one worth mentioning is Pearl's causality [4][4] J. Pearl, Biometrika 82, 669 (1995).. Over the years, Pearl's causality has been revised by him and colleagues in a series of published works [5[5] J.Y Halpern and J. Pearl, Br. J. Philos. Sci. 56, 843 (2005).,6[6] J. Halpern, in: 24th International Joint Conference on Artificial Intelligence (Buenos Aires, 2015).].
  • 3
    The lag operator L is similar to the z-transform. However, z is treated as a variable, and is often used in signal processing, while L is an operator [39][39] W. Van Drongelen, Signal Processing for Neuroscientists, (Academic Press, London, 2018), 2ª ed..
  • 4
    Stationarity, by definition, refers to time shift invariance of the underlying process statistics, which implies that all its statistical moments are constant over time [45][45] W.A. Woyczyński, A First Course in Statistics for Signal Analysis (Birkhäser, Boston, 2019), 3ª ed.. There are several types of stationarity. Here, the required stationarity conditions for defining power spectral densities are constant means and that the covariance between any two variables apart by a given time lag is constant regardless of their absolute position in time.
  • 5
    Whether i Granger-causes j or vice-versa.
  • 5
    Baccalá and Sameshima [26][26] L.A. Baccalá and K. Sameshima, Biol. Cybern. 84, 463 (2001). analysed this system using partial directed coherence and directed transfer function.

Publication Dates

  • Publication in this collection
    14 Sept 2020
  • Date of issue
    2020

History

  • Received
    03 Jan 2020
  • Reviewed
    25 June 2020
  • Accepted
    27 July 2020
Sociedade Brasileira de Física Caixa Postal 66328, 05389-970 São Paulo SP - Brazil - São Paulo - SP - Brazil
E-mail: marcio@sbfisica.org.br