Acessibilidade / Reportar erro

INTRODUCING A CAUSAL PAR( p ) MODEL TO EVALUATE THE INFLUENCE OF CLIMATE VARIABLES IN RESERVOIR INFLOWS: A BRAZILIAN CASE

ABSTRACT

The Brazilian electricity energy matrix is essentially formed by hydraulic sources which currently account for 70% of the installed capacity. One of the most important characteristics of a generation system with hydro predominance is the strong dependence on the inflow regimes. Nowadays, the Brazilian power sector uses the PAR(p) model to generate scenarios for hydrological inflows. This approach does not consider any exogenous information that may affect hydrological regimes. The main objective of this paper is to infer on the influence of climatic events in water inflows as a way to improve the model’s performance. The proposed model is called “causal PAR(p)” and considers exogenous variables, such as El Niño and Sunspots, to generate scenarios for some Brazilian reservoirs. The result shows that the error measures decrease approximately 3%. This improvement indicates that the inclusion of climate variables to model and simulate the inflows time series is a valid exercise and should be taken into consideration.

Keywords:
Reservoir inflow modelling; Periodic models; Climate predictors

1 INTRODUCTION

The Brazilian electricity generation is mainly composed by hydroelectric plants owned by multiple players; the Brazilian National Interconnected System (NIS) integrate power and trasmission lines from the South, Southeast, Midwest, Northeast and part of the North region. Only 1.7% of the country’s electricity production capacity relies outside the NIS, in small isolated systems located mainly in the Amazon region2121 ONS. 2014. October. Operador Nacional do Sistema Elétrico. www.ons.com.br
www.ons.com.br...
.

Planning the Brazilian energy sector means, basically, making decisions about the dispatch of hydroelectric and thermoelectric plants, with the risk of financial losses or energy rationing, as happened such strong in 20012525 PINGUELLI LR, FIDELIS NS, GIANNINI MP & DIAS LL. 2013. Evolution of Global Electricity Markets: New Paradigms, New Challenges, New Approaches. Academic Press. 435-459., affecting almost all Brazilian regions.

One of the main characteristics of the hydraulic generation system is its strong dependence onhydrological regimes. Thus, the dispatch operation planning has to define generation goals for both hydroelectric and thermal plants along the study horizon, considering the electricity demand, the plants and electrical operating constraints2424 PEREIRA MVF. 1989. Optimal stochastic operations scheduling of large hydroeletric systems.International Journal of Eletric Power and Energy Systems, 11: 161-169..

Considering the dependence on the hydrological regimes, the existing uncertainty of the Brazilian power planning requires an appropriate and consistent stochastic modelling of hydrological series. Therefore, it is possible to identify how important it is to build models to generatehydrological scenarios, in order to optimize the system operation performance, adding reliability to the system and reducing its costs1818 OLIVEIRA FLC. 2010. Nova abordagem para geração de cenários de afluências no planejamento da operação energética de médio prazo. Pontifícia Universidade Católica do Rio de Janeiro.. This optimization process has a stochastic variable: natural inflow.

There are, basically, two approaches to predict the natural inflow: physical and statistical models, where the first one includes the rainfall-runoff hydrological model and the second covers data-driven methods such as time series. To perform monthly forecasts and simulation the classical Periodic Autoregressive (p) model2929 TERRY LA, PEREIRA MVF, NETO TA, SILVA LFCA & SALES PRH. 1986. Coordinating the Energy Generation of the Brazilian National Hydrothermal Electrical Generating System. Interfaces, 16: 16 p., has been widely used. This type of model adjusts the series using the estimated parameters of the historical data1313 MACEIRA MEP & DAMÁZIO JM. 2006. Use of the Par(p) Model in the Stochastic Dual Dynamic Programming Optimization Scheme Used in the Operation Planning of the Brazilian Hydropower System. Probability in the Engineering and Informational Sciences, 20: 143-156., and does not consider any exogenous information that could affect the hydrological regimes and, consequently, the electricity generation. Several examples of the application of PAR(p) can be found in the literature, see for example1616 MONDAL MS & WASIMI SA. 2006. Generating and forecasting monthly flows of the ganges river with PAR model. Journal of Hydrology, 323: 41-56. who generate and forecast monthly inflows of the Ganges River with the PAR model. A quick literature search also returns an extensive set of univariate models applied to reservoir inflows (e.g.2727 RAVINES RR, SCHMIDT AM, MIGON HS & RENNO CD. 2008. A joint model for rainfall-runoff: the case of Rio Grande basin. Journal of Hydrology, 353: 189-200.).

However, models that incorporate explanatory variables, specifically climate variables, havebeen only recently developed. Some examples of these works in chronological order, are:3232 UVO CB & GRAHAM NE. 1998. Seasonal runoff forecast for northern South America: a statistical model. Water Resources Research, 34(12): 3515-3524. studied the relation between the Sea Surface Temperature (SST) pattern over the Atlantic and the Pacific Oceans and the variability of water availability in the Amazon Basin;88 KELMAN J, VIEIRA AM & RODRIGUEZ-AMAYA JE. 2000. El niño influence on streamflow forecasting. Stochastic Environmental Research and Risk Assessment, 14: 123-138. introduced a procedure for further conditioning the inflow probability distributions by considering the recent measurements of climatic variables,2828 SOUZA FILHO FA & LALL. 2003. Seasonal to interannual ensemble streamflow forecasts for Ceara, Brazil: applications of a multivariate, semiparametric algorithm. Water Resources Research, 39(11): 1-13. developed a semiparametric approach for forecasting inflow at multiple gaging locations on climate precursors;1414 MAITY R & KUMAR DN. 2008. Basin-scale stream-flow forecasting using the information oflarge-scale atmospheric circulation phenomena. Hydrological Processes, 22: 643-650. applied Artificial Neural Network to model the complex relationship between inflow and climatic phenomenon;1010 KUMAR DN & MAITY R. 2008. Bayesian dynamic modelling for nonstationary hydroclimatic time series forecasting along with uncertainty quantification. Hydrological Processes, 22: 3488-3499. investigated the potential of the Bayesian dynamic modelling approach through an application to forecast a hydrologic time series using relevant climate index information;1111 LIMA CHR & LALL U. 2010. Climate informed monthly streamflow forecasts for the brazilian hydropower network using a periodic ridge regression model. Journal of Hydrology, 380: 438-449. included climate information in a periodic auto-regressive model in order to provide monthly inflow forecasts for 54 hydropower sites in Brazil; and1212 LIMA MLM, POPOVA E & DAMIEN P. (2014). Modeling and forecasting of Brazilian reservoir inflows via dynamic linear models. International Journal Forecasting, 30: 464-474. applied Bayesian Dynamic Models to model and forecast the water inflow for Brazilian hydropower reservoirs, and concluded that the incorporation of climate variables such as rainfall precipitation and El Niño variables, increased the accuracy of both modelling and prediction.

Considering the context and the relevance of the subject, this paper aims to investigate and propose methodological advances in time series modelling and stochastic simulation to generate synthetical hydrological scenarios to model the Brazilian hydrothermal dispatch.

The proposed method, called causal PAR(p), intends to include, exogenously, the meteorological phenomena influence by adjusting a Dynamic Regression (Autoregressive DistributedLags Model) to the traditional PAR(p) residuals and climate series, incorporating the regression coefficient in the traditional modelling. Several preliminary analyses will be concieved in order to obtain a greater understanding of the involved series and its applicability into the proposed method.

Besides this main goal, which is to present a new approach to model the inflow series, this work also intends to generate synthetical scenarios that better represent the original historical series as well as confidence intervals for out-of-sample forecasts.

The paper is organized as follows: section 2 presents the theoretical background, with a brief description of the Periodic Autoregressive model, the mathematical details of the proposed approach (causal PAR(p)), and a short description of the Bootstrap technique, used to generatesynthetical scenarios. Section 3 describes the input variables, their connection with reservoir inflows the underlying system and the Brazilian case study; section 4 presents a exploratory analysis of the available variables. The results from the traditional model and the proposed approach are shown on section 5, and section 6 sums up the work and summarizes its conclusion and final remarks.

2 THEORETICAL BACKGROUND

The proposed framework to include the climate series behaviour in the PAR(p) modelling,named as causal PAR(p), is composed using two main techniques: PAR(p) and Dynamic Regression Model. These techniques are briefly presented in what follows.

2.1 Traditional PAR(p)

Periodic Autoregressive models can also be referred to PAR(p), where p corresponds to the order of the model, in other words, the number of autoregressive terms identified in the model. The PAR(p) model fits to each series period an AR(p) model. Generally, p is a vector, p =[p 1, p 2,...,p 12], where each element provides the order for each period (month, in case of monthly series). For more details about the PAR(p) model see77 HIPEL KW & MCLEOD AI. 1994. Time Series Modelling of Water Resources and Environmental Systems. Elsevier..

PAR(p) model is mathematically described as follows:

(1)

where,

Zt is the seasonal series of period S.

S is the number of periods (S=12 to monthly series).

T is the time index, t=1,2,…,SN, function of the T year (T=1,2,…,N) and the m period (m=1,2,…,S).

N is the number of years.

μm is the seasonal average of the period m.

σm is the seasonal standard deviation of the period m.

φm is the i-th autoregressive coefficient of the period m.

Pm is the order of the autoregressive operator of the period m.

at is the series of independent noises with average zero and variance σa 2( m ).

2.2 Dynamic Regression Model

A Dynamic Regression model can be described by the following general equation:

(2)

Where at is the dependent (or output) variable; Xi,t are the explanatory (or inputs) variables;βi(L) =bi (L)/a(L) and a(L) b 1(L),...,bk (L)are finite order lag polynomials of degrees r, s 1,...,sk , respectively, and εt is assumed to be white noise. Such a formulation can be seen in2323 OTEXTS. 2016. February. Forecasting: principles and practice, advanced forecasting methods, dynamic regression models. Texts: online, open-access textbooks. https://www.otexts.org/fpp/9/1
https://www.otexts.org/fpp/9/1...
and for more mathematical details see55 COCHRANE D & ORCUTT GH. 1949. Application of least squares regression to relationships containing auto- correlated error terms. Journal of the American Statistical Association, 44: 32-61..

2.3 Causal PAR(p)

The step by step sequence to perform the causal PAR(p) follows the three steps described below:

  1. Estimate the traditional PAR(p) model;

  2. Find the significant explanatory variables by applying the Dynamic Regression model; and

  3. Estimate the causal PAR(p) model.

In the first step the traditional PAR(p) is estimated and the residuals series are extracted to be used in the second step, that fits the Dynamic Regression model. In this step, the exogenous variables are one of the inputs and the residuals generated by the traditional PAR(p) are the outputs.Then, the coefficients obtained in step two and the exogenous variables are inserted in the mathematical formulation of the traditional PAR(p), generating the causal PAR(p).

(3)

Where, Xi,t are the exogenous variables, βi the coefficient and is the white noise.

Therefore the innovation at the proposed approach is to introduce exogenous variables in the PAR(p) modelling. Further studies would consider the development of some kind of periodic transfer function.

Afterwards, the synthetical scenarios generation are carried out applying the Booststrap technique to the residual series as well as in obtaining confidence interval. The methodology isdetailed in the next section.

2.4 Synthetical scenarios’ generation and confidence interval

In order to simulate synthetical scenarios the Bootstrap technique is used. This technique, first developed by66 EFRON B. 1979. Bootstrap methods: Another look at the jacknife. The Annals of Statistics, 7: 1-26., is a method of sampling with replacement the observations of a random sample that allows the assessment of the variability of an estimator. Such technique generates as many new samples as one wishes, called “Bootstrap sample”, usually with the same size of the original sample. In the context of time series, there are basically two ways to apply this technique: Bootstrap in the residuals and the method called Moving Blocks99 KUENSCH HR. 1989. The jackknife and the bootstrap for general stationary observations. The Annals of Statistics, 17: 1217-1241..

In this paper Bootstrap is used in the residuals, due to the fact that for all the studied series it is possible to extract residuals, thus ensuring the hypotheses of a random sample (i.e., independent and identically distributed observations) a required condition to apply Bootstrap.

A formal description of the method is: consider R 1,...,RN the random sample and B the number of residuals series to be generated. B residual series are drawn with replacement from the original sample, generating B Bootstrap residuals series of size N each: r 1,...,rB whereri = ri 1,...,.riN, i = 1 ,...,B

Anderson et al.22 ANDERSON PL, MEERSCHAERT MM & ZHANG K. 2012. Forecasting with prediction intervals for periodic autoregressive moving average models. Journal of Time Series Analysis, 34: 187-193. computed Gaussian prediction intervals for the estimated parameters of a periodic autoregressive moving average (PARMA) models. In the present study the 5% confidence interval is developed from the simulated series by calculating the quantiles 2.5% and 97.5%.

3 DATA ANALYSIS

In Brazil, the fifteen major river basins have an installed capacity of approximately 92 Giga Watts [GW]. The Parana river basin has the highest hydroelectric potential (around 54 GW). In these major rivers there are around 190 hydroelectric power plants currently in operation2222 ONS. 2015. Atualização de Séries Históricas de Vazões, período 1931 a 2014. Technical report,Operador Nacional do Sistema Elétrico., and these plants operate in a cascade scheme. As an illustration Figure 1 displays the cascade scheme for Paranaíba and Grande basins with 19 reservoirs, represented by triangles, and 15 hydroelectric with no reservoir (circles).

Figure 1
Example of a cascade scheme. Source: Adapted from2121 ONS. 2014. October. Operador Nacional do Sistema Elétrico. www.ons.com.br
www.ons.com.br...

This way decisions taken at the upstream reservoirs will impact the inflow of the downstream reservoirs. The historical data available is the natural inflow for each reservoir, on a monthly basis, starting in January 1931 and ending in December 2014, measured in cubic meters per second [m 3/s].

The climate variables were selected trough a literature search1717 NOAA. 2016. February. Climate Prediction Center. National Oceanic and Atmospheric Administration. www.cpc.ncep.noaa.gov/data/indices/
www.cpc.ncep.noaa.gov/data/indices/...
. However, the authors intend to broaden this search for other variables to be considered in the model building. The selected variables are basically related to El Niño and the Sunspots numbers; the variables representing El Niño/La Niña phenomenon are: Southern Oscillation Index (SOI), Equatorial SOI, Niño variations and ONI.

The Southern Oscillation Index (SOI) is calculated based on the difference between the atmospheric pressure at sea level in the regions of Tahiti (in the Western Pacific) and Darwin (Australia, Western Pacific). The Equatorial SOI measures the average difference of atmosphericpressure at sea level between two regions centered on the equator: Indonesia and East Pacific.

The range to indicate the presence or absence of El Niño/La Niña is the same for both the SOI index and Equatorial SOI. Consecutive periods of negative figures indicate El Niño phenomenon occurrence; meanwhile consecutive positive figures denote the presence of La Niña and values close to zero indicate a normal situation, where none of the two phenomenon occur. The official historical monthly series of these indices are provided by the National Oceanic and Atmospheric Administration (NOAA). SOI index data starts in 1951 and Equatorial SOI in 1949.

The sea surface temperature anomaly is a proxy for El Niño and La Niña. Thus, this index is used to classify and quantify such phenomena in four Niño regions: Niño 1+2, Niño 3, Niño 4 and Niño 3.4, defined as follows by NOAA in 2014. Through the location of the Niño regions it is possible to conclude that regions Niño 1+2 and Niño 3 better identify temperature anomalies for the Eastern Pacific Ocean sea surface and region Niño 4 for the Western Pacific. The Niño 3.4 region is centralized in the Pacific, which allows a better understanding of anomalies across it. Therefore, currently the Niño 3.4 region is the official measure used to represent SST (Sea Surface Temperature). However, depending on the study, other regions may be a betteralternative.

The threshold for the normal state of this index is between -0.5°C and +0.5°C. The criteria commonly used to define an El Niño phenomenon consists of five consecutive averages of SST anomalies above +0.5°C. Similarly, for La Niña, this criterion remains, but now the SST anomaly should be below -0.5°C. The time series for all regions are provided by NOAA, on a weekly and monthly basis. For the weekly series, the data starts in 1990 and ends at the current week and the monthly data starts at 1982 and ends up at the current month.

The Oceanic Niño Index (ONI) measures the average sea surface temperature anomalies for the region Niño 3.4, removing the existing warming trend on it. The ONI uses multiple periods based on thirty years to perform the calculation for five successive years. The base periods uses a fifteen year interval, for the lower and upper bound, for example, for 1950 and 1955 the base period considered starts in 1936 and ends in 1965. The El Niño and La Niña are indicated in the same manner as the SST index, the time series is monthly and is provided by NOAA.

Sunspots comprehends solar surface regions of high magnetic field, which have considerably lower temperature than its surroundings and thus appears as a dark area. The magnetic flux amount on the sun surface varies over eleven year periods, known as sunspot and solar cycles. During this cycle there is a minimum and a maximum magnetic flux, which is not only difficultto identify the sunspots and but also they appear almost all the time. The cycle reachs its maximum aproximately every eleven years, therefore the observed cycle duration corresponds to eleven years.

The daily and monthly number of sunsposts calculation is accomplished with the Relative Index American number of sunspots. This index indicates the solar phenomenon occurrence taking into account their relationship with the Earth, including geomagnetic variations and ionosphere effects. The Solar Division from American Association of Variable Star Observers coordinates the data collection program and the analysis of this phenomenon. Thus, the National Geophysical Data Center (NGDC), provides the historical data from the number of sunspots per month since 1749 and forecasts have been produced until December 2019.

4 EXPLORATORY ANALYSIS

In order to provide a better understanding of the series used in this research, this section presents an exploratory analysis that includes: evolution of the series through time using line’s graphic; variables’ probability distribution using the histogram; descriptive statistics and correlation between the reservoir series and the climate variables.

Also, to reduce the problem dimension (currently with 192 reservoir inflow series), were selected the eight basins that present the biggest correlation with each one of the climate variables. As an example, for the SOI Standard variable the reservoir Curuá-una was the one with the highest correlation among all 192 possibles. Table 1 presents the selected hydroelectric power plants and the Pearson’s correlation with the climate variables.

Table 1
Correlation between reservoir inflow series and climate variables.

Note that the biggest values were found between the reservoirs Quebra Queixo and Monjolinho with variable Niño 1+2 (0.369 and 0.353, respectively). This does not mean that in the climate variables selecting process this particular variable necessarily will be picket up in the model formulation. This is due to the fact that the variable selection is carried out via a backward process that selects only the significant variables.

The descriptive statistics for each one of the selected reservoirs is shown in Table 2. See that Balbina contains the higher values for mean, median, standard deviation and quartiles. On the other hand Lajes is the one with the lowest values.

Table 2
Descriptive analyses of the reservoirs inflows.

Observing Figure 2, which shows the reservoirs time series plot, it is possible to identify, a strong periodicity in all inflow series.

Figure 2
Reservoirs inflow, time series from January 1982 up to December 2014. Source: The authors.

In the first and third columns of Figure 3 the histogram of each reservoir series is presented.It seems quite clear that there is a strong asymmetry indicating that the data might follow a Weibull distribution, which is the distribution that usually models natural events. The second and fourth columns shows the Autocorrelation Functions (ACF) and from them it is possible to confirm the presence of periodicity by observing significantly picks each six months.

Figure 3
Reservoir inflows histogram, Autocorrelation Functions (ACF) and Partial Autocorrelation Functions (PACF). Source: The authors.

Moving now to the climate variables, in Table 3 it is presented the descriptive statistics for each climate variables available. Note that the values presented in the table rely on the same range except for the Sunspots variable. In fact the first seven variables measure exactly the same events: El Niño/La Niña, while the latter variable measures the sunspots events.

Table 3
Descriptive statistics of the climate variables.

One important point to check is the presence of significant correlations between the various El Niño series. Table 4 shows the Pearson correlations , and as expected, there is a negative correlation between the Equatorial SOI and the SOI index with the others, as for those, El Niño is represented by negative values, while for other variables, this same phenomenon is represented by positive values. Also note that ONI and Niño 3.4 show the highest correlation, a consistent result with their concepts, since both measure the sea surface temperature anomaly in the same region.

Table 4
Correlation between El Niño proxies.

Figure 4 shows each climate variable evolution through time, highlighting that for the variables representing El Niño events, the extreme values coincide with stronger El Niño or La Niña occurrences. It is also observed the strong 11 year cycle presented in the time series of Sunspots.

Figure 4
Climate variables, time series from January 1982 up to December 2014. Source: The authors.

The last analysis presented is a map that contains the localization of the selected reservoirs and the climate variables measured area, that is where the El Niño/La Niña proxy is measured. From Figure 5 one can see that four of the selected reservoirs are in the same region (South), one at the Southeast, one at the Midwest and two at the North. Figure 5 displays each of the Niño regions, where Niño 3.4 is in the middle of Niño 3 and Niño 4, and the ONI index is measured in theNiño 3 region.

Figure 5
Niño regions and basins location. Source: The authors.

In Figure 6 the SOI regions are represented, i.e.: the measurement areas for the SOI Standard and the Equatorial SOI index, together with the reservoir localization.

Figure 6
SOI regions and basins location. Source: The authors.

5 RESULTS

In order to evaluate the proposed methodology, the causal PAR(p) approach was applied to model the eight reservoirs inflow series previously presented. The set of possible input variables are the variables representing El Niño/La Niña and Sunspots phenomenon. Since the proposed approach has been developed to improve the current model it will be shown comparisons between the traditional PAR(p) and the causal. The last result obtained is the generation of scenarios with both approaches and a comparison with real values.

In order to carry out an out-of-sample analysis, the last year of the available data (2014) was omitted from the initial analysis. So, the modelling set is the data ranging from January 1982 up to December 2013, resulting in 384 observations or 32 years. It is worth mentioning that all the results were generated using the R software2626 R CORE TEAM. 2015. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. https://www.R-project.org/
https://www.R-project.org/...
and the packages pear1515 MCLEOD AI. 1994. Diagnostic checking periodic autoregression models with applications. Journal of Time Series Analysis, 15: 221-233.) TSA44 CHAN K-S & RIPLEY B. 2012. TSA: Time Series Analysis. R package version 1.01. https://CRAN.R-project.org/package=TSA
https://CRAN.R-project.org/package=TSA...
, tseries3030 TRAPLETTI A & HORNIK K. 2015. Tseries: Time Series Analysis and Computational Finance. R package version 0.10-34. http://CRAN.R-project.org/package=tseries
http://CRAN.R-project.org/package=tserie...
and dynlm3333 ZEILEIS A. 2014. Dynlm: Dynamic Linear Regression. R package version 0.3-3. http://CRAN.R-project.org/package=dynlm
http://CRAN.R-project.org/package=dynlm...
.

To estimate the models performance, three metrics were considered: the Mean Absolute Scaled Error (MASE), given by

(4)

the Mean Absolute Percentage Error (MAPE), given by

(5)

and the Root Mean Squared Error (RMSE), given by

(6)

where Yt is the observed value, Ft is the fitted value and n is the sample size.

5.1 Modelling reservoirs inflow series

As detailed before, the first step of the model is to fit the traditional PAR(p) to each of the reservoir inflow series. There are many ways to select the best PAR(p) model, i.e. the model’s order,3131 URSU E & TURKMAN KF. 2012. Periodic autoregressive model identification using genetic algorithms. Journal of Time Series Analysis, 33: 398-405., for example, applied a genetic algorithm, while1919 OLIVEIRA FLC & SOUZA RC. 2011. A new approach to identify the structural order of par (p) models. Pesquisa Opracional, 31: 487-498. uses the Bootstrap technique. Using information criterion is a common approach to select the best model, more specifically Akaike’s Information Criterion (AIC) and Schwarz’s Bayesian Information Criterion (BIC)11 AHO K, DERRYBERRY D & PETERSON T. 2014. Model selection for ecologists the worldviews of aic and bic. Ecology, 95: 631-636.. Both criteria are based on the likelihood function and a penalization term for the number of parameters in the model, however AIC is better for prediction as it is asymptotically equivalent to cross-validation, while BIC is best for explanation as it allows consistent estimation of the underlying data generating process. Since the ultimate goal is to analyse the out-of-sample model performance, in this work it is used the AIC criterion to select the PAR(p) orders.

To better explain the causal PAR(p) model fitting step by step, Monjolinho reservoir is used as example series in what follows. In Table 5 the model orders obtained and their respectiveweights are displayed.

Table 5
Monjolinho PAR(p) model orders and coefficients.

From the previous step the residuals series is obtained and with these numerous tests is performed using dynamic regression to select the best climate variables. The idea is to select only the significant climate variables (p-value <0.05) in the fitting exercise. By that all the explanatory variables are considered as candidates and, those that show no statistical significance are eliminated one at a time until the final formulation is reached.

Using again the Monjolinho, Table 6 displays the causal variables selecting process, where the “x" means the variable is selected and “-" means it was removed. In Model 1 all the variables are included since the backward approach is performed, in Model 2 the most insignificant variable were removed from the previous Model and so on until the last model where only significant variables remain: SOI Standard, Equatorial SOI and Equatorial SOI (-1). The “(-1)" means that a lagged variable is used.

Table 6
Monjolinho causal variables process.

Thus, Table 7 shows which variables were selected to join the causal PAR(p) modelling, for each reservoir inflow series. Note that some of them consider lags and also it is possible to visualize that the variable Equatorial SOI was the most used, whereas Niño 1+2 and Niño 3 were not at all picked by any reservoir.

Table 7
Causal PAR(p) coefficients.

Moving to Step 3 of the model building, the causal model must be estimated in accordance with the explanatory variables selected. In Table 8 the three error measurements statistics obtained to both, PAR(p) and causal PAR(p) models are displayed. Note that the inclusion of climate variables leads to a small reduction in the error statistics, i.e., 1% in the MASE, 5% in the MAPE and 2% in the RMSE.

Table 8
Fitted errors with PAR(p) and causal PAR(p).

5.2 Generating synthetical scenarios

The synthetical scenarios generation is performed, with the residuals series, using the Bootstrap technique. First, the BDS test33 BROCK WA, DECHERT WD & SCHEINKMAN JA. 1987. A test for independence based on the correlation dimension. Departament of Economics, University of Wisconsin at Madison, University of Houston and University of Chicago. confirms that the residuals obtained by both, traditional and causal PAR(p) models are white noise (p-value greater than 0.05).

The first procedure in this section is the generation of in-sample scenarios and the respectiveconfidence intervals for the period 1982 up to 2013. Table 9 shows the error statistics comparing the estimated means of the scenarios generated using the traditional PAR(p) and the causal PAR(p) with the actual value of the series. Note that, in general, the causal PAR(p) slightly outperforms the traditional PAR(p): 3% in MASE, MAPE and RMSE.

Next, for the generation of synthetical reservoir inflow scenarios, one should ensure that all simulated values are positive; as a negative value does not make sense. The initial approach does not guarantee this constraint, so the strategy employed was to resample with replacement the corresponding residuals until a positive value for it is obtained, similar to the proposal of2020 OLIVEIRA FLC, SOUZA RC & MARCATO ALM. 2015. A time series model for building scenarios trees applied to stochastic optimisation. International Journal of Electrical Power & Energy Systems, 67: 16-38..

Table 9
Comparison between the average of the generated scenarios and the in-sample forecasts.

Table 10 presents the coverage of the 95% empirical confidence interval using the series’ historical (in-sample values). Note that the proposed approach performs rather better, since the overall coverage across all reservoirs ranges from 94.53% to 95.57% with an average of 94.79%, better than the traditional PAR(p) that ranges from 92.45% to 93.49% (average 92.97%).

Table 10
Accounting for the number of observations covered by the 95% confidence interval in-sample.

Hence, the average obtained for each month with the synthetical scenarios is checked against the corresponding average of the test set, i.e., year 2014. Table 11 contains the out-of-sample statistics for both approaches, again the causal PAR(p) performs better for all error statistics.

These results revealed that the causal PAR(p), produces better fit and coverage in comparision with the traditional PAR(p). This results is aligned and corroborates the findings of1212 LIMA MLM, POPOVA E & DAMIEN P. (2014). Modeling and forecasting of Brazilian reservoir inflows via dynamic linear models. International Journal Forecasting, 30: 464-474.. The improvement provided by the proposed approach compared with Lima’s work relies on the fact that in this study eight climate variables were tested for each reservoir while Lima’s study includes only two variables (Niño 3.4 and precipitation).

Table 11
Comparison between the average of the generated scenarios and the out-of-sample values of the series.

6 CONCLUSIONS AND FINAL REMARKS

The PAR(p) is currently used to estimate the operational costs of the Brazilian hydro-thermal optimal dispatch, and it does not take into account any possible exogenous information that may possibly affect the hydrological regimes and therefore the power generation. Causal PAR(p) is a novel approach that proposes the inclusion of explanatory variables exogenously, in the PAR(p) model, using Dynamic Regression. In this study, the inclusion of climate variables related to the El Niño and Sunspots variable were considered to model the reservoirs inflow series. This new approach was able to generate better results when compared with the current PAR(p).

Another important contribution of this research is the literature review made to identify possible climate variables that could influence the hydrological regime, resulting in eight possible variables; seven related to the El Niño phenomenon and one representing the Sunspots.

The major contribution of this paper is the upgrading of the current approach by including climate variables, that result in better results than the traditional PAR(p). Depending on the location of the reservoir different climatic variables were considered in the model fitting.

Finally, the causal PAR(p) was used to generate the synthetic scenarios and its comparison with the measurements each month of 2014 shows that the averages generated with the simulated scenarios reproduce quite well the future values.

In general the proposed approach performs slightly better than the currently used methodology. As a further improvement of this research one can envisage the construction of a model similar to a periodic transfer function, in such way that the explanatory variables are related to the original series and not only with the residuals generated by the PAR(p). Besides that, the search for other variables that could influence the hydrological regime is another possible improvement. Another possible future step is to aggregate the reservoirs by major basins, aiming to analyse if the climate variables can further improve the results by increasing the range of inflow data.

As a final word, although the causal PAR(p) produces only minor improvements this result is very important as it may result in different electricity prices when introduced in the optimization model used in the Brazilian hydrothermal dispatch.

ACKNOWLEDGEMENTS

The authors would like to thank the R&D program of the Brazilian Electricity Regulatory Agency (ANEEL) for the financial support (PD-0387-0315/2015). They also thank the Coordination for the Improvement of Higher Education Personnel (CAPES) for the doctoral financial support. F. L. Cyrino Oliveira also thanks the support of the National Council of Technological and Scientific Development (CNPq) (research project 443595/2014-3) and FAPERJ (research projects E-26/202.806/2015 and E-26/201.912/2015).

REFERENCES

  • 1
    AHO K, DERRYBERRY D & PETERSON T. 2014. Model selection for ecologists the worldviews of aic and bic. Ecology, 95: 631-636.
  • 2
    ANDERSON PL, MEERSCHAERT MM & ZHANG K. 2012. Forecasting with prediction intervals for periodic autoregressive moving average models. Journal of Time Series Analysis, 34: 187-193.
  • 3
    BROCK WA, DECHERT WD & SCHEINKMAN JA. 1987. A test for independence based on the correlation dimension. Departament of Economics, University of Wisconsin at Madison, University of Houston and University of Chicago.
  • 4
    CHAN K-S & RIPLEY B. 2012. TSA: Time Series Analysis. R package version 1.01. https://CRAN.R-project.org/package=TSA
    » https://CRAN.R-project.org/package=TSA
  • 5
    COCHRANE D & ORCUTT GH. 1949. Application of least squares regression to relationships containing auto- correlated error terms. Journal of the American Statistical Association, 44: 32-61.
  • 6
    EFRON B. 1979. Bootstrap methods: Another look at the jacknife. The Annals of Statistics, 7: 1-26.
  • 7
    HIPEL KW & MCLEOD AI. 1994. Time Series Modelling of Water Resources and Environmental Systems. Elsevier.
  • 8
    KELMAN J, VIEIRA AM & RODRIGUEZ-AMAYA JE. 2000. El niño influence on streamflow forecasting. Stochastic Environmental Research and Risk Assessment, 14: 123-138.
  • 9
    KUENSCH HR. 1989. The jackknife and the bootstrap for general stationary observations. The Annals of Statistics, 17: 1217-1241.
  • 10
    KUMAR DN & MAITY R. 2008. Bayesian dynamic modelling for nonstationary hydroclimatic time series forecasting along with uncertainty quantification. Hydrological Processes, 22: 3488-3499.
  • 11
    LIMA CHR & LALL U. 2010. Climate informed monthly streamflow forecasts for the brazilian hydropower network using a periodic ridge regression model. Journal of Hydrology, 380: 438-449.
  • 12
    LIMA MLM, POPOVA E & DAMIEN P. (2014). Modeling and forecasting of Brazilian reservoir inflows via dynamic linear models. International Journal Forecasting, 30: 464-474.
  • 13
    MACEIRA MEP & DAMÁZIO JM. 2006. Use of the Par(p) Model in the Stochastic Dual Dynamic Programming Optimization Scheme Used in the Operation Planning of the Brazilian Hydropower System. Probability in the Engineering and Informational Sciences, 20: 143-156.
  • 14
    MAITY R & KUMAR DN. 2008. Basin-scale stream-flow forecasting using the information oflarge-scale atmospheric circulation phenomena. Hydrological Processes, 22: 643-650.
  • 15
    MCLEOD AI. 1994. Diagnostic checking periodic autoregression models with applications. Journal of Time Series Analysis, 15: 221-233.
  • 16
    MONDAL MS & WASIMI SA. 2006. Generating and forecasting monthly flows of the ganges river with PAR model. Journal of Hydrology, 323: 41-56.
  • 17
    NOAA. 2016. February. Climate Prediction Center. National Oceanic and Atmospheric Administration. www.cpc.ncep.noaa.gov/data/indices/
    » www.cpc.ncep.noaa.gov/data/indices/
  • 18
    OLIVEIRA FLC. 2010. Nova abordagem para geração de cenários de afluências no planejamento da operação energética de médio prazo. Pontifícia Universidade Católica do Rio de Janeiro.
  • 19
    OLIVEIRA FLC & SOUZA RC. 2011. A new approach to identify the structural order of par (p) models. Pesquisa Opracional, 31: 487-498.
  • 20
    OLIVEIRA FLC, SOUZA RC & MARCATO ALM. 2015. A time series model for building scenarios trees applied to stochastic optimisation. International Journal of Electrical Power & Energy Systems, 67: 16-38.
  • 21
    ONS. 2014. October. Operador Nacional do Sistema Elétrico. www.ons.com.br
    » www.ons.com.br
  • 22
    ONS. 2015. Atualização de Séries Históricas de Vazões, período 1931 a 2014. Technical report,Operador Nacional do Sistema Elétrico.
  • 23
    OTEXTS. 2016. February. Forecasting: principles and practice, advanced forecasting methods, dynamic regression models. Texts: online, open-access textbooks. https://www.otexts.org/fpp/9/1
    » https://www.otexts.org/fpp/9/1
  • 24
    PEREIRA MVF. 1989. Optimal stochastic operations scheduling of large hydroeletric systems.International Journal of Eletric Power and Energy Systems, 11: 161-169.
  • 25
    PINGUELLI LR, FIDELIS NS, GIANNINI MP & DIAS LL. 2013. Evolution of Global Electricity Markets: New Paradigms, New Challenges, New Approaches. Academic Press. 435-459.
  • 26
    R CORE TEAM. 2015. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. https://www.R-project.org/
    » https://www.R-project.org/
  • 27
    RAVINES RR, SCHMIDT AM, MIGON HS & RENNO CD. 2008. A joint model for rainfall-runoff: the case of Rio Grande basin. Journal of Hydrology, 353: 189-200.
  • 28
    SOUZA FILHO FA & LALL. 2003. Seasonal to interannual ensemble streamflow forecasts for Ceara, Brazil: applications of a multivariate, semiparametric algorithm. Water Resources Research, 39(11): 1-13.
  • 29
    TERRY LA, PEREIRA MVF, NETO TA, SILVA LFCA & SALES PRH. 1986. Coordinating the Energy Generation of the Brazilian National Hydrothermal Electrical Generating System. Interfaces, 16: 16 p.
  • 30
    TRAPLETTI A & HORNIK K. 2015. Tseries: Time Series Analysis and Computational Finance. R package version 0.10-34. http://CRAN.R-project.org/package=tseries
    » http://CRAN.R-project.org/package=tseries
  • 31
    URSU E & TURKMAN KF. 2012. Periodic autoregressive model identification using genetic algorithms. Journal of Time Series Analysis, 33: 398-405.
  • 32
    UVO CB & GRAHAM NE. 1998. Seasonal runoff forecast for northern South America: a statistical model. Water Resources Research, 34(12): 3515-3524.
  • 33
    ZEILEIS A. 2014. Dynlm: Dynamic Linear Regression. R package version 0.3-3. http://CRAN.R-project.org/package=dynlm
    » http://CRAN.R-project.org/package=dynlm

Publication Dates

  • Publication in this collection
    Jan-Apr 2017

History

  • Received
    10 June 2016
  • Accepted
    16 Mar 2017
Sociedade Brasileira de Pesquisa Operacional Rua Mayrink Veiga, 32 - sala 601 - Centro, 20090-050 Rio de Janeiro RJ - Brasil, Tel.: +55 21 2263-0499, Fax: +55 21 2263-0501 - Rio de Janeiro - RJ - Brazil
E-mail: sobrapo@sobrapo.org.br