Eta Model and CMIP5 Climate Change Projections for the São Francisco and Paraíba do Sul River Basins, Brazil

Resumo O objetivo deste artigo é analisar as projeções de precipitação do modelo climático regional Eta para as bacias dos rios São Francisco (SF) e Paraíba do Sul (PS), Brasil. Para tanto, a resposta dinâmica da redução de escala foi discutida em comparação com os modelos CMIP5 e, em particular, com os modelos HadGEM2-ES e MIROC5 para o horizonte climático futuro de 2011 a 2040 em dois cenários do quinto relatório do Painel Intergovernamental sobre Mudanças Climáticas: RCP4.5 e RCP8.5. Os resultados indicam que o modelo Eta aninhado aos modelos globais representou adequadamente as simulações de precipitação para o clima presente. Quanto a avaliação da representação das projeções climáticas, as projeções Eta discordaram no sinal de mudança com seus forçantes GCMs e outros modelos CMIP5 nas bacias analisadas, amplificando o sinal mais seco. Reduções na precipitação foram apontadas para as duas bacias e com maior intensidade no cenário RCP8.5, variando em até -20% (SF) e 15% (PS) pelos modelos CMIP5 e em até quase -40% para os modelos Eta-HadGEM2-ES e Eta MIROC5 (SF e PS).


Introduction
Global Climate Models (GCM) are widely used in the study of climate and its changes, projecting changes based on a set of anthropogenic forcing scenarios (Marengo et al., 2012;Chou et al., 2014;Moise et al., 2015;Brasil, 2016;Giorgi, 2019;Pickler and Mölg, 2021).Changes in radiative forcing, mostly caused by anthropogenic factors, have been identified as the main cause of these changes through significant emissions of carbon dioxide (CO2) and other gases that contribute to the formation of the greenhouse effect (IPCC, 2013;PBMC, 2016;Silva et al., 2020).
The Intergovernmental Panel on Climate Change (IPCC) is the main scientific body for the assessment of impacts associated with climate change -having been created by the United Nations Environment Program (UNEP) and the World Meteorological Organization (WMO) in 1988 (IPCC, 2013).Since its inception, the IPCC has been releasing Assessment Reports (AR) on climate change, disclosing projection scenarios (which include the present scenario, future scenarios, paleoclimate simulations and idealized simulations) -using Global Climate Models (GCM -General Circulation Model) -based on the emission of Greenhouse Gases (IPCC, 2013).However, so that the results of the various models can be compared, all simulations follow a pattern, from the input data to the simulations, through groups that cooperate with each other in the Coupled Models Intercomparison Project (CMIP), having a link with the World Climate Research Programm (WCRP).
The fact that global climate models make use of different physical representations of processes, in a relatively low-resolution grid, introduces a certain degree of uncertainty in these future climate change scenarios and, consequently, in the assessment of vulnerability and impacts arising from it.Nevertheless, the low-resolution grid of global models limits the simulation of smaller spatial scale systems (Cruz et al., 2017;Borges de Amorim and Chaffe, 2019;Pickler and Mölg, 2021).Thus, obtaining more detailed projections for a given region with a higher spatial resolution than that provided by a GCM is particularly useful for the studies of the impacts of climate change on the management and operation of water resources (PBMC, 2014;Valério and Fragoso Júnior, 2015;BRASIL, 2016).However, such uncertainties become greater in the case of regional models, as they are strongly dependent on the boundary conditions and the methods used to adapt the RCM variables to the lower spatial resolution of the GCM; thus, according to Pielke and Wilby (2012), large-scale weather errors in GCMs are not only absorbed by RCMs, but can be amplified, due to their higher spatial resolution.Furthermore, if the RCM is nested in just a few GCMs, the high-resolution scenarios do not cover the full range of projected changes that a larger number of GCMs would indicate as plausible futures, thus increasing the uncertainty of the results obtained (Hewitson et al., 2014).Chou et al. (2014), Adam et al. (2015) and Tejadas et al. (2016) recommend the application of regionalization techniques for the assessment of extreme events and better use of climate information from global projections of future climate in studies of regional impacts.The dynamic downscaling technique, which consists of forcing a Regional Climate Model (RCM -with smaller grid spacing) with GCM data (with larger grid spacing), is shown as an output for better representation of subgrid forcings, such as: topography, coastline, vegetation heterogeneity, among others that can reduce the uncertainties of their impacts on the regional climate, since the scale of data from GCMs are generally incompatible with the scales required for impact studies (Solman, 2013;Chou et al., 2014;Sales et al., 2015;BRAZIL, 2016;Oliveira et al., 2017;Pickler and Mölg, 2021).
The Eta model, among the wide variability of existing RCMs, has been used by several authors and Brazilian entities to assess the impacts of climate change on different sectors of society (Chou et al., 2012;Marengo et al., 2012;Chou et al., 2012;Chou et al., 2014;Oliveira et al., 2017;Dereczynski et al., 2020;Farias et al., 2020;Silva et al., 2020). Brito et al. (2019), for example, showed that the bias of the HadGEM2-ES and Eta models are similar in most of the climatic extremes of precipitation indicators for the Amazon Basin -the HadGEM2-ES is a climate model participating in the CMIP5 (5th phase of the CMIP -Coupled Models Intercomparison Project).Eta performed better when simulating extreme rains.The authors naturalized, however, the effect of HadGEM2-ES simulations on the quality of Eta simulations, due to the different physical and dynamic structuring of the bases of both models, also recommending the use of other global models as a boundary condition for the Eta. Zakhia et al. (2020) identified the reduction in monthly precipitation (RCP8.5, at the end of the 21st century), with the projections of the Eta-HadGEM2-ES model in the Ribeirão Jaguara Hydrographic Basin, in Minas Gerais.The greatest increase in monthly precipitation was observed in the RCP4.5 scenario, during the period 2041-2070, with the projections of the Eta-MIROC5 model.The largest monthly increases in average temperature occurred towards the end of the 21st century (2071-2099) for all models with the RCP8.5 scenario.
However, few studies have evaluated how the Eta regional model behaves in relation to its global forcings, a fact that may have neglected the uncertainty of climate change in the Northeast region, or even omitted a signal associated with the bias error of the regional atmospheric model which can interfere with the reading of information (PBMC, 2014;Giorgi, 2019).RCM simulations are often biased due to systematic errors (Piani et al., 2010;Teutschbein and Seibert, 2010;Themeβl et al., 2011).In this sense, several studies show that the climate change signal can present significant differences between the RCMs and their respective forcing GCMs due to effects of local forces (Di Luca et al., 2012;Rummukainen, 2016).
Hence, the present study was motivated by the following research questions: how do the Eta projections behave in relation to their respective forcing GCMs in the São Francisco and Paraíba do Sul River basins?As for the evaluation of the representation of climate projections on a small scale, do the Eta projections agree on the sign of change with their forcing GCMs and other CMIP5 models in the analyzed basins?Therefore, the objective was to analyze the projections of the regional model Eta and some global models CMIP5, for the basins of the São Francisco and Paraíba do Sul Rivers.Discussing, in particular, the response of dynamic downscaling to large-scale forcing, comparing precipitation projections between Eta and their respective global forcings (HadGEM2-ES and MIROC5) for scenarios RCP4.5 and RCP8.5.

Study region
The study was developed for the Paraíba do Sul and São Francisco River basins (Fig. 1).These basins are strategically important for the country, as they serve several consumptive and non-consumptive uses, including: human consumption, irrigation, industry, energy generation, among others.
The São Francisco River, one of the largest Rivers in the country, acts as one of the most important sources of water for the entire Northeast region.The area of its basin is equivalent to about 7.5% of the total area of Brazil and partially covers the Federal District and six Brazilian states: Minas Gerais, Goiás, Bahia, Pernambuco, Alagoas and Sergipe.The São Francisco River basin interconnects the Southeast and Northeast regions, in addition to being characterized by climatic and environmental diversity (CBHSF, 2016).
The interannual cycle of meteorological variables makes it possible to identify similar patterns between the climate of the Alto (Aw type, hot and humid with summer rains, according to the Köppen Climate Classification) and the Médio São Francisco (Aw predominant with BShw climate variation, semiarid).On the other hand, between the Sub-médio (BShw type, semi-arid) and Baixo São Francisco (predominantly AS, hot and humid, with winter rains), the latter with less precipitation intensity.In the Alto and Médio São Francisco, the rainier months occur from November to March and the drier months from May to September, which in turn correspond to the months of lower temperatures.At the end of the winter months, maximum evaporation occurs, which coincides with the month with the highest wind intensity and lowest relative humidity.The Sub-médio and Baixo São Francisco are milder regions, with a higher wind intensity and lower precipitation intensity, compared to the Alto and Médio São Francisco.The precipitation climatology shows that the months with heaviest rainfall occurs from January to April, for the Sub-médio São Francisco, and from March to August, for the Baixo São Francisco (CBHSF, 2016).The mean annual precipitation ranges from 1500 mm (Alto São Francisco) to 350 mm (Baixo e Médio São Francisco) (Bezerra et al., 2019).
The Paraíba do Sul River basin, in turn, is located in the southeast region of Brazil (Fig. 1).According to the National Water Agency ( 2010) it has a drainage area of about 55,500 km 2 distributed across the states of São Paulo, Rio de Janeiro and Minas Gerais.The basin area corresponds to about 0.7% of the country's area and, approximately, to 6% of the Southeast region of Brazil.The region is characterized by a predominantly hot and humid tropical climate, with variations determined by differences in altitude and inflows of marine winds (CEIVAP, 2021).The Paraíba do Sul River basin has a tropical climate with an average annual temperature that varies between 18 °C and 24 °C.According to the detailed Koppen classifcation for Brazil (Alvares et al., 2013), the climate basin is considered humid subtropical with dry winter and hot or temperate summer (Cwa or Cwb) in most of the basin, except the northwest region which is classifed as tropical with dry winter (Aw).The rainfall regime is characterized by a dry period, which extends from June to September, and a very rainy period, which covers the months from November to January, when the great floods of the Paraíba do Sul River occur.
The highest rainfall occurs in the São Paulo section of the Serra do Mar, in the regions of the Itatiaia massif and its foothills and in the Serra dos Órgãos, a section of the Serra do Mar that accompanies the Região Serrana of the state of Rio de Janeiro, where the annual precipitation reaches exceeding 2000 mm.In these three high altitude regions, the average minimum temperatures reach less than 10 ºC.The lowest rainfall occurs in a narrow strip of the center Paraíba do Sul and in the lower reaches of the basin (North and Northwest regions of Rio de Janeiro), with annual rainfall between 1000 mm and 1250 mm.Summer is characterized as rainy with accumulated precipitation between 200 and 250 mm/month.In winter, the driest period occurs, with accumulated rainfall of less than 50 mm/month (CEIVAP, 2021).

Global-based projections
In this study, were used monthly precipitation data obtained from eighteen global climate models participating in the CMIP5 -an overview of CMIP5 and the experiment design can be found in Taylor et al. (2012).The Table 1 lists the models used in this study and their respective institutions and/or agencies and countries.The database comprised the monthly series from 1971 to 2005 (reference period) and from 2011 to 2040 was the period considered for the projections.No pre-processing of data from the GCMs described above was performed.
In this work, these projections are based on scenarios called Representative Concentration Pathways (RCP), published by the IPCC in the Fifth Assessment Report (AR5) and which represent the sets of projections of the components of anthropogenic radiative forcing, used as input data in the GCMs for climate and atmosphere chemistry modeling (IPCC, 2013).The possible scenarios for the end of the 21st century are: RCP 3.0-PD (Peak and Decline), with a peak in radiative forcing at 3 W/m 2 in the middle of the 21st century that decays to 2.6 W/m 2 until 2100, also known as RCP 2.6; RCP 4.5 with stabilization at 4.5 W/m 2 before the end of the 21st century; RCP 6.0 with stabilization at 6 W/m 2 after 2100; and RCP 8.5 with increasing path, reaching 8.5 W/m 2 in 2100 and 12 W/m 2 after the 21st century (Van Vuuren et al., 2011;IPCC, 2013).
Only two global climate models (Table 2) had their results used as initial and contour conditions in the dynamic downscaling process for the Eta regional model.The choice of the two GCMs mentioned above is due to the fact that they are being commonly used in studies involving their global grid scale refinement.
The Hadley Center Global Environment Model version 2 -Earth System (HadGEM2-ES) is a grid-point model of resolution N96, which is approximately equivalent to 1.875 degrees in longitude direction and 1.275 degrees in latitude, and 38 levels in the atmosphere (Collins et al.,2011;Martin et al., 2011).In the ocean, the model has 40 levels in the vertical, and in the horizontal, the resolution varies from 1/3 degree in the tropics to 1 degree in latitudes higher than 30 °.It is a model of earth system category with representation of the carbon cycle.Over land, the carbon cycle is modelled by the dynamic vegetation TRIFFID (Top-down Representation of Interactive Foliage Including Dynamics) (Cox, 2001).It distinguishes 5 plant functional types, broadleaf and needleleaf trees, C3 and C4 grass, and shrubs.The model includes atmospheric chemistry and aerosol model with organic carbon and dust representation (Collins et al.,2011;Martin et al., 2011).
The Model for Interdisciplinary Research on Climate version 5 (MIROC5) is a Japanese cooperatively developed model (Watanabe et al., 2010).It is spectral in the atmospheric component with resolution T85, which is approximately 150 km in the horizontal, and has 40 verticais atmospheric levels.It is coupled to COCO 4.5 ocean model with 50 levels in depth and 1 ° of horizontal resolution.The radiative fluxes are calculated by a k-distribution scheme (Hasumi, 2007;Sekiguchi and Nakajim, 2008).The aerosol model, the SPRINTARS, is coupled to cloud microphysics scheme together with the radiation scheme, it uses the MATSIRO land surface scheme with 6 soil layers (Takata, Emori and Watanabe, 2003).Each grid box is formed by three tiles of potential vegetation, cropland, and lake.The scheme also contains River routing and the effects of snow on albedo.Sea ice thermodynamics and dynamics are represented (Watanabe et al., 2010).

Regionally based projections
The regional climate model used was the Eta-CPTEC/INPE, as its dynamic downscaling simulations showed higher success rates than the predictions of the global models (Chou et al., 2012;Marengo et al., 2012).The Eta model is a grid point mesoscale model of primitive equations.The version of the Eta model that runs operationally at CPTEC/INPE is hydrostatic with a horizontal resolution of 40 km, in addition to another of 20 km, both with 38 layers in the vertical, and covering practically all of South America.Forecasts are provided twice a day, one with an initial condition at 00:00 UTC and the other at 12:00 UTC.The Eta regional model  (Mesinger et al., 1988;Black, 1994) developed at the University of Belgrade.This model uses Arakawa's E grid (Arakawa and Lamb, 1977) and vertical coordinate η (Mesinger, 1984).Time integration uses the 'split-explicit' technique (Gadd, 1978).Turbulent processes are treated using the Mellor-Yamada scheme (1974).The long (Fels and Schwarzkopf, 1975) and short (Lacis and Hansen, 1974) radiation parameterization scheme was developed by the Geophysical Fluid Dynamics Laboratory.The model uses a modified Betts-Miller scheme to parameterize convection (Janjic, 1994).The surface schema is represented by the OSU schema (Chen et al., 1997).In this work, the updated version of the Eta model is adapted for climate change studies.The sea surface temperature is taken from the global coupled ocean models: HadGEM2-ES and MIROC5, and it is daily updated.Initial soil moisture and soil temperatures are derived from the global models.Lateral boundaries are updated every 6 hours.The first year of integration is discarded from the analysis.The model was setup at 20 km resolution and 38 verticais levels.Model top is set at 25 hPa.The simulation area of the regional model was limited by the coordinates: 100° W-30° W; 30° N-50° S. The Eta model had its simulations obtained for the Historical, RCP4.5 (intermediate) and RCP8.5 (the one with very high GHG emissions -Greenhouse Gases) runs;

Observational data
The observed data used in this study was obtained from the Global Precipitation Climatology Centre (GPCC) and corresponds to precipitation climatology from 1971 to 2005, which provides global monthly precipitation data at a spatial resolution of 0.5° x 0.5° (Schneider et al., 2015).The GPCC is a climate data center that provides global precipitation data.It uses rain gauge measurements collected from over 75,000 stations worldwide, along with additional satellite-based measurements, to create highquality precipitation datasets.The GPCC offers monthly and daily precipitation data with a spatial resolution of 0.5° and 1.0°, respectively.The dataset covers the period from 1901 to present, making it a valuable source of longterm climate information.The GPCC data has been widely used in climate research, including in the evaluation and validation of climate models, the assessment of droughts and floods, and the analysis of climate variability and change (Schneider et al., 2018).

Descriptive statistics
For validation between the observed data set and those from the Eta and CMIP5 models, a comparative analysis was performed based on a series of statistics described below, such as Mean, Median, Pearson Correlation Coefficient (R), Standard Deviation (SD) and Root of the Mean Square Error (RMSE).
The mean (x) describes the sample of N values x i that make up the series, as a single value that represents the center of the data distribution and the median acts as a measure of position and represents the value that is exceeded by 50% of the sample points.The SD, in turn, acts as a measure of dispersion, indicating the degree of uniformity of a sample's values around the mean -for more information and equations, see Wilks (2019).
In order to obtain a good precision indicator when correlating experimentally estimated and observed values, the Pearson Correlation Coefficient, also called correlation index, is used.Following the formulation adopted by Jolliffe and Stephenson (2003), Eq. ( 1) defines R as: where φ 0 i is the deviation from the estimated series and ψ 0 i is the deviation from the observed series.Here, σ φ and σ ψ are also taken as the standard deviation of the averages of the estimated and observed series, respectively.Values of R for -1 and 1 refer to a perfect correlation, while the value of zero for R refers to no linear relationship between the studied variables (Moore, 2007).
According to Fox (1981) the RMSE is an error measure widely used in statistical treatment, as it presents a better sensitivity with regard to the growth of deviations between values.The RMS Error is computed through Eq. ( 2): RMSE = ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffiffi 1 N As the error measure is as close as possible to zero, it is said that there is greater similarity between the estimated and measured values.
In order to observe the quality of the climate models covered in this study, we chose to use the Taylor Diagram (DT) (Taylor, 2001), which provides a graphic summary of how close these models are to observational data.The DT increases the quality of the discussion about performance and the choice of a given model, as it is possible to simultaneously analyze a series of statistics from observed and estimated data (Pereira et al., 2014).In this study, the DT was chosen with the three metrics mentioned above: the R, the SD and the RMSE.

Analysis techniques
The use of the analysis techniques described below is intended to compare the representation of the results of the Eta models with most of the CMIP5 GCMs.To evidence the existing divergence in the signal of change as well as to evaluate the spatial distribution of precipitation anomalies in relation to its global forcings in the considered scenarios and in the analyzed basins.

Cluster analysis
To identify characteristics that evidence the existence of heterogeneity in the sign of changes proposed by the Eta models and the CMIP5 models, the cluster analysis (CA) technique was applied.In CA, the hierarchical agglomerative method proposed by Ward (1963) was considered, with the Euclidean distance squared as a dissimilarity measure (Jackson and Weinand, 1995;Lyra et al., 2006;Lyra et al., 2014).Ward's method forms groups, minimizing the dissimilarity or total sums of squares within groups.The Euclidean distance is the square root of the sum of the squares of the differences in values for each variable (Lyra et al., 2006;Lyra et al., 2014): where d e is the Euclidean distance; P pj and P kj are the quantitative variables of individuals p and k, respectively.The choice of the number of clusters, although it may be random, was defined based on the average of similarity or internal dissimilarity of the obtained clusters, establishing a certain level that dictated the number of clusters.

Projections analysis
To obtain projections of precipitation of the GCM and RCM, the anomaly was calculated from the percentage change between the future values related to RCP4.5 and RCP8.5 scenarios and the historical average recorded for the Historical scenario, as expressed in Eq. (4): where P a XXI is the annual precipitation variable of the projections for the 21st century scenarios and P a XX is the average annual precipitation for the 20th century.HadGEM2 presented together provides a graphic summary of how close the climate models -ES and MIROC5 and the regional Eta forced by global models -hereinafter called Eta-HadGEM2-ES and Eta MIROC5, respectivelyare to the observed data of the GPCC considering the variable precipitation.

Precipitation validation analysis
It is observed in the Taylor diagrams that most models presented a strong correlation with the observed data, obtaining R ≥ 0.85.In all basins, the Eta-HadGEM2-ES model showed the lowest SD values.Eta MIROC5, in turn, showed a SD in Paraíba do Sul closer to the SD of the set of observations.The MIROC5, despite showing a high correlation (R ≅ 0.99), presented the highest values for the RMSE (≥ 50 mm) and SD (> 100 mm).Had-GEM2-ES had a good performance by showing R values above 0.95 and SD lower.In summary, in the Paraíba do Sul River basin, the correlation between the models and the observed data has values between 0.9 and 1.0.
With respect to seasonality, precipitations considered as discrepant (outliers) were identified in both basins.The low quartile 1 in both basins and the lowest median value in São Francisco are show in August, indicating that 25% of its precipitation data were close to zero and that 50% of these data were represented by a precipitation also around zero.In general, São Francisco showed a behavior with low variability from May to September.The highest 3rd and median quartiles were identified in December and January for São Francisco and Paraíba do Sul, respectively.
In San Francisco, specifically, a greater dispersion of precipitation data, also denounced by the greater occurrence of extremes or outliers was identified between the months of October and April.The interannual variations in rainfall in this region can be attributed to anomalies in the position and intensity of the intertropical convergence zone (ITCZ), caused by positive anomalies in the temperature of the Atlantic Sea surface, according to the study by Moura and Shukla (1981) and Nobre (1994), and by the occurrence of El Niño in the equatorial Pacific.
In Paraíba do Sul the months of September to March presented greater dispersion of precipitation data, due to by the greater occurrence of extremes or outliers.A greater variability of precipitation was identified for this basin.The action of the South Atlantic Convergence Zone (SACZ) is responsible for intense and persistent periods of precipitation, as well as reduced rainfall in the surroundings (Santana, 2021).
In the analysis of precipitation validation, the patterns of the Eta models and their global forcings showed values of strong correlation with the observed data, especially for the Eta-HadGEM2-ES, HadGEM2-ES and MIROC5 models.Allied to this, the Eta-HadGEM2-ES showed the lowest standard deviation values in the two basins.Chou et al. (2017) also obtained a good ability of the Eta model simulations forced by the HadGEM2-ES and MIROC5 data for the present climate  to reproduce the seasonality of rainfall for South America.
As for seasonality, the observed source of data indicated in the Paraíba do Sul River basin a greater variability in terms of precipitation when compared to the São Francisco River basin.This showed a behavior with low varia-bility from May to September, which may be associated with the dry period in the Basin.The highest median values were identified in December and January for São Francisco and Paraíba do Sul, respectively.

Cluster analysis
The characterization of differences in models was explored in more detail using cluster analysis.Figure 3 shows the dendrograms obtained by Ward's hierarchical agglomerative method the percentage anomaly data of the average annual precipitation for the years 2011 to 2040 of the Eta-HadGEM2-ES and Eta MIROC5 models and the sixteen CMIP5 models in the São Francisco and Paraíba do Sul River basins in the RCP4.5 and RCP8.5 scenarios.
It is observed that models from the same institute have a smaller distance than other models and, therefore, tend to be grouped closer together.The similarity level is very similar to model pairs that have many more common components.For example, the level of similarity between (i) MIROC-ESM-CHEM and MIROC-ESM and (ii) IPSL-CM5A-MR and IPSL-CM5B-LR.A greater dissimilarity was identified between some sets of models from the same institution, for example, the dissimilarity observed between (i) MIROC5 and MIROC-ESM / ESM-CHEM and (ii)IPSL-CM5A-LR and IPSL-CM5A-MR / IPSL-CM5B-LR.
It is interesting to note that the Eta-HadGEM2-ES model presented the largest Euclidean distance found from the rest of the model sample in both scenarios in Paraíba do Sul and in scenario RCP8.5 in São Francisco (and less markedly in RCP4.5), revealing a greater dissimilarity regarding the precipitation anomaly in relation to the other groups of models.Eta MIROC5, in turn, also showed a From the analysis of the dendrograms shown in Fig. 3, it was observed that models from the same institute tend to be grouped closer together and that a level of similarity was obtained very similar to the pairs of models that have much more common components -similar patterns were obtained in some studies from different parts of the globe (Mizuta et al., 2014;Gibson et al., 2017;Pathak et al., 2019).The Eta-HadGEM2-ES model revealed a greater dissimilarity in terms of precipitation anomaly in relation to the other groups of models.Eta MIROC5 also showed a level of dissimilarity with respect to homogeneous groups, but established itself with the family of the MIROC-ESM / ESM-CHEM model set in most scenarios and basins.It is noteworthy that the dissimilarity identified between some sets of models from the same institution may be associated with changes in atmospheric components during the model development process (Pathak et al., 2019).

Projections analysis
In the analysis of the average percentage rainfall anomaly, the CMIP5 models indicate projections with more intense reductions in rainfall in the RCP8.5 scenario than in the RCP4.5 scenario in the short term for both basins how showed in Fig. 4. Considering both scenarios, most models indicated reductions in the São Francisco River basin.Precipitation anomalies of most GCMs models varied by up to -20% for the scenario with the lowest GHG emissions, with emphasis on the global model CanESM2.The regional models Eta-HadGEM2-ES and Eta MIROC5 enhanced significant reductions in precipitation in both projections.It is interesting to note that the Eta-HadGEM2-ES model presented the highest magnitude of negative anomaly in both scenarios and among all models, with an emphasis on the RCP8.5 scenario, which presented a reduction of almost 40%.
In Paraíba do Sul, considering the RCP8.5 scenario, most GCM models indicated a decrease in precipitation in the basin of just over 15%.Otherwise, in the scenario of lower GHG emissions, an increase in precipitation was pointed out, varying around 10%.The Eta-HadGEM2-ES model showed a more pronounced anomaly in this scenario when compared to the higher emission one.Again, this model showed the highest magnitude of negative anomaly in both scenarios and among all models.
The precipitation anomalies of the Eta model indicated that a large part of the basins may suffer from a great lack of rainfall in the period from 2011 to 2040, but it is important to highlight that when comparing the anomalies simulated by the two global models, HadGEM2-ES and MIROC5 with and without using the Eta regional model, it is evident that Eta tends to amplify the driest signal in the analyzed basins.Among the CMIP5 models, the CanESM2 model stood out in terms of negative percentage change.These results agree with those presented by Brasil (2015), in the previous study by Silveira et al. (2016) and in the work of Chou et al. (2017), in which the CanESM2 model showed a reduction in rainfall in the São Francisco for both scenarios.Brasil (2015) associates the intensification of negative precipitation anomalies, shown by the Eta regional model, to the increase in the temporal variability of the annual series in relation to global models.
The following figures show spatially the average percent precipitation anomaly of the climate models Had-GEM2-ES and Eta-HadGEM2-ES (Fig. 5) and MIROC5 and Eta MIROC5 (Fig. 6) for scenarios RCP4.5 and RCP8.5 in the period 2011 to 2040 over the basins of the São Francisco and Paraíba do Sul Rivers.
The HadGEM2-ES global model showed a more pronounced increase in precipitation in most of the Paraíba do Sul and São Francisco River basins in the RCP4.5 scenario, mainly in the Sub-basin Médio São Francisco, with anomalies between 0 and 36% as shown in Fig. 5.In a more pessimistic scenario, its anomalies were more weakened, indicating percentages of increase of a maximum of 12% in part Médio and Sub-médio São Francisco and in part of Paraíba do Sul.When used as a boundary condition for the Eta regional model, the model showed significant reductions in the entire Paraíba do Sul basin and in most of the São Francisco basin (except for part of the Sub-médio and Baixo São Francisco sub-basins).This decrease was greater in module for the RCP8.5 scenario, with magnitudes above 48%.This indicates the possibility of most of the analyzed basins becoming even drier.
In Fig. 6, a strong increase in precipitation of up to 24% was observed in most of the Paraíba do Sul Basin and throughout the São Francisco River basin, when compared to HadGEM2-ES (Fig. 5A and Fig. 5B), mainly in RCP4.5, with emphasis on the Baixo São Francisco subbasin.Positive anomalies were identified in the two scenarios considered, attenuating their anomalies reaching only 12% in the most pessimistic scenario, mainly in a large part of the Médio São Francisco.For the Eta MIROC5 model, significant reductions in precipitation were observed in most basins (except in the Sub-médio and Baixo São Francisco), and this decrease was also greater in module for the RCP8.5 scenario, up to 36% in relation to historical average.Considering the Sub-médio and Baixo São Francisco sub-basins, a great spatial divergence was observed in both scenarios regarding the anomaly signal, which, in turn, was significantly weakened in relation to the other regions.

Discussion
In the analysis of precipitation validation, the patterns of the Eta models and their global forcings showed a strong correlation with the observed data, especially for the Eta-HadGEM2-ES, HadGEM2-ES and MIROC5 models.Allied to this, the Eta-HadGEM2-ES showed the lowest standard deviation values in the two basins.This shows the good capacity of the simulations for the present climate of the Eta model nested to global models in reproducing the seasonality of rainfall in the São Francisco and Paraíba do Sul River basins.
As for seasonality, the observed data source showed a marked behavior with low variability from May to September for the São Francisco River basin, which may be associated with the dry period.In the Paraíba do Sul River basin, a greater variability in terms of precipitation was identified when compared to the São Francisco River basin, even for the dry season.The highest median values were identified in December and January for São Francisco and Paraíba do Sul, respectively.
From the analysis of the dendrograms shown in Fig. 3, it was observed that models from the same institute tend to be grouped closer together and that a level of similarity was obtained very similar to the pairs of models that have much more common components -similar patterns were obtained in some studies from different parts of the globe (Mizuta et al., 2014;Gibson et al., 2017;Pathak et al., 2019).The Eta-HadGEM2-ES model revealed a greater dissimilarity in terms of precipitation anomaly in relation to the other groups of models.Eta MIROC5 also showed a level of dissimilarity with respect to homogeneous groups, but established itself with the family of the MIROC-ESM/ESM-CHEM model set in most scenarios and basins.It is noteworthy that the dissimilarity identified between some sets of models from the same institution may be associated with changes in atmospheric components during the model development process (Pathak et al., 2019).
The precipitation anomalies of the Eta model indicated that a large part of the basins may suffer from a great lack of rainfall in the period from 2011 to 2040, but it is important to highlight that when comparing the anomalies simulated by the two global models, Had-GEM2-ES and MIROC5 with and without using the Eta regional model, it is evident that Eta tends to amplify the driest signal in the analyzed basins.Among the CMIP5 models, the CanESM2 model stood out in terms of negative percentage change.These results agree with those presented by Brasil (2015), in the previous study by Silveira et al. (2016) and in the work of Chou et al. (2017), in which the CanESM2 model showed a reduction in rainfall in the São Francisco for both scenarios.Brasil (2015) associates the intensification of negative precipitation anomalies, shown by the Eta regional model, to the increase in the temporal variability of the annual series in relation to global models.
The fact that global climate models make use of different physical representations of processes, in a relatively low-resolution grid, introduces a certain degree of uncertainty in these future climate change scenarios and, consequently, in the assessment of vulnerability and impacts arising from it.Such uncertainties become greater in the case of regional models, as they are strongly dependent on the boundary conditions and the methods used to adapt the RCM variables to the lower spatial resolution of the GCM; thus, according to Pielke and Wilby (2012), large-scale weather errors in GCMs are not only absorbed by RCMs, but can be amplified, due to their higher spatial resolution.Furthermore, if the RCM is nested in just a few GCMs, the high-resolution scenarios do not cover the full range of projected changes that a larger number of GCMs would indicate as plausible futures, thus increasing the uncertainty of the results obtained (Hewitson et al., 2014).

Conclusions
In order to better understand how the Eta regional model behaves in relation to some GCMs and, in particular, to its global forcings, this research compared a set of information for future climate change scenarios, in relation to its own forcing and to the other global models, in order to verify if there is an agreement of the Eta results with the GCM, assuming that the representation of the climate variable obtained from the RCMs presents a bias of intensification of its signal due to systematic errors.Obtaining more detailed projections for the analyzed regions, with a higher spatial resolution than that provided by a GCM, is of great use, particularly for studies of the impacts of climate change, aiding in the management and operation of water resources.Based on the results obtained, the main contributions of the study are: � The strong correlation of the Eta model nested to the global models with observational data, showed the good ability to adequately represent the simulations for the present climate in reproducing the seasonality of rainfall in the São Francisco and Paraíba do Sul River basins.It showed a greater variability when compared to the São Francisco River basin; � Precipitation anomaly projections obtained from the Eta model indicated that a large part of the basins may suffer from a significant shortage of rainfall in the period 2011 to 2040 and mainly pointed to the disagreement in the signal of change with its GCMs and forcing other CMIP5 models.When comparing the anomalies simulated by the two global models, HadGEM2-ES and MIROC5, with and without the use of the regional model, it was evident that Eta amplifies the driest signal; � This opposite sign between global models and when forced by regional model, may be related to the fact the Eta it is able to resolve local and mesoscale physical processes, which have more influence on precipitation.Since much of the dispersion identified in the results of the global models participating in the fifth IPCC report (AR5) for the Brazilian Northeast comes from their problems in satisfactorily resolving and reproducing the physical and dynamic processes that drive climate variability in the region, such as for example, ENSO and Tropical Atlantic variability (Bellenger et al., 2014).� The information generated suggests the need for a set analysis of models with different physical patterns, given that the use of information from only one model can neglect the uncertainty associated with climate change, which can reduce the robustness of the water resources management strategies adopted by not to considere all plausible scenarios.Especially for sectors that need information on a smaller spatial scale, such as water resources, agriculture, health, among others, decisions based on situations that disregard the uncertainty of the future climate can lead to strategies that are greatly regretted and affect their robustness.� And finally, for further studies, it is necessary to discuss more carefully how the regional model is fed by the global one and how changes in the parameters of the regional model impact this divergence.In addition, given the computational cost and the existence of the need for more than one regional model to enable an analysis of climate projections, it is imperative that institutions evaluate the joint use of both dynamic downscaling and statistical downscaling.Although the last-mentioned technique has limitations, especially associated with stationarity.Given the points mentioned, the big question is not the reliability of the A or B result, but the treatment of uncertainty in the future climate that cannot be neglected, especially in such important sectors of society, such as water resources and the basins addressed in this work.

Figure 1 -
Figure 1 -Location map of the study region.

Figure 2
Figure 2 presents the boxplot graph of the monthly average precipitation in the Paraíba do Sul and São Francisco River basins considering the GPCC data series for the period from 01/1971 to 11/2005.The Taylor DiagramHadGEM2 presented together provides a graphic summary of how close the climate models -ES and MIROC5 and the regional Eta forced by global models -hereinafter called Eta-HadGEM2-ES and Eta MIROC5, respectivelyare to the observed data of the GPCC considering the variable precipitation.

Figure 2 -
Figure 2 -GPCC monthly average rainfall boxplot for the period from 01/1971 to 11/2005 in a) São Francisco; Taylor diagram for the Eta-HadGEM2-ES, Eta MIROC5, HadGEM2-ES and MIROC5 models in relation to observed precipitation of the GPCC in b) São Francisco.The empty circle on the abscissa axis represents the set of observations; GPCC monthly average rainfall boxplot for the period from 01/1971 to 11/2005 in c) Paraíba do Sul and Taylor diagram for the Eta-HadGEM2-ES, Eta MIROC5, HadGEM2-ES and MIROC5 models in relation to observed precipitation of the GPCC in d) Paraíba do Sul.The empty circle on the abscissa axis represents the set of observations.

Figure 3 -
Figure 3 -Dendrogram showing the dissimilarity of the models for standardized data of the percentage anomaly of the mean annual precipitation in the São Francisco River basins (a) RCP4.5 (b) RCP8.5 and Paraíba do Sul (c) RCP4.5 (d) RCP8.5.The allocation of models for each cluster is indicated by colors.

Table 2 -
CMIP5 global models that acted as initial and boundary conditions for the RCM Eta.