## Servicios Personalizados

## Articulo

## Indicadores

## Links relacionados

- Similares en SciELO

## Compartir

## Bragantia

*versión impresa* ISSN 0006-8705

### Bragantia vol.70 no.4 Campinas 2011

#### http://dx.doi.org/10.1590/S0006-87052011000400031

**AGROMETEOROLOGY**

**Incorporating climate trends in the stochastic modeling of extreme minimum air temperature series of Campinas, state of São Paulo, Brazil**

**Incorporando tendências climáticas na modelagem estocástica da série de temperatura do ar mínima absoluta de Campinas, Estado de São Paulo**

**Gabriel Constantino Blain ^{*}**

Instituto Agronômico, Centro de Ecofisiologia e Biofísica, Caixa Postal 28, 13012-970 Campinas (SP), Brasil

**ABSTRACT**

Under the hypothesis that the presence of climate trends in the annual extreme minimum air temperature series of Campinas (Tminabs; 1891-2010; 22º54'S; 47º05'W; 669 m) may no longer be neglected, the aim of the work was to describe the probabilistic structure of this series based on the general extreme value distribution (GEV) with parameters estimated as a function of a time covariate. The results obtained by applying the likelihood ratio test and the percentil-percentil and quantil-quantil plots, have indicated that the use of a time-dependent model provides a feasible description of the process under evaluation. In this non-stationary GEV model the parameters of location and scale were expressed as time-dependent functions. The shape parameter remained constant. It was also verified that although this non-stationary model has indicated an average increase in the values of the analyzed data, it does not allow us to conclude that the region of Campinas is now free from frost occurrence since this same model also reveals an increasing trend in the dispersions of the variable under evaluation. However, since the parameters of location and scale of this probabilistic model are significantly conditioned on time, the presence of climate trends in the analyzed time series is proven.

**Key words:** extreme value, non-stationary, time-dependent functions.

**RESUMO**

Sob a hipótese de que a presença de tendências climáticas na série anual de temperatura do ar mínima absoluta de Campinas, Estado de São Paulo (Tminabs; 1891-2010; 22º54'S; 47º05'W; 669 m), não pode mais ser negligenciada, o objetivo do trabalho foi descrever a estrutura probabilística com base na distribuição geral dos valores extremos (GEV), e parâmetros estimados em função da covariável tempo. Os resultados observados após a aplicação dos testes de Mann-Kendall, razão da verossimilhança e gráficos percentil-percentil e quantil-quantil, indicaram que a utilização de um modelo dependente do tempo resulta em uma descrição verossímil do processo sob avaliação. Neste modelo não estacionário, os parâmetros de localização e escala foram expressos como funções do tempo. O parâmetro de forma ou calda permaneceu constante. Observou-se também que embora o modelo tenha indicado aumento médio nos valores de Tminabs, não é possível concluir que a região de Campinas está atualmente livre da ocorrência de geadas, uma vez que o modelo também revela a tendência de elevação na dispersão dos dados da variável sob análise. Contudo, uma vez que os parâmetros de localização e de escala são significativamente condicionados no tempo, a presença de tendência climática, na referida série, está comprovada.

**Palavras-chave:** Valores extremos, não estacionário, funções dependente do tempo.

**1. INTRODUCTION**

Statistical evaluations applied to meteorological time series have been used to improve the capability of the human society in dealing with the influence of the climate variability in almost all human activities. For instance, based on the relationship between agricultural production and air temperature values observed during the crop seasons, several studies have evaluated the probability of occurrence associated with extreme minimum air temperature values in order to provide important information related to frosts^{(1)} mitigations procedures (CAMARGO et al., 1993; ASTHOLPO et al., 2004; BLAIN and LULU, 2011).

Based on the Extreme Value Theory (EVT), CAMARGO et al. (1993) and ASTOLPHO et al. (2004) used a particular case of the general extreme value distribution (GEV) - called Gumbel or Fisher-Tippet I - to describe the probabilistic structure of the annual extreme minimum air temperature series (Tminabs) available from the weather station of Campinas (hereafter referred just as Campinas). This time series is composed by the lowest air temperature value observed during each year.

Considering that the GEV has all the flexibility of its three particular cases (type I - Gumbel; type II - Fréchet and, type III - Weibull; NADARAJAH and CHOI, 2007), BLAIN and LULU (2011) used the GEV distribution to assess the probability of occurrence associated with Tminabs values (1948-2007) observed in six regions of the State of São Paulo.

According to LEADBETTER et al. (1983) and WILKS (2006) a basic assumption from the EVT is that the distribution of the maximums of independent and identically distributed (iid) random variables, converge to one of the particular cases of the GEV. This assumption is called the Extremal Types Theorem. Thus, the probability of occurrence of an extreme event observed in any t year (x_{t}) can be described as Pr{X __<__ x_{t}}=GEV(x_{t}; µ, σ, ξ); where µ, σ, ξ are, respectively, the parameters of location, scale and, shape. Also according to WILKS (2006), although the EVT is often related to the description of the behavior of the maximum values observed in a data sample, its approach is equally applicable to the evaluation of the smallest values observed in this data sample. In this case it is only necessary to transforming the variables x into -x and, consequently, -µ into µ.

It is worth emphasizing that the use of the GEV(x_{t}; µ, σ , ξ) model is frequently called "the stationary approach" (COLES, 2001; EL ADLOUNI et al., 2007; PUJOL et al., 2007; MÉNDEZ, 2007; FURIÓ and MENEU, 2011) since the probability associated with any x_{t} is independent of t. In other words, once the GEV(x_{t}; µ, σ, ξ) model is fitted from a (already) observed time series (for e.g. from the 1891-2010 time span) it is assumed that the values of µ, σ, ξ will remain the same during the next (t) years (for e.g. 2011 to 2050). However, the Intergovernmental Panel on Climate Change (IPCC 2001 and 2007) indicates that the intensity and the frequency of extreme events (such as extreme low air temperatures) will change due to the global warming. Furthermore, the IPCC (2007) also states that the recent increases of the global air temperature already have perceptible impacts on many natural systems.

As indicated by BLAIN et al. (2009) the evaluation of the average annual minimum air temperature (Tmin) of Campinas seems to agree with this last statement of the IPCC (2007) since this time series has shown remarkable increasing trends during the last years. Indeed, considering the years between 1890 and 2007, this annual series has shown a linear increasing trend at the rate of (approximately) 2.5 ºC per 100 years (BLAIN et al., 2009).

These last considerations have allowed us to (i) work under the hypothesis that the use of a stationary GEV model may no longer be valid since the presence of climate trends in the Tminabs series of Campinas must not be neglected, and to (ii) assume that the development of a time-dependent model may provide a more feasible description of future realizations of the variable under evaluation. Thus, the aim of the work was to describe the probabilistic structure of the Tminabs data sample based on a GEV model in which the parameters are estimated as a function of a time covariate.

**2. MATERIAL AND METHODS**

Annual extreme daily minimum air temperature data were used from the weather station of Campinas, State of São Paulo, Brazil (22º54'S; 47º05'W; 669m; 1890-2010). This dataset was obtained from the Instituto Agronômico (IAC/APTA/SAA; Figure 1).

**Theoretical background**

According to COLES (2001), EL ADLOUNI et al., (2007), MÉNDEZ et al., 2007; PUJOL et al. (2007) and, FURIÓ and MENEU (2011), a non-stationary GEV model may be described by the following probability density function:

Under the framework of equation 1, we have proposed four GEV models with increasing numbers of parameters to be estimated.

Model 1: GEV(µ_{t}=µ, σ _{t}= σ, ξ _{t}= ξ) - The stationary model. The parameters are constant.

Model 2: GEV(µ_{t}=µ_{o} + βt, σ _{t}= σ, ξ _{t}= ξ) - The homoscedastic model

Model 3: GEV(µ_{t}=µ'_{o} + β't, σ _{t}= exp(σ_{o} + αt), ξ _{t}= ξ)

Model 4: GEV(µ_{t}=µ''_{o} + β''t, σ' _{t}= exp(σ'_{o} + α't), ξ _{t}= ξ_{o} + δt)

Following COLES (2001) the maximum likelihood estimators of equation 1 parameters were obtained by maximizing the following likelihood function (equation 2; note that t=1 refers to the year of 1891 and, consequently, t=120 refers to the years of 2010).

**Supporting the non-stationary approach**

The Mann-Kendall (MK) test (MANN, 1945; KENDALL and STUART, 1967) was used as a methodology for trend detection, since non-stationary features may be identified by detecting trends in the Tminabs series (ZANG et al., 2001; 2004). The null hypothesis (Ho) associated with the MK test assumes that the sample is free from trends (the absence of significant serial correlation is also assumed). The Ho is usually rejected if the p-value is less than or equal to 0.05.

As indicated by ANDERSON (1976) if the probabilistic structure of the variable under evaluation does not change with time, the process that generates the Tminabs values can be seen as strictly stationary. Otherwise it is non-stationary. Thus, based on model 1 (stationary approach), the parameters of equation 1 were estimated from the following five time spans: 1891-1950, 1951-2010, 1891-1930, 1931-1970 and, 1971-2010. The first year of the series (1890) was not considerate in order to obtain equal periods of records. The likelihood ratio test (Λ*; as described in WILKS, 2006) was used to evaluate the differences between the probabilistic structure associated with each one of these equals time lengths. The Λ* test (equation 3) can be described by the difference of the log-likelihoods [*l(.)*] associated with both null and alternative hypotheses.

As indicated by WILKS (2006), under H_{0} the sampling distribution of the statistic in equation 3 is chi-square with three (equation 3.1) and six (equation 3.2) degrees of freedom. For equation 3.1 the H_{0} assumes that the two data records (1891-1950, 1951-2010) were drawn from the same GEV distribution. Thus, the rejection of Ho, for a particular confidence level, indicates that the probabilistic structure associated with these two time spans are significantly different. For equation 3.2, the H_{0} assumes that the three data records (1891-1930, 1931-1970 and, 1971-2010) were drawn from the same GEV distribution.

**Choosing the model**

As indicated by EL ADLOUNI et al. (2007), the most general model is frequently the best to represent the data sample under evaluation. However, EL ADLOUNI et al. (2007) also indicate that according to the parsimony principle when the differences between two models Model_{i} and Model_{j} (in which; Model_{i} ⊂ Model_{j} if j>i) are not significant, it is better to use the simplest one (Model_{i}). The uncertainty in quantile estimation tends to increase as the number of parameters to be estimated increases. In this view, model 1 can be seen as a particular case of model 2. At the same time, these both models can be seen as particular cases of model 3. These three models are, indeed, particular cases of model 4. According to COLES (2001), the Λ* algorithm can also be applied to compare the validity of a more general model against a particular case (p-values less than or equal to 0.05 were seen as an evidence that the most general model is better than the particular one).

As pointed out by SANSIGOLO (2008) although the chi-square and the Kolmogorov-Smirnov are commonly used tests of goodness-of-fit, these both methods are only appropriated for evaluating the central part of the distributions. Since this work deals with extreme data, the adequacy of the models must be evaluated from goodness-of-fit procedures that give special focus for the upper (lower, in the present case) tail of the distributions. In this view, the quantile-quantile plot (QQ) and the percentile-percentil plot (PP) were used in order to compare the observed data and the fitted distribution. However, it is worth emphasizing that a natural consequence of adopting a time-dependent model is that the data cannot be considerate as iid. Thus, before plotting the QQ and PP, we had to transform the data in order to assure that each point has the same joint distribution. This transformation was carried out by adopting the procedure described in MÉNDEZ et al. (2007) and FELICI et al. (2007). For both PP and QQ plots, the displacement of the Cartesians points from the diagonal line (1:1) indicates low quality of the parametric estimation (FELICI et al., 2007).

**3. RESULTS AND DISCUSSION**

The visual analysis of Figure 1 seems to indicate an average increase in the Tminabs data. However, it is also apparent that this increasing trend is associated with a greater dispersion of the data values observed during the most recent years of Figure 1. These subjective results were confirmed by the statistical methods applied in this study.

The MK final value (4.18) indicates the presence of a significant trend component in the analyzed time series since the p-value associated with this statistic is equal to 0.0052 (considerable lower than the critical value of p=0.05). A similar feature is also indicated by the results of the Λ* test (Table 1).

After comparing the distinct sub-periods of the Tminabs series (Table 1), it becomes evident the increasing trend of these extreme data over the years. The higher values of the parameters µ and σ are found in the most recent time spans (1951-2010 and 1971-2010). In addition, since the p-value associated with the Λ* indicates that the probability of occurrence of error type I is always greater than 0.99 (Table 1), we could not accept the null hypothesis that all the data records were drawn from the same population with a stationary GEV distribution. The results of both MK and Λ* tests indicate that the probabilistic structure of the Tminabs series does change with time. Consequently, the use of a GEV model with fixed parameters, in which the presence of climate trends is neglected, may no longer be valid. Thus, a time-dependent model should provide a more realistic description of the process under evaluation.

**Choosing the non-stationary model**

As expected, the Λ* test indicated that models 2, 3 and, 4 are significantly better than model 1 for describing the process that has generated the data observed during 1891-2010. The values of 21.52, 28.57 and, 28.58, respectively, were obtained when models 2-1, 3-1 and, 4-1 were subject to the Λ* test. These three last values are associated with p-values lower than 0.01. The Λ* test also indicates that models 3 and 4 are significantly better than model 2 since the Λ* final values were 7.04 (p-value=0.008; models 2-3) and 7.06 (p-value=0.03; models 2-4). Finally, this test also indicated the absence of significant differences between models 3 and 4 (Λ* = 0.02; p=0.89). Thus, by following EL ADLOUNI et al. (2007), i.e., considering the parsimony principle, we have assumed that model 3 is the most adequate to modeling the Tminabs series of Campinas. The parameters of this model are: µ_{t}=3.095+0.0301t, σ_{t}= exp(0.316 + 0.0053t), ξ _{t}= ξ = -0.177. Thus, based on model 3, one is able to estimate the probability of occurrence associated with a given Tminabs value at a given t year.

The small displacement of the Cartesian points in both PP and QQ plots (Figure 2) supports our decision in using model 3 to represent the probability density function [Pr(Tminabs_{t=1}, Tminabs_{t=2},...,Tminabs_{t=n})] .

In order to provide an example of the use of this non-stationary model, the probability associated with 5 ºC, 4 ºC, 3 ºC, 2 ºC and, 1 ºC at the years of 2020, 2030, 2040 and 2050 are presented in Table 2 (for more information about the relationship between Tminabs values and frost occurrence see SENTELHAS et al., 1995).

As can be seen, the probability of occurrence associated with the data presented in Table 2 tends in general to decrease over the years. This feature is a natural consequence of the increasing trends of the values of µ_{t}. However, within the analyzed period (2020-2050) model 3 also indicates a continuous decrease in the rate of change of the probabilities of occurrence associated with the lowest Tminabs values (Table 2). Although this last feature seems contradictory with the increasing values of location parameter, it is worth emphasizing that model 3 also shows an increasing trend in the σ(t) values which indicates an increasing dispersion of the Tminabs values over the years. In other words, despite the fact that model 3 indicates an increasing trend in the average values of the Tminabs data, it does not give us enough evidence to adopt the speculation that extreme low temperatures (such as Tminabs < 2ºC) will no longer be observed at the weather station of Campinas. Also according to model 3 this trend will be accompanied by a increasing in the variance of future Tminabs data. From an agrometeorological point of view, this greater dispersion may be linked with future frost events.

Finally, it is worth mentioning that all proposed models (2, 3 and, 4) are linear or log-linear equations in which, as pointed out by CANNON (2010), the dependence of the distribution parameters on the covariates had to be specified *a priori*. Although several authors such as COLES (2001), EL ADLOUNI et al. (2007), MÉNDEZ et al. (2007), FELICI et al. (2007), PUJOL et al. (2007) and, FURIÓ and MENEU (2011), also work under this assumption of linearity, the development of a more flexible framework that can be used to model non-linear relationships is of great interesting. In this view, the study carried out by Cannon (2010), that specifies the parameters of the GEV by using a probabilistic extension of the multilayer perceptron neural network, can be seen as an important future alternative to overcome the linear approach adopted in this study.

**4. CONCLUSION**

A non-stationary GEV model in which the parameters of location and scale are expressed as time-dependent functions is recommended to describe the probabilistic structure of the annual extreme minimum air temperature series available from the weather station of Campinas, State of São Paulo, Brazil. In addition, since two of the three parameters of this probabilistic model are significantly conditioned on time, the presence of climate trends in the analyzed time series becomes evident.

Although this non-stationary model has indicated an average increase in the values of the analyzed data (represented by the time variability of µ), it does not allow us to conclude that the region of Campinas is now free from frost occurrence. The increasing temporal trend in the scale parameter reveals an increasing trend in the dispersions of the data sample. This greater dispersion may be linked with future values of this variable that still may cause the death of plant tissues.

**REFERENCES**

ANDERSON, O.D. Time series analysis and forecasting: The Box-Jenkins approach, London and Boston: Butterworths 1976. [ Links ]

ASTOLPHO, F.; CAMARGO, M.B.P.; BARDIN, L. Probabilidades mensais e anuais de ocorrência de temperaturas mínimas do ar adversas à agricultura na região de Campinas (SP), de 1891 a 2000. Bragantia, v.63, p.141-147, 2004. [ Links ]

BLAIN, G.C.; PICOLI, M.C.A.; LULU, J. Análise estatística das tendências de elevação nas séries anuais de temperatura mínima do ar no Estado de São Paulo. Bragantia, v.68, p.807-815, 2009. [ Links ]

BLAIN, G.C., LULU, J. Considerações estatísticas relativas a seis séries mensais de temperatura do ar da Secretaria de Agricultura e Abastecimento do Estado de São Paulo. Revista Brasileira de Meteorologia, v.26, p.29-40, 2011. [ Links ]

CAMARGO, M.B.P.; PEDRO JÚNIOR, M.J.; ALFONSI, R.R.; ORTOLANI, A.A.; BRUNINI, O. Probabilidades de ocorrência de temperaturas mínimas absolutas mensais e anual no Estado de São Paulo. Bragantia, v.52, p.161-168, 1993. [ Links ]

CANNON, A.J. A flexible nonlinear modeling framework for nonstationary generalized extreme value analysis in hydroclimatology. Hydrological Process, v.24, p.673-685, 2010. [ Links ]

COLES, S. An introduction to statistical modeling of extreme value. London: Springer 2001. [ Links ]

EL ADLOUNI, S., OUARDA, T.B.M.J., ZHANG, X., ROY, R., BOBÉE, B. Generalized maximum likelihood estimators for the nonstationary generalized extreme value model. Water Resources Research, v.43, p.1-13, 2007. [ Links ]

FELICI, M.; LUCARINI, V.; SPERANZA, A.; VITOLO, R. extreme value statistics of the total energy in an intermediate-complexity model of the midlatitude atmospheric jet. Part II: trend detection and assessment. Journal of the Atmospheric Science, v.64, p.2159-214-75, 2007. [ Links ]

FURIÓ, D.; MENEU, V. Analysis of extreme temperatures for four sites across Peninsular Spain. Theoretical Applied Climatology, v.104, p.83-99, 2011. [ Links ]

IPCC. Climate Change 2001: Impacts, adaptation and vulnerability, contribution of working group 2 to the third assessment report of the intergovernmental panel on climate change. HOUGHTON, J.T. (Ed.). Cambridge: University Press, 2001. [ Links ]

IPCC. Climate Change 2007: The Physical Science Basis, Contribution of Working Group I to the Fourth Assessment Report of the Intergovernmental Panel on Climate Change. HOUGHTON, J.T. (Ed.). Cambridge: University Press, 2007. [ Links ]

KENDALL, M.A.; STUART, A. The advanced theory of statistics. 2.ed. Londres: Charles Griffin, 1967. v.2, 690p. [ Links ]

LEADBETTER, M.R.; LINDGREN, G.; ROOTZEN, H. Extremes and Related Properties of Random Sequences and Processes. New York: Springer, 1983. 336p. [ Links ]

MANN, H.B. Non-parametric tests against trend. Econometrica 13, MathSciNet, 1945. p.245-259. [ Links ]

MÉNDEZ, F.J.; MENÉNDEZ, M.; LUCEÑO, A.; LOSADA, I.J. Analyzing Monthly Extreme Sea Levels with a Time-Dependent GEV Model. Journal of Atmospheric and Oceanic Technology, v.24, p.894-911, 2007. [ Links ]

NADARAJAH, S.CHOI, D. Maximum daily rainfall in South Korea. Journal of Earth System Science, v.116, p.311-320, 2007. [ Links ]

PUJOL, N.; NEPPEL, L., SABATIER, R. Regional tests for trend detection in maximum precipitation series in the French Mediterranean region. Hydrological Sciences Journal, v.52, p.952-973, 2007. [ Links ]

SANSIGOLO, C.A. Distribuições de extremos de precipitação diária, temperatura máxima e mínima e velocidade do vento em Piracicaba, SP (1917-2006). Revista Brasileira de Meteorologia, v.23, p.341-346, 2008. [ Links ]

SENTELHAS, P.C.; ORTOLANI, A.A.; PEZZOPANE, J.R.M. Estimativa da temperatura mínima de relva e da diferença de temperatura entre o abrigo e a relva em noites de geada. Bragantia, v.54, p.437-445, 1995. [ Links ]

WILKS, D.S. Statistical methods in the atmospheric sciences. 2.ed. San Diego: Academic Press, 2006. 629p. [ Links ]

ZHANG, X.; HARVEY, K.D.; HOGG, W.D.; YUZYK, T.R. Trends in Canadian streamflow. Water Resource Research, v.37, p.987-999, 2001. [ Links ]

ZHANG, X., ZWEIRS, F.W.; LI, G. Monte Carlo experiments on the detection of trends in extreme values. Journal of Climate, v.17, p.1945-1952, 2004. [ Links ]

Received: February 9, 2011

Accepted: July 20, 2011

(*) Corresponding author: gabriel@iac.sp.gov.br

(1) Defined from an agronomic point of view: death of the plant tissues caused by extreme low air temperature values.