MODELLING AIR TEMPERATURE FOR THE STATE OF SÃO PAULO , BRAZIL

Spatial modelling of air temperature (maximum, mean and minimum) of the State of São Paulo (Brazil) was calculated by multiple regression analysis and ordinary kriging. Climatic data (mean values of five or more years) were obtained from 256 meteorological stations distributed uniformly over the State. The correlation between the climatic dependent variables, with latitude and altitude as independent variables was significant and could explain most of the spatial variability. The coefficients of determination (P < 0.05) varied in the range of 0.924 and 0.953, showing that multiple regression analysis is an accurate method for the modelling of air temperature for the State of São Paulo. Finally, these regression equations were used together with the kriged maps of the residual errors to build 15 digital maps of air temperature using a 0.5 km Digital Elevation Model in a Geographic Information System.


INTRODUCTION
Knowledge of the spatial distribution of climatic data is an essential tool for the management of natural resources and the prediction of climatic data is very useful in a wide number of scientific disciplines (e.g.: agronomy, geography, and ecology).Moreover, the impact of fossil fuel burning and deforestation on climatic change raised the awareness for the development of adequate and reliable prediction models of air temperature around the world.
The objective of this study were i) to calculate spatial models of both monthly and annual air temperatures (mean, minimum and maximum) for the State of São Paulo (Brazil) using multiple regression analysis and ordinary kriging; ii) to use these models to derive digital raster maps of air temperature for the study area.

CASE STUDY
The State of São Paulo is located in the southern hemisphere (19-25º S, 44-53ºW) in Brazil, with an area of 248,816 km 2 .The State is divided into five geomorphologic units (Figure 1).The Western Plateau constitutes the main unit (60% of the area).Sedimentary deposits form this plateau and its morphology is dominated by flat to undulated landscapes of low hills with slopes between 0-20% (Cacheiro-Pose et al., 2002), and altitudes ranging between 250-650 m a.s.l.The Basaltic Cuestas spread from SW to NE in the middle of the State forming a range of trapps with alternating layers of basaltic rocks and eolian sandstones with altitudes between 500-900 m a.s.l.The Peripherical Depression, situated in the central portion of the State, constitutes a zone topographically lower between the Basaltic Cuestas and the Atlantic Region, with sediments, including shales, siltstones, sandstones and occasional basaltic intrusions.The Atlantic Region is the most elevated area formed by a plateau and two mountainous regions (Serra do Mar and Serra da Mantiqueira) with steep slopes between 0-20% in the plateau and from 10-40% in the mountains.It is formed mainly by igneous and metamorphic rocks, with altitudes ranging between 650-800 m a.s.l. in the plateau and up to 2703 m a.s.l. in the mountain ranges.These mountainous chains act as topographic barriers for oceanic fronts, therefore an important annual rainfall gradient is found between both sides of the ranges.The Coastal Region is a flat area along the coastal line narrowing on the direction SW-NE, where the ranges of the "Serra do Mar" come closer to the seacoast.
The predominant climatic type in São Paulo is the tropical moist with dry winter (Aw) but the moist with mild winter climates with dry winter and hot (Cwa) and humid and hot (Cfa) according to Köppen's classification are also frequent.Mean temperatures are smooth, ranging from 4.7-23ºC in winter and up to 29ºC in summer.Mean annual precipitation ranges from 1,350 to 1,550 mm.Rainy season occurs during October-March, coinciding with the higher temperature values (austral summer).The dry season starts in April and ends in October.
A total of 256 stations recording air temperature were utilized in these analyses (CIIAGRO data- base).The temperature data available from the CIIAGRO database had records ranging from 5 to 15 years.According to the World Meteorological Organization (WMO, 1967), to ensure the optimal climate modelling, data series should extend to at least 30 years long.Such long time series are often unavailable.However good results have also been obtained using shorter time series (Wotling et al., 2000;Marquinez et al., 2003).
Multiple regression analysis was used, combined with ordinary kriging, for air temperature modelling.The mean values of the climatic variables were considered as dependent variables in the multiple regression analysis.As possible independent variables nominal altitude (ALT), latitude (LAT) and longitude (LON) of the meteorological stations were considered.
The adjusted regression model finally gives an expression in the form: where y is the estimated value of the dependent variable, E 0 is the intercept, b n are multiple regression coefficients, X n the significant independent variables and ε is the residual error of the estimation.
For each model, the multiple coefficient of determination (R 2 ) was computed, the multiple regression coefficients building the model equation and the level of significance for each independent variable are also shown.Regression analyses were made using the stepwise method with inclusion of variables.Only independent variables with a level of significance greater than 95% were accepted to build up the model.These regression equations were converted to air temperature maps using map algebra with a Geographic Information System (ArcGIS 8.3), processing the independent variables as map layers in raster format.Altitude raster layer, in meters, was obtained from a 0.5 km 2 Digital Elevation Model (DEM) of the State (GTOPO30, 1996).Latitude and longitude raster layers, in decimal degrees, were computed using the central cell coordinates from the same DEM.At each station the value of ε that expresses the difference between the empirical and the modelled values of temperature was also calculated.The multiple regression results were improved by analyzing the variogram functions of the residual errors.The variogram function describes the average dissimilarity between the residual errors in relation to their spatial distance (Goovaerts, 1997).Sample variograms could be fitted to spherical models (Figure 2) that finally were used to obtain maps of the residual errors by ordinary kriging.These maps were added to the regression maps to diminish the errors of the regression model.

RESULTS AND DISCUSSION
For each multiple regression analysis (Table 1), the significance value for longitude was higher than 0.05, indicating that this variable does not contribute to the prediction of the values of air temperature in the studied area.The accuracy of the regression models was greater than 90% in all cases, as shown in the values of the coefficients of determi-   The spatial structure of the residual values by ordinary kriging was also analyzed.The shapes of the empirical semivariograms show that the residuals present a pattern of spatial dependency (Figure 2).Spherical models to describe the empirical variogram functions of the residuals showed the best goodness of fit according to the minimum weighted least squares criteria.
Nuggets range between 0.05-0.1 and variograms stabilize at approximately 100 km.These residual values account for the local variations of temperature not adjusted by the regression models.The regression model accuracy can be increased by adding the maps of the kriged residuals (Figure 3) to those obtained with the regression equations.We observe that residuals take the values of ± 0.5ºC in the main part of the State, so that a good fit between the model and the empirical data was obtained.The results for annual temperatures are shown in Table 2 shows the mean minimum, mean maximum and mean temperature results for raster layers in monthly and annual basis.The extreme months are June (5.41ºC)and February (28.39ºC).Despite minimum temperatures in May, June and July are lower than 5ºC it has to be noted that mean temperatures are much higher, around 18ºC, considering the whole study area.These low values of temperature are located in high altitude areas in the NE of the coastal mountains.
Finally a linear regression analysis to isolate the influence of altitude and latitude on the interpolated values of temperature was made.In Figure 8 the scatterplot of the annual mean temperatures over altitude, and the least square linear regression model are represented.There is a good correlation between altitude and temperature (R 2 = 0.762).A trend of a decline in temperature with increases in altitude is observed.In this scatterplot three groups of points are distinguished.
Points belonging to area A correspond to data of meteorological stations from the Coastal Region.The oceanic influence in this area causes annual mean temperatures to be lower than the modelled values explained by the simple linear correlation model of altitude over temperature.For similar altitude values, there is a gradient of temperature from points in zone C to those in zone B that can be explained by their latitude position.The cut point between both zones is located at 22.11ºS.Real temperatures in this area tend to be higher than modelled temperatures by this simple regression model.The contrary occurs in zone C.This shows that, in addition to the effect of altitude, there is a certain contribution of latitude to the spatial distribution of temperature.
Plotting annual mean temperatures in relation to latitude (Figure 9) a group of points (zone A) corresponding again to the same stations of the Coastal Province can be observed.On the other hand, the coefficient of determination is low, showing a poor weight of latitudes in relation to altitudes when explaining the distribution of temperatures.For each latitude there is a temperature gradient due to the effect of altitude, the cut point being between zones B and C located at 570 m.
Therefore, the multiple regression models proposed in this study improves the accuracy of simple linear regression models of temperature values over altitudes.
Altitudes and latitudes for the map raster layers of this study were computed using the GTOPO30 DEM, and the low resolution of this DEM can be a cause of incertitude for the temperature maps.The use of a more accurate DEM of the State would lead to a more realistic distribution of air temperatures.

CONCLUSIONS
Multiple regression analysis is a suitable method to model air temperature in the State of São Paulo.The models created can predict air temperature with an accuracy greater than 90% in all cases.The spatial distribution of temperature can be explained using only latitude and altitude as independent variables.Longitude was, in all analyses, a non-significant variable able to predict air temperatures.The kriging of the residuals allows to take into account local anomalies into the regression models and thus to improve the final results.

Figure 1 -
Figure 1 -Hipsometry of the State of São Paulo and location of the 256 meteorological stations used for temperature modelling (black points).

Figure 2 -Sci
Figure 2 -Empirical (red dotted line) and adjusted (blue line) variogram functions of the residuals from the regression models.The number of pairs at each lag distance, and the standard deviation (black dotted line) are also indicated.Distance units correspond to km /10 Figures 4 to 6. Monthly temperature results are in Figure 7.

Figure 3 -
Figure 3 -Maps of the kriged residuals for temperature in the State of São Paulo, Brazil.

Figure 4 -
Figure 4 -Annual mean minimum temperature in the State of São Paulo, Brazil.

Figure 5 -
Figure 5 -Annual mean maximum temperature in the State of São Paulo, Brazil.

Figure 6 -
Figure 6 -Annual mean temperature in the State of São Paulo, Brazil.

Figure 7 -
Figure 7 -Monthly mean temperature in the State of São Paulo, Brazil.

Figure 8 -
Figure 8 -Scatterplot of annual mean temperature over altitude.

Figure 9 -
Figure 9 -Scatterplot of annual mean temperatures as a function of latitude.

Table 1 -
Results of the multiple regression analyses for air temperature.
nation.All the months have similar predictors.The lower fits were obtained for mean temperatures in June, July and August (austral winter), with R 2 values ranging from 0.924 to 0.930.This is expected because mean values in months with extreme temperatures are more difficult to predict than in those with mean or modal values.Considering annual data, the lower fit was obtained for the minimum air temperature.Regression coefficients for altitude always have negative values.As expected, air temperature is negatively correlated to increases of altitude.For latitude, regression coefficients are positive.In this case increases in latitude mean decreases in final temperature values since latitudes in the State of São Paulo were treated as negative values (Southern hemisphere).The equations constructed with each regression coefficient and interception values were used to derive 15 equations predicting the spatial distribution of air temperature by means of map algebra using a Geographic Information System.

Table 2 -
Descriptive statistics for the monthly and annual air temperature layers (451,534 raster cells).