Acessibilidade / Reportar erro

Climate Regionalization in Mato Grosso do Sul: a Combination of Hierarchical and Non-hierarchical Clustering Analyses Based on Precipitation and Temperature

Abstract

The climatic zones of Mato Grosso do Sul (MS) were defined based on the mathematical methodology of cluster analysis (CA). Data from 77 climatic seasons of average annual temperatures (maximum and minimum) and total annual precipitation data from 1978 to 2013 were used, and hierarchical (Ward) and partitional or non-hierarchical (k-means) CA algorithms were chosen, as two of the most used approaches, to carry out the regionalization. The optimum number of clusters in which the data can be grouped was determined by the statistical methods of elbow, silhouette and gap. The stability of the clusters is also tested by statistical approaches and four homogeneous groups were found, as in conventional climatic zones, but with considerable border differences. Pearson's correlation coefficient (r) between the series in each cluster helps to understand the dynamics of these clusters. The hierarchical cluster analysis and the elbow method for the optimal number of clusters was the most appropriate and satisfactory and was able to train and validate homogeneous regions of climate in the state of Mato Grosso do Sul. The efficient application of these methodologies is confirmed by the delimitation of four distinct clusters (homogeneous regions of climate), consistent with recorded heights and temperatures (maximum and minimum) and geographical characteristics as topography, in the state of Mato Grosso do Sul.

Keywords:
cluster analysis; climatic zones; climate regionalization; Mato Grosso do Sul

HIGHLIGHTS

  • Hierarchical cluster analysis was efficient in determining climatic areas.

  • The elbow method for the optimal number of clusters was the most suitable.

  • Notch boxplot analysis and comparing tests can help in the number of clusters.

  • The clusters formed were correlated with geographical and climatic variables.

  • Meteorological systems and topography assist in the discrimination of clusters.

HIGHLIGHTS

  • Hierarchical cluster analysis was efficient in determining climatic areas.

  • The elbow method for the optimal number of clusters was the most suitable.

  • Notch boxplot analysis and comparing tests can help in the number of clusters.

  • The clusters formed were correlated with geographical and climatic variables.

  • Meteorological systems and topography assist in the discrimination of clusters.

INTRODUCTION

Climate similarity measures between regions, typically presented as climate classes, are useful for representing spatial environmental characteristics. The most practical way to establish the climate of a region is through the analysis of monthly average temperature and monthly total precipitation [11 Alvares CA, Stape JL, Sentelhas PC, Gonçalves JLM, Sparovek G. Köppen’s climate classification map for Brazil. Meteorol. Z. 2013 Dec.;22(6):711-28.,22 Abadi AM, Rowe CM, Andrade M. Climate Regionalization in Bolivia: A Combination of Nonhierarchical and Consensus Clustering Analyses Based on Precipitation and Temperature. Int. J. Climatol. 2019 Dec.; 40(10):4408-21.,33 Carvalho MJ, Melo-Gonçalves P, Teixeira JC, Rocha A. Regionalization of Europe based on a K-Means Cluster Analysis of the climate change of temperatures and precipitation. Phys. Chem. Earth. 2016 Aug.; 94: 22-8.]. This is a key barrier when studying regions with a substantial lack of climate data, such as Brazil and developing countries. Another important issue in climate classification studies is related to a climate class boundary detected by various time series data sets from different periods that may reflect climate change [44 Beck C, Grieser J, Kottek M, Rubel F, Rudolf B. Characterizing global climate change by means of Köppen climate classification. Klimastatusbericht. 2005 Jan.; 51:139-49.,55 Park S, Park H, Im J, Yoo C, Rhee J, Lee B. Delineation of high-resolution climate regions over the Korean Peninsula using machine learning approaches. PLoS One. 2019 Oct.;14(10):e0223362.]. There are some approaches for regionalizing climate areas according to the similarity of regional climate characteristics in rule-based precipitation and temperature schemes [22 Abadi AM, Rowe CM, Andrade M. Climate Regionalization in Bolivia: A Combination of Nonhierarchical and Consensus Clustering Analyses Based on Precipitation and Temperature. Int. J. Climatol. 2019 Dec.; 40(10):4408-21.,66 Köppen W. Das geographische System der Klimate. Köppen, W., R. Geiger (Eds.): Handbuch der Klimatologie. Gebrüder Bornträger. Berlin, 1, part C. 1936. 44 p.], clustering [77 Bunkers MJ, Miller JR, DeGaetano AT. Definition of climate regions in the Northern Plains using an objective cluster modification technique. J Climate. 1996 Jan.; 9(1):130-46.-88 Abreu MC, Souza A, Lyra GB, Pobocikova I, Cecílio RA. Analysis of monthly and annual rainfall variability using linear models in the state of Mato Grosso do Sul, Midwest of Brazil. Int. J. Climatol. 2020b Sep.; 41(S1): E2445-E2461.] and machine learning-based classification [66 Köppen W. Das geographische System der Klimate. Köppen, W., R. Geiger (Eds.): Handbuch der Klimatologie. Gebrüder Bornträger. Berlin, 1, part C. 1936. 44 p.

7 Bunkers MJ, Miller JR, DeGaetano AT. Definition of climate regions in the Northern Plains using an objective cluster modification technique. J Climate. 1996 Jan.; 9(1):130-46.
-88 Abreu MC, Souza A, Lyra GB, Pobocikova I, Cecílio RA. Analysis of monthly and annual rainfall variability using linear models in the state of Mato Grosso do Sul, Midwest of Brazil. Int. J. Climatol. 2020b Sep.; 41(S1): E2445-E2461.].

The cluster analysis (CA) is the most widespread technique for climate classification [22 Abadi AM, Rowe CM, Andrade M. Climate Regionalization in Bolivia: A Combination of Nonhierarchical and Consensus Clustering Analyses Based on Precipitation and Temperature. Int. J. Climatol. 2019 Dec.; 40(10):4408-21.,33 Carvalho MJ, Melo-Gonçalves P, Teixeira JC, Rocha A. Regionalization of Europe based on a K-Means Cluster Analysis of the climate change of temperatures and precipitation. Phys. Chem. Earth. 2016 Aug.; 94: 22-8.,88 Abreu MC, Souza A, Lyra GB, Pobocikova I, Cecílio RA. Analysis of monthly and annual rainfall variability using linear models in the state of Mato Grosso do Sul, Midwest of Brazil. Int. J. Climatol. 2020b Sep.; 41(S1): E2445-E2461.] and it is based on the similarity or dissimilarity between the objects. The Euclidian distance is a consistent method for dissimilarity measurement between objects and is widely used as a basis for the agglomeration of these objects into clusters [88 Abreu MC, Souza A, Lyra GB, Pobocikova I, Cecílio RA. Analysis of monthly and annual rainfall variability using linear models in the state of Mato Grosso do Sul, Midwest of Brazil. Int. J. Climatol. 2020b Sep.; 41(S1): E2445-E2461.,99 Lyra GB, Oliveira-Júnior JF, Zeri M. Cluster analysis applied to the spatial and temporal variability of monthly rainfall in Alagoas state, Northeast of Brazil.Int. J. Climatol. 2014 Feb.; 34: 3546-58.]. The agglomeration may be processed by hierarchical and partitional (non-hierarchical) cluster-based methods, such as single, medium and full bonding, Ward's method and k-means clustering, to define the climate characteristics of the different sub-groups. The suitability of a clustering method for climate classification is assessed by comparing standard deviations within and between clusters [77 Bunkers MJ, Miller JR, DeGaetano AT. Definition of climate regions in the Northern Plains using an objective cluster modification technique. J Climate. 1996 Jan.; 9(1):130-46.,1010 Rhee J,Im J,Carbone GJ,Jensen JR. Delineation of climate regions using in‐situ and remotely‐sensed data for the Carolinas. Remote Sens. Environ. 2008 Jun.; 112:3099-111.,1111 Oliveira-Júnior JF, Xavier FMG, Teodoro PE, Gois G, Delgado RC. Cluster analysis identified rainfall homogeneous regions in Tocantins state, Brazil. Bios. J. 2017 Mar. - Apr.; 33(2): 333-40.].

Likewise, CA has been widely used for climate regionalization [99 Lyra GB, Oliveira-Júnior JF, Zeri M. Cluster analysis applied to the spatial and temporal variability of monthly rainfall in Alagoas state, Northeast of Brazil.Int. J. Climatol. 2014 Feb.; 34: 3546-58.,1111 Oliveira-Júnior JF, Xavier FMG, Teodoro PE, Gois G, Delgado RC. Cluster analysis identified rainfall homogeneous regions in Tocantins state, Brazil. Bios. J. 2017 Mar. - Apr.; 33(2): 333-40.,1212 Stooksbury DE,Michaels PJ. Cluster‐analysis of southeastern United‐States climate stations. Theor. Appl. Climatol. 1991 Sep.; 44:143-50.,1313 Rocha Júnior RL, Silva FDS, Costa RL, Gomes HB, Silva MCL, Pinto DDC, et al Long-Term Change and Regionalization of Reference Evapotranspiration in the Brazilian Northeast. Rev. Bras. Meteorol. 2021 Oct.-Dec.; 35: 891-902.]. Broadly, CA attempts to maximize similarity within a group while minimizing similarity between groups [1414 Jolliffe IT,Philipp A. Some recent developments in cluster analysis. Phys. Chem. Earth. 2010 Jan.; 35(9-12): 309-15.]. Numerous clustering algorithms are generally classified as hierarchical or non-hierarchically optimization techniques [1414 Jolliffe IT,Philipp A. Some recent developments in cluster analysis. Phys. Chem. Earth. 2010 Jan.; 35(9-12): 309-15.]. One disadvantage of hierarchical methods is that entities that are misclassified at the early stage of the clustering process are not relocated later on [1515 Everitt B. Cluster Analysis. Halsted Press:New York, New York; 1980. 348 p.]. In addition, defining the final number of clusters (i.e. stopping point) can be challenging since different hierarchical methods can give quite different results. Optimization techniques allow the reallocation of entities.

According to [1010 Rhee J,Im J,Carbone GJ,Jensen JR. Delineation of climate regions using in‐situ and remotely‐sensed data for the Carolinas. Remote Sens. Environ. 2008 Jun.; 112:3099-111.,1717 Zscheischler J, Mahecha MD, Harmeling S. Climate classifications: the value of unsupervised clustering. Procedia Comput. Sci. 2012 Aug.; 9: 897-906.,1818 Bieniek PA,Bhatt US,Thoman RL,Angeloff H,Partain J,Papineau J,et al. Climate divisions for Alaska based on objective methods.J. Appl. Meteorol. Climatol. 2012 Jul.; 51(7):1276-89.,1919 Huth R. An intercomparison of computer‐assisted circulation classification methods. Int. J. Climatol. 1996 Aug.; 16(8):893-922.], the non-hierarchical method of grouping k-means surpasses the classical climate classifications and those non-hierarchical methods have surpassed hierarchical methods. In addition, Huth [1919 Huth R. An intercomparison of computer‐assisted circulation classification methods. Int. J. Climatol. 1996 Aug.; 16(8):893-922.] showed that each clustering algorithm has its strengths and weaknesses when applied to sets of climatic data. Currently, there is no preferred grouping method for regionalizing the climate. However, if a data set is well-nucleated, the solutions of different clustering algorithms must be broadly similar [2121 Torres FE, Cargnelutti Filho A, Teodoro PE, Corrêa CCG, Ribeiro LP, Cunha ER. Dimensionamento amostral para a estimação da média de precipitação pluvial mensal em locais do Estado do Mato Grosso do Sul. Ciênc Rural. 2016 Jan.; 46(1): 60-9.].

Regardless of applying hierarchical or non-hierarchical clustering method, the optimum number of clusters (k) can be evaluated by several algorithms such as the elbow method [99 Lyra GB, Oliveira-Júnior JF, Zeri M. Cluster analysis applied to the spatial and temporal variability of monthly rainfall in Alagoas state, Northeast of Brazil.Int. J. Climatol. 2014 Feb.; 34: 3546-58.], the silhouette method and the “Gap statistic” method [2121 Torres FE, Cargnelutti Filho A, Teodoro PE, Corrêa CCG, Ribeiro LP, Cunha ER. Dimensionamento amostral para a estimação da média de precipitação pluvial mensal em locais do Estado do Mato Grosso do Sul. Ciênc Rural. 2016 Jan.; 46(1): 60-9.]. But there is no consensus which is the best method. It is advised, in climate data clustering classification, to compare the clusters formed by each method to establish the optimal number of clusters and to perform spatial correlation of the geographical and climate variables [22 Abadi AM, Rowe CM, Andrade M. Climate Regionalization in Bolivia: A Combination of Nonhierarchical and Consensus Clustering Analyses Based on Precipitation and Temperature. Int. J. Climatol. 2019 Dec.; 40(10):4408-21.,33 Carvalho MJ, Melo-Gonçalves P, Teixeira JC, Rocha A. Regionalization of Europe based on a K-Means Cluster Analysis of the climate change of temperatures and precipitation. Phys. Chem. Earth. 2016 Aug.; 94: 22-8.].

The climate of a region is a key determinant of the functional requirements to be considered in engineering projects, agricultural activities and water resources management [88 Abreu MC, Souza A, Lyra GB, Pobocikova I, Cecílio RA. Analysis of monthly and annual rainfall variability using linear models in the state of Mato Grosso do Sul, Midwest of Brazil. Int. J. Climatol. 2020b Sep.; 41(S1): E2445-E2461.,99 Lyra GB, Oliveira-Júnior JF, Zeri M. Cluster analysis applied to the spatial and temporal variability of monthly rainfall in Alagoas state, Northeast of Brazil.Int. J. Climatol. 2014 Feb.; 34: 3546-58.,1818 Bieniek PA,Bhatt US,Thoman RL,Angeloff H,Partain J,Papineau J,et al. Climate divisions for Alaska based on objective methods.J. Appl. Meteorol. Climatol. 2012 Jul.; 51(7):1276-89.]. Therefore, understanding tropical climates is difficult due to a paucity of data but also because studies on climate classification are scarce [88 Abreu MC, Souza A, Lyra GB, Pobocikova I, Cecílio RA. Analysis of monthly and annual rainfall variability using linear models in the state of Mato Grosso do Sul, Midwest of Brazil. Int. J. Climatol. 2020b Sep.; 41(S1): E2445-E2461.]. The tropical regions are characterized by daily variations whereas at higher latitudes seasonal variation is the dominant characteristic [11 Alvares CA, Stape JL, Sentelhas PC, Gonçalves JLM, Sparovek G. Köppen’s climate classification map for Brazil. Meteorol. Z. 2013 Dec.;22(6):711-28.]. The state of Mato Grosso do Sul (MS), center-west of Brazil, stands out due to its robust agricultural sector, the main economic activity, emphasizing soy and cattle production [2222 Teodoro PE, Oliveira-Júnior JF, Cunha ER, Correa CCG, Torres FE, Bacani VM, et al. Cluster analysis applied to the spatial and temporal variability of monthly rainfall in Mato Grosso do Sul State, Brazil. Meteorol. Atmos. Phys. 2016 Oct.; 128: 197-209.]. The knowledge of its climatic characteristics is important to design strategies that can define comfortable living conditions, ideal management of crops, water and natural resources conservation. In this study, we intend to define spatially homogeneous regions on the basis of the most relevant hydrometeorological variables of MS, using different approaches by CA.

MATERIAL AND METHODS

Characterization of the study area

The state of MS is located in the Midwest Region of Brazil (Figure 1a), with the total area approximately 358,159 km2. The State stands out for its agricultural activity, being the main economic products in MS, in particularly soy and cattle production. The topography (Figure 1b) has elevations range varies from 24 to 1,100 m [2323 Souza A, Fernandes WA, Albrez EA, Galvíncio JD. Análise de agrupamento da precipitação e da temperatura no Mato Grosso do Sul. Acta Geográfica. 2012 Jan.;6(12):109-24.], while mean annual temperatures range varies from 20 to 26°C and mean annual rainfall ranges from 1,000 mm to 1,900 mm. The state has a well-defined dry season between April and September, in which the highest rainfall records are observed in the southern portion of the state. However, the northern region of the state receives higher rainfall records in the rainy season (between October and March) compared to the southern region.

The Köppen’s climate classification divides the climate diversity of MS to several climatic regions: (i) “Aw” (tropical zone with dry winter), in the Southeast and North of the state; (ii) “Am” (tropical zone monsoon) in the central region; (iii) “Af” (tropical zone without dry season) in the Southwest; and (iv) “Cfa” (humid subtropical zone with hot summer) in the Southern of the state (Figure 1c). In the Southwest of Mato Grosso do Sul, south of the Pantanal (between -21º and -22º latitudes), the climate is characterized as tropical forest (“Af”), with rainfalls distributed throughout the year. The central portion of the state is predominantly characterized by tropical monsoon climate (“Am”), with a small dry season during winter. In the North, a small part of the central region and the Southeastern state, the characterized climate is savannah (“Aw”) which tends to dry winters and rainy summer. Only in the South of the state, the climate is humid in all seasons with hot summers (“Cfa”) (temperatures > 22ºC) [11 Alvares CA, Stape JL, Sentelhas PC, Gonçalves JLM, Sparovek G. Köppen’s climate classification map for Brazil. Meteorol. Z. 2013 Dec.;22(6):711-28., 2424 Ward JH. Hierarchical grouping to optimize an objective function. J. Am. Stat. Assoc. 1963 Apr.; 58(301): 236-44.].

The biome diversity of Mato Grosso, as represented in Figure 1d, includes areas of the Atlantic Forest, Cerrado and Pantanal (encompassed 14%, 61% and 25% of the state’s area, respectively). The Atlantic Forest area is an extremely important biome due to its abundant biological diversity, and has gained great interest as a conservation area, since its biome has been considerably reduced. The Brazilian Cerrado, a vast tropical savanna ecoregion, is widely known for its native habitats and rich biodiversity, and represents the second largest biome in South America, after the Amazon. The Cerrado of Mato Grosso do Sul is located in two hydrographic regions of Brazil, Paraná and Paraguay. The Pantanal region is the world’s largest inland wetland. It is a home to a rich wildlife and is known for its unique biome, however it is also considered as a biodiversity hotspot due to environmental degradation and damage [2323 Souza A, Fernandes WA, Albrez EA, Galvíncio JD. Análise de agrupamento da precipitação e da temperatura no Mato Grosso do Sul. Acta Geográfica. 2012 Jan.;6(12):109-24.].

Figure 1
Mato Grosso do Sul state: (a) localization in South America and in Brazil (b) digital elevation model (DEM) (c) climate classification by Köppen (d) biomes.

Rainfall and temperature data set

Annual average maximum temperature, annual average minimum temperature and total annual precipitation data set were collected from 77 weather stations (Figure 1) during 1978-2013 in Mato Grosso do Sul (MS) state, from the National Institute of Meteorology (INMET) networks (Table A1). The total annual precipitation (Prec, mm), the average maximum temperature (Tmax, ºC) and average minimum temperature (Tmin, ºC) were obtained for each weather station. These climatic variables were used to cluster analysis. The gaps admitted for analysis were, at most 10% [88 Abreu MC, Souza A, Lyra GB, Pobocikova I, Cecílio RA. Analysis of monthly and annual rainfall variability using linear models in the state of Mato Grosso do Sul, Midwest of Brazil. Int. J. Climatol. 2020b Sep.; 41(S1): E2445-E2461.]. Table 1 presents the Prec, Tmax and Tmin variables and their respectively standard derivation.

Cluster Analysis (CA)

The classificatory variables (Prec, Tmax and Tmin) were submitted to a grouping process, which aims to compose groups with high internal homogeneity within the groups and a high external heterogeneity between the groups. To consider all weighted variables equally in the CA, the data were standardized with mean zero and unit variation. There are two types of grouping methods; nonhierarchical methods, which produce a fixed number of groupings, and hierarchical methods, which form groupings through an increasing sequence of group partitions - divisive approach - or successive group joins - agglomerative approach. For both hierarchical and nonhierarchical clustering methods, the evaluation of the group structure contained in the data was made by the hierarchical CA, using Euclidean distance (d E ) [88 Abreu MC, Souza A, Lyra GB, Pobocikova I, Cecílio RA. Analysis of monthly and annual rainfall variability using linear models in the state of Mato Grosso do Sul, Midwest of Brazil. Int. J. Climatol. 2020b Sep.; 41(S1): E2445-E2461.,1111 Oliveira-Júnior JF, Xavier FMG, Teodoro PE, Gois G, Delgado RC. Cluster analysis identified rainfall homogeneous regions in Tocantins state, Brazil. Bios. J. 2017 Mar. - Apr.; 33(2): 333-40.]:

d E = [ j = 1 n ( P p , j P k , j ) 2 ] 0.5 (1)

Where d E is the Euclidian distance, and P p,j and P k,j are the quantitative variables j from elements p and k, respectively.

The hierarchical method most used in environmental studies for cluster analysis is the Ward’s algorithm [99 Lyra GB, Oliveira-Júnior JF, Zeri M. Cluster analysis applied to the spatial and temporal variability of monthly rainfall in Alagoas state, Northeast of Brazil.Int. J. Climatol. 2014 Feb.; 34: 3546-58.,2525 Lima AO, Lyra GB, Abreu MC, Oliveira-Júnior JF, Zeri M, Cunha-Zeri G. Extreme rainfall events over Rio de Janeiro State, Brazil: Characterization using probability distribution functions and clustering analysis. Atmos. Res. 2021 Jan.;247:105221.,2626 Salehnia N, Salehnia N, Ansari H, Kolsoumi S, Bannayan M. Climate data clustering effects on arid and semi-arid rainfed wheat yield: a comparison of artificial intelligence and K-means approaches. Int. J. Biometeorol. 2019 Jul.;63(7):861-72.]. Ward’s algorithm minimizes the dissimilarity, or the total sum of squares (TSS), when establishing groups, which are determined in each step so that the solution has the minimum TSS within each group [99 Lyra GB, Oliveira-Júnior JF, Zeri M. Cluster analysis applied to the spatial and temporal variability of monthly rainfall in Alagoas state, Northeast of Brazil.Int. J. Climatol. 2014 Feb.; 34: 3546-58.,2626 Salehnia N, Salehnia N, Ansari H, Kolsoumi S, Bannayan M. Climate data clustering effects on arid and semi-arid rainfed wheat yield: a comparison of artificial intelligence and K-means approaches. Int. J. Biometeorol. 2019 Jul.;63(7):861-72.].

Unsupervised method of classification aims to minimize the sum of quadratic errors over all groups. This requires three specific parameters: number of groups, group initialization, and distance metric. The quadratic error between mean of the clusters (u k ) and the points in group of the center of the k clusters (Ck) are defined as [1818 Bieniek PA,Bhatt US,Thoman RL,Angeloff H,Partain J,Papineau J,et al. Climate divisions for Alaska based on objective methods.J. Appl. Meteorol. Climatol. 2012 Jul.; 51(7):1276-89.,2727 Fereday DR, Knight JR, Scaife AA, Folland CK. Cluster analysis of North Atlantic-European circulation types and links with tropical Pacific sea surface temperatures. J. Clim. 2008 Aug.;21(15):3687-703.]:

J ( C k ) = X i ε C k n ( X i μ k ) 2 (2)

Where J(ck) is the quadratic error between u k and C k , X = x i , i = (1,2, ..., n) is the set of n d-dimensional points, C = ck, k = (1,2, ..., K) is the set of K clusters and μ k is the mean of Ck clusters. Since the goal is to minimize the sum of squared error (J(c)) over all clusters, the equation is rewritten:

J ( C ) = k = 1 k X i ε C k n ( X i μ k ) 2 (3)

The unsupervised clustering methods such as k-means promotes balanced iterative reducing. The k-means algorithm was successfully used to regionalization the temperature and precipitation in Europe [33 Carvalho MJ, Melo-Gonçalves P, Teixeira JC, Rocha A. Regionalization of Europe based on a K-Means Cluster Analysis of the climate change of temperatures and precipitation. Phys. Chem. Earth. 2016 Aug.; 94: 22-8.,2828 Costa RL, Baptista GMM, Gomes HB, Silva FDS, Rocha Júnior RL, Salvador MA, et al. Analysis of climate extremes indices over northeast Brazil from 1961 to 2014. Weather. Clim. Extremes. 2020 Jun.;28:100254.], Brazil [88 Abreu MC, Souza A, Lyra GB, Pobocikova I, Cecílio RA. Analysis of monthly and annual rainfall variability using linear models in the state of Mato Grosso do Sul, Midwest of Brazil. Int. J. Climatol. 2020b Sep.; 41(S1): E2445-E2461.,99 Lyra GB, Oliveira-Júnior JF, Zeri M. Cluster analysis applied to the spatial and temporal variability of monthly rainfall in Alagoas state, Northeast of Brazil.Int. J. Climatol. 2014 Feb.; 34: 3546-58.,1111 Oliveira-Júnior JF, Xavier FMG, Teodoro PE, Gois G, Delgado RC. Cluster analysis identified rainfall homogeneous regions in Tocantins state, Brazil. Bios. J. 2017 Mar. - Apr.; 33(2): 333-40.,2929 Rousseeuw PJ. Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math. 1987 Nov.;20:53-65.], Bolivia [22 Abadi AM, Rowe CM, Andrade M. Climate Regionalization in Bolivia: A Combination of Nonhierarchical and Consensus Clustering Analyses Based on Precipitation and Temperature. Int. J. Climatol. 2019 Dec.; 40(10):4408-21.] and Iran [2020 Tibshirani R, Walther G, Hastie T. Estimating the number of clusters in a data set via the gap statistic. J. R. Statist. Soc. B. 2001 Jan.; 63(Part 2): 411-23.].

Optimum number of Clusters

The optimal number of clusters was determined by tree different methods: elbow [99 Lyra GB, Oliveira-Júnior JF, Zeri M. Cluster analysis applied to the spatial and temporal variability of monthly rainfall in Alagoas state, Northeast of Brazil.Int. J. Climatol. 2014 Feb.; 34: 3546-58.], silhouette and the gap statistic methods [2121 Torres FE, Cargnelutti Filho A, Teodoro PE, Corrêa CCG, Ribeiro LP, Cunha ER. Dimensionamento amostral para a estimação da média de precipitação pluvial mensal em locais do Estado do Mato Grosso do Sul. Ciênc Rural. 2016 Jan.; 46(1): 60-9.].

The Elbow method looks at the total within-cluster sum of square (WSS) criteria for defining the optimal number of clusters: One should choose a number of clusters in a way that additional cluster will not greatly increase the total WSS. Elbow method computes the clustering algorithm for different values of k, varying from 1 to 10 clusters. For each k, the calculate WSS is plotted according to the number of clusters k, resulting in a curve. The location of the elbow (bend) of the curve is typically an indicator for the appropriate number of clusters. When groups are being arranged their Euclidian distances are initially small, increasing in each step of the process [99 Lyra GB, Oliveira-Júnior JF, Zeri M. Cluster analysis applied to the spatial and temporal variability of monthly rainfall in Alagoas state, Northeast of Brazil.Int. J. Climatol. 2014 Feb.; 34: 3546-58.].

The gap statistic approach, proposed by [2121 Torres FE, Cargnelutti Filho A, Teodoro PE, Corrêa CCG, Ribeiro LP, Cunha ER. Dimensionamento amostral para a estimação da média de precipitação pluvial mensal em locais do Estado do Mato Grosso do Sul. Ciênc Rural. 2016 Jan.; 46(1): 60-9.], can be applied to any clustering algorithm. The gap statistic is also used to study the separation distance between the resulting clusters, by comparing the change in within-cluster dispersion with that expected under an appropriate reference null distribution. It includes the following steps: (i) Cluster the observed data over some range of k = 1, …, k max , and compute the corresponding total within-cluster dispersion (W k ). (ii) Generate B reference data sets with a random uniform distribution, and cluster each of them with varying number of clusters k = 1, …, k max , and compute the corresponding total within intra-cluster variation W kb . (iii) Compute the estimated gap statistic as the deviation of the observed W k value from its expected value W kb under the null hypothesis:

g a p ( k ) = 1 B b = 1 B log ( W k b ) log ( W k ) (4)

(iv) Compute the standard deviation of the statistics. (iv) when reaching a point where the error stop decreasing, set the smallest value of k as the appropriate number of clusters for a given data set such that:

g a p ( k ) g a p ( k + 1 ) S K + 1 (5)

Equation 5 is searching for the subsequent pairs of k and k+1 for finding the smallest value of k that maximizes the gap statistic, adjusted for the expected variance (SK+1), meaning that the clustering structure is far away from the random uniform distribution of points.

The Silhouette index, developed by [3030 Gil VO, Ferrai F, Emmendorfer L. Investigação da aplicação de algoritmos de agrupamento para o problema astrofísico de classificação de galáxias. Rev. Bras. Comput. Apl. 2015 May.;7(2):52-61.], can be used for the validation of grouping data points into clusters. Silhouette width evaluates the quality of the resulting clusters, considering both the compactness (distance between data points within the same group) and the separation (distance between data points in two neighboring groups). This method makes it possible to assess the appropriate number of groups, such that the chosen value of k is providing the most fitted mean value of Silhouette [3131 Santos EB, Lucio PS, Silva CMS. Precipitation regionalization of the Brazilian Amazon. Atmos. Sci. Let. 2015 Sep.;16(3):185-92.,3232 R Core Team. 2020. R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. Available from: https://www.R-project.org/.
https://www.R-project.org...
]. The Silhouette coefficient (S) is calculated by:

s ( i ) = b i a i max ( b i a i ) (6)

Where a i is the mean distance between i and all other instances in the same cluster, and b i is depicts mean nearest cluster, i.e mean distance to the i instances of the next closest cluster:

b ( i ) = j C k c k δ C i min d i s t ( i , j ) n ( C k ) (7)

Where C(i) is the cluster containing instances i, dist (i, j) is the distance (e.g. Euclidean) between instances i and j, and n(C) is the cardinality of cluster C. The Silhouette Width thus varies between the [−1, 1] interval and should be maximized [3030 Gil VO, Ferrai F, Emmendorfer L. Investigação da aplicação de algoritmos de agrupamento para o problema astrofísico de classificação de galáxias. Rev. Bras. Comput. Apl. 2015 May.;7(2):52-61.]. A value close to 1 implies that the instance is close to its cluster, and is a part of the appropriate cluster, while a value close to -1 means that the value is assigned to wrong cluster. When a value is characterized by the value zero, it becomes impossible to identify the group to which they belong.

The mean silhouette width of a group - S k for all instances i in a given group is defined as the mean of all individual silhouettes, where n is the number of objects in the dataset, as follows:

s ( k ) = i = 1 n S ( i ) n (8)

Finally, constructing a silhouette plot, which graphically represents the consistency within the clusters of the data, can provide the means to visually assess cluster quality.

Clusters Validation

The validation of the number of clusters was done by performing notch boxplot analysis [88 Abreu MC, Souza A, Lyra GB, Pobocikova I, Cecílio RA. Analysis of monthly and annual rainfall variability using linear models in the state of Mato Grosso do Sul, Midwest of Brazil. Int. J. Climatol. 2020b Sep.; 41(S1): E2445-E2461.]. The notch boxplot displays a confidence interval around the median of Prec, Tmax and Tmin for each cluster. It can be visually observed if the clusters medians differ or not, by their overlap. The Wilcoxon-Mann-Whitney test was performed for testing the equality of the means of Tmax, Tmin and Prec in each of the clusters. For this purpose, we considered the optimum number of clusters derived by (item 2.2) each methodology and compared the clusters. If the means do not differ between clusters, it was considered as an evidence for reducing the number of groups.

Next, the clusters formed were plotted to check their spatial distribution and their coincidence with climate, biomes and geographical characteristics (latitude, longitude and altitude). Moreover, the Pearson correlation was performed to analyze the behavior of Tmax, Tmin and Prec with geographical characteristics in MS state, approach in order to verify the differences in each cluster in spatial terms.

Software used to perform analysis and maps

All data analysis were performed in R software [3333 Maechler M, Rousseeuw P, Struyf A, Hubert M, Hornik K. 2019. cluster: Cluster Analysis Basics and Extensions. R package version 2.1.0.], using the following Packages: cluster [3434 Kassambara A, Mundt F, 2020. factoextra: Extract and Visualize the Results of Multivariate Data Analyses. R package version 1.0.7. https://CRAN.R-project.org/package=factoextra
https://CRAN.R-project.org/package=facto...
], facto extra) [3535 Charrad M, Ghazzali N, Boiteau V, Niknafs A, NbClust: An R Package for Determining the Relevant Number of Clusters in a Data Set. J. Stat. Soft. 2014 Nov.;61(6):1-36.], nbClust [3636 Kassambara A, 2019. ggcorrplot: Visualization of a Correlation Matrix using 'ggplot2'. R package version 0.1.3. https://CRAN.R-project.org/package=ggcorrplot
https://CRAN.R-project.org/package=ggcor...
] and ggcorrplot [3737 ESRI 2011. ArcGIS Desktop: Release 10.]. The maps were produced by ArcGIS [3838 Ávila LF, Mello CR, Viola MR. Mapeamento da precipitação mínima provável para o sul de Minas Gerais. Rev. Bras. Eng. Agric. Ambient. 2009 Dec.;13(suppl):906-15.].

RESULTS AND DISCUSSION

The criteria for the optimal number of clusters analysis were similar for both hierarchical and non-hierarchical clustering algorithms, however differences were observed using the different methods (Elbow, Gap Statistic and Silhouette methods) - (Figure 2). The number of optimal clusters varied from 1 to 4, and the most consistent number of clusters was two (tree times) and tree (twice). The hierarchical clustering algorithm followed by the Elbow method indicated 4 climatically homogeneous regions on the basis of the most relevant hydrometeorological variables in MS state, corroborating the number of groups founded in a previous analysis by [1111 Oliveira-Júnior JF, Xavier FMG, Teodoro PE, Gois G, Delgado RC. Cluster analysis identified rainfall homogeneous regions in Tocantins state, Brazil. Bios. J. 2017 Mar. - Apr.; 33(2): 333-40.], and closer to the five groups founded by [2323 Souza A, Fernandes WA, Albrez EA, Galvíncio JD. Análise de agrupamento da precipitação e da temperatura no Mato Grosso do Sul. Acta Geográfica. 2012 Jan.;6(12):109-24.]. These studies [88 Abreu MC, Souza A, Lyra GB, Pobocikova I, Cecílio RA. Analysis of monthly and annual rainfall variability using linear models in the state of Mato Grosso do Sul, Midwest of Brazil. Int. J. Climatol. 2020b Sep.; 41(S1): E2445-E2461., 2323 Souza A, Fernandes WA, Albrez EA, Galvíncio JD. Análise de agrupamento da precipitação e da temperatura no Mato Grosso do Sul. Acta Geográfica. 2012 Jan.;6(12):109-24.] considered just the precipitation as variable for clustering. The formation of groups in the works of [88 Abreu MC, Souza A, Lyra GB, Pobocikova I, Cecílio RA. Analysis of monthly and annual rainfall variability using linear models in the state of Mato Grosso do Sul, Midwest of Brazil. Int. J. Climatol. 2020b Sep.; 41(S1): E2445-E2461.] and 23] were justified in systems operating in the MS state, which shows an advantage of the hierarchical method followed by the Elbow method in discriminating groups.

Figure 2
Optimal number of clusters by the Elbow, Silhouette and Gap Statistic methods, using hierarchical clustering algorithms (panel a, c and e) or by using non-hierarchical clustering algorithms (panel b, d and f).

Figure 3 shows the notch boxplot for cluster results using hierarchical (Figure 3 a, c and e) and non-hierarchical (Figure 3 b, d and f) clustering algorithms for precipitation (Prec), maximum (Tmax) and minimum (Tmin) temperatures. The notches in the boxplots of Prec, Tmax and Tmin show a confidence interval around the median and do not overlap between the different groups (except for the cluster 1 and cluster 3 in except for the cluster 1 and cluster 3 for the annual precipitation in hierarchical cluster analysis - Figure 3 a), a strong evidence that their medians differ. Table 1 shows the Wilcoxon-Mann-Whitney test results, which confirm the differences between the clusters formed by hierarchical and non-hierarchical clustering algorithms. The optimal number of clusters in MS, considering the Prec, Tmax and Tmin variables is four, due to the difference detected between these clusters and since the identification of four clusters by the hierarchical method was the most consistent. The seasonal cycle of precipitation and temperature for each cluster represents clearly that the clustering procedure results in climatically distinct clusters for the variables. For the hierarchical cluster analysis, the cluster 4 had higher amount of precipitation and lower temperatures (Tmin and Tmax). The cluster 3 had the higher Tmax and Tmin. The clusters 3 and 1 had the lower amount of precipitation, without significant difference between them. The cluster 2 had intermediate values of precipitation and temperatures. For the non-hierarchical cluster analysis, the amount of precipitation was statistically equivalent in clusters 2 and 3 and greater in cluster 1. The Tmax and Tmin were higher in cluster 2, intermediate in cluster 3 and lower in cluster 1. The two algorithms (hierarchical and non-hierarchical) determined clusters with particularities in terms of amount of precipitation and temperatures that differed statistically from each other.

Table 1
Wilcoxon-Mann-Whitney statistical test for cluster analysis results using either hierarchical or non-hierarchical clustering algorithms for precipitation, maximum and minimum temperature.

Figure 4 presents the spatial disposition of the CA results for hierarchical (Figure 4 a, c and e) and non-hierarchical (Figure 4 b, d and f) clustering algorithms, considering the digital elevation model (Figure 4 a and b), the Köppen climate classification (Figure 4 c and d) and the biomes (Figure 4 e and f) in MS. Considering these factors (orography, climate and biomes) and spatial distribution outcome of each methodology for clustering (hierarchical and non-hierarchical), the results highly indicates that the best data stratification was obtained when four homogeneous regions illustrating the spatial distribution of the Tmax, Tmin and Prec in the MS were composed. These four clusters, formed by hierarchical technique followed by the Elbow method for choosing the optimum number of clusters analysis, can be strongly distinguished according to their altitude, climatic classification and biome. Thus, the discussion of the spatial arrangement of the clusters will be made based on these four groups.

Table 2 presents the Tmax, Tmin and Prec statistical values of position and dispersion in each of the clusters. The standard deviation (SD) was reduced when was computed for each of the clusters, compared to the SD of MS state, an interesting and important point for present and future planning, management and activities related to economic, agricultural, engineering and environmental areas [88 Abreu MC, Souza A, Lyra GB, Pobocikova I, Cecílio RA. Analysis of monthly and annual rainfall variability using linear models in the state of Mato Grosso do Sul, Midwest of Brazil. Int. J. Climatol. 2020b Sep.; 41(S1): E2445-E2461.,99 Lyra GB, Oliveira-Júnior JF, Zeri M. Cluster analysis applied to the spatial and temporal variability of monthly rainfall in Alagoas state, Northeast of Brazil.Int. J. Climatol. 2014 Feb.; 34: 3546-58.,1818 Bieniek PA,Bhatt US,Thoman RL,Angeloff H,Partain J,Papineau J,et al. Climate divisions for Alaska based on objective methods.J. Appl. Meteorol. Climatol. 2012 Jul.; 51(7):1276-89.].

Cluster 1 (C1) is comprised of 17 municipalities which mainly belongs to the Cerrado biome and its transition with Atlantic Forest. The group includes 3 climatic types, all of them with tropical characteristics (“Af”, “Am”, and “Aw”). The altitude ranges from 132 m to 658 m. The climatic characteristic of C1 is intermediary maximum and minimum temperatures with the lowest annual totals precipitated. The longitudinal profile of C1 corresponds to the profile of precipitation in dry season, showing a subdivision that separates the transition months (April and May, September and October) at the beginning and end of the season from the dry months (June, July and August), with higher total precipitates in this period [88 Abreu MC, Souza A, Lyra GB, Pobocikova I, Cecílio RA. Analysis of monthly and annual rainfall variability using linear models in the state of Mato Grosso do Sul, Midwest of Brazil. Int. J. Climatol. 2020b Sep.; 41(S1): E2445-E2461.].

Cluster 2 (C2) is comprised of 26 municipalities, with greater number of weather stations in Atlantic Forest and in high altitudes (from 276 to 712 m) in Cerrado. The main climate in C2 is the tropical monsoon (“Am”). Tmax and Tmin are low compared to other clusters (excluding cluster 4) and had the second major annual precipitation.

Cluster 3 (C3) surrounds the Pantanal in the eastern portion. It is a region with the lowest total precipitates during the driest months (from April to September) but includes areas with a large volume of rain during the rainiest months (from October to March). This makes this cluster the second highest total annual precipitation. In addition, C3 has the lowest altitudes, in a transition to a very characteristic physiognomy like the Pantanal, the largest floodplain in the world. The predominant climate of C3 is Am, always in transition with “Aw” climate (Figure 4).

The most concise cluster in area is Cluster 4 (C4), located in the southern region of the state, in the Atlantic Forest biome, characterize by “Cfa” climate and altitudes ranging between 324 to 786 meters. C4 presents the lowest Tmax and Tmin among the groups and the highest total precipitates during the year. During the driest months of the year (between April to September), the precipitated totals reach up to 7 times higher than that observed in the northern region of the state [88 Abreu MC, Souza A, Lyra GB, Pobocikova I, Cecílio RA. Analysis of monthly and annual rainfall variability using linear models in the state of Mato Grosso do Sul, Midwest of Brazil. Int. J. Climatol. 2020b Sep.; 41(S1): E2445-E2461.,2323 Souza A, Fernandes WA, Albrez EA, Galvíncio JD. Análise de agrupamento da precipitação e da temperatura no Mato Grosso do Sul. Acta Geográfica. 2012 Jan.;6(12):109-24.].

Figure 5 displays the Pearson's correlation (r) between the variables for the entire series (Tmin, Tmax and Prec), for each cluster. The groups demonstrated different spatial behavior. In C1, the high correlation observed between the latitude and Tmin, indicating higher minimal temperature in weather stations in north of MS state. In C2 and C4, the positive correlations between latitude to Tmin, Tmax and Prec, indicate that the weather stations on north, has lower values of Tmax, Tmin and Prec. In C3, the strong negative correlations between the longitude to Tmin, Tmax and altitude were observed, indicating a longitudinal gradient with higher temperatures in the Pantanal region. Likewise, the orographic effect, where air masses are forced to flow over high topography, on Tmax promotes lower temperatures in the mountains of MS.

Figure 3
Notch boxplot for cluster results using hierarchical (a, c and e) and non-hierarchical (b, d and f) clustering algorithms for precipitation, maximum and minimum temperatures.

The clustering and the Pearson correlation analyses show that beyond the tropical characteristics of MS (attitude distribution), altitude is among the physiographic factors, as it is effective in contributing to higher precipitation totals and its observed influence on the precipitation resembles results obtained in previous studies conducted in other Brazilian states [88 Abreu MC, Souza A, Lyra GB, Pobocikova I, Cecílio RA. Analysis of monthly and annual rainfall variability using linear models in the state of Mato Grosso do Sul, Midwest of Brazil. Int. J. Climatol. 2020b Sep.; 41(S1): E2445-E2461.,2323 Souza A, Fernandes WA, Albrez EA, Galvíncio JD. Análise de agrupamento da precipitação e da temperatura no Mato Grosso do Sul. Acta Geográfica. 2012 Jan.;6(12):109-24.,3838 Ávila LF, Mello CR, Viola MR. Mapeamento da precipitação mínima provável para o sul de Minas Gerais. Rev. Bras. Eng. Agric. Ambient. 2009 Dec.;13(suppl):906-15.,3939 Uliana TM, Reis EF, Silva JGF, Xavier AC. Precipitação mensal e anual provável para o estado do Espírito Santo. Irriga 2013 Nov.;18(1):139-47.,4040 Abreu MC, Souza A, Lins TMP, Oliveira-Júnior JF, Oliveira SS, Fernandes W, et al. Comparison and validation of trmm satellite precipitation estimates and data observed in Mato Grosso do Sul state, Brazil. Rev. Bras. Climatol. 2020 Jul-Dec.;27:2237-8642.]. The topographic alignment, arranged in the longitudinal direction (NE - SW), shows a clearly defined morphological characteristics: the plain and the plateau. This arrangement has a marked influence on rainfall behavior in rainfall groups and meteorological systems operating in MS [4141 Nimer E. Clima da região Sudeste. In: IBGE - Instituto Brasileiro de Geografia e Estatística: Geografia do Brasil. Rio de Janeiro, Rio de Janeiro. 1971. 427 p.] and at the border of C2 and C4 (Figure 4).

Another aspect refers to the fact that the state is at the confluence of the main active meteorological systems that define the rainfall of the Midwest [4141 Nimer E. Clima da região Sudeste. In: IBGE - Instituto Brasileiro de Geografia e Estatística: Geografia do Brasil. Rio de Janeiro, Rio de Janeiro. 1971. 427 p.], having more than one type of rainfall regime, as identified in this study. The atmospheric circulations that affect it most have a tropical and extratropical origin, being influenced by local warming, moisture transport from northern South America (SA), Frontal Systems (FS) and dry air masses of the subtropical South Atlantic region [2323 Souza A, Fernandes WA, Albrez EA, Galvíncio JD. Análise de agrupamento da precipitação e da temperatura no Mato Grosso do Sul. Acta Geográfica. 2012 Jan.;6(12):109-24.,4141 Nimer E. Clima da região Sudeste. In: IBGE - Instituto Brasileiro de Geografia e Estatística: Geografia do Brasil. Rio de Janeiro, Rio de Janeiro. 1971. 427 p.].

Figure 4
Spatial arrangement of cluster analysis using hierarchical (a, c and e) and non-hierarchical (b, d and f) clustering algorithms for precipitation, maximum and minimum temperatures, considering their spatial distribution above the digital elevation model (a and b), the Köppen climate classification (Figure 4c and d) and the biomes (Figure 4e and f) in Mato Grosso do Sul.

The annual total precipitation during the year presents a distribution of a rainier nucleus south of MS, with a decrease from east to west. At the extreme west of MS, the regime falls to 1,200 mm in the Pantanal, and in the southern region, where the highest values are 1,660 mm. However, these precipitations are not evenly distributed throughout the year. In almost every region, more than 70% of the total rainfall accumulated during the year is precipitated from November to March, and from November to January [88 Abreu MC, Souza A, Lyra GB, Pobocikova I, Cecílio RA. Analysis of monthly and annual rainfall variability using linear models in the state of Mato Grosso do Sul, Midwest of Brazil. Int. J. Climatol. 2020b Sep.; 41(S1): E2445-E2461.,2323 Souza A, Fernandes WA, Albrez EA, Galvíncio JD. Análise de agrupamento da precipitação e da temperatura no Mato Grosso do Sul. Acta Geográfica. 2012 Jan.;6(12):109-24.,4141 Nimer E. Clima da região Sudeste. In: IBGE - Instituto Brasileiro de Geografia e Estatística: Geografia do Brasil. Rio de Janeiro, Rio de Janeiro. 1971. 427 p.].

Table 2
Statistical significance indicating the position and dispersion by the comparison of precipitation (Prec), maximum and minimum temperature (Tmax and Tmin, respectively) of the clusters formed by hierarchical clustering algorithm.

The November to January quarter is generally rainier with an average of 45-55% of the annual total rain. By contrast, winter is excessively dry. At this time of the year, rains are exceedingly rare, with an average of 4-5 days of occurrence of this phenomenon per month, being rarer in the western part of MS, where at least one month does not register even one rainy day. The dry season happens in the winter quarter, typically during June-July-August.

Figure 5
Entire Pearson’s correlation coefficients (r) between climatic and geographical variables for clusters 1-4 (Cluster 1, Cluster 2, Cluster 3 and Cluster 4). Blue and red color indicates positive and negative correlation, respectively. Color saturations reflect the magnitude of correlation.

Owing to MS state latitudinal locations, it is characterized by having transitional regions between warm low-latitude climates and temperate-type mid-latitude climates [4141 Nimer E. Clima da região Sudeste. In: IBGE - Instituto Brasileiro de Geografia e Estatística: Geografia do Brasil. Rio de Janeiro, Rio de Janeiro. 1971. 427 p.]. MS is affected by most of the synoptic systems that reach the south of the country, with some differences in system intensity and seasonality. According to [4242 Fernandes KA, Satyanurty P. Cavados invertidos na região central da América do Sul. Congresso Brasileiro de Meteorologia, Anais II, Belo Horizonte, MG: 1994; 8: 93-94.], inverted troughs act mainly during winter, causing moderate weather conditions mainly in MS. Upper Tropospheric Cyclonic Vortices (UTCV) from the Pacific region are organized with intense convection associated with instability caused by subtropical jet. Prefrontal instability lines generated from the association of large-scale dynamic factors and mesoscale characteristics are responsible for intense precipitation [4343 Cavalcanti IA. Um estudo sobre as interações entre os sistemas de circulação de escala sinótica e circulações locais. INPE 2494 TDL/097. São José dos Campos, São Paulo. 1982.].

Especially over MS state, the Bolivia’s High (BH), generated from the strong convective heating (release of latent heat) from the atmosphere during the summer months of South Hemisphere (SH) [4444 Virgi H. A preliminary study of summertime tropospheric circulation patterns over South America estimated from cloud winds. Mon. Weather Rev. 1981 Jul.; 109(3): 549-610.], is considered as a typical semi-stationary system of region. MS state is characterized by the performance of systems that associate characteristics of tropical systems with typical systems of medium latitudes. During the months of greatest convective activity, the South Atlantic Convergence Zone (SACZ) is one of the main phenomena that influence the rainfall regime of these Regions [4545 Quadro MFL. Estudo de episódios de zonas de convergência do Atlântico Sul (ZCAS) sobre a América do Sul. Rev. Bras. Geof. 1999 Nov.; 17(2-3):210-32.,4646 Souza A, Santos CM, Ihaddadene R, Cavazzana G, Abreu MC, Oliveira-Júnior JF, et al. Analysis of extreme monthly and annual air temperatures variability using regression model in Mato Grosso do Sul, Brazil. Model. Earth Syst. Environ. 2021 Feb.; 1(1):1-17.]. The fact that the cloud and rainfall band remain semi-stationary for days at a time favors the occurrence of flooding in the affected areas.

The averages of Tmax and Tmin in the MS state are high during spring-summer, with September and October being the warmest months (averages above 23°C), and mild in autumn-winter, but rarely below 18°C. during June and July, the months with the lowest thermal averages, the averages of Tmax and Tmin are between 18°C to 21°C [2424 Ward JH. Hierarchical grouping to optimize an objective function. J. Am. Stat. Assoc. 1963 Apr.; 58(301): 236-44., 47]. However, this monthly spatial variation in temperature is not homogeneous in the state, as well as the precipitation, as reported by [88 Abreu MC, Souza A, Lyra GB, Pobocikova I, Cecílio RA. Analysis of monthly and annual rainfall variability using linear models in the state of Mato Grosso do Sul, Midwest of Brazil. Int. J. Climatol. 2020b Sep.; 41(S1): E2445-E2461.,2323 Souza A, Fernandes WA, Albrez EA, Galvíncio JD. Análise de agrupamento da precipitação e da temperatura no Mato Grosso do Sul. Acta Geográfica. 2012 Jan.;6(12):109-24.,4141 Nimer E. Clima da região Sudeste. In: IBGE - Instituto Brasileiro de Geografia e Estatística: Geografia do Brasil. Rio de Janeiro, Rio de Janeiro. 1971. 427 p.,47]. The average annual temperature values recorded lead to the understanding that the spatial and seasonal variation of the climate variable follows the characteristics of the groups in MS state. The highest thermal averages are observed between October to March, which corresponds to summer, in the domain of tropical climates in the SH, with October showing the highest averages, since this is characterized by the transition between the dry and rainy periods [2424 Ward JH. Hierarchical grouping to optimize an objective function. J. Am. Stat. Assoc. 1963 Apr.; 58(301): 236-44.]. Thus, changes in atmospheric circulation patterns, high evapotranspiration rates, low average wind speeds, and precipitation, such as low air humidity, favor rising temperatures, which indicates an early summer. Additional analysis done from temperatures is that of the observed thermal amplitude between months with higher and lower temperatures, showing variation of 10°C on average [3030 Gil VO, Ferrai F, Emmendorfer L. Investigação da aplicação de algoritmos de agrupamento para o problema astrofísico de classificação de galáxias. Rev. Bras. Comput. Apl. 2015 May.;7(2):52-61.], with low altitude areas having higher average and high-altitude areas having lower average of Tmax and Tmin [88 Abreu MC, Souza A, Lyra GB, Pobocikova I, Cecílio RA. Analysis of monthly and annual rainfall variability using linear models in the state of Mato Grosso do Sul, Midwest of Brazil. Int. J. Climatol. 2020b Sep.; 41(S1): E2445-E2461.,2323 Souza A, Fernandes WA, Albrez EA, Galvíncio JD. Análise de agrupamento da precipitação e da temperatura no Mato Grosso do Sul. Acta Geográfica. 2012 Jan.;6(12):109-24.,4141 Nimer E. Clima da região Sudeste. In: IBGE - Instituto Brasileiro de Geografia e Estatística: Geografia do Brasil. Rio de Janeiro, Rio de Janeiro. 1971. 427 p.].

The topographic distribution in MS state is irregular with high elevation in its east part and it was well characterized by a temperature gradient, from maximum in east portion of the MS state, to minimum in west. This defined the C3, in Pantanal region. The highest altitudes in the MS state are in the longitudinal direction, from 56 to 52 degrees long in the south, the northeast direction of the MS state. It influences the C4 and C2, with lower Tmax and Tmin and higher Prec, and the C1 with lower Prec. Theses clusters have weather stations in different areas in MS state (Figure 4).

The latitudinal pattern was effective in the Tmax and Tmin dynamics of the state in most of the clusters. The least influence on the latitude was observed in C1, with an intermediate maximum and minimum temperature in the state, with its pattern best described by low precipitation. In C2, C3 and C4 regions, the south of the state had the lowest temperatures, due to the influence of higher altitudes, FS, polar masses and other extratropical systems [88 Abreu MC, Souza A, Lyra GB, Pobocikova I, Cecílio RA. Analysis of monthly and annual rainfall variability using linear models in the state of Mato Grosso do Sul, Midwest of Brazil. Int. J. Climatol. 2020b Sep.; 41(S1): E2445-E2461.,4141 Nimer E. Clima da região Sudeste. In: IBGE - Instituto Brasileiro de Geografia e Estatística: Geografia do Brasil. Rio de Janeiro, Rio de Janeiro. 1971. 427 p.].

CONCLUSION

From the results obtained in this study, it can be concluded that hierarchical cluster analysis and elbow method for the optimal number of clusters was the most suitable and satisfactory and were capable of training and validating climate homogeneous regions in Mato Grosso do Sul state. The efficient application of these methodologies is confirmed by the delimitation of four distinct clusters (climate homogeneous regions), consistent with rainfall heights and temperatures (maximum and minimum) recorded and geographical characteristics as topography, in the state of Mato Grosso do Sul.

The climatological study of the homogeneous regions made the knowledge of the rainfall/thermal structure of the regions viable, enabling more targeted research to specific areas of the Mato Grosso do Sul state. The same region, with nuclei located at different parts of the state, confirmed that the hypothesis of the physical approximation guarantees climatic similarity between the meteorological stations, showed the determining influence of the topographic structure, latitude, longitude and altitude of the passage variations of the mass systems, air and front formation.

Rainfall and temperature in Mato Grosso do Sul is influenced by weather systems that determine different rainfall patterns in the state. The determination of these homogeneous regions, besides contributing to the understanding of climate variation in this region, can be useful as a support tool for the management and planning of water resources in the state of Mato Grosso do Sul, as an agricultural state.

Acknowledgments

The authors thank their universities for their support.

REFERENCES

  • 1
    Alvares CA, Stape JL, Sentelhas PC, Gonçalves JLM, Sparovek G. Köppen’s climate classification map for Brazil. Meteorol. Z. 2013 Dec.;22(6):711-28.
  • 2
    Abadi AM, Rowe CM, Andrade M. Climate Regionalization in Bolivia: A Combination of Nonhierarchical and Consensus Clustering Analyses Based on Precipitation and Temperature. Int. J. Climatol. 2019 Dec.; 40(10):4408-21.
  • 3
    Carvalho MJ, Melo-Gonçalves P, Teixeira JC, Rocha A. Regionalization of Europe based on a K-Means Cluster Analysis of the climate change of temperatures and precipitation. Phys. Chem. Earth. 2016 Aug.; 94: 22-8.
  • 4
    Beck C, Grieser J, Kottek M, Rubel F, Rudolf B. Characterizing global climate change by means of Köppen climate classification. Klimastatusbericht. 2005 Jan.; 51:139-49.
  • 5
    Park S, Park H, Im J, Yoo C, Rhee J, Lee B. Delineation of high-resolution climate regions over the Korean Peninsula using machine learning approaches. PLoS One. 2019 Oct.;14(10):e0223362.
  • 6
    Köppen W. Das geographische System der Klimate. Köppen, W., R. Geiger (Eds.): Handbuch der Klimatologie. Gebrüder Bornträger. Berlin, 1, part C. 1936. 44 p.
  • 7
    Bunkers MJ, Miller JR, DeGaetano AT. Definition of climate regions in the Northern Plains using an objective cluster modification technique. J Climate. 1996 Jan.; 9(1):130-46.
  • 8
    Abreu MC, Souza A, Lyra GB, Pobocikova I, Cecílio RA. Analysis of monthly and annual rainfall variability using linear models in the state of Mato Grosso do Sul, Midwest of Brazil. Int. J. Climatol. 2020b Sep.; 41(S1): E2445-E2461.
  • 9
    Lyra GB, Oliveira-Júnior JF, Zeri M. Cluster analysis applied to the spatial and temporal variability of monthly rainfall in Alagoas state, Northeast of Brazil.Int. J. Climatol. 2014 Feb.; 34: 3546-58.
  • 10
    Rhee J,Im J,Carbone GJ,Jensen JR. Delineation of climate regions using in‐situ and remotely‐sensed data for the Carolinas. Remote Sens. Environ. 2008 Jun.; 112:3099-111.
  • 11
    Oliveira-Júnior JF, Xavier FMG, Teodoro PE, Gois G, Delgado RC. Cluster analysis identified rainfall homogeneous regions in Tocantins state, Brazil. Bios. J. 2017 Mar. - Apr.; 33(2): 333-40.
  • 12
    Stooksbury DE,Michaels PJ. Cluster‐analysis of southeastern United‐States climate stations. Theor. Appl. Climatol. 1991 Sep.; 44:143-50.
  • 13
    Rocha Júnior RL, Silva FDS, Costa RL, Gomes HB, Silva MCL, Pinto DDC, et al Long-Term Change and Regionalization of Reference Evapotranspiration in the Brazilian Northeast. Rev. Bras. Meteorol. 2021 Oct.-Dec.; 35: 891-902.
  • 14
    Jolliffe IT,Philipp A. Some recent developments in cluster analysis. Phys. Chem. Earth. 2010 Jan.; 35(9-12): 309-15.
  • 15
    Everitt B. Cluster Analysis. Halsted Press:New York, New York; 1980. 348 p.
  • 16
    Gong XF,Richman MB. On the application of cluster‐analysis to growing‐season precipitation data in North America East of the Rockies. J. Clim.1995 Apr.; 8(4):897-931.
  • 17
    Zscheischler J, Mahecha MD, Harmeling S. Climate classifications: the value of unsupervised clustering. Procedia Comput. Sci. 2012 Aug.; 9: 897-906.
  • 18
    Bieniek PA,Bhatt US,Thoman RL,Angeloff H,Partain J,Papineau J,et al. Climate divisions for Alaska based on objective methods.J. Appl. Meteorol. Climatol. 2012 Jul.; 51(7):1276-89.
  • 19
    Huth R. An intercomparison of computer‐assisted circulation classification methods. Int. J. Climatol. 1996 Aug.; 16(8):893-922.
  • 20
    Tibshirani R, Walther G, Hastie T. Estimating the number of clusters in a data set via the gap statistic. J. R. Statist. Soc. B. 2001 Jan.; 63(Part 2): 411-23.
  • 21
    Torres FE, Cargnelutti Filho A, Teodoro PE, Corrêa CCG, Ribeiro LP, Cunha ER. Dimensionamento amostral para a estimação da média de precipitação pluvial mensal em locais do Estado do Mato Grosso do Sul. Ciênc Rural. 2016 Jan.; 46(1): 60-9.
  • 22
    Teodoro PE, Oliveira-Júnior JF, Cunha ER, Correa CCG, Torres FE, Bacani VM, et al. Cluster analysis applied to the spatial and temporal variability of monthly rainfall in Mato Grosso do Sul State, Brazil. Meteorol. Atmos. Phys. 2016 Oct.; 128: 197-209.
  • 23
    Souza A, Fernandes WA, Albrez EA, Galvíncio JD. Análise de agrupamento da precipitação e da temperatura no Mato Grosso do Sul. Acta Geográfica. 2012 Jan.;6(12):109-24.
  • 24
    Ward JH. Hierarchical grouping to optimize an objective function. J. Am. Stat. Assoc. 1963 Apr.; 58(301): 236-44.
  • 25
    Lima AO, Lyra GB, Abreu MC, Oliveira-Júnior JF, Zeri M, Cunha-Zeri G. Extreme rainfall events over Rio de Janeiro State, Brazil: Characterization using probability distribution functions and clustering analysis. Atmos. Res. 2021 Jan.;247:105221.
  • 26
    Salehnia N, Salehnia N, Ansari H, Kolsoumi S, Bannayan M. Climate data clustering effects on arid and semi-arid rainfed wheat yield: a comparison of artificial intelligence and K-means approaches. Int. J. Biometeorol. 2019 Jul.;63(7):861-72.
  • 27
    Fereday DR, Knight JR, Scaife AA, Folland CK. Cluster analysis of North Atlantic-European circulation types and links with tropical Pacific sea surface temperatures. J. Clim. 2008 Aug.;21(15):3687-703.
  • 28
    Costa RL, Baptista GMM, Gomes HB, Silva FDS, Rocha Júnior RL, Salvador MA, et al. Analysis of climate extremes indices over northeast Brazil from 1961 to 2014. Weather. Clim. Extremes. 2020 Jun.;28:100254.
  • 29
    Rousseeuw PJ. Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math. 1987 Nov.;20:53-65.
  • 30
    Gil VO, Ferrai F, Emmendorfer L. Investigação da aplicação de algoritmos de agrupamento para o problema astrofísico de classificação de galáxias. Rev. Bras. Comput. Apl. 2015 May.;7(2):52-61.
  • 31
    Santos EB, Lucio PS, Silva CMS. Precipitation regionalization of the Brazilian Amazon. Atmos. Sci. Let. 2015 Sep.;16(3):185-92.
  • 32
    R Core Team. 2020. R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. Available from: https://www.R-project.org/.
    » https://www.R-project.org
  • 33
    Maechler M, Rousseeuw P, Struyf A, Hubert M, Hornik K. 2019. cluster: Cluster Analysis Basics and Extensions. R package version 2.1.0.
  • 34
    Kassambara A, Mundt F, 2020. factoextra: Extract and Visualize the Results of Multivariate Data Analyses. R package version 1.0.7. https://CRAN.R-project.org/package=factoextra
    » https://CRAN.R-project.org/package=factoextra
  • 35
    Charrad M, Ghazzali N, Boiteau V, Niknafs A, NbClust: An R Package for Determining the Relevant Number of Clusters in a Data Set. J. Stat. Soft. 2014 Nov.;61(6):1-36.
  • 36
    Kassambara A, 2019. ggcorrplot: Visualization of a Correlation Matrix using 'ggplot2'. R package version 0.1.3. https://CRAN.R-project.org/package=ggcorrplot
    » https://CRAN.R-project.org/package=ggcorrplot
  • 37
    ESRI 2011. ArcGIS Desktop: Release 10.
  • 38
    Ávila LF, Mello CR, Viola MR. Mapeamento da precipitação mínima provável para o sul de Minas Gerais. Rev. Bras. Eng. Agric. Ambient. 2009 Dec.;13(suppl):906-15.
  • 39
    Uliana TM, Reis EF, Silva JGF, Xavier AC. Precipitação mensal e anual provável para o estado do Espírito Santo. Irriga 2013 Nov.;18(1):139-47.
  • 40
    Abreu MC, Souza A, Lins TMP, Oliveira-Júnior JF, Oliveira SS, Fernandes W, et al. Comparison and validation of trmm satellite precipitation estimates and data observed in Mato Grosso do Sul state, Brazil. Rev. Bras. Climatol. 2020 Jul-Dec.;27:2237-8642.
  • 41
    Nimer E. Clima da região Sudeste. In: IBGE - Instituto Brasileiro de Geografia e Estatística: Geografia do Brasil. Rio de Janeiro, Rio de Janeiro. 1971. 427 p.
  • 42
    Fernandes KA, Satyanurty P. Cavados invertidos na região central da América do Sul. Congresso Brasileiro de Meteorologia, Anais II, Belo Horizonte, MG: 1994; 8: 93-94.
  • 43
    Cavalcanti IA. Um estudo sobre as interações entre os sistemas de circulação de escala sinótica e circulações locais. INPE 2494 TDL/097. São José dos Campos, São Paulo. 1982.
  • 44
    Virgi H. A preliminary study of summertime tropospheric circulation patterns over South America estimated from cloud winds. Mon. Weather Rev. 1981 Jul.; 109(3): 549-610.
  • 45
    Quadro MFL. Estudo de episódios de zonas de convergência do Atlântico Sul (ZCAS) sobre a América do Sul. Rev. Bras. Geof. 1999 Nov.; 17(2-3):210-32.
  • 46
    Souza A, Santos CM, Ihaddadene R, Cavazzana G, Abreu MC, Oliveira-Júnior JF, et al. Analysis of extreme monthly and annual air temperatures variability using regression model in Mato Grosso do Sul, Brazil. Model. Earth Syst. Environ. 2021 Feb.; 1(1):1-17.
  • Funding:

    This research received no external funding.

APENDIX

Table A1
Weather station in Mato Grosso do Sul, latitude (º), longitude (º), altitude (m) and their average and standard derivation for precipitation, minimum and maximum temperature.

Edited by

Editor-in-Chief:

Alexandre Rasi Aoki

Associate Editor:

Raja Soosaimarian Peter Raj

Publication Dates

  • Publication in this collection
    16 May 2022
  • Date of issue
    2022

History

  • Received
    04 June 2021
  • Accepted
    04 Nov 2021
Instituto de Tecnologia do Paraná - Tecpar Rua Prof. Algacyr Munhoz Mader, 3775 - CIC, 81350-010 Curitiba PR Brazil, Tel.: +55 41 3316-3052/3054, Fax: +55 41 3346-2872 - Curitiba - PR - Brazil
E-mail: babt@tecpar.br