Assessing the spatial variation of functional diversity estimates based on dendrograms in phytoplankton communities

Variation in phytoplankton functional diversity is partioned and mapped using several indices and linkage methods based on dendrograms. Th e relationships between diversity metrics and major environmental predictors, including zooplankton density, were assessed in 29 phytoplankton communities of fl oodplain lakes distributed along the Middle Araguaia River in central Brazil. Th e dendrogram-based functional diversity indices were Functional Group Richness, Functional Diversity, Mean Pairwise Distance and Mean Nearest Taxon Distance, whereas seven diff erent hierarchical agglomerative linkage methods we used. Th e performance of indices were compared using ANOVA and their spatial variation in response to major environmental predictors evaluated. Th e results indicate that variation in functional diversity values is primarily a product of the type of index chosen. Th is variation was statistically signifi cant in 90 % of the fl oodplain lakes studied; however, a spatial pattern of variation in index values along the river was not detected. Furthermore, environmental constraints, including zooplankton density, were weak predictors of functional diversity indices. Th erefore, the mathematical characteristics of indices are of primary importance in explaining variation among functional diversity values.


Introduction
Functional diversity is an important descriptor of biological communities (Mason et al. 2005), and is able to predict ecosystem dynamics and factors aff ecting ecosystem structure (Petchey & Gaston 2006).Functional diversity is defi ned as the values or range of values for existing functional traits in a community capable of affecting ecosystem functioning, such as primary production, stability, nutrient cycles and the provision of ecosystem services (Díaz et al. 2007;Tilman 2001;Cardinale et al. 2012).Functional diversity has gained prominence in the scientifi c literature (Schleuter et al. 2010), which has led to the development of several indices for measuring it (e.g., Walker et al. 1999;Petchey & Gaston 2002;Heemsbergen et al. 2004;Botta-Dukát 2005;Villegér et al. 2008;Laliberté & Legendre 2010).Mason et al. (2005) maintain that functional diversity can be divided into three distinct facets: functional richness, divergence and evenness/regularity.Indices that express richness usually indicate how much of niche space is occupied by a given species assemblage, while divergence refers to how species abundance diverges in niche space, and evenness indicates how regular the distribution of species abundance is within niche space (Schleuter et al. 2010).Th ese three aspects are complementary (Mouchet et al. 2010) and when used in conjunction with one another reveal the processes that link biodiversity to ecosystem functioning more clearly (Villegér et al. 2008).
One of the simplest ways to measure functional diversity in a community is to calculate Functional Group Richness (FGR).In the FGR index, the functional similarity between species may be determined from a distance matrix, constructed based on the functional traits of species, and converted into a dendrogram via a linkage method (Ricotta 2005).A threshold dendrogram is then defined to determine the similarity level between species, allowing similar species to be grouped within the same functional group (Petchey & Gaston 2006).The dendrogram threshold is usually arbitrary (but see Teresa et al. 2015) and it is defined according to the purpose of the study or the researchers' own criteria (Pla et al. 2012).
Other continuous methods (i.e., which do not divide the species into functional groups according to their traits) for assessing functional diversity have been developed, many of which are derived from the calculation of phylogenetic diversity (Pavoine & Bonsall 2011).The Functional Diversity (FD) index is the sum of dendrogram branch lengths (Petchey & Gaston 2002), generated from a functional traits distance matrix, while the Mean Pairwise Distance (MPD) index is the average distance between pairs of species that compose a community (Webb 2000) and the Mean Nearest Taxon Distance (MNTD) index is equal to the average dendrogram lengths between the functionally most similar pairs of species in the community (Webb 2000).Although there are other continuous indices of functional diversity based on dendrograms (e.g., GFD Mouchet et al. 2008;NMDS Cadotte et al. 2009), the above mentioned indices are some of the most frequently used to describe functional richness and divergence in aquatic (Colzani et al. 2013;Carvalho & Tejerina-Garro 2015;Dunck et al. 2015) and terrestrial communities (e.g., Bihn et al. 2010;Cianciaruso et al. 2012;Hidasi-Neto et al. 2012).
Calculating functional diversity based on dendrograms usually requires three methodological steps: the construction of a distance matrix from species functional traits, the selection of a linkage method and the construction of the functional dendrogram (Petchey & Gaston 2006).The selection of the distance matrix and linkage method generate variation in the resulting indices due to sensitivities to different methodological steps used in dendrogram construction (Poos et al. 2009).However, another potential source of variation is the measurement of functional diversity, which depends on the choice of index itself, from the various options available in the literature (e.g., FGR, FD, MPD, MNTD).
In addition to methodological controversies, some indices may be differentially sensitive to environmental conditions, causing differences in their predictive power (Pakeman 2011).These differences in performance are commonly associated with the mathematical characteristics of the metrics (Petchey et al. 2004;Ricotta 2005).For example, MPD and MNTD differ in relation to functional richness indices (e.g., FGR and FD) because they are divergence indices, based on the average distance between species rather than the sum of functional entities.Patterns of average pairwise differences between species have been used to address the processes underlying species co-occurrence (Tucker et al. 2017).In fact, the co-occurrence of functionally similar species has been associated with processes of environmental filtering, while limiting similarity processes has been associated with the co-occurrence of functionally distinct species (Mouillot et al. 2007;Sobral & Cianciaruso 2015; but see Mayfield & Levine 2010).Therefore, we would expect that metrics describing how different, on average, species are among communities could be more informative for predicting niche-based processes.
While previous studies have compared the performance of different functional diversity indices (e.g., Petchey et al. 2004;Schmera et al. 2009;Teresa & Casatti 2017), none of these studies were dedicated exclusively to indices based on dendrograms, despite their increased use and occurrence in the scientific literature (for the use of FD see Mouchet et al. 2008;Petchey et al. 2009).Similarly, there are ample discussions on their methodological issues (Podani & Schmera 2006;2007;Petchey & Gaston 2007), although there is still no general consensus (Mouchet et al. 2008).; Here, we present a methodological approach for evaluating variation among functional diversity metrics based on dendrograms (see a similar approach to evaluating variation of ecological niche models by Diniz-Filho et al. 2009) in geographical space, using phytoplankton community data from 29 floodplain lakes in the Middle Araguaia River, central Brazil.Thus, we address the following questions: i) How important are the type of index and linkage method as sources of variation in functional diversity values of lake communities?ii) Is the variation among values of functional diversity spatially structured along the floodplains, that is, are more proximate lakes more similar in terms of functional diversity than more Assessing the spatial variation of functional diversity estimates based on dendrograms in phytoplankton communities distant lakes?iii) Do diversity-environment relationships vary according to the type of index and/or linkage method?
We expect that the choice of index and linkage method will explain variation among functional diversity values.We further expect that this variation originates from index characteristics that depend upon how each index is associated with environmental variables.To answer the first question we use the decomposition of the sum of squares (ANOVA) considering variation of the indices and linkage methods within each lake.To answer the second question, we measured the spatial pattern of variation in functional diversity through the Moran I index, and to answer the third question, we associated environmental variables with functional diversity indices through a Canonical Correspondence Analysis.

Dataset
We sampled 29 floodplain lakes located along 500 km of the Middle Araguaia River and four large tributaries (Crixás, Mortes, Vermelho and Cristalino) in central Brazil (Fig. 1).The floodplain lakes are distributed among the states of Goiás, Tocantins and Mato Grosso (14°72'86" to 10°54'73"S 51°03'57'' to 50°55'22''W).Samples were collected in January 2012 during the high-water period.We chose this month, because it is in the middle of the rainy period (started in November), and the characteristics of the phytoplankton community, landscape, morphometrics and limnology of the lakes have already stabilized and therefore different from that of low water periods (Nabout et al. 2006;2009).Moreover, although we use one temporal sample, the present study was developed on a wide spatial scale, thus taking into consideration biological, limnological, morphometric and landscape variation.During the collection period, 19 of the lakes were connected to the main river channel and 10 were isolated.The floodplain lakes varied in terms of limnological, morphometrical and soil use characteristics (Machado et al. 2015;2016) and represent 29 communities that vary in phytoplankton species richness (6-22 species, see details in Machado et al. 2015).
Phytoplankton was collected from subsurface water (0.5 m), stored in dark bottles and fixed with acetic acid modified Lugol's (Vollenweider 1974).Individuals were identified to the lowest taxonomic level possible.We obtained 10 functional traits (Weithoff 2003;Kruk et al. 2010) of the 115 phytoplankton species found in the region: maximum linear dimension, individual surface area, individual biovolume, biological form, mucilage, demand for silica, heterocytes, mixotrophy, flagella and aerotopes.For the traits "biological form" and "mucilage", species were classified according to the states of the functional trait, where the same species can show more than one state (Tab.1).The traits were obtained through screening the samples and consulting the literature or specialists (Tab.1).The counting of individuals and the measurements to obtain the functional traits were performed in separate steps, since these two approaches require the use of different microscopes (inverted microscope for counting individuals and optical microscope for performing cell measurements).Thus, not all counted individuals were found again for the measurements.We measured the cell dimensions for all the individuals of a species found in the samples (between one and four individuals) and then calculated the average value to estimate maximum linear dimension, biovolume and individual surface area.The individual surface area and biovolume of species were estimated using the equations described in Hillebrand et al. (1999).
We assessed the following local environmental variables: oxygen saturation, total nitrogen, total phosphorus and transparency.These variables are important to the organization of phytoplankton communities and are known to regulate their dynamics in aquatic environments (Reynolds 2006).Zooplankton are one of the main predators of phytoplankton in aquatic environments and, along with environmental variables, play an important role in the structure and dynamics of these communities (Reynolds 2006).Thus, within the set of local environmental variables we included a variable "zooplankton density", which was the sum of the density of all zooplankton groups (i.e., cladocerans, copepods and rotifers).Oxygen saturation was estimated using a portable oximeter (Digimed DM-4P) and water transparency was measure using a Secchi disk.The concentrations of total nitrogen and total phosphorus were obtained following the protocols described in Zagatto et al. (1981).For the collection of zooplankton, 500 liters of water were filtered through a plankton net (68 µm mesh) using a suction pump.The number of organisms was estimated according to Bottrell et al. (1976) and the total density was expressed in individuals per cubic meter (ind/m³).All variables were assessed at the same place and depth as the phytoplankton.Further details on the sampling of zooplankton are described in Machado et al. (2015).
Landscape variables were represented by the types of land cover around each floodplain lake.Land cover data were obtained through interpretation of satellite Landsat 5TM images (30-meter spatial resolution, orbits 223/67 to 223/70) that are freely available from the Instituto Nacional de Pesquisas Espaciais (INPE; http://www.dgi.inpe.br/CDSR/).The scenes were georeferenced based on Geocover images (GLS-Landsat, available at http:// landsat.usgs.gov/),after which we constructed image mosaics.These procedures were performed using the software ERDAS.We created a buffer of 10 km around each floodplain lake and quantified the percentage of different land cover classes (i.e., agriculture, native Cerrado vegetation and pasture) found within the buffer with Arcgis 9.3.The choice of spatial scale for this study was due to many factors, such as the biology of the studied taxa, dispersal capabilities, sampling interval and system heterogeneity (Legendre & Legendre 1998).We used a buffer of 10 km with the intention of capturing an impact gradient along the sampled lakes (i.e., native vegetation, pasture and agriculture).

Functional diversity indices and linkage methods
We used four distinct functional diversity indices based on dendrograms: Functional Group Richness (FGR); Functional Diversity (FD) proposed by Petchey & Gaston (2002); Mean Pairwise Distance (MPD) and Mean Nearest Taxon Distance (MNTD) described in Webb (2000).We selected these four indices because they are commonly used in functional ecology (Petchey et al. 2009;Hidasi-Neto et al. 2012;Best et al. 2013;Coyle et al. 2014), they represent different aspects of diversity and have different mathematical characteristics (Pavoine & Bonsall 2011).Although MNTD and MPD can be calculated with a raw distance matrix (Kembel et al. 2010;2014), their interpretation is more intuitive when these indices are based on dendrograms, which is why their calculation using dendrograms is very common (e.g., Cianciaruso et al. 2012;Hidasi-Neto et al. 2012;Carvalho & Tejerina-Garro 2014;Dunck et al. 2015).
In each community, we obtained seven values for each functional diversity index from the seven linkage methods used (Fig. 2).
Table 1.Phytoplankton functional traits used to calculate the functional diversity indices based on dendrograms via seven linkage methods.The classified categorical traits indicate that species may be simultaneously classified into more than one category.For example, the same species may be found in unicellular or colonial biological form.MLD: Maximum Linear Dimension; ISA: Individual Surface Area; 1: functional attributes were obtained during the screening of samples; 2: functional attributes were obtained from the literature and consulting databases and experts.

Assessing the spatial variation of functional diversity estimates based on dendrograms in phytoplankton communities
Acta Botanica Brasilica -31(4): 571-582.October-December 2017 Several methods have been proposed for the calculation of FGR, generally using environmental features (e.g., Pillar 1999;Pillar & Sosinski Junior 2003).However, to be consistent with the other indices (FD, MPD and MNTD), which do not consider environmental data in their calculations, we calculated FGR using only the functional traits of organisms (e.g., Pla et al. 2012).To date, there are no consolidated procedures for defining the cut-off point for dendrograms, and thus these criteria are usually defined according to the purposes of the researcher (Pla et al. 2012).
For our definition of FGR, we arbitrarily set the threshold for functional similarity between species at 0.70.A similarity greater than 0.70 is considered a high value (e.g., to define taxonomic resolutions, Heino 2010; or evaluation of collinearity, Tabachnick & Fidell 1989).Thus, this value was adopted because we believe that species that share more than 70 % of functional traits may be associated with similar ecosystem functions.Although the choice of cut-off point in the dendrogram can be another source of variation for functional diversity values, such variation does not occur in indices such as FD, MPD and MNTD, and it was therefore not included in this study.We standardized all indices to vary from zero (minimum functional diversity value) to one (maximum functional diversity value).
Our functional attribute matrices had different mathematical characteristics, including quantitative, categorical and nominal traits.Thus, we constructed dendrograms using Gower distance matrices.The choice of distance matrix may also be identified as a cause of variation in the calculation of functional diversity indices (Mouchet et al. 2008).However, Gower distance is recommended as the best choice for data with different scales or missing values (Podani & Schmera 2006;Petchey & Gaston 2009), and accommodates different types of variables (e.g., categorical, numeric, nominal, ordinal;Podani & Schmera 2006).We constructed the Gower distance matrices using the generalized Gower distance method proposed by Pavoine et al. (2009).
To evaluate if the indices are associated with species richness, we performed a Pearson correlation.The indices were calculated using the vegan (Oksanen et al. 2015) and Picante (Kembel et al. 2014) packages in R 3.2.2(R Core Team 2015).

Evaluating and mapping variation of functional diversity indices
We conducted a two-way analysis of variance (ANOVA) without replication (Zar 2010) to detect variation between the functional diversity indices and their linkage methods.We used an analysis of variance without replication, since we had only one functional diversity value for each combination of factors (type of index and type of linkage method).In situations where there is no replication, it is not possible to estimate the interaction (Zar 2010).The first factor in the ANOVA (hereinafter factor A) was the type of functional index, while the second factor (hereinafter factor B) was the type of linkage method used.We obtained the sum of squares of the indices (SQI), the sum of squares of the linkage methods (SQLM), the residual sum of squares (SQR; i.e., the portion of the variation that is not explained by the functional diversity indices and linkage methods) and the variation determined by the two factors together (i.e., the indices and linkage methods; SQT).We tested the significance level for components A and B using the F statistic (P <0.05).This approach was adapted from Diniz-Filho et al. (2009), who used a similar approach to evaluate variation among niche models and climate scenarios in global climate change projections of New World birds.
We calculated the ANOVA by performing spatial decomposition of the sum of squares; i.e., the variation caused by the indices and linkage methods was determined in each floodplain lake.Thus, we mapped the variation obtained for the indices and linkage methods along the middle floodplain of the Araguaia River.To statistically verify the spatial patterns identified in the maps, we built spatial correlograms for each component of variation (i.e., SQI, SQLM and SQT) using Moran's I coefficient.We employed six distance classes according to the Sturge rule (Legendre & Legendre 1998), and obtained significance levels using Bonferroni correction (Legendre & Legendre 1998).The spatial analyses were conducted with the Software SAM (Rangel et al. 2010).
Canonical Correspondence Analysis (CCA) was performed to verify whether the indices and linkage methods were associated with different local or landscape variables.Thus, the variables used were: oxygen saturation, total phosphorus, total nitrogen, transparency, zooplankton density and presence of native Cerrado, agricultural and pastoral vegetation.In this analysis the limnological variables and zooplankton density were standardized using the Z score method and land use types by the arcsine of the square root x 180 / pi.The CCA was performed using the vegan package (Oksanen et al. 2015) in R (R Core Team 2015).The total significance of the CCA was verified with a Monte Carlo test with 1000 randomizations.A summary of the methodology used in this study is represented schematically in Figure 2.

Results
We found 115 phytoplankton species considering all evaluated floodplain lakes, with a mean of 14 species per lake (coefficient of variation = 27 %).On average, considering all linkage methods, we found 13 functional groups in the lake with the highest species richness and 5 functional groups in the lake with the lowest species richness.Mean values of FD, MPD and MNTD in the lake with the highest species richness were 1.00, 0.93 and 0.57, while in the lake with the lowest species richness they were 0.42, 0.96 and 0.99, respectively.The FGR and FD indices were positively correlated with species richness, while MPD and MNTD were negatively correlated with richness, although for MPD the correlations are not significant (Tab.2).
The two-way ANOVA without replication applied to the functional diversity indices and linkage methods indicated that FGR, FD, MPD and MNTD values were distinct from each other for most of the assessed lakes (except for lakes 7, 17 and 24).However, we did not observe differences among linkage methods (Fig. 3).When we compared the indices and linkage methods, most of the variation in the values of functional diversity was assigned to the type of index used (Tab.3).
Distinct values were observed for different functional diversity indices within the same community.However, we did not find a spatial pattern in the variation of the indices across the 29 sampling units.This is demonstrated by the dispersal pattern of the mapped points (Fig. 4) and by the absence of significant Moran's I values (Tab.3).
The first and second CCA axes applied to the environmental variables and functional diversity indices together explained 29 % of the variation in the data.We observed the formation of three distinct groups of indices (Fig. 5); the first group corresponded to MNTD, the second to MPD and the third to the FGR and FD.Local environmental and landscape variables did not exhibit statistically significant relationships with any of the functional diversity indices (Monte Carlo test, P = 0.32); in other words, these variables, which were measured in each lake, did not explain the variation among indices.

Discussion
The evaluation of variation in methodological approaches has been performed for species distribution models (e.g., Diniz-Filho et al. 2009;Tessarolo et al. 2014), richness estimators (e.g., Brose et al. 2003), beta diversity indices (e.g., Anderson et al. 2011) and functional diversity indices (e.g., Petchey et al. 2004;Schmera et al. 2009;Mouchet et al. 2010).In this study, we used an approach that enabled us to map the sources of variation (i.e., type of index and linkage method) when calculating functional diversity via dendrograms and to evaluate their relationships with environmental conditions.Our results indicate that the main source of variation for functional diversity values in phytoplankton communities is the type of index used, while the choice of linkage method seems to be less influential.
The number of planktonic species sampled here may be considered low when compared to other studies conducted in the Araguaia River floodplain (e.g., Nabout et al. 2006;2007).This is probably due to the period in which the samples were collected, since the flood pulse tends to homogenize the environments of the plain (Thomaz et al. 2007).A relationship between species richness and functional diversity indices is commonly found in different communities (e.g., Petchey et al. 2009;Bihn et al. 2010;Hejda & de Bello 2013), with the strongest effects usually occurring when a large number of functional traits are used that are not correlated with each other (Petchey et al. 2009).FD and FGR were positively correlated with species richness, while MPD and MNTD presented the opposite pattern (although MPD showed no significant correlation).These differences are associated with the mathematical characteristics of the indices.While FD and FGR are metrics that sum functional differences, MPD and MNTD are based on the average of the pairwise difference between species, so that the addition of a redundant species can reduce the values of MPD and MNTD (Petchey et al. 2009;Tucker et al. 2017).Our results suggest that species added in richer communities tended to be functionally redundant, causing the average distances to be reduced and generating a negative correlation, a pattern that can be interpreted as resulting from communities structured by environmental filtering (Mouillot et al. 2007).In fact, by analyzing the same dataset, Machado et al. (2016) found that local environmental variables were important in driving variation in phytoplankton community composition.
The four indices showed different values within the same community and this pattern of variation was not spatially structured.That is, we did not find a spatial pattern for the variation of functional diversity along the Middle Araguaia River.Thus, it was not possible to identify sets of floodplain lakes that were geographically closer, in which variation in the indices used was higher or lower.Phytoplankton species exhibit different forms of dispersal ( ReynoldsTable 3. Variation attributed to the index (SQI), linkage method (SQLM) and the interaction between the two factors (SQT), considering the 29 floodplain lakes.The spatial variation in each component was evaluated using Moran's I coefficient.The values in the table indicate Moran's I coefficient for the first-class distance.We did not obtain any significant correlograms according to the Bonferroni correction.group; that is, large interspecific functional differences define the formation of functional groups (for sensitivity of FGR see Petchey & Gaston 2002).However, more functionally unique species have relatively high contributions to variation in functional diversity measured by FD and FGR, which contribute to these indices being grouped in a similar way.Based on our results, it is not possible to recommend an index or linkage method that is more appropriate for verifying the association of functional diversity with environmental conditions.However, in this situation, we suggest that the use of cophenetic correlation (Petchey & Gaston 2007) or consensus trees (Mouchet et al. 2008) should be adopted for the selection of a linkage method and the association of the different indices with environmental variables should be carefully considered.
In general, a consensus measure of functional diversity is not yet considered to exist (Ricotta 2005;Maire et al. 2015).Thus, the evaluation of differences in functional diversity indexes performance is highly relevant, as is an evaluation of what different metrics are actually measuring.This study has provided a methodological approach for decomposing and mapping variation in functional diversity indices based on dendrograms and has further evaluated their relationships with environmental conditions.Although local environmental and landscape variables do not explain the differentiation among functional diversity indices, we encourage further studies using different taxa and spatial scales to adopt this approach.For example, at large scales it is possible to investigate variation in functional diversity indices in relation to environmental gradients and climatic and historical effects.Furthermore, this methodology can be tested in other communities with different richness levels or in relation to other facets of functional diversity, or include the variation generated by intraspecific variability (e.g., Cianciaruso et al. 2009), missing data or trait transformations (e.g., Májeková et al. 2016).
For the natural phytoplankton communities in the studied floodplain lakes, the linkage method was less important for functional diversity values.Thus, the most important criterion is the choice of index.In summary, the identification of a perfect functional diversity metric is a task that is far from complete, because each one expresses functional diversity differently (i.e., mathematically and methodologically).Nevertheless, approaches that measure variation of indices may be useful due to the great number of metrics that continue to arise and the increasing need for consolidation of functional diversity concepts.

Figure 1 .
Figure 1.Location of the 29 floodplain lakes sampled along the Middle Araguaia River and its tributaries.Numbers indicate the floodplain lakes studied along the basin.

Figure 2 .
Figure 2. Schematic representation of the methods used to verify variation in the performance of functional diversity indices based on dendrograms and their linkage methods.The abbreviations used are as follows: FGR (Functional Group Richness); FD (Functional Diversity); MPD (Mean Pairwise Distance); MNTD (Mean Nearest Taxon Distance); Single (Single Linkage Agglomerative Clustering), Ward (Ward Minimum Variance Method), complete (Complete Linkage Agglomerative Clustering), UPGMA (Unweighted Arithmetic Average Clustering), WPGMA (Weighted Arithmetic Average Clustering), WPGMC (Weighted Centroid Clustering) and UPGMC (Unweighted Centroid Clustering), CCA: Canonical Correspondence Analysis.

Figure 4 .
Figure 4. Variation observed among the four functional diversity indices (SQI) and the seven linkage methods (SQLM) across the sampling units.The sizes of the dots indicate the values obtained for the sum of squares of the indices.We did not find any significant differences for the sum of squares for the linkage methods.