Testing multivariate analysis in paleoenvironmental reconstructions using pollen records from Lagoa Salgada , NE Rio de Janeiro State , Brazil

Despite the indisputable significance of identification of modern analogs for Paleoecology research, relatively few studies attempted to integrate modern and fossil samples on paleoenvironmental reconstructions. In Palynology, this general pattern is not different from other fields of Paleoecology. This study demonstrates the practical application of modern pollen deposition data on paleoenvironmental reconstructions based on fossil pollen by using multivariate analysis. The main goal of this study was to use Detrended Correspondence Analysis (DCA) to compare pollen samples from two sediment cores collected at Lagoa Salgada, a coastal lagoon located at northeastern Rio de Janeiro State. Furthermore, modern surface samples were also statistically compared with samples from both cores, providing new paleoecological insights. DCA demonstrated that samples from both cores are more similar than previously expected, and that a strong pattern, related to a paleoenvironmental event, is present within the fossil data, clearly identifying in the scatter plot samples that represent preand post-environmental change. Additionally, it became apparent that modern vegetation and environmental conditions were established in this region 2500 years before present (BP). Multivariate Analysis allowed a more reliable integration of modern and fossil pollen data, proving to be a powerful tool in Paleoecology studies that should be employed more often on paleoclimate and paleoenvironmental reconstructions.

types and should always be carried out together with fossil pollen reconstructions.
Despite the early thoughts on tropical pollen analysis being very discouraging (Faegri 1966), mainly because of the extremely high plant diversity and most of the plants displaying entomophillous pollination (pollen dispersal through insects), Flenley (1973) conducted one of the first studies on pollen rain in the tropics.This author found that, even though the tropical pollen flora presented a high diversity of plants, palynological studies were not unfeasible, and pollen production and dispersal were not as limiting as initially believed.Additionally, despite the pollen of numerous tropical plant taxa was rarely or never found in sediment samples, there were enough pollen grains being produced and dispersed to sample sites, which allowed the pollen assemblages to be related to specific vegetation types.

NUMERICAL TECHNIQUES IN PALEOECOLOGY
In the last two decades, Paleoecology displayed a dramatic shift from being purely qualitative and subjective to being more quantitative, making use of many numerical techniques such as Multivariate Analysis.This change was probably due to evolution of personal computers, which allowed significant improvements and the design of new statistical software packages (Pielou 1984, McCune andGrace 2002).
The same trend was observed in Palynology.Most of the earlier studies with fossil pollen that used any numerical analysis employed only Constrained Cluster Analysis in order to facilitate the identification of changes in pollen assemblages.This classification technique was usually performed by Coniss (Grimm 1987), a software included in the Tilia package (Grimm 1992).In the last decade, however, an increasingly large number of studies that have been conducted started to apply different techniques of Multivariate Analysis in Palynology, such as Detrended Correspondence Analysis -DCA, Principal Component Analysis -PCA, Two-way Indicator Species Analysis -TWINSPAN, Non-metric Multidimensional Scaling -NMDS, and even Canonical Correspondence Analysis -CCA (e.g.Bush et al. 1990, 2005, Bush and Colinvaux 1990, Rodgers and Horn 1996, Behling et al. 1997, Haberle and Bennett 2001, Stutz and Prieto 2003, Gosling et al. 2005, Urrego et al. 2005, De Toledo and Bush 2008a, b).
Detrended Correspondence Analysis (DCA) (Hill and Gauch 1980) is an ordination technique based on Correspondence Analysis (CA or RA) and, like CA, it ordinates both samples and species simultaneously.However, DCA provides a correction for the "arch effect", a mathematic artifact that is almost a constant presence with more than one dimension in CA, and other ordination techniques as well (e.g.PCA).By dividing the first axis into segments and then setting the average score on Axis 2 within each segment to zero, DCA avoids this mathematical artifact.The main goal of this multivariate technique is to reduce the number of ecological dimensions with a minimum loss of information (McCune and Grace 2002), allowing the observation of the most relevant of a large number of patterns in the data, demonstrating the similarity, association, and correlation among samples and species (Behling et al. 2005).
Of all Multivariate techniques, DCA was chosen basically because it is not affected by mathematical artifacts (e.g."arch effect", very common on other popular techniques McCune and Grace 2002), and also because the ordination axes are rescaled as standard deviation units (SD), allowing estimates of species turnover rates.

INTEGRATING MODERN POLLEN RAIN AND FOSSIL POLLEN DATA
Several studies on modern pollen deposition have been carried out worldwide.Three types of samples have been used in these studies: artificial pollen traps (Tauber 1974, Bush 1992, Behling et al. 1997, Bennett and Hicks 2005, Gosling et al. 2005); sedimentary surface samples from soil, peat bogs and lake, also known as mud-water interface (Rodgers and Horn 1996, Phillips et al. 1997, Vincens et al. 1997, Willard et al. 2001), and finally samples from moss cushions (Weng et al. 2004).Even though, there are many studies of modern pollen rain that use samples from different origins to characterize present vegetation.Hardly any of them attempted to evaluate potential differences of pollen contribution among sample types (Wilmshurst and Mc-Glone 2005) and the consequences for paleovegetation and paleoclimate reconstructions.
Despite modern pollen rain being extremely important for paleovegetation and paleoclimate reconstructions, only a few studies accomplish the integration of modern and fossil pollen data, even with the availability of numerical techniques to aid in the identification of characteristic communities.Phillips et al. (1997) conducted a vegetation survey in Panama using modern pollen data to characterize different vegetation types of coastal swamps.Afterwards, without any statistical method, they tried to correlate the vegetation types with fossil pollen assemblages from a sediment core.Cienc (2009) 81 (4) In a study carried out in Florida (USA), Willard et al. (2001) have shown the usefulness of modern pollen assemblages in estimates of past trends of vegetation, hydrology, and environmental parameters that control plant communities in the Everglades.They used Cluster analysis and the Modern Analog Technique, and found that both methods were complementary.

An Acad Bras
Additional results demonstrating the applicability of modern pollen rain in paleoclimate and paleovegetation studies are available for southern South America as well.Markgraf et al. (2002) and Stutz and Prieto (2003) have analyzed modern pollen samples crossing the Andes from southern Chile to Patagonia, and in a coast-inland transect in Mar Chiquita, Argentina (respectively).Techniques of Multivariate analysis were successfully applied on both studies and, although a few fossil communities had no contemporary counterparts, it was possible to identify modern analogs for most of the fossil pollen assemblages, which allowed more reliable paleoclimate and paleoenvironmental reconstructions of these fossil records to be accomplished.
In Brazil, the research on present vegetation using modern pollen rain has apparently received little attention, judging by a relatively small number of published studies.Salgado-Labouriau (1973) and Bush (1991) presented the earliest data on modern pollen deposition on cerrado vegetation from central Brazil and rain forest from Central America and Amazonia (respectively).Still in Amazonia, Barth et al. (2002) carried out a pollen analysis of soil samples from a transect across a savanna-like area in the middle of the lowland rain forest in Pará State.
The first systematic pollen rain study on southern Atlantic rain forest was published by Behling et al. (1997)  Although these are the only studies on modern pollen distribution published for Rio de Janeiro State, these data were not systematically used on the interpretation of their respective fossil pollen records (Luz et al. 2006, Barreto et al. 2007).
This study attempts to demonstrate the practical application of modern pollen rain data on paleoclimate and paleoenvironmental reconstructions based on fossil pollen and using multivariate analysis.

STUDY GOALS
The main goal of this study was to test how an advanced statistical technique of multivariate analysis, Detrended Correspondence Analysis -DCA (Hill and Gauch 1980), performs when analyzing (comparing and contrasting) three different sets of pollen samples.
The specific goals were to statistically compare pollen samples from two sediment cores collected at Lagoa Salgada, which were analyzed with a 4-year interval, to evaluate the already existing paleoenvironmental interpretations.Additionally, samples from both cores were statistically compared with modern pollen rain samples in order to aid in paleoenvironmental reconstructions.

TESTED HYPOTHESES
H1: There will be a great dissimilarity among samples from cores T1 and T2.
The sample sets were obtained from two sediment cores (T1 and T2), collected less than 1000 m apart, at Lagoa Salgada (northeastern Rio de Janeiro), and were analyzed with a 4-year interval.Although both cores were analyzed by the same person, there was a marked difference between the pollen identification skills at the time of each analysis, which was confirmed by estimates of pollen richness and diversity from both cores.It is well known that the core location may have some influence on the pollen spectra due mainly to different pollen settling velocities and wind pattern.However, as the vegetation around Lagoa Salgada is basically the same, it is acceptable to assume that the learning curve of pollen identification will have a much greater impact on the pollen spectra, especially because a student under training will not be able to see the differences among relatively similar morphological features, grouping them under the same category.Furthermore, it would be unreasonable to expect any senior Palynologist to check 200-300 pollen grains per sample (a minimum standard An Acad Bras Cienc (2009) 81 (4) pollen sum) times an average of 10-15 samples, which would give a total of 2000-4500 pollen grains for each sediment core (for instance) per student.
Therefore, we expect samples from T1 and T2 to be placed separately on the scatter diagram, which demonstrates their low similarity.H2: Surface samples will have a greater similarity towards upper samples (younger in age) from T2 than towards the bottom ones.
Additionally, as samples of the transect and core T2 were analyzed within the same time frame (same level of identification skills), we expect the top samples from T2 and surface samples to form a cluster on the scatterplot diagram, suggesting their similarity, and maybe even indicating when the modern vegetation and environmental settings were established.We make no predictions on how samples from T1 will be placed, however.

STUDY SITE
Lagoa Salgada (21 • 54 47 S/41 • 00 34 W) is a coastal lagoon located near São Tomé Cape, northeastern of Rio de Janeiro State, in the coastal plain formed by beach ridges at the mouth of Paraíba do Sul River, at less than 3 km from the Atlantic Ocean (Fig. 1).The lagoon has brackish water and is 8.6 km long, 1.9 km wide and only 1m deep, with no tributaries.According to Martin et al. (1993), Lagoa Salgada started to be formed during a phase of coastal erosion (3900-3600 years BP) due to Holocene sea-level rise (Suguio et al. 1985), while radiocarbon dates on marine shells provide an age of 3000 years BP for its formation.
The modern climate is tropical and humid, with mean annual temperatures between 19 and 23 • C and annual mean precipitation of less than 1000 mm, falling mainly between December and March with a dry season from May to September (IBGE 2002).
Even though changes in the land use replaced the natural vegetation, converting large areas for farming activities (e.g.cattle ranching and sugarcane agriculture), the pollen signal from Lagoa Salgada may be influenced by four major vegetation types: semi-deciduous forest, tropical rain forest, restinga (maritime vegetation with shrub and tree components), and ruderal plants (weeds).

MATERIALS AND METHODS
Two sediment cores were collected by hammering a pipe into the bottom sediments of Lagoa Salgada.While T1 (previously known as IIA2) was collected in 1992 (Lemos R.M.T., unpublished data), T2 (previously known as T-02) and a transect of surface samples were collected in 1996 (De Toledo M.B., unpublished data) (Fig. 1).Surface samples were collected on a zigzag shaped transect in the middle of the lagoon, spaced approximately 50 m from each other, and consist of the top 3-4 cm of sediment.Both cores were split open, described and sub-sampled at the laboratory.A total of 33 samples were analyzed for their pollen contents: 9 samples from T1; 14 samples from T2; and 10 surface samples.
Processing of samples for pollen extraction with HCl, KOH, HF, and acetolysis followed the methodology proposed by Ybert et al. (1992).In order to calculate pollen concentrations, tablets of Lycopodium spores were added to the samples prior to processing (Stockmarr 1971).The pollen residues were mounted in glycerin jelly medium and counts of at least 200 grains that were conducted at 400× and 1000× magnification on a Zeiss Axiolab.Pollen grains were identified by comparison with the reference collection of modern pollen housed in the Laboratório de Palinologia of Universidade Federal do Rio de Janeiro, and also published catalogs with photographs and morphological descriptions of pollen types (Hooghiemstra 1984, Roubik andMoreno 1991).The pollen sum, which included only the terrestrial taxa, pollen percentage and concentration were all calculated in TILIA (Grimm 1992).
Detrended Correspondence Analysis -DCA (Hill and Gauch 1980) was carried out on pollen data using the new algorithm provided in PC-ORD 4.0 (McCune and Mefford 1999).In order to have a better understanding of the factors affecting sample distribution along axes 1 and 2, three matrices containing percentage pollen data were prepared: a) matrix I -percentage pollen data from cores T1 and T2, including all species recorded (84 species); b) matrix II -percentage pollen data from cores T1 and T2, including only shared species (15 species); and c) matrix III -percentage pollen data from cores T1 and T2, plus surface samples, but including only shared species (15 species).

SEDIMENTARY RECORD
Core T1 (IIA2 in Lemos R.M.T., unpublished data) was 120 cm long, composed of gray clayed-mud in the bottom, overlain by calcareous mud, and with organicrich mud on top.Core T2 was 130 cm long and showed the same composition of T1: gray clayed-mud in the bottom, replaced by calcareous mud in the middle, and overlain by organic mud on the top.The conventional radiocarbon date of the carbonate lenses (55 and 50 cm) provided an age of about 2500 years BP (De Toledo M.B., unpublished data).It is worth mentioning that sediment distribution on both cores appears to be representative of an environmental change, which is recorded by the replacement of gray clayed-mud by calcareous mud, indicating relatively drier conditions.

POLLEN RECORD
The pollen record from both cores (T1 and T2) also showed an environmental discontinuity, which allowed the recognition of two main phases that were interpreted as humid conditions on the basal portion of the cores, being replaced by relatively drier conditions on the above layers (Lemos R.M.T., unpublished data, De Toledo et al. 1996, De Toledo M.B., unpublished data).This major vegetation change seemed to have taken place between 70 and 55 cm of depth on both cores, right below the carbonate lenses, where the calcareous mud replaces the gray clayed-mud (ca.2500 years BP).The shift to drier conditions is, therefore, confirmed by both pollen records (De Toledo M.B, unpublished data).As the goal of this research is to run Detrended Correspondence Analysis (DCA) to compare T1 and T2 pollen records, and also to identify contemporary analogs from modern pollen rain, reviewing discussion of previous paleoclimate and paleoenvironmental interpretations is beyond the scope of this investigation.Therefore, the presentation of the respective pollen diagrams is irrelevant, as they do not serve to the purpose of this study and can also be found in their original literature (Lemos R.M.T., unpublished data, De Toledo et al. 1996, De Toledo M.B, unpublished data).
Regarding the surface samples, the highest pollen concentration was observed in the samples collected at the central portion of the lagoon, which could be expected, as these samples are located in the deepest area.The pollen richness of surface samples was higher than samples from T1, but comparable with samples from T2, which was also expected, as T2 and surface samples were analyzed within the same timeframe.

MULTIVARIATE ANALYSIS -FOSSIL RECORDS
Detrended Correspondence Analysis (DCA) on pollen data (including all species) from cores T1 and T2 displayed a strong polarization on Axis 1 (Fig. 2A), placing samples from "pre-environmental change" (from now on referred to as bottom samples) on the left side and "post-environmental change" (from now on referred to as top samples) on the right side of Axis 1.Interestingly, not all "post-change" samples are grouped together, as the very top samples from T2 (27-1 cm depth) have scored higher values on Axis 2, and are spread on the top-right portion of the graph.
This particular sample distribution suggests that, although samples from very top of T2 may be to some extent unlike those from top of T1, both cores are indeed much more similar than expected, regarding their pollen composition.
To minimize the effect that the differences in pollen richness may have on a fair comparison of T1 and T2, as their pollen assemblages share only 15 species, DCA was performed once again on a matrix containing only common species to both cores (Fig. 2B).The most noticeable modification on this second scatter plot was the inversion of Axis 1, and the formation of smaller clusters of samples.Nevertheless, not even the length of either axis has changed, and the very top samples of T2 are still polarized by Axis 2, being placed relatively apart from other "post-environmental change" samples (Fig. 2B).
The fact that DCA shows a similarity among samples from T1 and T2, and even splits top from bottom samples, rather than just polarizing samples according to their respective cores, could suggest that numerical analyses, in this case DCA, are such powerful techniques that differences in pollen richness or diversity do not affect analysis of similarity at all.However, pollen richness is basically a function of sample size or sampling effort (quantity of pollen grains counted per sample) and environment type (i.e.different types of vegetation An Acad Bras Cienc (2009) 81 (4) yield different values of pollen richness, larger lakes tend to yield richer and more diverse samples than smaller lakes).Therefore, if sample size is fixed (200 grains in this case), and the environment and vegetation are essentially the same, pollen richness could also vary according to depositional differences among areas within the same site.However, as there was a 4-year inter-val between the analysis of both cores, which is translated into different pollen identification skills, it is more plausible to accept that variations of such magnitude on pollen richness are due to differences between pollen identification skills.The fact that T2 was analyzed after T1, and it has the highest pollen richness confirms this suggestion.Although DCA, and other multivariate analysis, are undeniably remarkable tools that allow a graphic summarization of multidimensional data, showing the most important patterns and gradients, it would be incorrect to assume that these statistical techniques could fix bad data.
Matrices with differences in species richness of such magnitude (84 species -15 species = 69 species) would certainly yield pollen assemblages with a much greater dissimilarity than the one detected by DCA.One possible explanation for this is the presence of a very strong pattern within the data.When the pollen data reflect a drastic environmental change, such as the one described by De Toledo and Bush (2008a, b), it becomes easy for almost any statistical technique to identify the major trend with a strong polarization of samples.Furthermore, it is probable that the pollen assem-blage that records this environmental change is composed of just a few major pollen types that were easily recognized on the analysis of T1.
Plotting the DCA scores from Axis 1 of both cores against sample depths demonstrates the major pattern of environmental change, as recorded by pollen assemblages (Fig. 3).The scores from T2 were multiplied by minus one (-1) in order to alter the graphic direction of the curve, just so it would allow better visual comparisons to be made.The most important thing to notice is that, in both cores, the changes in pollen assemblages are directional rather than chaotic, which shows a progression of communities from one state to another.An additional striking feature of these graphs is that the major paleoecological shift takes place between 70 and 55 cm of depth (Fig. 3), which is close to 2500 years BP (De Toledo M.B, unpublished data).As DCA results pointed to a fairly high similarity between T1 and T2, even when using all pollen taxa recorded, hypothesis 1 is rejected.The potential effect of different stages of pollen identification skills was, apparently, not greater than the environmental change that took place in the region of Lagoa Salgada.

MULTIVARIATE ANALYSIS -MODERN AND FOSSIL RECORDS
Integrating fossil and modern pollen data through Detrended Correspondence Analysis allowed a paleoenvironmental reconstruction of Lagoa Salgada based on statistical methods (Fig. 4).The scatter plot clearly shows the strongest polarization of samples along Axis 1, placing bottom samples from T1 and T2 (preenvironmental change) on the right side, and post-environmental change samples (top samples) on the left.It is remarkable that surface samples, which provide the modern pollen rain signal, or the proxy for present vegetation, form a cluster with near-modern pollen samples from T1 and T2, confirming that pollen assemblages from top samples have an analogue in modern and local community.It is also worth mentioning that the trend previously observed (Fig. 3) of directional change of communities, persists when modern samples are added into DCA.This pattern, which was probably caused by the environmental change that was recorded at Lagoa Salgada, is so strong in the dataset that both axes 1 and 2 are responsible for 75% of total variation within the data.
As surface samples are clustered with top samples (post-environmental change) from not only T2, but also T1, showing their high similarity, and bottom samples from both cores are found to have great dissimilarity with modern samples, hypothesis 2 is therefore accepted.
The observed pattern of sample distribution (Fig. 4) suggests that a vegetation type that closely resembles to the modern vegetation of Lagoa Salgada was established as early as 2500 years BP (between 70 and 55 cm depths).It is noteworthy that Axis 2 polarizes post-environmental change samples from T2, placing the shallower ones 54 and 37 cm on the bottom, and the very top samples 27 and 1 cm above on the axis.This arrangement demonstrates the highest similarity between modern samples and the very top samples from T2.
The fact that there were no modern analogs for the community from pre-environmental change could be due to a rather restricted variety of modern environments sampled, since all surface samples were taken from Lagoa Salgada.These results would probably change as new surface samples from different vegetation and environment types are added to the data.

CONCLUSIONS
The integration of modern and fossil pollen data from Lagoa Salgada, Rio de Janeiro State, was successfully accomplished using Detrended Correspondence Analysis, which provided more solid paleoecological interpretations.
The potential effect of different stages of pollen identification skills was, apparently, not greater than the impact of the environmental change that was recorded at Lagoa Salgada, northeastern Rio de Janeiro State.Furthermore, it has been undoubtedly demonstrated that Multivariate Analysis techniques, such as DCA, are very appropriate to identify such strong patterns within paleoecological data.
The high similarity between surface samples and top-core (pos-environmental change) samples suggests that vegetation and environmental conditions have been basically the same in this region for the last 2500 years.
Detrended Correspondence Analysis proved to be a powerful tool that can be very useful in Paleoecology, and definitely should be used more often on paleoclimate and paleoenvironmental reconstructions.

Fig. 1 -
Fig. 1 -Map of the study area, showing the location of Lagoa Salgada in the northeastern Rio de Janeiro State.The inset shows a portion of the lagoon with the sample location (surface and cores).

Fig. 2 -
Fig. 2 -Detrended Correspondence Analysis (DCA) of cores T1 and T2 showing a strong polarization of samples along Axis 1. A) DCA carried out on percentage pollen data of all species that were present on both cores together.B) DCA carried out on percentage pollen data of shared species.Various symbols were used to characterize different samples: black circles for bottom samples from T1; gray triangles for top samples of T1; black upside down triangles for bottom samples of T2; and gray diamonds for top samples of T2.

Fig. 3 -
Fig. 3 -DCA scores from Axis 1 plotted against sample depths (cm) from T1 (at the top) and T2 (at the bottom).Community changes are directional rather than chaotic, showing progression of plant community from one state to another.The major change took place between 70 and 55 cm.

Fig. 4 -
Fig. 4 -Detrended Correspondence Analysis (DCA) of cores T1 and T2, and surface samples.A strong polarization of samples is apparent: bottom samples (pre-environmental change) from T1 and T2 are placed on the right of Axis 1; top samples (post-environmental change) from T1 and T2 are placed on the left of Axis 1, together with surface samples; near-modern samples from T2 (27-1 cm depth) are separated from lower samples, being placed closer to surface samples.The same symbols from Figure 3 were used to characterize different samples: black circles for bottom samples from T1; gray triangles for top samples of T1; black upside down triangles for bottom samples of T2; gray diamonds for top samples of T2; and open triangles for surface samples.
Luz et al. (2005)analyzed samples from Lagoa do Campelo, northeastern Rio de Janeiro State.