Improving collection efforts to avoid loss of biodiversity : lessons from comprehensive sampling of lycophytes and ferns in the subtropical Atlantic Forest

Estimating species richness with herbarium data and new collections allows us to understand the distribution of diversity. We investigated the accuracy of lycophyte and fern sampling along a vegetation gradient in the subtropical Atlantic Forest in southern Brazil. We compiled lycophyte and fern collection metadata and estimated species richness and assessed sampling accuracy for sixty 50 x 50 km units using ACE, Chao 1, Chao 2, Jackknife 1 and Jackknife 2 estimators. We compiled data for 12,779 fern specimens of 441 species, 67 of which were sampled in only one unit (singletons) and 35 in two units (duplicates). Of the 60 units examined, only 11 had observed values that were above 70% of their estimated values, and 14 had observed levels between 65-70% of the estimated values, meaning that 35 units had a sampling accuracy of less than 65%. In spite of the long history of lycophyte and fern collecting in the study area, there remain units with a lower than expected sampling accuracy for a subtropical forest. is nding indicates that a sizeable collection e ort is needed in order to discover the actual distribution of species before the e ects of fragmentation and deforestation become permanent.


Introduction
e worldwide interest in biodiversity issues has been unhesitatingly growing (Lughadha 2004;Lewinsohn & Prado 2005;Condon et al. 2008;Butchart et al. 2010;Barbault 2011;May 2011), manifested especially in the compilation of large surveys and lists of species (see Zuloaga et al. 2008;Forzza et al. 2012).However, most of the datasets are still spatially and taxonomically biased (Hortal et al. 2008;Sastre & Lobo 2009), especially when di erent sources are used, such as botanical collections.Nevertheless, these collections comprise key tools for ecological studies (Sánchez-Fernández et al. 2008;Maldonado et al. 2015), especially where the original vegetation still exists.Large on-line databases on biological diversity such as speciesLink (CRIA 2015) and e Global Biodiversity Information Facility (GBIF) can contribute to further analysis such as species modeling and geographical distribution (Feeley & Silman 2011) because they host data from various collections.In addition to the large amount of data currently available, both scrutiny and validation by taxonomic experts need to be done to ensure data quality (Maldonado et al. 2015).
Furthermore, these tools can help to identify insu cient sampling and gaps (Pyke & Ehrlich 2010;Maldonado et al.Improving collection efforts to avoid loss of biodiversity: lessons from comprehensive sampling of lycophytes and ferns in the subtropical Atlantic Forest 2015), especially in tropical areas and biodiversity hotspots.Deforestation (Pandit et al. 2007), forest fragmentation (Rodríguez-Loinaz et al. 2011;Silva et al. 2014), human disturbances (Murphy & Romanuk 2014), undiscovered species (Tedesco et al. 2014), and climate changes (Urban 2015) are some of the threats to biodiversity (Kim & Byrne 2006;Butchart et al. 2010), and all of these threats occur in the Brazilian Atlantic Forest. is biome has critical conservation status (as other hotspots; Sloan et al. 2014), having only ~12% of the original vegetation cover (Ribeiro et al. 2011), and deforestation continues (Fundação SOS Mata Atlântica & Instituto Nacional de Pesquisas Espaciais 2009).
In high-diversity regions, such as Brazil, two of the botanical groups that need more attention regarding sampling coverage are the lycophytes and ferns.In the Red Book of the Brazilian Flora (Martinelli & Moraes 2013), both groups are the most threatened when the estimated number of species in each taxon is considered.Despite this fact, the state of Santa Catarina has a well-known lycophyte and fern ora, as shown by large surveys such as the projects of Flora of Santa Catarina (1967Catarina ( -1984) ) and the Floristic and Forest Inventory of Santa Catarina (Vibrans et al. 2010;Gasper et al. 2012).However, even with these large projects, many areas were not sampled.e rst project has approximately 180 collection points, the second has 597.Another concern is the time interval between these projects, because during this interval the forest loss increased (Fundação SOS Mata Atlântica & Instituto Nacional de Pesquisas Espaciais 2009).Because of these kinds of initiatives, and also due its physiognomic gradients, Santa Catarina constitutes an excellent environmental model for investigation of oral distribution and sampling su ciency (see also Rezende et al. 2014;2015;Gasper et al. 2015).
Our aim was to analyze the sampling of lycophytes and ferns, evaluating both insu ciently and su ciently sampled areas in Santa Catarina, Brazil, to discuss e orts for attaining more accurate samplings.We were guided by the following questions: 1) Even when a region was wellstudied, with standardized sampling units, could we see important collection gaps (insu ciencies) for lycophytes and ferns?We addressed this question based on richness estimators, which are powerful tools to check for sampling accuracy (Chiarucci et al. 2003); 2) What are the geographic regions with a more complete sampling e ort and what are the ones with inadequate collections?We expected that the coastal region, where Rainforest dominates, would contain the most well-known vegetation, followed by the plateau (Mixed Forest) and western region (Seasonal Forest), because, in spite of major projects conducted statewide, many collections have been made near protected areas and universities, and in the coastal region there is a higher concentration of these institutions.We discuss these issues in light of conservation implications of under-sampling the biodiversity in high-diversity and high-threatened regions.

Study area
The state of Santa Catarina is located in Southern Brazil, between 25°57'41'' -29°23'55''S and 48°19'37'' -53°50'00''W (Fig. 1).e climate is mesothermal and rather humid in the Southern Plateau (Cfb according to the Köppen classi cation) and humid subtropical along the coast and in the Atlantic Slope (Cfa according to the Köppen classi cation), with very high temperatures (Klein 1984).Nimer (1989) considered the region to be a temperate zone, with high cloud formation and rainfall regimes, the annual rainfall ranging from 1,250 mm to 2,000 mm.ere are two very distinct seasons: cold winter and hot summer (Klein 1984).e rst extends from June to August and the second, from December to March.
Santa Catarina is covered by the Subtropical Atlantic Forest and its phytoecological zones are: Seasonal Semideciduous Forest, in the Uruguay River channel; Mixed Forest (Araucaria Forest), on the plateau and in the western region of Santa Catarina; Rainforest as well as the uvial and marine in uence zones towards its coast (Mangrove and Coastal Dwarf Forest, respectively) (IBGE 1992;Oliveira-Filho 2009).

Data compilation
We used the same criteria adopted by Gasper et al. (2015) to delimit the sample units (SU) and compile the database, which had 12,779 lycophyte and fern records in Santa Catarina (Fig. 2), with all units represented by at least ve samples.Of the registered species, three did not inform the municipality (Santa Catarina State only indicated) and were removed from the analysis.We prepared the database with the information of collections done in Santa Catarina.e botanical material was deposited in the following herbaria: ASE, B, BHCB, BM, BOTU, CRI, ESA, F, FCAB, FIC, FLOR, FUEL, FURB, G, GH, GUA, HAS, HB, HBR, HCF, HRCB, HSJRP, HUCS, HUEFS, ICN, INPA, IRAI, JOI, JPB, K, LIL, MBM, MO, NY, P, PACA, R, RB, RBR, RJ, S, SP, SPF, UB, UC, UEC, UPBC and US ( iers 2015), the IFFSC project (Vibrans et al. 2010), the 'Pteridó tas da Mata Atlântica' project (part of the dataset was compiled by Salino & Almeida 2009) and speciesLink (CRIA 2015).All information on the specimen labels was digitized.eses and dissertations were used to compare the identi cations in the herbaria with the database.Material that could be not checked for accuracy of identi cation was not considered in our analysis.Herbarium duplicates (same collector and number) were removed in order to avoid overestimates.We constructed two matrices: (i) a matrix with scienti c names as well as longitude and latitude coordinates; (ii) a binary matrix (record presence or absence) by sampling Acta Botanica Brasilica -30(2): 166-175 unit.e rst matrix was used to map the distribution of specimens and proceed with the grid estimators, and the second was used to estimate sampling e ort.e coordinate used was the original one, or when the coordinate was lacking in the samples, we added a coordinate based on the collection locality (using other collections), or based on the municipality coordinates.e municipalities of Santa Catarina are small; thus, when a coordinate was provided, the probability of changing the original SU was reduced.
e layout and analysis maps of sample units were developed using ArcGIS® 10 software (ESRI 2011) and estimators grids in DIVA-GIS 7.5 (Hijmans et al. 2005).From the 60 sample units (Fig. 1), we obtained the sample number and the species number using the intersect function of ArcGIS® 10.
e Jackknife 1 is based on the presence of uniques in sampling units, de ned as the species recorded in only one sampling unit, and Jackknife 2 considers also the duplicates, i.e., species that occur in exactly two samples (Chiarucci et al. 2003).e Chao 1 considers both singletons and doubletons, i.e., the number of species that have exactly one and two individuals (here, the number of records within a sampling unit), respectively, and Chao 2 considers both uniques and duplicates (Chao 1984;Chao & Lee 1992;Colwell & Coddington 1994).ACE is an abundance-based coverage estimator that considers species with 1-10 individuals (in our case, collections) plus the abundant species (>10 individuals) (Chao & Lee 1992;Colwell & Coddington 1994).All estimators assume that the higher the number of species poorly represented, the more likely is the occurrence of some other species in the area that are not represented at all in the data set.
We calculated the estimators for each SU independently, using DIVA-GIS (Hijmans et al. 2005) and a shape le.Note that DIVA-GIS software considers as abundance data the number of records within each sampling unit, although the original matrix was based only on presence/absence records.
For all studied areas we used EstimateS 9.1 (Colwell 2013).We considered a SU to be well-sampled when we found at least 60% of the estimated value compared to the observed value, according to the aforementioned estimators (near to the value recommended by Heck Jr et al. 1975 andJiménez-Valverde &Hortal 2003). is can be justi ed because when the records performed in these 20 SUs were discarded, none of the species was removed from the general list.Hence, no species recorded in these 20 SU was di erent when compared with the other 40.

Results
Considering the entire state of Santa Catarina, the estimators showed values between 466 and 535 species (Tab.1).For some SU, the observed values reached 70% of the estimated values, except for J2, where no SU had more than 67% (Fig. 3, Tab.1).Considering the 60% cut-o value, we found a value between four and 27 SU well-sampled (Tab.2).e western region, where Seasonal Forest predominates, had lower richness and a poor sampling accuracy, which was almost achieved in two other SU (15 and 16). is also occurred in Santa Catarina plateau (Mixed Forest) and ecotones with grasslands.e coastal region, where Rainforest predominates, showed the highest richness values.e coastal region also had the highest number of registered collections (Fig. 2).
Eleven out of 60 SU tested showed values above 70% in relation to the estimated richness values (using Jackknife 1), and 14 other SU had levels between 65-70%.In addition, 15 SU had values between 60-65%; 67% of the SU had values above 60%, which was the cuto used here to consider a SU accurately sampled.Twenty SU had an estimated value below 60% in relation to the observed one.Among the minimum values of ve and maximum of 248 species recorded in the SU, the values estimated with Jackknife 1 were nine and 307 species, respectively.Twenty-six SU had species estimated at numbers higher than 100, all of them located near the coastal zone (Tab.1).Jackknife 2 did not exhibit any area with more than 70% of the observed value (compared to the predicted value), while each of the other indices exhibited more than 10 areas with such characteristics (Fig. 4, Tab.2).Improving collection efforts to avoid loss of biodiversity: lessons from comprehensive sampling of lycophytes and ferns in the subtropical Atlantic Forest

Discussion
Ferns and lycophytes are the most threatened plant group in Brazilian Flora (Martinelli & Moraes 2013). is threat has many causes, but one of them is habitat loss associated with low sampling e orts (Tedesco et al. 2014), especially before the accomplishment of two large surveys, namely the Illustrated Flora of Santa Catarina (Reitz 1965) and IFFSC (Vibrans et al. 2010). is kind of survey can dramatically increase the number of recorded species, which recently occurred in Minas Gerais and Espírito Santo (T.E.Almeida, personal communication).ese surveys made Santa Catarina a region with an even distribution of samples for angiosperms (Sousa-Baena et al. 2014); for lycophytes and ferns, we here found areas with high sampling e ort, but others with few collections.e overall study area was well-sampled, as could be expected for a large study area with standardized sampling units (Jiménez-Valverde & Hortal 2003;Rezende et al. 2014).When all data were analyzed together, we found 95% (ACE), 93% (C1), 88% (C2), 87% (J1) and 82% (J2) of t between estimated and predicted values, thereby indicating that Santa Catarina is well-sampled.However, all values were higher than those reported by List of Species of the Brazilian Flora (2014), which is of 448 species for Santa Catarina.is could be explained at least in part by a few records that we were not able to include in our database, such as those without a herbarium voucher, or because there are new species or records to be sampled.Within Santa Catarina Protected Areas, 4,012 samples and 296 species of lycophytes and ferns were recorded.Seven species were considered as vulnerable, one species was classi ed as critically endangered, and seven other Improving collection efforts to avoid loss of biodiversity: lessons from comprehensive sampling of lycophytes and ferns in the subtropical Atlantic Forest species were categorized as presumed extinct (Gasper & Salino 2015).No large protected area (greater than 20,000 hectares) was found in the central and west parts of the state; such areas are exactly those with few collections and low recorded species.
When we analysed each one of the SU, at least 34% (J1) of them were not well-sampled (< 60% of accuracy).
is demonstrates that, even in a state with a tradition of collecting, much remains to be done to achieve satisfactory values for lycophytes and ferns.e data provided by herbarium (the aforementioned institutions), in digital or other formats, are crucial because, together with environmental data, they can help in the delineation of range and species descriptions (MacDougall et al. 1998).
is information about species distribution is crucial to exploration, use, and conservation (Mutke & Barthlott 2005).
e di erent geographical areas have di erent sampling accuracy, but the coastal region was the most thoroughly collected one, followed by the plateau and western regions.Biased sampling occured in the Rainforest region, where the Floristic and Forest Inventory of Santa Catarina State (IFFSC) collected epiphytes, a procedure not performed in other regions (Vibrans et al. 2010;Caglioni et al. 2012) with high sampling accuracy (>70% SU in dark gray).e same bias in coastal areas (where the Rainforest occurs) was observed for angiosperms by Sousa-Baena et al. (2014).is bias may also be related to the heterogeneity of these SU (Ferrer-Castán & Vetaas 2005), since they cover montane coastal regions (Martinelli 2007), such as the Serra Geral, Serra do Mar, and their valleys (Klein 1980).Several authors indicated the importance of mountainous regions as areas of speciation (Kozak & Wiens 2012) and a favorable microclimate for some species (Holttum 1938;Parris 1985;Jones et al. 2011) because of the higher humidity of the region (Nimer 1989); humidity was identi ed by Gasper et al. (2015) as one of the variables that in uence the composition of lycophytes and ferns in Santa Catarina.Other variables are the presence of active botanists (MacDougall et al. 1998) and available botanical inventories (Ahrends et al. 2011).
Most of the sample units (SU) that achieved sample su ciency are located near universities or protected areas.
ese areas with a large sample number are concentrated in Rainforest vegetation, like the Serra do Itajaí National Park and the Florianópolis region (capital of Santa Catarina), or in a Mixed Forest plateau region, where some SU are located within the São Joaquim National Park, with 170 species recorded.These areas are generally not easily accessible, at least in their core area.Speci c eld trips in these areas explain the numerous collections and species richness, unlike easily accessible areas, where the number of samples is low (as in SU 46, 56, and 66), as it would not be attractive to large eld expeditions, because of the low vegetation cover (Fig. 2).In these cases, the museum-e ect (Nelson et al. 1990;MacDougall et al. 1998) appears to have had an important impact on some of the species-richest lycophyte and fern areas in Santa Catarina.e same bias in the coastal region was observed by Werneck et al. (2011), studying endemic angiosperm species of Atlantic Forest.
ese authors found that few grid cells were well-surveyed and that the species richness in each cell depended on the sampling e ort.
Based on the aforementioned data, it is possible to calculate how many areas would need to be sampled to give researchers the ability to direct collection e orts.For example, it would probably be productive to sample at least the 20 SU that do not have a su ciency above 60% and discover how much time would be required to achieve such value; in addition, the cost of the eld surveys could be ascertained (Soberón & Llorente 1993).Even in a wellsampled area where species are well-identi ed (which may assist in the extrapolation of data to be applied to poorly sampled areas), sampling inequalities can result in biased and partial descriptions of changes in biodiversity (Hortal et al. 2007).
It is essential to begin collecting in these areas since the   (2011) showed that fragmentation had a negative e ect on the species richness and diversity, especially for ferns.Despite the fact that the number of endemic species was very low ( ve), the number of uniques was quite high (66).Considering this information, additional e orts should be taken to reduce the number of "rare" species (unicates or singletons), since the higher the number of unicates persisting in the data, the higher the chance that the total species richness has not been reached (Walther & Moore 2005).We believe that, due the fact that at least 34% of SU did not exhibit an accurate sampling e ort, the opportunity to assess the whole of the natural vegetation and its diversity is diminishing.In view of the several studies mentioned, species diversity could have been underestimated not only in Santa Catarina, but across Brazil.Even areas previously considered as well-sampled may actually be undersampled, as demonstrated by Anthelme et al. (2011).erefore, additional sampling e ort should be directed towards areas that have not yet reached a minimum value of sampling intensity, here indicated to be between 60-70% for a subtropical zone, in order to improve the oristic su ciency in each area (in the present case, for lycophytes and ferns), increasing precision and accuracy, and reducing the sampling bias (for further discussion, see Walther & Moore 2005).Well-sampled areas can still contain surprises, such as new records recently documented in certain protected areas (Funez & Gasper 2014), and therefore should not be ignored.
Our study demonstrates the relevance of this approach for conservation of biodiversity, especially in regard to lycophytes and ferns, and might be seen as an additional warning tool in the process of moving towards more accurate sampling.We believe it is necessary to address these points for conservation purposes, since the state of Santa Catarina is a region with one of the greatest rates of loss of forest coverage in Brazil (Fundação SOS Mata Atlântica & Instituto Nacional de Pesquisas Espaciais 2009), and its forests are highly degraded and fragmented (Vibrans et al. 2013).Recent deforestation has expanded pastures, agricultural production, and plantations of exotic species, such as pine and eucalyptus, for timber and cellulose.Moreover, addressing these issues may be relevant not only for the state of Santa Catarina, but for the entire Subtropical Atlantic Forest. is study could call attention to the need to improve collection e orts before permanently losing lycophyte and fern biodiversity in the Subtropical Atlantic Forest, indeed even in other ecosystems outside Brazil.
Improving collection efforts to avoid loss of biodiversity: lessons from comprehensive sampling of lycophytes and ferns in the subtropical Atlantic Forest

Figure 1 .
Figure 1.Distribution of sample units (SU) across phytoecologial regions in a sector of the Subtropical Atlantic Forest.Number of 50 x 50 km sample units in the grid distributed in a sector of the Subtropical Atlantic Forest, generated through Hawth's Tools in ArcGIS 10.

Figure 2 .Figure 3 .
Figure 2. Records of lycophytes and ferns in a sector of the Subtropical Atlantic Forest.

Table 2 .
(Tedesco et al. 2014)km sample units and the cover is small and highly fragmented (Fundação SOS Mata Atlântica & Instituto Nacional de Pesquisas Espaciais 2009;Vibrans et al. 2013).erates of forest-cover loss are growing in Santa Catarina (Fundação SOS Mata Atlântica & Instituto Nacional de Pesquisas Espaciais 2009), which directly a ects the sampling accuracy found in the state and probably results in species loss(Tedesco et al. 2014).Rodríguez-Loinaz et al.