Georeferenced database and interactive online map of limnoterrestrial and freshwater Tardigrada from Central and South America

Abstract Like other meiofaunal organisms, tardigrades suffer from a significant knowledge gap concerning many aspects of their biodiversity. The lack of an up-to-date digital collection with all species and details of limnoterrestrial and freshwater tardigrades in South and Central America is one of the most critical gaps to be filled. Therefore, the present work aims to develop a database containing all valid species of limnoterrestrial and freshwater tardigrades from South and Central America found until 2023 and provide open access to the results. Data for each species were obtained directly from the literature using Google Scholar and the website tardigrada.net. This compiled data resulted in the creation of a database with the species name, author and year of species description, genus, family, class, type country, type location, coordinates (longitude and latitude), if it is aquatic and/or limnoterrestrial, substrate where it was found, the country and location of collection, and manuscript containing the species identification. Furthermore, the coordinates of each occurrence were plotted on maps with political-administrative boundaries and Neotropical and Andean biogeographic regions. In addition, statistical analysis was performed related to the geographic distribution of the sampling effort. From the literature, 2157 records of valid non-marine Tardigrada species, endemic or not, were computed. From these records, 271 species of tardigrades have been identified in the two regions combined, with 223 species in South America and 129 species in Central America. We were able to show that there are still many biases in the sampling of tardigrades in the Neotropical and Andean regions and that further studies are needed on the biogeography of these meiofaunal organisms in these biogeographic regions. We expect this database to help better understand the richness and distribution patterns of limnoterrestrial and aquatic tardigrade species in Central and South America.


Introduction
Water bears are free-living, microscopic animals (about 50-1200 μm in size) that belong to the phylum Tardigrada and are divided into the classes Heterotardigrada and Eutardigrada (Nelson et al. 2020).They have segmented bodies with four pairs of legs and inhabit terrestrial, aquatic, and marine environments (Peluffo et al. 2007, Vicente & Bertolani 2013, Schill 2018, Bartels et al. 2020, Nelson et al. 2020).Most tardigrade species are limnoterrestrial, inhabiting mosses, lichens, leaf litter, and soil; however, some are aquatic, living in sediments or roots of aquatic plants in inland waters or marine sediments from the intertidal zone to abyssal depths (Guil & Cabrero-Sañudo 2007, Schill 2018, Bartels et al., 2020, Nelson et al. 2020).
Like other meiofaunal organisms, tardigrades suffer from the "meiofauna paradox".They are animals believed to have a cosmopolitan distribution but without dispersal capabilities (Giere, 2008).At the same time, the "Everything is everywhere, but environment selects'' (EiE) hypothesis (Finlay et al. 1996, Fenchel et al. 1997, Fenchel & Finlay 2004) was widely accepted for small metazoans, implying the absence of any discernible biogeographic pattern (Cerca et al. 2018, Morek et al. 2021).Most available data on tardigrade species distribution has barely any records which were identified utilizing an integrative taxonomic approach; this hinders the delimitation of species and the understanding of their distribution patterns (Morek et al. 2019, Gąsiorek et al. 2019a).Thus, it is essential to utilize both molecular and observational data to better comprehend tardigrade species distribution patterns (Gąsiorek et al. 2019a).
The presence of a geographic sampling bias and the fact that only a few species have been studied make it difficult to understand the limits of dispersal and, consequently, the distribution patterns and richness of tardigrades, especially in the Southern Hemisphere (Bini et al. 2006, Guil et al. 2009, Yang et al. 2013, Cerca et al. 2018, Azovsky et al. 2020, Garraffoni et al. 2021).In the Neotropical region, the number of recorded limnoterrestrial tardigrade species is much lower compared to other regions, mainly due to the scarce number of specialized researchers, which in turn reduces the number of studies conducted there (Guil & Cabrero Sañudo 2007, Fontaneto et al. 2012, Nelson et al. 2020, Garraffoni et al. 2021).Kaczmarek et al. (2014Kaczmarek et al. ( , 2015) ) compiled the records of non-marine tardigrades in Central and South America up to the respective years of their publication.However, updating the data, and facilitating access and use is necessary.Therefore, this study presents a georeferenced database created through an extensive literature search of all limnoterrestrial and freshwater tardigrades in Central and South America, along with statistical analyses and maps to represent their distribution graphically.This digital and updated collection of the occurrence data of limnoterrestrial and freshwater tardigrades in Central and South America will undoubtedly benefit future biology studies or large-scale analyses and interpretations of biodiversity and distribution data of tardigrades in the Neotropics.

Material and Methods
Distribution data for limnoterrestrial and freshwater tardigrades was mainly obtained from Kaczmarek et al. (2014), who listed all nonmarine tardigrades from Central America, and Kaczmarek et al. (2015) from South America.In addition, a literature search was conducted from 2014 to 2023 using Google Scholar and the "Recent papers'' section of tardigrada.net.A set of terms was used to try to locate all possible publications that had limnoterrestrial tardigrade species as the central topic: "south american limnoterrestrial tardigrade OR central american limnoterrestrial tardigrade OR new south american tardigrade species OR new central american tardigrade species OR limnoterrestrial tardigrade south america OR limnoterrestrial tardigrade central america OR new species limnoterrestrial tardigrade south america OR new species limnoterrestrial tardigrade central america''.
For our final dataset, we removed all records from species that present any kind of taxonomical problem (e.g.unknown type material, dubious name, dubious species and/or descriptions with insufficient morphological data) cited in Kaczmarek et al. (2014Kaczmarek et al. ( , 2015) ) or in the most updated checklist of Tardigrada species organized by Degma & Guidetti (2023) (Table S2).Species records that contained names with c.f. were also not used.Furthermore, we grouped the species sampled in both regions into endemic (i.e.locus typicus is in Central or South Americas) and traditionally treated as allegedly cosmopolitan (i.e.locus typicus is outside of Central and South America; usually records of those species are scattered across the globe and unreliable in the light of modern tardigrade systematics) since many recent studies showed that records of species are prone to contain misidentifications (e.g.Michalczyk et al. 2012, Morek et al. 2019, Morek et al. 2021).
To visualize the reported locations of all species, we plotted geographic coordinates on two maps, one shows reported localities from both sections, and the other contains the number of records in each area.Maps were elaborated for both geopolitical boundaries and biogeographic regions in Central and South America.For the latter maps, we used the Andean and Neotropical provinces proposed by Morrone (2015a) and Morrone et al. (2022), respectively.The data on the biogeographic regions map had to be adjusted (removal of 27 records and four species) because northern Mexico is not fully represented in the Neotropical region.In addition, an interactive and free-to-use online map was created, where each point represents a sampling site of a single species.For this map, sampling sites were flagged in three ways: putatively cosmopolitan, endemic or taxonomic problems.Additionally, we created charts of observed species richness from published articles for countries and biogeographic provinces to illustrate the relationship between observed species richness and sampling effort.que ainda há muito viés na amostragem de tardígrados nas regiões Neotropical e Andina, e mais estudos acerca da biogeografia desses organismos meiofaunais nessas regiões biogeográficas são necessários.A partir desse banco de dados, espera-se contribuir para um maior entendimento da riqueza e dos padrões de distribuição de espécies de tardígrados limnoterrestres e aquáticos nas América Central e Sul.Palavras-chave: meiofauna; tardígrados; biogeografia; Neotropical; esforço amostral.We merged shapefiles from the Andean and Neotropical biogeographic regions produced by Löwenberg-Neto (2015) and Morrone et al. (2022).This procedure yields duplicate provinces that overlap (transition zone).The Neotropical South American transition zone (Atacama, Comechigones, Cuyan High Andean, Desert, Monte, Paramo and Puna provinces) was kept to solve this, while the same Andean unit was removed (Atacama, Desert, Monte, Paramo, Prepuna, and Puna provinces).

Results
Between 2014 to 2023, we found nine published papers regarding descriptions of new species and new records in Central America, while in South America, there were 27 (Table S1).
With all this data compiled, the database is a comma-separated value (.csv) file consisting of a single table with 19 columns: -Species: taxon of the collected species; -Author and year of species' description: name(s) of the author(s) and year of species description; -Genus: taxon of the species' genus; -Family: taxon of the species' family; -Class: taxon of the species' class; -Type country: country of the collected specimen that gave the species its name; -Type location: geographic location of the collected specimen that gave the species its name; -Longitude (Lon): longitude of the species' occurrence; -Latitude (Lat): latitude of the species' occurrence; -Aquatic or limnoterrestrial: defines whether the species is limnoterrestrial or aquatic; -Substrate where it was found: divided into six columns, there are three primary substrates (moss, lichen, and others) with a column for each, followed by another column describing the location of the collected substrate; -Country of collection: the country where the occurrence of the species was documented; -Place of collection: geographic location where the occurrence of the species was documented; -Manuscript containing the species' identification: work in the literature that recorded the occurrence of the species.Valid species records of non-marine Tardigrada from South and Central America, endemic or not, totalized 271 species, of which 129 were found in Central America (33 endemic) and 223 in South America (110 endemic), amounting to 2157 sampling sites (Figure 1A).A total of 141 endemic species corresponded with 732 sites (Figure 1C), while 130 were "cosmopolitan" ones that were recorded at 1425 sites (Figure 1E).The occurrence of substantial sampling effort for endemic species was noticed in Costa Rica (212 records), Argentina (176 records), and Colombia (77 records) (Figure 1D)."Cosmopolitan" species were amply registered in Argentina (353 records), Costa Rica (341 records) and Chile (161 records) (Figure 1F).The highest observed endemic richness was recorded in Argentina (51 species), Mexico (25 species), and Costa Rica (23 species), and the highest number of "cosmopolitan" species was also recorded in these same countries (66 spp., 39 spp.and 35 spp., respectively).Belize, El Salvador, Guatemala, Guyana, Haiti, Honduras, Jamaica and Panama had no registers for limnoterrestrial and freshwater tardigrades (Figure 1B).The most abundant "cosmopolitan" species were Macrobiotus hufelandi, Milnesium tardigradum, and Paramacrobiotus ritchersi, with 196, 150, and 84 sampling sites, respectively, while Barbaria bigranulata, Mesobiotus coronatus and Minibiotus continuus, with 79, 63 and 62, respectively, were the most frequent endemic ones.Species observed only at their type locality amounted to 66.
Valid species records of non-marine Tardigrada from Andean and Neotropical biogeographic regions, endemic or not, totalized 267 species, of which 186 were found in the Neotropical region (96 endemic and 90 "cosmopolitan"), 105 in the Andean region (43 endemic and 62 "cosmopolitan") and 90 in the Transition Zone (36 endemic and 54 "cosmopolitan"), amounting to 2130 sampling sites (Figure 2A).A total of 139 endemic species corresponded with 715 sites (Figure 2C), while 128 "cosmopolitan" ones were recorded at 1415 sites (Figure 2E).The occurrence of substantial sampling effort for endemic species was noticed in the Guatuso-Talamanca province (112 records), the Puntarenas-Chiriqui province (100 records), and the Guajira province (61 records) (Figure 2D)."cosmopolitan" species were amply registered in the Puntarenas-Chiriqui province (167), the Guatuso-Talamanca province (155) and the Valdivian Forest province (141) (Figure 2F).The highest observed endemic richness was recorded in the Guajira province (21 species), the Valdivian Forest province (19 species), the Puntarenas-Chiriqui, Guatuso-Talamanca and Magellanic Forest provinces (all three with 18 species), and the highest number of "cosmopolitan" species was recorded in the Valdivian Forest province (40 species), the Magellanic Forest (35 species), Atlantic and Puna provinces (both with 30 species).Bahama, Chapada Diamantina, Choco Darien, Comechigones, Ecuadorian, Falkland Islands, Guianan, Imeri, Jamaica, Juan Fernandez, Pará, Roraima, Southern Espinhaço, Trinidad, and Ucayali had no registers for limnoterrestrial and freshwater tardigrades (Figure 2B).Provinces such as Guatuso-Talamanca, Puntarenas-Chiqui, Valdivian Forest, Magellanic Forest, Pampean, and Puna have more sampling sites than all the other 60 provinces.This discrepancy results from the fact that five of these six provinces overlap with countries where most tardigrades were sampled.The most abundant "cosmopolitan" and endemic species in the biogeographical regions were the same as seen for Central and South America.
Figure 3 shows screenshots from the ArcGis online platform of Limnoterrestrial and Freshwater Tardigrada of Central and South America.Four views are depicted here: A map view showing all sampling sites included in our dataset, B map view showing selected valid endemic species records (blue triangles), C map view showing selected valid "cosmopolitan" species records (orange squares), D map view showing selected invalid species records due to taxonomical problems (green circles).Each occurrence on the map can be clicked, after which a window with information about the record appears (Fig. 3D).The map can be accessed at https://arcg.is/1jjO84.When analyzing the influence of sampling bias on tardigrade records, we see a positive correlation (R = 0.78 and p < 0.0001 for countries and R = 0.77 and p < 0.0001 for biogeographic provinces) between sampling effort and higher observed species richness (Figure 4).Since Argentina is an outlier compared to all other countries and stands out (Figure 4A), it could affect the correlation between variables (Goodwin & Leech 2006).The model was run without Argentina, and the positive correlation was not only maintained, but we obtained a higher value with an even smaller p-value (Figure S1), confirming a consistent pattern in sampling bias.

Discussion
In this study, we demonstrate that general taxonomic literature (e.g., descriptions of new species, checklists, or faunal lists) can be used to create databases that summarize knowledge about species distributions, despite biases caused by predominant taxonomic approaches in each historical period or by the singular view of each researcher (Lewis 1990).These databases contain highly curated registers that are an essential source of information to gain insights into species distributions and diversity patterns (Griffiths et al. 2003, Guénard et al. 2017).Decades or centuries of taxonomic information can be summarized in just one file or website, and records of species occurrences can become publicly available to the scientific community at no cost (Zizka et al. 2019).Furthermore, according to Griffiths et al. (2003), "...when relational databases are linked to a Geographical Information System (GIS), they become an even more powerful tool for taking on large-scale biogeographical patterns".
Critical evaluation of the historical and contemporary tardigrade records is of utmost importance to understand this taxon's phylogenetic diversity and distribution patterns around the globe (Morek et al. 2019).Most of the records in our dataset (1425 out of 2157) are from socalled "cosmopolitan" species and date to a period (early and middle 20 th century) when the widespread distribution of many tardigrades was broadly accepted.One emblematic case regarding this thought is Milnesium tardigradum Doyerè, 1840, which was considered ubiquitous for decades (Morek et al. 2021).In our study it was the second species with the highest number of records among all 271 species and was found in 15 countries in both regions.This view changed only recently when Michalczyk et al. (2012) and Morek et al. (2019) applied an integrative approach to redescribe and better understand the intraspecific variability in M. tardigradum and when Tumanov et al. (2022) found that the distribution of this species is restricted to the Paleartic region.Together with M. tardigradum, many other widespread species (e.g., Macrobiotus hufelandi, Paramacrobiotus ritchersi, Minibiotus intermedius, Pseudechiniscus (Pseudechiniscus) suillus) were described in the late 19 th or early 20 th centuries, which means that taxonomic problems may arise due to incomplete descriptions, lack of type series deposited in zoological Museums and/ or the non-use of modern techniques for morphological analyses.Thus, most species identification and records should be considered dubious or invalid (Michalczyk et al. 2012, Morek et al. 2019, Gąsiorek et al. 2021).Despite that, many tardigrade species' definitive distribution range is far from known, the "EiE" hypothesis does not explain the wide geographic distribution of many of them.However, although it is not simple to distinguish natural and human-mediated dispersal, Gąsiorek et al. (2019a, b) extend the discussion regarding the latter, proposing human's pivotal role in the dispersal of some tardigrade species worldwide.
We plotted records on countries' geopolitical/administrative boundaries as well as biogeographical regions, mapping geographical areas categorized according to their climatic, geological, and biota (including endemic taxa) criteria (Escalante et al. 2009, Morrone 2015a, Morrone 2015b, Morrone 2017, Morrone et al. 2017, Morrone 2018, Morrone et al. 2022), in Central and South America (Andean and Neotropical regions).When studying species distribution and diversity patterns, a widely used method considers geopolitical/administrative boundaries valid units (Murphy 2021), however they rarely concur with ecological boundaries, as they are constantly subject to changes (Wilson & Donnan 2012).Murphy (2021) demonstrated several critical issues of this practice, which include overestimating endemism, underestimating biodiversity metrics (particularly endemism estimates), hindering understanding of biodiversity discontinuity across the world (especially true for measures containing species range size), and identifying hotspots.Thus, biogeographic regionalization is essential to comprehend ecological and evolutionary aspects of life (Crisp et al. 2009, Holt et al. 2013, Flores-Tolentino et al. 2021).
The Andean and Neotropical regions are hierarchically arranged in five levels: kingdoms, regions, dominions, provinces, and districts (Morrone 2015b), meaning they do not represent countries' political boundaries.However, countries with considerable sampling effort and intensity will translate to provinces with more sampling sites and higher observed richness (Figure 4), as seen in this study.Due to a sampling bias, this phenomenon is known as the "specialist" effect, where distribution data explains who is researching these organisms instead of their actual distribution (Fontaneto et al. 2012).Argentina is a clear example of this statement, as it was ranked second in the total number of records and first in observed richness.The substantial sampling effort in this country was enough to consider it an outlier in our analysis and removing it from the statistical analysis yields a higher positive correlation (Figure S1).Thus, Argentina outperforms all other countries in Central and South America regarding the relationship between observed richness and published articles.Another case would be Costa Rica.The country already had studies of tardigrade ecology conducted there (Mehlen 1969, Kaczmarek et al. 2011, Stander 2016), justifying why it is the first and third country in overall sampling sites and observed richness.Consequently, overlapping biogeographical provinces with both countries will have higher observed richness due  to substantial localized sampling effort (e.g., Guatuso-Talamanca and Puntarenas-Chiriqui provinces with Costa Rica).
This database and the online interactive map will significantly help future studies on limnoterrestrial and freshwater tardigrades' biogeography and ecology in Central and South America.Although we have provided valuable insights into certain areas of knowledge of these organisms, their study continues to face obstacles due to numerous critical deficiencies that remain unresolved.We believe that implementing a more homogenous and widespread sampling across both regions and performing analyses of all specimens utilizing an integrative taxonomic approach will greatly benefit the understanding of the diversity and distribution patterns of limnoterrestrial and freshwater tardigrades.

Supplementary Material
The following online material is available for this article: Table S1 -List of publications from 2014 to 2023 on limnoterrestrial and freshwater Tardigrada from Central and South America.
Table S2 -List of species with their respective taxonomical issue(s) and reference(s) for species of non-marine tardigrades from Central and South America according to Degma & Guidetti (2023).
Figure S1 -Correlation between the number of published articles and known species richness for each country (excluding Argentina).Orange squares represent Central American countries, while purple diamonds represent South American ones.There are overlapping data represented in the chart.Countries with zero published papers (Belize, El Salvador, Guatemala, Guyana, Haiti, Honduras and Panama) were excluded.Pearson's R-value correlation and p-value are shown in the upper left corner.

Figure 1 .
Figure 1.A Central and South America map showing the recorded sampling sites (red circles) of limnoterrestrial and freshwater valid tardigrades' species.B Documented localities of limnoterrestrial and freshwater valid tardigrades' species in each Central and South American country.C Central and South America map showing the recorded sampling sites (red circles) of limnoterrestrial and freshwater valid endemic tardigrades' species.D Documented localities of limnoterrestrial and freshwater valid endemic tardigrades' species in each Central and South American country.E Central and South America map showing the recorded sampling sites (red circles) of limnoterrestrial and freshwater valid "cosmopolitan" tardigrades' species.F Documented localities of limnoterrestrial and freshwater valid "cosmopolitan" tardigrades' species in each Central and South American country.

Figure 2 .A
Figure 2. A Neotropical and Andean biogeographic regions' map with the recorded sampling sites (red circles) of limnoterrestrial and freshwater valid tardigrades' species.B Documented localities of limnoterrestrial and freshwater valid tardigrades' species in each biogeographic province of the Andes and Neotropical regions.C Neotropical and Andean biogeographic regions' map with the recorded sampling sites (red circles) of limnoterrestrial and freshwater valid endemic tardigrades' species.D Documented localities of limnoterrestrial and freshwater valid endemic tardigrades' species in each biogeographic province of the Andes and Neotropical regions.E Neotropical and Andean biogeographic regions' map with the recorded sampling sites (red circles) of limnoterrestrial and freshwater valid "cosmopolitan" tardigrades' species.F Documented localities of limnoterrestrial and freshwater valid "cosmopolitan" tardigrades' species in each biogeographic province of the Andes and Neotropical regions.

Figure 4 .
Figure 4.A Correlation between the number of published articles and known valid species richness for each country.Orange squares represent Central American countries, while purple diamonds represent South American ones.There are overlapping data represented in the chart.Countries with zero published papers (Belize, El Salvador, Guatemala, Guyana, Haiti, Honduras and Panama) were excluded.Kendall Rank's Rvalue correlation and p-value are shown in the upper left corner.B Correlation between the number of published papers and known valid species richness for each biogeographic province.Pink triangles represent Neotropical biogeographic provinces, while blue circles represent Andean and green squares Transition zone provinces.There are overlapping data in the chart.Provinces with zero published papers (Bahama, Chapada Diamantina, Choco Darien, Comechigones, Ecuadorian, Falkland Islands, Guianan, Imeri, Jamaica, Juan Fernandez, Pará, Roraima, Southern Espinhaço, Trinidad, and Ucayali provinces) were excluded.Kendall Rank's R correlation and p-value are shown in the upper left corner.