Modeling of the potential distribution of Eichhornia crassipes on a global scale: risks and threats to water ecosystems

The water hyacinth (Eichhornia crassipes) is listed among the 100 worst invasive plants and was ranked as the 11 worst invasive species in Europe, being a threat to aquatic biodiversity and water-provision. Predicting species distribution is the first step to understanding niche suitability, forecasting the invasion impact and building resilience against this species. In this study, we used a potential distribution model to assess the global risk of water hyacinth invasion by overlapping maps of highly suitable areas for water hyacinth occurrence and areas of biological importance and water scarcity. The MaxEnt Maximum Entropy algorithm was used in the construction of the model and included five global bioclimatic layers and one of urbanized areas. Among the variables used, occurrence is mainly explained by urban areas, highlighting the importance of cities as a source or dispersion mechanism of the water hyacinth. Global biodiversity hotspots are predominantly situated in high suitability regions for the species. Ramsar sites and global protected areas are at a lower risk level compared to hotspots; however, future climate change and urban growth scenarios could put these areas at higher risk for invasion. Threats posed by the water hyacinth are possibly more acute in regions suffering from current or chronic drought. The results suggest that niche models that do not consider anthropic variables may be underestimating potential distribution of invasive species. Furthermore, the ecological plasticity of the water hyacinth and its close association with cities increase the concern about the impact of this species on the environment and on water security.


INTRODUCTION
The water hyacinth (Eichhornia crassipes) is a free-floating aquatic macrophyte in the Pontederiaceae family and originates from the Brazilian Amazon (EPPO, 2008). It reproduces both vegetatively, via ramets formed from axillary buds on stolons, and sexually through seed production (EPPO, 2008). The species' growth is related to an environment's nutrient content, especially when the temperature ranges between 28ºC and 30ºC; however, growth sharply decreases below 10ºC or above 34ºC (EPPO, 2008). E. crassipes colonizes still or slow-moving water bodies, such as estuarine habitats, lakes, urban areas, watercourses, and wetlands. It can tolerate water level fluctuation extremes and seasonal variations in flow velocity, as well as extremes of nutrient availability, pH, temperature and toxic substances (Gopal, 1987).
There is currently no consensus on how and when this species was introduced into environments outside its natural habitat, but its use for ornamentation in lakes and gardens, as well as in controlling nutrients and algal blooms in eutrophic environments certainly contributed to its spread (Kriticos and Brunel, 2016).The water hyacinth is present on all continents, except Antarctica, having invaded more than 50 tropical and subtropical countries (EPPO, 2008). Due to its high dispersal and growth capacity, the species is ranked on the 100 worst invasive species list as reported by the International Union for Conservation of Nature (IUCN) and it is in the top 20 list of the Spanish Invasive Species Specialist Group (ISSG) (Téllez et al., 2008). According to Nentwig et al. (2018), E. crassipes was ranked as the 11 th worst invasive species in Europe.
Environments colonized by E. crassipes have undergone significant changes in their structure and aquatic habitat diversity, including the proliferation of disease transmitters and high fish mortality due to low concentrations of dissolved oxygen in water (Lorenzi, 2000).
Rev. Ambient. Água vol. 15 n. 2, e2421 -Taubaté 2020 Moreover, multiple water body uses have been impacted, especially uses that affect power generation, navigation, recreation and drinking water supply (Liu et al., 2016). This effect is more pronounced in regions that suffer from chronic drought (e.g., the Mediterranean), countries with tourism-based economies (e.g., Tunisia), and countries whose principal electricity supply comes from hydroelectric generation (e.g., Brazil; Kriticos and Brunel, 2016).
In Sardinia (Italy), in 2010, the invasion of E. crassipes became evident when the Mare'eFoghe River, in the Province of Oristano, was covered for 8 km over an area of 560,000 m². During this event, there was an interruption of the recreational activities that usually occur in the watercourse (Brundu et al., 2012). Countries such as Portugal, India, Sri Lanka, Bangladesh, Buma, Malaysia, Indonesia, Thailand, and the Philippines recorded negative impacts and large economic losses in rice fields of around US$ 15 million due to E. crassipes (Moreira et al., 1999).
In some cases, the economic impacts are so significant that they require the use of control techniques, such as in the State of Florida, in the United States, which spent more than $43 million between 1980 and 1991 on the suppression of water hyacinths. Mullin et al. (2000) reported annual expenditures for the management of the species in the order of US $500,000 in California and 3 million in Florida. Spain spent more than 14 million euros between 2005 and 2008 to control the species in the Guadiana River Basin (Téllez et al., 2008). In Lusaka, Zambia, the E. crassipes invasion on the Kafue River led to the suspension of water treatment and the reduction of the electric power generation capacity at the Gorge Dam, for at least one week (EPPO, 2008). Hydroelectric plants in Malawi and Jinga, Uganda, on the Nile River, are also frequently affected by the turbine clogging caused by water hyacinths (Wise et al., 2007).
Given that invasive species commonly produce negative impacts, predicting which regions are at risk of biological invasions is important for developing successful monitoring programs and management strategies. In this context, Species Distribution Models (SDM) are tools used to predict the potential distribution of a particular species through the relationship between species occurrence and environmental condition data sets (Elith and Leathwick, 2009).
Many of the modeling studies which implement SDMs carried out and reported in the literature have focused on conserving and representing the distribution of rare and endemic species (Oliveira, 2011); biogeographic analyses (Whittaker et al., 2005); potential routes of infectious diseases (Peterson et al., 2006;Levine et al., 2007); predicting the effects of climate change on the geographical distribution of species (Peterson et al., 2002;Pearson et al., 2006;Wiens et al., 2009;Kriticos and Brunel, 2016); identifying priority areas for conservation (Ortega-Huerta and Peterson, 2004); and predicting the spread risks of invasive species (Peterson, 2003;Peterson and Robins, 2003;Campos et al., 2014;Kriticos and Brunel, 2016;Liu et al., 2016).
Maps generated from such models may be useful in predicting the invasive potential of exotic species, and for assessing the invasion risk in uncolonized environments (Rödder et al., 2009). We hypothesize that anthropogenic variables, such as proximity to urban areas, and climatic variables (temperature and precipitation), are determinants of the species distribution. To date, no global analyses of the potential impact of water hyacinth on biodiversity or ecosystem services have been carried out. Thus, the present study aimed to build a potential distribution model of the water hyacinth, on a global scale, in order to assess invasion risk. Additionally, the study sought to identify areas in terms of the threat level to biodiversity, water supply, and regions under chronic drought.

Occurrence data acquisition and processing
The occurrence points of the species were obtained from the dataset available on the Global Biodiversity Information Facility website (GBIF -gbif.org) for the period between 1960 and 2017. This online platform was chosen due to the ease of access to the occurrence records on a global scale, as highlighted in various recently reported studies (Syfert et al., 2013;Campos et al., 2014;Zeng et al., 2016;Liu et al., 2016). Because inconsistencies related to the reliability of the georeferencing and taxonomic identification of the water hyacinth have been identified, inconsistent registers were removed.

Selection of environmental layers of interest
Nineteen bioclimatic layers were obtained digitally from the WorldClim project (http://worldclim.org) at a spatial resolution ~ 2.5'. In addition to these variables, a binary layer of urban areas worldwide was obtained from the Socioeconomic Data and Applications Center -SEDAC (http://sedac.ciesin.columbia.edu/data/set/grump-v1-urban-ext-polygons-rev01). This layer was considered because urban areas provide favorable conditions for the distribution of E. crassipes (Dube et al., 2018). The layers were obtained in ESRI Grid format and were converted using DIVA-GIS 7.5.0 to the ASCII format, which is compatible with the MaxEnt data entry format. ArcGIS 10.3 was used to standardize the spatial incoming data in the algorithm and to generate a Pearson correlation matrix in order to evaluate the relation between the bioclimatic variables, and thus removing the highly correlated environmental layers from the final set (r >|0.70|) (Dormann et al., 2012).

The modeling algorithm
The Maximum Entropy -MaxEnt v. 3.3.3 algorithm was selected to elaborate on the potential distribution model (Phillips et al., 2006). This software estimates the probability of occurrence of certain phenomena even when considering incomplete information and demonstrates excellent performance for models that only consider presence/occurrence data (Hernandez et al., 2006;Pearson et al., 2007;Wisz et al., 2008). The modeling parameters were set by default (regularization multiplier: 1; max number of background points: 10,000; replicates: 1; replicated run type: cross-validate; maximum iterations: 500; convergence threshold: 0.00001; adjust sample radius: 0). The obtained model used the best predictor variables, with 75% of the occurrence data for training and 25% for test. The environmental suitability map resulting from the model was categorized into five levels defined by the naturalbreaks function in ArcGIS 10.3. The same software was also used to represent the graphical outputs of MaxEnt.

Model Evaluation and Validation
In order to statistically evaluate the MaxEnt performance, analyses carried out by the software were evaluated using the Jack-Knife and the Area Under the Curve (AUC) tests. The former was carried out to evaluate the importance of the environmental layers in the explanation of the species distribution, and the latter is a statistical measure that assesses the agreement between the presence records and species distribution. An AUC value equal to 0.5 indicates that the model performance is possibly by chance similar to chance, while values closer to 1.0 indicate better model performance (Phillips et al., 2006). True Skill Statistic (TSS) was another performance measure used to evaluate the model. With values ranging from -1 to +1, positive values closer to +1 are related to the best model performance. TSS was calculated from a confusion matrix composed of hits and misses related to the prediction of the model (Allouche et al. 2006; Tables 1 and 2).
Subsamples of 700 and 1000 records were used in order to verify if the n sample size used (presence records) had a significant influence on the algorithm's performance. Moreover, an independent dataset of species occurrence (25% of the total records) was used for the model validation. For this process, a threshold was adopted based on Fixed Cumulative Value 5, aiming to binarize the environmental suitability map for invasion susceptibility in a presence-Rev. Ambient. Água vol. 15 n. 2, e2421 -Taubaté 2020 absence map of the species in order to compare the outputs of the model against actual distribution data (Phillips and Dudik, 2008).

Measure Formula
Accuracy +

RESULTS
After excluding species occurrence points lacking geographic coordinates and location description or identified as duplicates, a total of 1316 occurrence points were selected to develop the model. From the records in this dataset, 62% of the points are located between the tropics (23° N and 23° S), while 25% are above the Tropic of Cancer and 13% are below the Tropic of Capricorn. Thus, occurrence points are distributed across all continents, except Antarctica. Although E. crassipes is native to South America, only 22% of the occurrence records were on that continent, while North America accounted for about 48%, Oceania with 7.7% of the records, followed by Africa (6.1%), Europe (5.9%) and Asia (5.8%).
The Pearson correlation analysis indicated a high number of correlated variables from the 19 bioclimatic layers dataset used in the model development. Six variables had no significant correlations among them (r<|0.70|) and thus they were selected for analysis (Dormann et al., 2012), five being bioclimatic and one being the binary layer of urban areas around the world, which was not tested with the other variables as its data has no correlation with the other layers (Table 3). According to the Jack-Knife test, "urban extent" and "temperature seasonality" variables individually contributed the most to the model. The developed model used 987 training points and 329 test points, performing better than expected at random model (AUC = 0.917 and TSS = 0.70). The result of the sensitivity statistical measure was higher than the specificity, indicating that the model produced few errors of omission (Syfert et al., 2013; Table 4). Additional tests were performed using 1000 and 700 records to evaluate the efficiency of the model when using subsamples, which verified that the reduction of the n sample size causes few changes in the model performance with hit rates higher than 93% in all cases. The modeled distribution is consistent with the actual points of species occurrence used in this study, as well as administrative regions in which the water hyacinth has established populations, either in their native or non-native habitats (Figure 1). The model indicated a broad spectrum of potential environments that could be invaded by the E. crassipes and then the binarized distribution model transformed the results of the environmental suitability map into a presence/absence map (Figure 2).
According to the results, E. crassipes could be affecting the storage and freshwater supply in Central America, the Southeastern United States, Africa (Sub-Saharan Africa), Southern Europe, Southern and Southeastern Asia, and Oceania (note that more field data is required for confirmation). It should be highlighted that more than 33% of the main watercourses and 10% Rev. Ambient. Água vol. 15 n. 2, e2421 -Taubaté 2020 of the entire area of the world's lentic environments occur in regions that are suitable for the species occurrence. Approximately 44% of the world's lentic environments present conditions for colonization. For many tropical countries located in identified risk areas, almost all of their important watercourses are located in regions of high suitability (supplementary material -   Many global ecoregions are under threat since more than 43% of global river basins present ideal conditions for invasion risk. There are many basins located in both North and South America with a high degree of fish species endemism that also offers the highest suitability for invasion (see http://www.feow.org) (supplementary material - Table  B: http://doi.org/10.5281/zenodo.3708474). (Figure 3).
About 52% of the Protected Areas (PA) of the world are under potential conditions for the establishment of E. Crassipes. Less than 1% of PAs are located in optimum conditions, corresponding to more than 279,551 km² of areas that can be or are already invaded. On the other hand, approximately 48% of the total land area of PAs lies outside of regions that offer water hyacinth suitability (Table 5). These PAs are predominantly either above the Tropic of Cancer or below the Tropic of Capricorn. Some of them are among the largest PAs on the planet, such as the Greenland biosphere reserve and the Chinese natural reserves of Sanjiangyuan and Qiangtang. Approximately 28% of the world's biodiversity hotspot areas are located in regions of high suitability, while 6% are in optimal conditions for the occurrence of the water hyacinth. When considering the threshold adopted in the model, 79% of the global biodiversity hotspot areas can be invaded. There are large areas of potentially threatened hotspots in Mexico, the Southeastern United States, Brazil, Madagascar, and tropical Asia (Figure 4). About 50% of Ramsar sites are in places that offer minimum conditions of suitability for the occurrence of the water hyacinth. Approximately 3% of the area of the sites or 67.6 thousand km² occur in optimal conditions and 18% are in places of high suitability ( Table 5). The projected distribution indicates a high likelihood of species expansion including newly established Ramsar sites. Approximately 30% of the world's drylands are under potential risk of colonization, such as the in the Southwestern United States, Central-East and Southern Africa, Northern Asia, Northeastern Brazil, and Australia. Almost 50% of the available water resources in dry and subhumid lands are potentially threatened ( Figure 5 and Table 6).

DISCUSSION
We confirmed the hypothesis that both climatic and anthropic layers are important predictors for water hyacinth distribution. Our analyses showed that the distribution of E. crassipes is limited by low temperatures at high altitudes and latitudes, as well as by heat and aridity in desert regions in Africa, Australia, Chile, Argentina, and Asia. In contrast to the Northern Hemisphere, the Southern Hemisphere has few areas that are cold enough to prevent species establishment. There is little opportunity for E. crassipes to expand the boundaries of its occupation beyond the habitats already colonized in the southern hemisphere, given that the Andes Cordillera in South America and the desert lands of Australia constitute a stress gradient due to the cold and arid conditions, respectively.
We also found significant overlap amongst highly suitable regions for species occurrence and areas of water scarcity and biologically important regions. World Protected Areas (PAs) are less threatened than Ramsar sites and Biodiversity Hotspots, considering water hyacinth suitability. The results obtained for the PAs were significantly influenced by the large number of PAs located in Asia and at high latitudes, which are not suitable for the species. The Ramsar sites are in an intermediate invasion potential condition. Despite this, the projected distribution indicates a high probability of expansion of the species to newly established Ramsar sites, such as the Marais de Sacy, in France; Lake Massaciuccoli, in the region of Tuscany, Italy; and the environmental protection area of Cananéia-Iguapé-Peruíbe, in São Paulo, Brazil. Global biodiversity hotspots showed alarming results. Approximately 79% of their areas are within suitable conditions for the occurrence of the water hyacinth since the most biodiverse regions of the world are concentrated in the tropics, the portion of the planet where the water hyacinth is predominant.
Rev. Ambient. Água vol. 15 n. 2, e2421 -Taubaté 2020 Threats posed by this species are possibly more acute in regions suffering from chronic drought or drought. In countries such as Greece, Albania, Macedonia, Bosnia, and Croatia, which have an extremely dry summer period and where available water resources are essential for human survival. Thus, in these locations, the environmental and economic impacts can be much more serious.
Threshold selection (fixed cumulative value 5) aimed to reduce the percentage of omission errors, because the modeled species is a generalist, being able to find adequate conditions for its survival throughout the projected area of occupation (Norris, 2014). The tests performed with this threshold, with subsamples of 1000 and 700 registers, showed that the reduction of the sample size implies a small reduction in the performance of the model. In all cases, the accuracy was higher than 93%. Despite this, there was a small reduction in accuracy from 96% to 93%, due to the decrease in the independent set of data used in the validation (Zhang et al., 2015). The model presented good performance, obtaining an AUC of 0.917. However, in some cases, the use of this statistical measure is criticized (Allouche et al., 2006). In addition, the True Skill Statistics (TSS = 0.70) was calculated, which confirmed the good AUC result.
Urban areas had a major influence on the projected distribution of the water hyacinth, which based on our analysis was the most influential factor explaining water hyacinth occurrence. This highlights the importance of cities serving as the source locations of hyacinth propagules due to the high levels of water pollution that contribute to species colonization. Moreover, cities serve as global dispersion vectors as they facilitate the spread of the water hyacinth far beyond its original distribution range. Due to the close association between the species and urban areas, coupled with its wide niche suitability, from the conservation and management point of view increases concern about the current and future impacts of the water hyacinth.
Results obtained by Gallardo et al. (2015) corroborate our findings, as their study indicates the importance of anthropic variables in the construction of SDMs by showing that anthropic variables explained a substantial amount (23% on average) of species distributions. Megacities, which are developing mainly in Asia, may accentuate the potential for invasion of the water hyacinth on that continent. In Europe, Rodríguez-Merino et al. (2017) showed that the best predictor of potential distribution for the majority of non-native aquatic macrophytes was the human footprint. In addition, the most vulnerable areas are located near to the sea and the high population density cities. An important part of the areas for colonization of these species coincide with territories with agricultural development increase.
Our projected distribution on the European continent suggests a much wider range than that found by Kriticos and Brunel (2016), who did not include urbanized areas in their model. Moreover, our projected distribution in South Africa also suggests a larger area at invasion risk, under current climatic conditions, than the areas identified by Hoveka et al. (2016), who also did not include anthropic related variables in their model.
One limitation of this study refers to the small number of occurrence records obtained from the GBIF portal for South America. This limitation could be improved using other platforms that provide more information on the distribution of the species. Nevertheless, automatically reducing occurrence numbers had little effect on models' performance, which suggests that the number of records were sufficient to test our hypothesis and strengthen the results.
Another limitation is collector's bias, as in general, most sampled areas are those of greater economic interest or more easily accessible, such as protected areas or near cities, roads and rivers (Oliveira, 2011;Norris, 2014). The use of more records would probably improve model performance. Nevertheless, although it is possible to measure collectors' bias, it is not possible to get rid of it, and virtually all niche models have such bias. Finally, water specific variables, which are extremely important for the water hyacinth occurrence, were not used in our SDM because no reliable data is currently available on a global scale.

CONCLUSIONS
The present study consisted of the elaboration of a potential distribution model of the water hyacinth on a global scale. Risk areas were identified in terms of threats to habitat biodiversity, water supply, and chronic drought. The results of this model are consistent with the distribution of collected occurrence records. They can also be used to predict the distribution of the target species at a broad geographic scale for areas where no samples were collected, which can serve to complement and direct costly field surveys. Thus, the most vulnerable areas can be understood, directing quick response efforts.
Global biodiversity hotspots are predominantly situated in regions of high environmental suitability. Ramsar sites and global protected areas are in a more secure status, but climate change scenarios and the growth of urban areas may put them at risk of invasion. A more detailed and individual evaluation for each of these areas is suggested in order to categorize them according to their environmental suitability for invasion susceptibility and proximity to recorded E. crassipes locations. Furthermore, we recommend that SDMs should use anthropogenic layers to better represent species distribution.
From the methodological point of view, this work adds to the literature as it brings evidence that modeling invasive species niches needs to include anthropic layers as explanatory variables, otherwise potential distribution may be underestimated. In this case, more than one quarter of the hyacinth occurrence is explained by the presence of urban centers, greatly expanding the range of areas identified as highly suitable when compared to previous studies that only relied on bioclimatic conditions to model the occurrence of this species.
From the conservation and water security point of view, we demonstrate that the water hyacinth should occur in areas around the globe where humidity and heat levels are appropriate. Given increasing rates of urbanization, particularly in tropical and developing countries (D'Amour et al., 2017), these and surrounding areas provide ideal environments for water hyacinth occurrence. Such findings increase the concern of the current and future impact of this plant on aquatic biodiversity and water resources.
Finally, understanding the full invasion potential of this species is crucial for decisions that involve species management and to avoid negative impacts. The methodology used in this study could be used in evaluating the dispersion potential of other invasive species.