1 Introduction
Statistical methods and computational niche models, based on the concept of Niche theory (Chase and Leibold, 2003) have recently been employed to investigate the potential geographical distribution of species (Peterson et al., 2011). These models can be useful, as they generate a measure of climate suitability (in different climate scenarios) that can be used as a predictor of demographic (Tôrres et al., 2012), genetic (Diniz-Filho et al., 2009) and ecological variables (Nabout et al., 2011). In addition, it may be useful to know the geographic distribution of a species for potential conservation purposes (Loyola et al., 2013), phylogeographic studies (Collevatti et al., 2013), as well as the impacts of global climate change on the geographic distribution of native species (Diniz-Filho et al., 2012) and of cultivated species (Nabout et al., 2012a). The niche models are referred to using various nomenclature, such as the ecological niche model, geographic distribution model, potential distribution model, habitat niche model, species distribution model (Peterson et al., 2011), bioclimatic envelope models (Araújo and Peterson, 2012), habitat suitability model, niche-based modeling and climate envelope modeling (Lobo et al., 2008).
The Ecological niche models (henceforward ENMs) are classified into two groups of models: mechanistic and correlative (Franklin, 2009). The correlative models have been used more often (Elith et al., 2006), possibly due to the availability of species occurrence data. The mechanistic models use species physiological information, which hinders the production of ENMs (Nabout et al., 2012a). Generally, the ENMs (either species occurrence or ecophysiology) seeks to determine the environmental limits of the species and then specialize the potential geographical distribution of the species (Colwell and Rangel, 2009). There has often been discussion in the literature on what has previously been designed (i.e., modeled), such as Grinnellian or Eltonion niches (Sillero, 2011). This discussion arises because the factors that determine the geographic distribution of species are biotic, abiotic and historical (dispersion) (Peterson et al., 2011), whereas the ENMs have previously used only abiotic data to understand the potential geographical distribution.
There are several different modeling methods that attempt to associate species presence and absence with environmental data (Peterson et al., 2011; Franklin, 2009). Moreover, the models are quite idiosyncratic, such that despite using the same set of data (occurrence and environmental data), the potential distribution of a species may be quite different between the models (Nabout et al., 2010). Because of this variation between models, Araújo and New (2007) suggested combining different models and parameters to generate a consensus model. Generally, the consensus model is an estimate of the species potential geographic distribution by evaluating which regions have greater overlap. Recently, the uncertainty of the ENMs have been explored in studies of climate change because the global circulation models (e.g.: atmospheric, oceanic) and climate scenarios (e.g.: emissions, baseline) are both sources of uncertainty for projections of species occurrence under future climate scenarios (Diniz-Filho et al., 2009).
Despite the conceptual discussions and different modeling methods, the ENMs has gained a considerable space in the scientific literature (Guisan et al., 2013). Much of this space is due to advances in cataloging species occurrence. For example, there are several foundations, such as the Reference Center on Environmental Information (CRIA), that provide the geographic coordinates of species. Moreover, beyond computational advances, wide access to environmental information with refined global resolution has also contributed to the use of ENMs.
It is possible to observe the increase in the use of the ENMs in the available literature. However, this observation prevents a complete understanding of the trends and biases of studies on ENMs. Questions such as “which method is more widely used?” or “which groups of organisms have often been modeled?” are difficult to answer. Therefore, techniques for measuring the sciences, i.e., scientometrics, are useful tools to assess the quantity, quality and variety of publications on ENMs. Scientometric studies have been employed to investigate leading areas of science such as global climate change (Nabout et al., 2012b), groups of organisms (Kraan et al., 2013; Nabout et al., 2010) and genetic studies (Quixabeira et al., 2010). Studies on ENMs still lack scientometric research. To date, only several papers used scientometrics and the ENMs; however, these studies focused on some biases of the ENMs, such as the biology of invasion (Barbosa et al., 2012), tropical biology (Cayuela et al., 2009) and conservation biology (Guisan et al., 2013). In this study, we evaluate all possible biases of the ENMs. Thus, we selected all papers indexed in Thompson-ISI about ENMs and analyzed the biases and trends of the scientific literature.
The aim of this study was to examine the trends and biases of the global scientific literature regarding ecological niche models. We characterized the main features of the work on ENMs by examining the temporal trends regarding the number of papers, the countries that publish on the topic, the methods, the groups of organisms investigated, and the relationship between the ENMs modeling method and the taxonomic group. This characterization of ENMs studies is important because it is a widely used tool with great potential for various studies. Therefore, this work aims to understand the characteristics of the current studies that use ENMs and to direct future studies to use this tool.
2 Material and Methods
For quantitative analysis of the research publications on ecological niche models (ENMs), the literature used in this study was obtained from the Thomson Institute for Scientific Information online database. We used the terms “ecological niche model*” OR “geographic* distribution model*” OR “potential distribution model*” OR “niche model*” OR “habitat niche model*” OR “species distribution model*” OR “species geographic distributions” OR “predict* species distributions” OR “habitat model*” OR “habitat distribution model*” OR “habitat suitability model*” OR “niche-based model*” OR “bioclimatic envelope model*” OR “resource selection function” in documents published from 1991 to 2011. We included the years 2012 and 2013 only to assess the trends of the number of papers (see Figure 1). We believe that the exclusion of these years in other analyses did not affect scientometric interpretations for detecting biases of the methods, groups of organisms, countries and other analyses.

Figure 1 Number of publications per year on ENMs (a), diversity indices journals that publish on the subject (b).
We analyzed each document according to the following criteria: (i) publication year; (ii) the journal publication; (iii) the country of origin of the first author; (iv) taxonomic group (mammal, amphibian, bird, reptile, fish, insect, invertebrates, microorganisms, plant, fungus); (v) the study environment (aquatic, terrestrial); (vi) the modeling method(s) used to generate the model(s) in each study; and (vii) time period (past, present, future). The diversity of journals that published papers on ENMs each year was calculated using the Shannon–Wiener diversity index (H’). Some authors have also used ecological indexes in scientometric studies (e.g., Carneiro et al., 2008). The Shannon-Weiner index is used in ecological studies to investigate biological diversity in various locations. This index has used species density and the location of occurrence to estimate species diversity. For this paper, we adapted this index to estimate the journal diversity over time. Thus, the journal name corresponded to the species (in the Shannon index), and the year corresponded to the study location. The frequency of the papers corresponds to the abundance. We obtained the temporal trend of scientific production on ENMs (total countries, journal of diversity, taxonomic groups and modeling method) using Pearson’s correlation coefficient (P<0.05). We divided the production of papers each year by the overall production of papers (obtained in ISI) to eliminate the effect of the temporal increase in papers (Nabout et al., 2012b).
A correspondence analysis (CA) was performed to analyze the relationships between the taxonomic group and the main modeling method, based on the frequency of studies in these categories. We used a chi-squared test (P<0.05) to evaluate the dependence between the two matrices (taxonomic groups and modeling methods).
3 Results
We found 3042 publications on ENMs between the years 1991 and 2013. In addition, the production increase over time was significant (r=0.77, P<0.001), especially in recent years (Figure 1a). These papers were found in 254 different journals. The ten most frequently published in journals contained 38.25% of all the papers. Among these, the top five journals that published on ENMs were Ecological Modelling (92 papers), Diversity and Distributions (76), Biological Conservation (69), Journal of Biogeography (63), and Ecography (51), with 24.45% of the total publications. An important aspect is the diversity of journals that published literature on ENMs. In this study, it is possible to observe a significant increase in the diversity of journals that published on the subject (r=0.97, P <0.001, Figure 1b).
In relation to the country of origin of the first author, we found 54 countries that have published on ENMs and of these, 15 countries account for only one publication each. The country with the highest number of publications on ENMs was the United States of America (USA) (550 papers). Furthermore, the top ten countries that have published on ENMs contributed 78.10% of the global scientific production about ENMs (Figure 2). Despite the predominance of USA papers, all major countries (top ten) showed a temporal increase in production (indicated by a positive Pearson’s correlation coefficient) (Figure 2) and have similar coefficients.

Figure 2 Number of publications of the first ten countries. The values above the bars indicate the trend in the number of papers, estimated using Pearson’s correlation coefficient. * Indicates R is significant (P<0.05).
Of the 1417 papers for which it was possible to identify the study environment, 85.03% were located in terrestrial environments, and 14.96% were located in aquatic environments. Moreover, it was possible to classify the taxonomic group modeled for 1436 papers. We classified 10 taxonomic groups, with the majority concerning plants (402 papers, or 28.36% of the total), followed by birds and mammals (Figure 3a). The trend of the production of papers for each taxonomic group showed a similar increase in the number of papers (indicated by the values of the correlation coefficient; Figure 3a).

Figure 3 The number of ENMs papers for each taxonomic group (a) and each modeling method (b). The values above the bars indicate the trend in the number of papers, estimated using Pearson’s correlation coefficient. * Indicates R is significant (P<0.05). MAXENT = Maximum Entropy; GARP = Genetic Algorithm for Rule-Set production; GLM = Generalized Linear Models; GAM = Generalized Additive Models; LR = logistic regression; ENSEMBLE = Multiple Methods; BIOCLIM: Bioclimatic Envelope Model; ENFA = Ecological Niche Factor Analysis; BRT = Boosted Regression Trees; RF = Random Forest; ANN = Artificial Neural Networks; CART = Classification and Regression Trees; FUZZY = Fuzzy Envelope; BA = Bayesian Analyses; DOMAIN = Gower distance; HSM = Habitat Suitability Maps; LMR = Multiple Logistic Regression; MARS = Multivariate Adaptive Regression Splines; CLIMEX = Climate Matching Tools; GAP = Gap Analysis.
For this same subset of publications, we can identify the modeling methods for 1401 publications, we recorded 83 different modeling methods. Among the methods used is the method of maximum entropy (MaxEnt), which included 312 papers, followed by the genetic algorithm for rule-set production (GARP) method with 179 papers. The top 20 most used methods comprised 88.93% of all publications (Figure 3b). Despite the higher number of papers that have used MaxEnt, the trend indicated that the MaxEnt, Ensemble, BRT, BA, DOMAIN, MARS, BIOCLIM, GLM, GAM, CART, RF and FUZZY trends have a number of similar papers, whereas the other methods have not shown an increase in the number of papers (i.e., GAP, LMR and ANN).
The correspondence analysis (Figure 4) showed some relationships between taxonomic groups and modeling methods (e.g., use of the method in studies with CLIMEX fungi). In general, the methods of the ENMs group and taxonomics are not related (χ2=4.8 P=0.15), showing that modeling methods have been employed for different taxonomic groups.

Figure 4 Correspondence analysis to demonstrate the association between taxonomic groups and the methods of ecological niche modeling studies. For this relationship, we used only the top 20 modeling methods.
Publications on ENMs can inform the climate maps constructed to understand the changes in the geographic distribution of species. Using the total number of jobs, it was possible to classify the studied period (e.g.: current, past or future) for 1508 publications. All publications modeled for the current climate scenario, with most studies, species were only modeled using only the current climate scenario (78.11% of articles), while 16.25% also designed their geographical distribution in a scenario future and 5.64% in a past climate scenario.
4 Discussion
Our results showed a significant increase in the number of publications on ENMs over the past 22 years, especially within the last six years. This growth in the number of publications was a global trend, with various scientometric studies being conducted in different areas (see Nabout et al., 2015a) showing an exponential growth in the number of publications in recent decades. Some examples are highlighted, such as the study on the use of ecological niche models to predict the distribution of invasive species (Barbosa et al., 2012), studies of crabs in the genus Uca (Nabout et al., 2010), studies on algae and bioenergy (Konur, 2011), and studies on biofuels (Yaoyang and Boeing, 2013).
The increase in the number of publications is most likely related to the increasing interest in ENMs, which have been used as tools to study the geographic distribution of species in different areas of ecology (Peterson et al., 2011). Moreover, in recent years, there have been great advances in methods for generating and providing environmental variables in digital formats (Elith and Leathwick, 2009). The evolution of computational science and statistics is reflected in new methods for ecological niche modeling (Palialexis et al., 2011).
In relation to the journals that published on ENMs, we observed a number of different titles in different areas of biology. An analysis of journal titles found that although the first publication date was in 1991, only in 1999 did specific journals begin to publish on ENMs. This trend may be related to the fact that the ENMs were used as a tool for behavioral studies and species distribution. This fact broadens the field of research and covers different objectives such as the conservation of species (Razgour et al., 2011), invasive species (Jimenez-Valverde et al., 2011), vector diseases (Fischer et al., 2011) and the impact of climate change (Wiens et al., 2010).
In relation to the country of origin of the first author, the United States led the ranking of publications, representing approximately half of all the articles we analyzed. This high production can be explained by the higher financial investment (see Nabout et al., 2010 for an example of the influence of financial investment in the scientific production of nations) and tradition in ecological niche modeling. Nonetheless, all countries increased their scientific production over time (indicated by a positive correlation), showing an increase on this topic in several regions of the world. Only two Latin American countries (Brazil and Mexico) were among the top ten in national production. These two countries have been conspicuous in scientific production when compared with other Latin American countries (e.g., Nabout et al., 2015b).
When observing the groups of study species, we found a bias in the research on plants, birds, mammals and insects. The preference for such groups is due to their high species diversity (Mora et al., 2011), their use as indicators of environmental impact (Bagliano, 2012), and the availability of some reliable databases (e.g. GBIF, speciesLink IABIN). In addition, terrestrial models have been most widely used, possibly due to the availability of atmospheric climate data (Robinson et al., 2011).
In relation to the methods used, there was a considerable change within the last 21 years of publications. Some models, such as MaxEnt, have only been used since 2006, whereas the GARP method that was heavily used until 2010 experienced a drastic drop in 2011. There are a variety of modeling methods available to estimate the probability of data with only presence (Elith et al., 2006; Franklin, 2009). Due to the appearance of MaxEnt in 2006 (Phillips et al., 2006), this method has been successful, and the original paper (Phillips et al., 2006) has been cited over 1200 times; in 2012 there were more than 300 citations (Fitzpatrick et al., 2013).This increase in use is because the MaxEnt model presents robust results compared with other modeling methods and performs well when used with a reduced number of the occurrence data types under investigation (Elith et al., 2006; Hernández et al., 2006). In addition, researchers have increasingly been discovering new methods and improving the performance of the models and the quality of the results.
A recent trend in the literature is the combination of different models (ENSEMBLE; Araújo and New, 2007), which allows for the generation of uncertainty maps. Although there are statistical tools to validate the ENMs (see Allouche et al., 2006), choosing a single model is difficult and risky; therefore, one solution is to combine models. Recently, although tests have been developed to understand the sources of uncertainty in ENSEMBLE (methods, scenarios and models of circulation) (Buisson et al., 2010; Déqué et al., 2012), publications have generally found that the methods generate greater uncertainty in the predictions (see Diniz-Filho et al., 2010; Terribile et al., 2010). In addition, it is possible to map the uncertainties in geographic space (Diniz-Filho et al., 2009).
The correspondence analysis evaluated the association between taxonomic groups and modeling methods in the studies of ENMs and generally not the relationship between modeling method and taxonomic group. This information is important and expands the application of the method of ENMs. Our literature review did not find any study that restricted the use of ENMs to certain taxonomic groups. By contrast, many authors have been successful in using ENMs with different species including rare (Guisan et al., 2006), threatened (Chunco et al., 2013), and endangered species (Bartel and Sexton, 2009). It is important to caution that there is a need for more studies targeting aquatic species. However, this practice is difficult as there is a lack of data for both terrestrial species and aquatic species (Giannini et al., 2012). Thus, it is necessary to expand and improve the availability and quality of the digital occurrence data because they directly affect the robustness of ENMs and are limiting factors for the study of species, often making it impossible to study species due to missing data (Robinson et al., 2011). Another measure would be to implement more extensive collections that are standardized and georeferenced properly and involve mainly sparsely sampled species. In addition, it would be important to develop and implement filters that evaluate the quality of both the taxonomic and mapping data. These measures could therefore improve the quality of the generated models (Giannini et al., 2012).
In relation to the climate scenario, all models were modeled for the current scenario. However, only 18% of papers projected the geographic distribution of species in the future scenario and 7% projected into the past. The uncertainty in temporal variation of climate variables is a difficulty for researchers interested in projecting the geographic distribution of species in different climate scenarios. Certainly, the use of future climate scenarios and the ENMs contribute to predict changes in species distribution. The interest in the study of global climate change can explain the impact of this on biodiversity (see Nabout et al., 2012b). However, to produce the future climatic scenarios, it is necessary to model many simulations, considering different geographical extents and resolutions of climate models (Peterson et al., 2011). Currently, there are several atmosphere ocean general circulation models (AOGCMs) (e.g., CCSM4, GISS-E2-R, MIROC-and MRI-CGCM3 ESM), with climate projections for 2050, 2080 and 2100 using climate variables available in Wordclim database (Hijmans et al., 2005). As for the paleoclimatic scenario, the difficulty is the availability of monthly climate data and a refined spatial resolution (Lima-Ribeiro and Diniz-Filho, 2013). However, the same AOGCMs cited above already provide paleoclimate data, which in the short term should increase the number of papers with paleoclimate scenario data (e.g., Collevatti et al., 2013).
Finally, it was evident that modeling studies have increased over time and that researchers in various fields tested many models and modeling methods. Several methods, such as MaxEnt and ENSEMBLE, are successful and widely used; however, there are still uncertainties about the errors of each method. We realize the need to invest more in the development of research aimed at improving existing methods, especially for methods with increased utilization.