Bibliometric Analysis for Pattern Exploration in Worldwide Digital Soil Mapping Publications

Bibliometric analyses provide a clear understanding of the scientific performance and relate them with standards of the global scientific production. Soil science is an outstanding and developing field among environmental sciences. Knowledge about soil characteristics and their distribution in the environment has been enriched by the use of new geotechnologies, resulting in what is known as digital soil mapping. Thus, the objective of this work was to characterize the scientific production in digital soil mapping in Brazil and in the world, in the period from 1996 to 2017, in databases such as Scopus and Web of Science. In the general context of increasing numbers of papers, the journal Geoderma published the highest number of related papers. Among the 10 with most published papers, the Revista Brasileira de Ciência do Solo is the only open access journal. Although there are countries at the cutting edge of digital soil mapping such as the United States and Australia, the position of Brazil in the number of papers and authors cannot be overlooked, showing the importance of the nation’s participation in digital soil mapping, as a field of science that can provide guidelines for public policies for the development of agriculture in the country.


INTRODUCTION
One way to assess the strength and productivity of a scientific field is to measure the number of publications over time (Hartemink andMcBratney 2008, Mao et al. 2015).Bibliometry arose directly from this need of evaluating scientific production, in view of the large amounts of information available in bibliometric databases (Wallin 2005, Moed 2009, Loudcher et al. 2015).Some bibliometric indices have been used for a strategic planning of research by institutions, universities and research funding agencies (Zhou et al. 2016).A clear understanding of the institutional performance does not only support particular areas of research but also situates them in relation to global scientific production standards.
In view thereof, the most productive and influential researchers and countries should be investigated (Cancino et al. 2017), also helping researchers to identify the leading journals in the development of a particular field of science (Shokraneh et al. 2012).LUCIANO C. CANCIAN, RICARDO S.D. DALMOLIN and ALEXANDRE T. CATEN In this aspect, soil scientists have increasingly contributed for the myriad of publications.
The importance of soils for ecosystems, food production, and climate regulation is more and more viewed as fundamental (Sanchez et al. 2009, Amundson et al. 2015).A growing interest in agriculture has also put soil back on the global research agenda.The increasing need for up-todate information on soil has been highlighted in several recent studies of the United Nations and other international organizations (Robinson et al. 2017).Soil science is a knowledge area that can help find answers to these challenges (McBratney et al. 2014).Nevertheless, more could be achieved through meta-analysis of what has already been published (Roudier et al. 2015, Arrouays et al. 2017).
Bibliometric studies are not new in soil science.Based on bibliometric tools and a study review, Warketin (1994) sought for trends in soil science studies, underlying the description and determination of the evolution and main topics addressed in this field.Another example is the review of the 100 volumes published by Geoderma, between 1967 and2001 (Hartemink 2001), evaluating the temporal behavior and characteristics of these publications in one of the main journals of soil science, showing the development of subareas of soil science over the years.Studies as that of Hartemink (2015), with inferences on how the new generation of soil scientists has been using soil classification, demonstrate the importance of bibliometric studies for science.They equip researchers, especially the new generation and young researchers, with an outlook on the headway already made on a given topic, identifying the difficulties, peculiarities and thus, finding ways for its evolution.
In the search for an enhanced acquisition of soil information, digital soil mapping (DSM) emerged by integrating subareas as a technique to generate new soil studies and meet the demand for information in terms of detailed knowledge on spatial distribution and properties (Arrouays et al. 2017).DSM is benefitted by the increasing availability of spatial data of the Earth's surface (McBratney et al. 2012), and is a constantly increasing field of science, in which additional possibilities of applications are continuously being explored.Based on a search in the Scopus database for articles containing the keywords "digital soil mapping", Minasny and McBratney (2016) stated that publications in the area increased at a rate of 12 papers per year and the number of citations increased by 384 citations per year.Much effort has been invested so that research on DSM will contribute to the further development of soil science in the world.
In Brazil, with a huge territorial extension available for food and fiber production, there is a lack of information about sustainable land use that supports production activities.As the soil databases available in the country do not cover the entire territory, DSM would contribute in a practical way to complete this information.The application of DSM is a relatively new area of science in Brazil (ten Caten et al. 2012, Dalmolin andten Caten 2015), and the first paper on DSM in Brazilian territory was published only in 2006 (Giasson et al. 2006).However, there is no information on the amount of the scientific production or how many researchers work with DSM in Brazil.This information would be useful for the specific orientation of public policies for compiling soil information, as of the program "Pronasolos" (Polidoro et al. 2016) for example, dedicated to resume pedological surveys in Brazil, for which DSM could be useful.
In this regard, there is still a lack of studies that characterize the main publications, not only in Brazil, but in other countries as well.Based on a comparison of the production of the main authors and their respective countries and between the DSM studies developed in Brazil and the global trends, the national and global research characteristics papers may not appear, due to this methodology.The database was searched for the following terms: "digital soil map" OR "digital soil mapping" OR "digital map of soil" OR "digital mapping of soil" OR "GlobalSoilMap".The plural terms were also included in the search.In both databases, the search was performed in the "Advanced search" field, making use of the boolean operator "OR" between the terms.After, the results were filtered, limiting the years from 1996 to 2017.
From the publications found by this method, all basic information was extracted, including the author's names, affiliation, country, language of publication, type of document (article or bibliographic review), number of times it was cited, journal name, year of publication, keywords, and subject category.Thereafter the data were saved in BibTex format, as recommended by Aria and Cuccurullo (2018).

BIBLIOMETRIC INDICES
The bibliometric analysis of the complete search results was performed using the package Bibliometrix (Aria and Cuccurullo 2018) version 1.9, in R environment (R Core Team 2018).The two files in BibTex format were uploaded, the "readFiles" function applied and converted to a data frame.Some 670 files were obtained from Scopus and 557 from WoS, consisting of journal articles and bibliographic reviews.After merging the two files, duplicated records were eliminated by the function "remove.duplicated".Then, to avoid any language conflict that would make an inclusion of duplicate documents possible, a manual screening was performed, removing the duplicate files.The total of 1,227 (Scopus + WoS) files was reduced to a final number of 727 files to be analyzed.
First, the "biblioAnalysis" function was applied, returning an object of class "bibliometrix", to which the "biblioNetwork" function is applied, which generates a set of bibliometric indices in DSM research could be identified, aside from predicting future scenarios and indicating new research lines.In this context, the objective of this study was to characterize, based on a set of bibliometric indicators, the scientific production on DSM for Brazil and worldwide, between 1996 and 2017, to identify characteristics and peculiarities in the national and global scientific production on DSM, making the prediction of growth trends in this area of knowledge possible and indicating paths to be followed.

DATA ORIGIN AND SEARCH PROCEDURE
For a pre-analysis of the databases, data were obtained from the Clarivate Analitics Web of Science (WoS) and Scopus databases.All records characterized by articles and bibliographic reviews detected by a query of the subject areas connected to agrarian sciences, published between September 1996 and December 2017, were stored and included in the study.
From combinations of terms referring to DSM, queries were carried out including searches for terms in the titles, abstracts, and keywords of papers.The words and terms used for this study were previously tested in the main scientific journals of the area.For our analysis, they were limited to those that were most relevant and published only in articles specifically about DSM.With this limitation by the search for terms only in titles, abstracts and keywords, articles from other areas with complementary methodologies or only cite of DSM where not taken into consideration.If the amount of results were too large, it would be necessary to check if the publications found fit in any topic of the research area, making the search subjective and not automatic.It is worth mentioning that unless some of the query terms were found in the keywords, title, or abstract, the papers were not included even if they used the term DSM, and some LUCIANO C. CANCIAN, RICARDO S.D. DALMOLIN and ALEXANDRE T. CATEN together with the results from the two databases.With this tool, the annual results of publications and citations were classified as DSM-related.In addition, a keyword network of all papers included in the study was created, allowing to analyze the subject trends in DSM research.
The scientific journals were evaluated as follows: identification of the 10 journals with the highest number of publications on DSM, also by function "biblioAnalysis".The output of papers on DSM of the 10 most productive journals was recorded for the study period.Complementarily, as a way of evaluating the journal quality, the following impact factors were analyzed: SCImago Journal Rank (SJR), a factor generated by Scopus, and the Journal Citation Reports (JCR), published by the Institute for Scientific Information.The JCR is a recognized base for evaluating journals indexed in the WoS.As a complementary metric of the journals with most publications on DSM, the Eigenfactor Score was calculated, considered adequate to mediate the quality of these journals (Cantín et al. 2015).The Eigenfactor is calculated from the number of times the articles published in the journal were cited by the JCR in the last five years, similarly to the Journal Impact Factor (JIF) or the 5-year JIF.In addition, the Eigenfactor assigns a weight or value to each citation, according to the journal of citation.
The advantage of the bibliometrix package is that more than one database can be analyzed, however, only the country of the first author is taken into consideration for the calculation.Therefore, since this study investigated in more detail in which countries the DSM research was carried out, we used only the data obtained by the Scopus database for this purpose, allowing the countries of all authors to be counted.This database was preferred to the others included in the previous analyses due to the higher number of articles found in the Scopus database (670 papers), the highest number of represented journals (Chadegani et al. 2013), the highest number of journals published only in this database (Barnett and Lascar 2012), and the highest number of papers on the subject "soil" (Minasny et al. 2013).
To evaluate the scientific output of a country, papers were counted for the entire study period from 1996 to 2017.For the number of citations, on the other hand, due to a limitation of the maximum time period of the database, only those of the period from 2002 to 2017 were counted, coinciding with the largest expansion of DSM research.The percentages of self-citation of each country were also recorded.A country self-citation means the percentage of citations received by papers from the same country in which the papers were published.
To evaluate the geographical distribution of the authors, the institutional addresses cited in the studies were captured in the Scopus database and their geographic coordinates recorded.To assess the increase of research in the countries over the years, this information was captured for the period from 1996 to 2007 and then from 1996 to 2017, in order to identify how the growth occurred in the first years of research and after the boost in publications on DSM.Maps were created with software QGIS 2.18 (Qgis Development Team et al. 2018) with the point-in-polygon tool, considering the number of authors of each country in the evaluated periods.The data of the institutional address indicated by the authors were also plotted on the map, showing a geographical distribution of authors working with DSM.In addition, the most representative Brazilian institutions were mapped.

CHARACTERISTICS OF THE GLOBAL DSM RESEARCH
From 727 papers retrieved in the Scopus and WoS database, we observed an increased number of publications on DSM at an annual rate of 19.6%, with an average of 15.4 citations per paper (Figure 1).Except in 1998, at least one publication was released per year, and a remarkable increase in the number of published papers was observed after 2006.A higher number of publications in 2006 was also observed, with discrepant values in relation to the growth trend observed until then.This higher number of papers can be explained by the occurrence of the event Second Global Workshop on Digital Soil Mapping, in 2006.The annual publication increased steadily from 13 in 2007 to 100 in 2017, i.e., a 10-fold increase in those 10 years.
The curve of citations per paper shows growth in the fi rst years, becoming more pronounced in 2003, the year of publication of the paper "On digital soil mapping" (McBratney et al. 2003), which established the bases and defi ned concepts of DSM, with 950 citations, serving as a reference for several subsequent studies.Thereafter, the curve declined, and the citations per paper rate was diluted by the higher number of papers published until 2017.On this decrease, it should be noted that more recent articles are in the so-called citation window, which is the time needed for the paper to be read and subsequently quoted.
The great majority of papers investigated here deal with the application of DSM techniques in diff erent regions of the world.The increase in data availability can break boundaries and leverage the coverage and availability of soil maps and properties constructed by DSM (Omuto et al. 2013, Sulaeman et al. 2013).A considerable portion of the most cited papers (data not shown) addresses the evaluation of diff erent models and techniques, with a view to evaluating methodologies that are more adequate for the prediction of soil properties.Confi rming the statements of Arrouays et al. (2017), there is also an increasing tendency to carry out studies on a local or regional scale.This contributes to a higher number of citations because the newly discovered methodologies can be applied in regions with similar characteristics.However, studies on a regional scale not only contribute to a higher number of citations, but also allow a fl ow of the already discovered and tested knowledge, to be disseminated and applied in diff erent regions of the world, generating new information about soils.
Taking into consideration all papers published in the last seven years, 80 studies were found where legacy data was exploited or suggesting its use as a way to feed the demand for DSM input data.2014).This shows that one of the possible limiting factors, with regard to the scope of the papers, is the availability of samples for the training and validation of the generated prediction models.This is due to the need to consider a large amount of samples, limiting DSM research in view of the time and financial resources required for sample collection and analysis.
As the DSM addresses diverse soil information, an overview of the main topics covered in the papers helps identify the most frequent themes dealt with.Figure 2 shows the 15 most frequent keywords in the DSM papers in the survey period, demonstrating that the central theme is the use of "models" for "spatial prediction", mainly "soil organic carbon", soil class mapping (from soil "landscape" relation) or "land use" of the soil.
The keywords also draws attention to the issues of obtaining "information" at appropriate "scales" in the studies, as reported by Arrouays et al. (2017), performing the "digital mapping" of the soil, allowing the formation of "databases" or using information from them.

PERIODIC EVALUATION OF DSM-RELATED PUBLICATIONS
The variability of journals publishing DSMrelated articles is wide.A total of 727 papers was published by 171 journals, and approximately 44% were released in 10 journals (Figure 3).Geoderma published most papers on the subject (159, or 22% of the total number of publications), followed by the Soil Science Society of America Journal (28, or 3.8% of the total) and by the Revista Brasileira de Ciência do Solo (RBCS), with 26, or 3.6% of the total.
Representing not only the high number of publications on DSM, but also the representativeness and relevance of journals for soil science, the indices JCR and SJR reflect the importance and tradition of these journals adequately, with emphasis on Geoderma as one of the main journals in soil science (SJR 1.55, JCR 4.03) (Hartemink 2001).The only journals not represented in these indices are Developments in Soil Science, which was discontinued in 2010, and Geoderma Regional, which is a new journal with an insufficient publication period to calculate the indices.
In spite of the late start of research on DSM in Brazil, the RBCS, the country's leading soil science journal, ranks fifth among the journals with most DSM publications, indicating an increase in DSM research in Brazil.With a considerably higher number of papers than other journals, the RBCS is also the only one in the top 10 with open access to all publications.This is an important aspect of research, since this publication format ensures a more efficient distribution of scientific knowledge than the standard publication model (Martínez-Quintana and Penagos-Corzo 2012).
The restricted availability of the vast majority of articles in closed access journals possibly affects the availability of knowledge to the scientific community, especially in developing countries.In Brazil, the scientific and educational institutions have free access to the largest databases.However, papers provided by scientists and researchers through social networking sites such as Research Gate have a noteworthy influence, facilitating the flow of scientific information (Thelwall and Kousha 2014), increasing the chances of citations and, consequently, raising the bibliometric indices.
Among soil science journals, the RBCS has an outstanding position (Minasny et al. 2013).Nevertheless, RBCS also has one of the lowest impact factors among the top 10 in DSM.In spite of its longstanding tradition in soil science in Brazil, indices are still low, possibly due to the fact that the great majority of articles were published in Portuguese until 2013, when English became the only language of publication (Vargas et al. 2014), facilitating the reading and citation of articles by the international scientific community.However, an analysis of the Eigenfactor Score (63) shows an approximation of the RBCS to other journals with better classification in the other two metrics studied, SJR and JCR, with even higher scores than some other journals.This means that despite the low impact factor, RBCS is cited in influential articles and journals.

EVALUATION OF THE COUNTRIES AND THE POSSIBILITIES OF BRAZIL
The evolution of DSM research in the last years is evident.However, not only the number of publications increased, but also the number of authors involved with the topic from more countries.Not only this increase was observed, but also the continuity of the authors involved in the first studies on DSM (Figure 1).From 1996 to 2007, the highest number of publishing authors were concentrated in the United States, followed by China.The distribution from 1996 to 2017 shows that these countries continue in evidence, beside the emergence of countries in Europe.Aside from Australia and Brazil followed by Netherlands, France and Germany, which are already very wellrepresented by the number of authors, it is worth mentioning the increasing participation of Iran, noted as outstanding in the DSM world scenario.Also noteworthy is the large dissemination of authors in the United States, where, in addition to having a high concentration of authors per area from 1996 to 2017, there is also a good distribution of authors across the country.This may be a result of the great efforts of the United States to harmonize and optimize the use of data and soil maps (Thompson et al. 2012) and get a comprehensive coverage of the country's agricultural land (Lobry de Bruyn et al. 2017).
These countries, whether at the forefront of DSM research or through collaboration in studies in other countries, are directly involved in the development of DMS tools and technical applications.Even though to a lesser extent, several countries in the different continents have a significant concentration of authors, while other countries participated with the publication of at least one paper during the survey period.On the other hand, the absence of DSM researchers in several African countries was noted in both survey periods, since although the soil of a good part of the territory with different properties is already mapped (Hengl et al. 2015), few countries have researchers working in institutions of the continent.
Observing the 10 countries with the highest output of DSM papers (Table I), the information shown in Figure 4 is confirmed.The United States is represented by 133 papers, closely followed by Australia, with 124.In addition to this ample dissemination in the United States, Australia stands out not only for dissemination of research, but also of tools for end users (Minasny and McBratney 2016) because it has a qualified research team that has been making great efforts in the development of new techniques, highlighting this country in this research area.
Also noteworthy are France and The Netherlands, which, even with a lower number of papers, had mean citations per paper of 34.7 and 31.6,respectively.The lowest citation means were found for Germany and China, which have a larger production of papers, together with Belgium and Canada (17.2,16.7,15.9,and 14.3,respectively).When the self-citation of the top 10 countries in DSM was evaluated, Belgium and China obtained the highest percentages (26.9% and 25.7%, respectively).In comparison, Minasny et al. 2013, studying soil science journals, found an average of 12% self-citations.On the other hand, Minasny et al. (2010) also reported that in soil science publications, the countries with the highest selfcitation percentages are China (63%) and the United States (48%), exceeding the values found for DSM publications.The study of self-citations also sheds light on the scientific output of a country, since the more papers a country produces, the more likely it is to cite articles from its own nation, whereas countries with fewer researchers and less published articles are more likely to cite papers from other nations (Minasny et al. 2010).
Evaluating the scientific production in Brazil, an important number of papers and citations was found.Seventy-nine papers were published in the analyzed period, with an average of 24.3 citations per paper, exceeding the number of the United States and China.Despite the delay in starting the application of DSM at the national level which may be related to the later access to software  The good performance of DSM in Brazil is also an expression of the level of Brazilian scientific production, ranking among the world's top 25 countries in scientific quality and first in South America (Nature Index 2017), and of the evident growth of soil science in the country (Trajano et al. 2013).This is a sign of the potential and ability not only to leverage scientific production even more, but also that research can be applied to generate knowledge and information in the country.
Figure 5 shows the distribution of Brazilian institutions investing in DSM research.Most of these are located in the south and southeast of the country, e.g.: agency of the Empresa Brasileira de Pesquisa Agropecuária specialized in soil research, Embrapa Solos, which accounts for 20 papers, followed by the Universidade Federal de Santa Maria, with 14 publications and ESALQ -Universidade de São Paulo, with 12 papers.Other institutions also made significant contributions, such as the Universidade Federal do Rio Grande do Sul (7), Universidade Federal de Lavras ( 7) and Universidade Federal de Santa Catarina (8).
Considering that there are already at least 200 researchers (data not shown) working directly or indirectly with DSM in Brazil, the creation and application of public policies, programs and research projects is fundamental, since this that may not only leverage Brazilian journals, but will increase the recognition and qualification of Brazilian researchers in scientific and social aspects.In Brazil, the implementation of projects such as SSURGO in the United States, where DSM is understood as a tool to map the country's entire arable land (Chaney et al. 2016), could supply the demand for information on Brazilian soils.
In this regard, Brazil already has fruitful initiatives like the Free Brazilian Repository for Open Soil Data (RBLDAS) (www.ufsm.br/febr), an unprecedented initiative that allows soil scientists to publish their datasets.The RBLDAS aims to centralize storage and allows the sharing of all types of soil information in Brazil.In this way, it is possible for soil legacy data to be used in other studies, also increasing collaboration among soil scientists (Samuel-Rosa et al. 2018).This knowledge, which has already been created and development is underway, could help significantly in programs such as "Pronasolos" (Polidoro et al. 2016), a long-term project to obtain information on soil in Brazil.In order to map the entire national territory, the possibility of using sophisticated techniques for high-precision, fine-resolution modelling of soil properties (Zhang et al. 2017) can be considered a renaissance of pedology in Brazil.

CONCLUSIONS
Publications on DSM are increasing at an accelerated pace, with the most significant contributions coming from Australia, the United States, China, Germany, and Brazil.The vast majority of articles was published in Geoderma, but other journals have also been achieving notable success.
The DSM research in Brazil has been gaining a prominent position in the world scenario, not only in the number of papers, but also with good quotation.From the knowledge already generated and the apparent evolution of DSM in Brazil, public policies and financial support could contribute not only to Brazilian research, but also to the social and technological development of the country by participation in programs to obtain soil information in the country.

Figure 1 -
Figure 1 -Evolution of the number of papers and average citations per paper on DSM from 1996 to 2017.

Figure 2 -
Figure 2 -The 15 keywords with highest frequency in DSM papers.

Figure 3 -
Figure 3 -Comparison of the 10 journals with highest output of DSM-related publications between 1996 and 2017.

Figure 4 -
Figure 4 -Geographical distribution of authors of papers on DSM, (a) first decade of analysis from 1996 to 2007 and (b) second decade of analysis from 1996 to 2017.

Figure 5 -
Figure 5 -Brazilian institutions with production of DSM papers.

TABLE I Comparison of the 10 countries with most published DMS papers between 1996 and 2017 and citations, mean citations per paper and self-citations between 2002 and 2017.
Citations: citations of papers of the proper country, also called self-citations, from 2002 to 2017.