Evolution of the scientific literature on esophageal cancer from 1945 to 2020: a bibliometric analysis

Abstract The aim of this study was to use bibliometric techniques to provide a longitudinal view of the evolution over more than 50 years of the literature on esophageal cancer without focusing on a specific area. The Web of Science Core Collection database was searched for published articles on esophageal neoplasm. Different aspects of the articles were analyzed - country, journal, authors, keywords, and topics. The search returned 24,215 articles - the journal Diseases of the Esophagus present the most number of manuscripts (n = 858), followed by Annals of Surgical Oncology (n = 475).The most cited article was one by van Hagen et al. (2012) (2,807 citations). The most prevalent topic was oncology (n = 10,448), followed by surgery (n = 4,944). Most articles were original research (n = 22,697), mainly with the basic science study design and published by institutions in China. The analysis of the variables chosen, identified China as the country with the highest number of articles and showed that authors and institutions in Asia stand out when it comes to production of scientific information on esophageal cancer.


INTRODUCTION
Esophageal cancer is the seventh most common and the sixth most fatal type of cancer worldwide (Sung et al. 2021).It is often divided into two histological subtypes: esophageal adenocarcinoma (EAC) and esophageal squamous cell carcinoma (ESCC) (Zulfiqar et al. 2013, Pickens 2022).ESCC is the most frequently occurring subtype in the world, but EAC has been increasingly identified in developed countries (Peters et al. 2021, Lipenga et al. 2021).Alcohol consumption, smoking, Barrett's esophagus, gastric reflux, and obesity are the risk factors for the development of esophageal neoplasia (Lu et al. 2021, Gokulan et al. 2019).Chemoradiation is the treatment of choice for patients considered ineligible for esophagectomy, due to cardiac comorbidities or general clinical frailty (Allum et al. 2014, Wang & Marshall 2021).
Bibliometric analysis evaluates the importance of articles published in a specific research area (Mainwaring et al. 2020).Citation frequency is a measure to determine the impact of the published articles on the scientific community.Bibliometrics is a valuable tool to promote and identify areas with underestimated potential (Akmal et al. 2020, Mörschbächer & Granada 2022).The R-tool Biblioshiny is a software used for bibliometric analysis (Liu et al. 2020).This program it lists and analyzes country networks, institutions, co-citations, journals, and references based on bibliographic records collected from the Web of Science (WoS) (Chang et al. 2015).The aim of this study was to use bibliometric techniques to provide a longitudinal view of the evolution over more than 50 years (1945 to 2020) of the literature on esophageal cancer without focusing on a specific area.

MATERIALS AND METHODS
Data were collected from WoS on September 17, 2021 to avoid biases and discrepancies caused by database updates.WoS is considered the main data source for bibliometric analyses, providing comprehensive and multidisciplinary data on the literature listed (Archambault et al. 2009, Yin et al. 2021, Zhang et al. 2022).To ensure that all relevant articles were identified, the appropriate search terms were compiled and combined as follows: TS = "cancer*" OR "Neoplasm*" OR "Carcinoma*" OR "Adenocarcinoma*" OR "Squamous cell*" and TS= "Esophageal" OR "Oesophageal" OR "Esophagus".
In this study, only articles and reviews written in English and published between 1945 and 2020, were considered.As no experiments on animals or humans were performed, no ethical approval was required.The thematic search strategy employed to collect all documents relevant to the subject was based on the article title.Search accuracy was increased by excluding documents classified as book chapters, notes, editorials, letters, or errata; only journal articles were considered.
The bibliometric information of the selected articles was exported as .CSV (comma-separated values) files using the WoS access of the University of Taquari Valley for subsequent data processing.The main bibliometric indicators analyzed were; (I) the most influential articles on esophageal cancer (total number of citations), (II) country/region, (III) productivity (number of publications), (IV) year of publication, (V) research institution, (VI) source/journal in which the articles were published, (VII) keywords, and (VIII) topic.Standard competition ranking (SCR) was used to rank some variables from one to ten.
The Hirsch index (h-index) and impact factor were used as bibliometric indicators to evaluate research impact and/or quality.Hirsch created the h-index in 2005 to measure the scientific development of authors, journals, institutions, and countries; the index is proportional to research productivity and number of citations-a high h-index indicates that the publication has had a significant impact on the development of the scientific community's knowledge (Hirsch 2005).The quality of each article was measured by the impact factor of the journal that published it (Journal Citation Report ® , JCR) (Thomson Reuters 2020).

Data analysis
Bibliometric mapping and network visualization were developed in the R-tool Biblioshiny (Chang et al. 2015), and the most frequent terms in the titles of the articles were selected.Microsoft Excel 2021 (Microsoft Corporation, Redmond, Washington, USA) was used to analyze quantitative variables such as publication and citation counts and journal citation reports.

RESULTS
The WoS search returned 24,215 articles published since 1945.The ten most influential articles on esophageal cancer (total number of citations) showed 17,243 citations collectively (Table I) in 60% of those articles, the impact of surgery alone was compared with chemotherapy, radiotherapy, combined surgery, and monoclonal antibody use on disease-free survival (Herskovic et al. 1992, Walsh et al. 1996, Mandard et al. 1994, Cooper et al. 1999, Fuchs et al. 2014, Van Hagen et al. 2012).The most cited article (2,807 citations) was one by Van Hagen et al. (2012) published in the New England Journal of Medicine, which compared the use of surgery alone with chemoradiotherapy associated with surgery as a form of treatment for patients with esophageal or esophagogastric junction cancer in the period from 2004 to 2008 it was published by the research group of the Erasmus MC Cancer Institute, located in Rotterdam, the Netherlands.
Analysis of the geographical distribution of scientific production (Figure 1) revealed the following: Seven countries from three continents (Asia, America, and Europe) published approximately 90% of the articles.Of the 105 countries that published articles, 79 (75.23%) presented less than 100, an indication of few groups with high scientific production rates in this field.China presents the highest number of articles (n = 7,720 [31.88%]), followed by the US with 5,119 (21.13%) and Japan with 4,939 (20.39%); whereas some countries in South America, Africa, and Europe had none.Ten authors presented the highest number of articles, 2,181 collectively (Table II).The author name with the most articles was Li, Y., but because there are multiple authors from different institutions named Li, Y., this name was excluded from the analysis.Consequently, Kuwano, H. was the author with the highest number of articles (n = 257 [6.91%]) and cited 7,704 times, followed by Wang, Y. with 248 articles (4.14%) and 4,615 citations and Zhang, Y. with 243 articles (4.01%) and 4,474 citations.
The first articles in this field were by de Nielsen 1945, Clark 1945, Sweet 1945a, Tomlinson & Wilson 1945, Boros 1945, Gaffney 1945, Sweet 1945band Hoover 1945.These articles ranged from case reports to surgical management in the treatment of esophageal cancer.The number of articles continuously increased annually over  were published from 2010 -2020 (Figure 2).Among the institutions with the highest number of articles, it was noted that a majority are located in China (Table III), with Zhengzhou University leading the list with 875 articles, 16,688 citations, and an h-index of 57, followed by the Chinese Academy of Medical Sciences & Peking Union Medical College with 777 articles, 27,418 citations, and an h-index of 80 and the University of Texas System with 657 articles, 30,171 citations, and an h-index of 85.
The journal Diseases of the Esophagus had the highest number of articles (n = 858), totaling 14,237 citations (Table IV), followed by the Annals of Surgical Oncology with 475 articles and 12,136 citations.Although the Cancer journal had the third-highest number of articles (n = 450), it had the highest number of citations (30,424) and, consequently, the highest h-index (90).Furthermore, considering its high impact factor, the International Journal of Cancer (7,396) has demonstrated publication specificity and relevance to the scientific community in the field of esophageal cancer research.
Keywords represent a brief summary of an article, while clustering and multiple correspondence analyses can quickly determine the specialty and the development of research articles in a specific area.These analyses were performed using the R-tool Biblioshiny.The 500 most cited articles in 2020 were selected to build a keyword co-occurrence network.Two main areas were identified in the articles, one associated with the genetic expression of tumors and prognosis, and the other related to different esophageal cancer histological types and treatment response.The word "cancer" was used to represent 119 articles, followed by "expression" (112 articles), "survival" (75 articles), and "proliferation" (54 articles) (Figure 3).
The most representative topic in the articles was oncology (43.15%), followed by surgery (20.42%) (Figure 4).Other topics such as experimental research in medicine and cell biology and pathology were covered in 18.5% of the articles.Of the 24,215 articles, 93.(Herskovic et al. 1992, Walsh et al. 1996, Mandard et al. 1994, Cooper et al. 1999, Fuchs et al. 2014, Van Hagen et al. 2012, Hoover 1945, Lagergren et al. 1999, Devesa et al. 1998).There are two reviews (Table I), in which data on pathogenesis, treatment, diagnosis, prognosis, management, and advances in the treatment of esophageal neoplasia were compiled (Enzinger & Mayer 2003, Pennathur et al.2013).The analysis shows that the number of articles increased annually, with more than half published in the last decade (2011-2020), indicating research advancements in this field of study.
The New England Journal of Medicine and The Lancet published the articles with the highest number of citations.These journals have a high impact factor and are recognized for their influence and quality; thus, their articles are likely to be cited more by the scientific community.Diseases of the Esophagus is the journal with the highest number of articles probably because it specifically publishes articles on pathologies of the esophagus and their etiology, diagnosis, and pharmacological and surgical treatment.
The analysis of institutions, authors, and countries with more articles shows the representativeness of the research conducted in Asia.Among the ten institutions that have conducted more research in this field, eight are in Asia, as are the ten authors with the highest number of citations.Among countries, China presents the highest number of articles.Advances in esophageal cancer research are probably owing to the 35% increase estimated in the number of new cases by 2030.In other words, in less than a decade, China is expected to have around 100,000 new esophageal cancer patients, an increase from 324,000 to 436,000 (IARC 2022, Li et al. 2022).In addition, China is a developing country continuously funding basic research in several areas (Li et al. 2022).
The result of the keyword analysis reflects the areas in which research worldwide was focused in 2020 and emerging areas for future research.Topics such as expression, cancer, metastasis, diagnosis, and survival indicate the need to understand the high mortality of esophageal cancer and identify possible correlations between genetic patterns and resistance to the chemoradiotherapy treatment used.They seek to identify genes or proteins that would be linked to the development of esophageal neoplasia that can serve as biomarkers for screening and diagnosis (Wang et al. 2018, Chu et al. 2020).
However, while bibliometric analysis could be useful for identifying the main topics and publications within a specialty, it can be limited by several types of bias.The use of only one database (WoS), even if widely used, to search for articles on esophageal cancer could have led to omission from the analysis of articles published in journals not indexed in WoS.The search based on the article title could have led to exclusion of articles not containing the keywords for esophageal neoplasia used in the search.Finally, the exclusion of textbooks, lectures, and conference abstracts could have led to a loss of information relevant to the area investigated.These biases could have influenced the accuracy of the results of the bibliometric search.
This study identified the most influential articles on esophageal cancer published from 1945 to 2020.Analysis of the variables chosen identified China as the country with the highest number of articles and showed that authors and institutions in Asia stand out when it comes to production of scientific information on esophageal cancer.
The results of this bibliometric analysis have the potential to inform key stakeholders (academic journals, health policymakers, and funding agencies) about trends and gaps, such as biomarker research, in scientific production on esophageal cancer and guide future research.Thus, these data can help avoid the duplication of research efforts and waste of valuable resources.

Figure 1 .
Figure 1.Geographical distribution of articles on esophageal cancer.Software: Microsoft Excel 2021.
73% are original research and 42.66% are open access.DISCUSSION This is the first bibliometric analysis to verify all existing publications on esophageal cancer.It includes 24,215 articles published from 1945 to 2020, and the most cited article was published in 2012 by van Hagen et al. (2012).Among the articles with the highest number of citations, eight are original studies on epidemiological data, different forms of treatment, and risk

Figure
Figure 2. Growth trend of articles on esophageal cancer from 1945 to 2020.Software: Microsoft Excel 2021.

Figure 4 .
Figure 4. Distribution of articles by topic.Software: Microsoft Excel 2021.

Table I .
The ten most cited articles on esophageal cancer.

Table II .
The most productive authors in the field of esophageal cancer research.

Table III .
The ten most productive institutions in the field of esophageal cancer research.

Table IV .
Distribution of articles on esophageal cancer in journals.