Diferenças entre as medidas do índice-h geradas em distintas fontes bibliográfi cas e engenho de busca Differences between h-index measures from different bibliographic sources and search engines

MÉTODOS: Comparou-se a produção científi ca de pesquisadores brasileiros bolsistas 1-A do CNPq, das áreas da saúde coletiva, imunologia e medicina. Os índices-h de cada pesquisador foram estimados com base no Web of Science, Scopus e Google Acadêmico. Foram estimadas as medianas dos índices-h para os grupos de pesquisadores em cada área, e para comparar as diferenças foram usados, de acordo com cada fonte, o teste não paramétrico de KruskalWallis e as comparações múltiplas de Behrens-Fisher.

Growth in scientific production, its importance in economic and social development and the consequent consolidation of science as a public policy object in the second half of the 20 th century brought with it them the need to develop indicators capable of measuring and evaluating the performance of complex scientifi c activities in general, and of its components -researchers and institutions -in particular.In spite of their recognized limitations, bibliometric indicators are the most widely used to evaluate scientifi c activity and its infl uence and impact. 13Bibliometric measurements capable of measuring and qualifying scientifi c productivity are developed to report the performance of researchers, groups and research institutions and to guide the promotion of scientists, fostering research and personnel training.Bibliometrics is a vast fi eld of empirical study and one of the bases of scientometrics.Bibliometric measurements have evolved over time.Initially, they were limited to counting the number of publications.However, in a short time, the number of publications became increasingly less relevant in qualifying the productivity of a researcher if it were not related to some measure of quality, expressed by peer recognition.This quality was translated into bibliometrics through the number of citations obtained in scientifi c publications.However, using the gross number of citations as a measure of the in infl uence of a publication had its limitations and was not always a refl ection of quality. 13A series of indices have been suggested for substituting this method, based on citations.The h-index is the most popular.This index was developed by a physicist interested in producing a measure which, based on citations, reduces the shortcomings related to simply counting them and overcomes the problems of the denominators used in calculating the impact factor.Hirsh 8,9 (2005, 2007) suggested that the h-index was better than other indices used up until now -total number of articles, total number of citations, mean number of citations, number of 'signifi cant' publications -, as it combines the number of citations with the number of citations of commonly cited articles.
The h-index became renowned due to the possibility of using one single measure, which is calculated in a particularly simple way, to characterize the impact of a researcher's scientifi c output.It is calculated based on the descending order of the number of citations of each piece of work by the author (or research group, journal, institution), the h-index being defi ned as the point at which the number of citations correspond to the number of the order.A researcher who has published 50 articles, of which 22 received 22 or more citations, would have an h-index = 22.It is a robust index, as it combines quantity of scientifi c production (number of INTRODUCTION publications) and aspects of their quality or relevance (citations). 8As it has become one of the most commonly used indicators to evaluate scientifi c production, it also has become the object of serious debate on aspects related to bibliometric measures in science. 4ere is transparency on h-index variability compared between different scientifi c areas.Areas with more prolific numbers of publications registered higher h-indices.There is variability when different bibliographic bases or search engines are used to derive the indices. 12,14This is because these bases differ with regards to the coverage of their bibliographic or citation records. 1 For the areas of Social and Human sciences, some databases are less representative as fewer books, reports or conference proceedings are indexed. 15herefore, the choice of database used to calculate the h-index directly infl uences the values found.
Two bibliographic databases stand out for their wide ranging coverage of scientifi c areas and for counting citations: the ISI Web of Science (WoS), with bibliographic records dating back to 1945; and Scopus, created more recently to compete with the former, with records dating from 1960 and, in a more systematic way, from 1996.Recently, Google Scholar (GS) has gained importance, although it is not a bibliographic database like the two former.It is a search engine which uses algorithms to identify scientifi c publications and their citations available on the internet.This characteristic means that Google Scholar embraces a greater diversity of bibliographic productions, including books, seminars, lectures and others.In this article, the term bibliographic database will be used to refer to all three sources, including GS.This article aims to analyze the use of h-index as a measure of bibliographic impact of the scientifi c output of Brazilian researchers.The perspective is to call attention to the use of databases appropriate to the specifi cs of each fi eld of knowledge, highlighting the particularities of Public Health.

METHODS
Three areas were selected for the analysis: Public Health, Immunology and Medicine, which form part of the so-called life sciences, which includes basic sciences, such as Biology, and applied sciences, such as agricultural and health sciences.Immunology was chosen as it is one of the subareas of biological science in which the greatest impact factors of researchers and the journals used for divulging their output are observed.Medicine was selected because it constitutes a range of the areas of health sciences which have the highest numbers of articles and a high impact factor.Public Health showed the greatest internal diversity, including researchers in the subareas of epidemiology, social sciences in health and politics and health care management, as well as having a smaller scientifi c community that Medicine.In spite of this, it stands out with regards to scientifi c output in Brazil.To qualify for a senior grant, the researcher should have completed 15 years, which may or may not be consecutive, with a category 1 grant.In category 1, the researcher falls into one of the four subcategories (A, B, C or D), based on their performance in the last ten years and compared with the performance of their peers.For category 2, their productivity is evaluated on one level only, with emphasis on published work and guidance, both referring to the preceding fi ve years.Category 1-A, the top of the hierarchy, contains researchers who show continued excellence in scientifi c production, in human resources training and who lead consolidated groups of researchers.The choice of this group restricted comparison to researchers with high levels of productivity and scientifi c leadership in each of the three areas in question.The sample grouped together researchers at the same stage of their career, as length of time spend in the profession strongly infl uences the h-index.
The list of researchers was obtained from the CNPq Carlos Chagas platform in April 2011.It included 98 researchers: 20 from Public Health, 59 from Medicine and 19 from Immunology.Access to the WoS and Scopus databases was obtained through the Capes Journal Portal.GS was accessed using the Publish-or-Perish interface, free access software which organizes searches and calculates h-index.b Searches were conducted in each database using the fi eld "name in bibliographic citations" of the selected authors' C.Vs. on the Lattes platform (CNPq).The areas of research and institution of each researcher were checked.Documents found in the bibliographic databases were compared to those referred to in each authors' C.Vs. on the Lattes platform.This check enabled publications by authors with the same name to be excluded, as this could have distorted the results.
The h-indices for each researcher were estimated using the three databases, considering the period covered by each.The total indexed production of each author in the different bases was found.Any distortion with regards to the difference time covered is present in all three areas; therefore, it does not affect comparisons of h-index behavior in the three different areas/subareas in the three databases.This process was carried out blind, i.e., without prior knowledge of the researchers' area, which avoided any bias on the part of the authors of this study.
Means and medians were obtained for each group of researchers by area and by origin of the h-indices estimates.To test the statistical signifi cance of differences between the groups, the Kruskal-Wallis test for two or more groups, equivalent to non-parametric variance analysis, was used to compare the three areas in question using the same database, or the three databases for each area.In those cases in which the result of the Kruskal-Wallis test was signifi cant (p < 0.05), the Behrens-Fisher multiple comparisons test was used, which tests the groups two by two, to ascertain in which groups the difference occurred. 5

RESULTS
The data referring to the Public Health researchers showed a greater range of variation, with more extreme minimum and maximum values (Table 1).In the graphics a, b and c of the Figure, the distribution of the h-index values and their respective medians for the researchers in each area are shown in more detail, generated for the three different sources used in the study (WoS, Scopus and GS).
When GS was used, the researchers from the areas of Public Health and Medicine had h-index medians signifi cantly higher than when using the other two databases (Tables 2 and 3).The three areas did not differ with regards to the h-indices obtained using Scopus and GS, although Immunology had signifi cantly higher medians than the other two areas when using WoS.This difference disappeared when the comparison was made using the Scopus and GS databases.
Among the researchers from Medicine and Public Health, there was a signifi cant increase in the h-indices medians when GS was used compared with the other two databases (23% and 475 higher, respectively).

DISCUSSION
Using GS generates higher h-indices for public Health researchers, followed by those in Medicine, but not for Immunology, compared with h-indices generated by WoS and Scopus.Immunology had signifi cantly higher medians that the other two areas in WoS, probably due to it being a basic science, the articles of which are traditionally published in English and much cited among peers.According to Journal Citation Reports (JCR, da ISI Web of Knowledge), in 2011 there were 139 journals indexed for Immunology and 234 for Public Health, considering the social sciences collection.In spite of this, the citation indices are higher for Immunology.
Results of this nature are relevant, as deciding which base to use for the h-index calculation has implications on the ranking of the researchers and the academic areas.
Researchers in Immunology had similar h-index median values calculated in the three different databases, whereas there were signifi cant differences in those of researchers in Medicine and Public Health estimated using GS compared with WoS and Scopus.This difference was greater than 50% in the case of Public Health.The area of health care, in general, has a greater number of journals published in Brazil which are not indexed or have only recently been indexed in the Scopus and WoS databases.
Using a different approach, Pereira & Bronhara 16 (2011) estimated the h-indices for all active Brazilian lecturers in Post-Graduate Public Health programs in 2009.They used WoS and found a national mean h-index of 3.1, with 29.8% of the lecturers obtaining h-indices of zero.The emergence of GS and interfaces which maximize its use has brought a whole new level to discussions of measures of bibliographic impact of scientific publications.Areas with strong professional components, in which knowledge produced is also published, as it should be, in the native language, vis-à-vis its dissemination to the international scientifi c community, have a different pattern of publication and citation to those areas which are exclusively or predominantly academic.
The differences found in the h-indices from the three sources are related to the particular characteristics of these sources.WoS belongs to Thomson Reuters, the third largest publishing company in the world, which charges for access.It is the most traditional of the bases, and the most commonly used, including in research institutions in Brazil.Scopus, developed by Elsevier, the second largest publishing company in the world, is private and charges for access.In spite of being more recent, it is a strong competitor with WoS.It has a wider coverage of scientifi c journals, including those not in English and published outside of the North American and Western Europe axis.According to information available in the respective portals, in 2012 the Scopus database had 19,500 journals, more than 240 of which were Brazilian.In WoS, of the more than 12 thousand journals indexed, around 240 were Brazilian.GS, in turn, is part of the most popular search engine on the internet.
One of the fi rst authors to compare these three sources highlighted differences between them which need to be better explained. 1Whereas Bar-Ilan 1 found both higher and lower variations among h-indices estimated by GS compared with the other two sources, 1 in this study, GS systematically generated higher results, with statistically signifi cant differences in the cases of Public Health and Medicine.These divergences may be due to the fact that, in the beginning, the indices were estimated directly from GS, without the possibility of excluding citations not referring to the articles, or by authors with the same names.Therefore, the creation of Publish-or-Perish and, more recently, a new interface developed by Google itself (the my citations command in GS) has improved the system, reducing inconsistencies.The current differences in results between h-indices generated in GS compared with the other two databases may better refl ect the real differences between the numbers of publications and citations.
The researcher who developed the Publish-or-Perish software called attention to GS's superiority in estimating h-indices, especially for researchers in applied and social and human sciences, whose scientifi c journals are not well covered in the other two databases used. 7n advantage of GS is that it does not depend on closed commercial databases.As it indexes references and citations available on the internet, GS is open and allows access to a large database which is not indexed in Scopus or WoS.
Search engines also have disadvantages.There is a greater degree of 'rubbish' in the data obtained, i.e., the inclusion of publications and citations of non-scientifi c articles, which brings limitations and means that more care is needed when using it and constructing indices based on it, which means that GS has both supporters and detractors. 6,10The fact that estimating the h-index depends on the number of publications which is a small percentage of the total publications of an active researcher means that it is easier to verify the publications and respective citations which are entered into the index calculation, excluding incorrect mentions.
Knowing the advantages and drawbacks of each of the sources used allows a more productive use of such bibliometric indicators, better exploiting the potential of each.This may avoid their use as a form of academic control or the creation of (false) hierarchies of researchers and research institutions.
Variation in h-index according to the bibliographic source or the search engine used is not an inherent disadvantage to this measure.However, there are drawbacks in the index itself which have been highlighted by various authors.There are two types of criticism of the use of this indicator: one, of a more general character, is related to the use of indices based on citations as a measure of scientifi c impact; the other is related to its specifi cities.
More general criticism states that using citations may be affected by various factors -social, political or geographic -and contests the relationship between 'popularity' for generating a high number of citations and the transparent expression of effective scientifi c quality. 13rom a more radical perspective, some argue that the use of scientifi c measures, especially bibliometrics, is part of a greater project aiming to impose 'quantifi ed control' on academic activities. 3re specifi c criticism of the h-index highlights the fact that it is dependent on time.It is cumulative, and is related to the number of citations, but also the number of publications.An author with ten publications with thousands of citations will never have a h-index higher than 10.This aspect is important, and Hirsch 9 (2007) himself, in the original article, stipulated that the index would serve for evaluating researchers at the same stage in their careers.The h-index is useful for making comparisons between the more productive scientists, who generally have been active in their fi eld for a greater length of time, which justifi es the choice of CNPq 1-A researchers. 2Another disadvantage of the h-index refers to the fact it can be manipulated by self-citation or other mechanisms.
Another relevant aspect that should be mentioned is the variation in h-index between scientifi c areas.A comparison of h-indices of members of the ten scientifi c areas of the Academia de Brasileira de Ciências (Brazilian Academy of Sciences) shows that higher h-index means were calculated using WoS in the areas of Biomedicine, Health and Chemistry (23, 20 and 19, respectively), lower means were found in Earth Sciences, Engineering and Mathematics (9, 8 and 7, respectively) and means of practically zero in the Human Sciences (1). 11 Using the appropriate database for each fi eld of knowledge is critical.This enables a more robust use of the h-index for reporting performance of researchers, groups and research institutions and of promoting scientists, fostering research and training personnel.
Only those who received productivity grant 1-A from the Conselho Nacional de Desenvolvimento Científi co e Tecnológico (CNPq -National Council for Scientifi c and Technological Development) were included in the analysis.The CNPq Productivity in Research grant is aimed at researchers who stand out amongst their peers, valuing scientifi c output according to normative criteria.Those who receive a productivity grant can be divided into three categories: 2, 1 (D, C, B and A) and senior.a In order to apply for a category 2 grant, the researcher should have completed a doctorate at least three years beforehand, or eight years for a category 1 grant.The criteria used to judge grant applications are: (a) the candidate's scientifi c production; (b) human resources training at a Post-Graduate level; (c) scientifi c and technological contribution and contribution to innovation; (d) coordination or participation as a lead researcher in research projects; and (e) participation in editorial and scientifi c management activities as well as administration of institutions and scientifi c and technological centers of excellence.

Figure .
Figure.Estimated h-indices and medians in the Web of Science, Scopus and Google Scholar (calculated by Publish or Perish) databases for 1-A researchers from the National Council for Research and Technological Development (CNPq) in the areas of Public Health, Medicine and Immunology.

Table 1 .
Mean and median of h-indices for CNPq 1-A researchers in the areas of Public Health, Medicine andImmunology estimated from different sources.

Table 2 .
Median of h-indices for CNPq 1-A researchers in the areas of Public Health, Medicine and Immunology estimated from different sources (Web of Science, Scopus and Google Scholar) and (A) p-values from comparing the different sources and (B) p-values from comparing the different areas for each source.
tered citation in WoS than in GS.This occurs because a signifi cant part of the publication in social science is in the form of books or other types of documents, captured by GS, but not indexed in Scopus or WoS.

Table 3 .
Results of multiple comparison tests: comparison between sources for each area and comparison between areas for each source.
a Behrens-Fisher test