evaluation of author ’ s citation-based performance through the relative author superiority index

The aim of this paper is to further explore the recent conversation about the indicators for research evaluation through citation-based indexes. It evaluates the Cuban Biotechnology; Applied Microbiology researchers’ citation-based performance, according to their scientific production in journals of the ISI Web of Science database through the Relative Author Superiority Index. The methodology comprises six steps: (1) preparation of the data; (2) calculation of the Percentile Rank Index for each of the papers; (3) calculation of the Author Superiority Index for each of the authors; (4) Calculation of the Relative Author Superiority Index; (5) Comparison of the Author Superiority Index of each author to their Hirsch (H) and G citation indexes and (6) individual or group evaluation of the citation-based performance. The findings suggest that the group of Cuban researchers in biotechnology achieved a high citation-based performance within the analyzed period. The results show the effectiveness of this index to assess the citation performance of individual or group researchers when the impact factor of the researcher or group under evaluation is not high. In addition, the Relative Author Superiority index could be complementary to other previous indicators such as H-index, G-index or citation counts as it overcomes the limitations of the age of publications, length of the author’s career, and the self-citation problem that are present in other indicators.

Introduction GARFIELD (2014) the founding father of the Intitute for Scientific Information (ISI) Database, states: "Citations are the currency of scholarship".Following this belief, throughout the world, research bodies begun to quantify research quality through citation analysis (FINCH, 2010).Thus, to measure the performance of a researcher using objective measurements has become one of the major challenges in science (GAJENDRA;SINGH, 2009).Evaluating individual research performance is a complex task that ideally examines productivity, scientific impact, and research quality (SAHEL, 2011).At present, the impact of scientific work is traditionally measured by the number of papers written by an author and the number of citations these publications receive (PAN; FORTUNATO, 2014).Thus, the core principle of a citation metrics is the assumption that when an article is cited by another scholar, it has had an impact on its research (NEOPHYTOU, 2014).
The validity of the abovementioned assumption is a matter of debate because there could be many reasons why a scholar might choose to cite another one's work, and those reasons do not always reflect the 'quality' of the cited work.As it is stated in an editorial published in the journal Nature (MEASURING IMPACT, 2011, p.477): "Impact and reputation are more than just an intangible bonus for years of hard work; when considered in tenure and funding decisions they can make or break academic careers".According to this, scientometricians pose the following limitations when measuring the impact of an author through citation measures: (1) The highly skewed distributions of citations (DE BELLIS, 2009;DING et al., 2013;RADICCHI et al., 2008); (2) The age of publications and/ or the length of author's academic career (ANTONAKIS;LALIVE, 2008;JÄRVELIN;PERSSON, 2008) and (3) the self-citation effect (RAD et al., 2012;ZHIVOTOVSKY;KRUTOVSKY, 2008).
Nevertheless, despite the number of criticisms to the impact of citation-based evaluations, it is thus far the most evolved criterion for measuring performance of a scientist or group or even a nation because citation indexes provide a way to measure the extent to which the academic community has engaged with a given piece of research.Interestingly, Thomson Reuters is predicting Nobel Prize winners with reasonably high accuracy based on citation records of scientists (GAJENDRA;SINGH, 2009).
Researchers have suggested many indexes to overcome the abovementioned limitations of citation measures as indicators of scientific impact.Among these indexes, the most cited in the scientific literature is the H-index (HIRSCH, 2005) and its descendants.The second most cited are the G-index (EGGHE, 2013) and its descendants, and the Crown indicator (MOED et al., 1995;MOED, 2010;WALTMAN et al., 2011b), among others.
Finch (2010) suggests that when considering a measurement for fairly assessing the performance of a researcher, we should consider four criteria: first, the citation metrics would have to be unambiguous, so that two calculations of the metric could not reach different results.Second, it would fairly compare authors from different areas or regions.Third, it would take account of time -both the age of the articles and the length of the author's publication career.Finally, the citation metrics must be easily calculated, particularly if it is to be systematically generated for large numbers of researchers.
When analyzing the H, G, or Crown indexes by taking into consideration Finch´s criteria to ascertain selection of a suitable indicator for measuring an author's performance the four indexes fail in one, two or all of these criteria (FINCH, 2010).Furthermore, these indexes are more suitable when the scientific production, the impact factors, or the authors under analysis are high.Citation-based performance measures that accurately evaluate the citation-based performance of authors that publish in average or low impact factor journals are lacking in the literature.Taking these factors into account, the aim of this paper is to assess the citation-based performance of Cuban Biotechnology researchers according to their scientific production in journals indexed to the Web of Science (WoS) database, through the Relative Author Superiority index.

Literature review
Citation counts as an indicator for research impact has its roots in the Anglo Saxon legal system.A court judging a case has to follow precedents laid down by higher courts; thus in citing authorities to back up new arguments, a lawyer has to check if they are still valid and have not been overturned by later sentences (DE BELLIS, 2009).Following this procedure, the Californian citation index of sentences was developed in 1860.
In 1955, a chemist Eugene Gardfield (GARFIELD, 1955) revolutionized research with his concept of citation index and citation searching, which led to the actual Science Citation Index in 1961.In the 1970s, citation counts articles received became an indicator of great relevance in the evaluation of research performance in the field of science.In the literature, there are several metrics to measure an individual's research impact.Many of these indicators are ranking measures that provide quantitative estimates of the relative importance of a scientist within his/her field.
In the 1970s, the use of citation analysis to produce indicators of scientific performance based on citation counts began to attract exceptional attention from policy decision makers because these indicators produced numerical indexes that were easily applied, understood and calculated (DE BELLIS, 2009).
The introduction of citation-based measures to assess the performance of researchers, groups or even countries, fostered a race for the development of unbiased indicators.This long and winding road has witnessed the development and growth of many indexes.The most cited indexes are the H-index, (HIRSCH, 2005) with 5,832 citations in Google Scholar at the moment this paper was being written, and G index (EGGHE, 2013) with 1,332 citations in Google Scholar.

The H-index
According to De Bellis (2009) the H-index is a better predictor of future individual achievements than traditional indicators that compute/calculate total citation counts, mean citations per paper, and number of papers.The H-index combines both quantity and impact in one measure (MINGERS, 2008).Hirsch (2005) defined the index as: "a scientist has index h if h of his/her N papers have at least h citations each and the other (N − h) papers have no more than h citations each".Thus, a researcher with an H-index of 10 means that he/she has 10 publications with at least 10 citations from each paper.The H-index was designed to be applied at a micro level (GLÄNZEL, 2006).
The H-index was welcomed by the scientific community as a corner stone in the advancement of measuring the impact of a given author, subject matter, a journal, institution or even a country.Mingers ( 2008) makes an interesting review of the advantages of this index.Among many others, this author highlights the following: (a) it measures both productivity and influence or impact of papers in a single measure (VÍLCHEZ-ROMÁN, 2014), (b) it is simple to calculate and easily understood (VÍLCHEZ-ROMÁN, 2014), (c) it can be applied at several levels of aggregation: individual, research group, journal or department; (d) any type of research output can be included and it is not affected by outputs with zero citations and it correlates with other standard bibliometric measures.
Although the H-index has been widely accepted it has also been criticized because of its drawbacks.For example, Adams (2014) states that the H-index is innately flawed as it is not responsive to field or career stage, it does not control for the age of documents or citations, and the same group of articles will yield a different H-Index in different databases.About the limitations of H-index Gajendra and Singh (2009) stated that "though H-index is a powerful technique for measuring performance, it has a number of limitations: (a) it depends on number of publications; (b) it is affected by the length of the scientist's career.As Vílchez-Román (2014) pointed out 'the H-index penalize young researchers or those who are at the beginning of their scientific careers'; (c) 'all highly cited papers get equal weight' .Finally, Waltman and Van Eck (2012) concluded the H-index cannot be considered an appropriate indicator of a scientist's overall scientific impact.Another drawback of H-index is that all authors receive full credit each for a multi-authored paper.
To overcome the abovementioned limitations, different authors have suggested a series of variations based on the procedure of the H-index.For example, the H2 index was developed by Kosmulski (2013), the Hw-index by Egghe and Rousseau (2013) et al. (2007), the generalized H-index by Radicchi et al. (2008), Burrell (2007), the rational H-index by Ruane and Tol (2008), the dynamic H-index by Egghe (2007) and the correction factor for H-index by Iglesias and Pecharromán (2007).More recently, Ferrara and Romero (2013) developed the Discounted H-index.

The G-index
The G-index (1,332 citations in Google Scholar) has been the most widely accepted and developed of the H-index variations.It was proposed by Egghe (2013).The G-index was introduced based on the average of highly cited papers (G-index is more or equal to H-index).Two scientists having equal H-index may have a different G-index.The main advantage of this index is that it overcomes the limitation of the H-index of calculating highly cited papers the same as lower-cited ones (GAJENDRA; SINGH, 2009).The G-index favors the highly cited articles.

The Crown-index
The research group of the Centre for Science and Technology Studies.based at Leiden University Moed et al. (1995) developed the Crown indicator (458 citations in Google Scholar).Leydesdorff and Opthof (2011) criticized the remaining problems in the Crown indicator because of its drawbacks based on the statistical procedure used, mainly the use of the mean as measure of central tendency when the median was the appropriate one.Later, the Crown indicator was modified and further developed by (LEYDESDORFF; OPTHOF, 2010;MOED, 2010;WALTMAN, et al., 2011a;WALTMAN, et al., 2011b).Van Raan (2006) presented evidence to back the Crown indicator as an appropriate indicator to measure research performance.The main limitation of this measure is the complexity and the number of operations required to calculate the indicator.
Other authors such as Järvelin and Persson (2008) developed the "Discounted Cumulated Impact for enhancing the sensitivity" measure to the age of the publication and quality of the cited articles.Antonakis and Lalive (2008) created the Index of Quality and Productivity to correct the straight citation counts for scholarly productivity, the age of academic career of authors and the specific citation trends of a field in relation to an expected citation rate.Lundberg (2007) developed the citation-z score to attain the normalization of citation impact at the level of individual publication.
When analyzing all these indexes taking into consideration the four criteria proposed by Finch (2010) for the selection of a single good indicator to measure an author's performance, they fail in one, two or all of these criteria.Furthermore, these indexes are more effective when measuring the citation-based impact of high impact authors or authors that mainly publish their papers in journals with high impact factors.
The Author Superiority index Pudovkin and Garfield (2009) suggested the Percentile Rank Index and Author Superiority Index.The main idea of Pudovkin and Garfield (2009) is that in the fields of science with a low citation intensity (e.g.mathematics, taxonomy, or national publications) some other measures might be more effective.This indicator shows the performance of a targeted author on the background of his/her peers by comparing papers of the target author with the papers of other authors within the same publication in a journal or book.When analyzing the Author Superiority Index (ASI) measure to assess a researcher's performance, Finch (2010) posed that the main limitation of ASI is its dependence on the number of papers.As a solution for this limitation, the Relative ASI (RASI) was introduced.
To measure the citation-based performance of authors from a country in a specific research area we used the Relative Author Superiority Index.The reasons for using this index were as follows: (1) index overcomes the self-citation effect; (2) Relative ASI is an effective measure when authors have a low scientific production and low number of citations.

Methodological procedures
The data for the study consisted of 496 Cuban papers on Biotechnology; Applied Microbiology

The strategy for data retrieval
At present, the most widely used databases for citation analyses are WoS, Scopus and, more recently, Google Scholar.For the present study, the data was retrieved from WoS database.The decision to use the WoS database as the source for the study was based on the reasons posed by Ronda-Pupo et al. (2014).It is the world's leading database in publications and reports of citations Adams and King (2009).Over 20 million researchers from over 3,800 institutions in 98 countries use WOS.This database provides access to information from approximately 8,500 of the most prestigious, influential research journals in the world.WoS produces annual indicators that allow measuring the performance of the countries analyzed that are generally accepted by the scientific community.Finally, WoS includes the necessary fields for obtaining information and creating matrixes of data for quantitative analyses.
The data retrieval process included four steps.First, a general search using the 'advanced search' feature, as follows: CU (Country) = Cuba and WC (WoS Category) = Biotechnology; Applied Microbiology, time spam = 1988 to 2013, inclusive.We set the upper limit of the time frame threshold into 2013 to avoid including recent articles with few citations in the study.
The document type was 'articles' and the citation indexes used was Science Citation Index Expanded, retrieving 685 documents.The records were ranked by the field 'authors' within the citation report window and we selected the top 25 authors with the highest scientific production.The results evidenced the presence of Lotka's law (LOTKA, 1926) because 25 researchers account for 72% of the scientific production.Thus, a sample of these 25 authors was selected to measure their citation-based impact factor.In this selection, the author selected was not necessarily from a Cuban research institution, but plays an important role in the scientific output of Cuban Biotechnology research.
Third, a citation report for each of the 25 authors was created ranking the records from the highest to lowest according to their number of citations.Finally, we calculated the values for the Relative ASI.In the following section, the procedure to calculate the Relative ASI is explained.

Assessing the Relative Author Superiority Index
The Relative Author Superiority Index is based on the methodology developed by Pudovkin and Garfield (2009).For the calculation of this index, a procedure involving a three-step process is required.First, a Percentile Rank Index (PRI) is obtained for each of the individual papers an author has published then, the ASI, which is based on PRI values, is assessed for all the papers published by an author.Finally, the Relative Author Superiority Index that is based on the values of ASI was calculated.

Calculation of the Percentile Rank Index
The PRI for each paper is based on the citation rank of the paper being evaluated among all the papers published in the same journal in the year the paper appeared.This way the drawback of age in previous indexes is solved because the papers are of the same age as the other papers under comparison.This suggestion is also in line with the approach of Pudovkin and Garfield (2004) for characterizing journal impact factors and overcoming problems with differences in citation frequency in different fields of science.
To calculate the PRI, all the papers published by the targeted author were downloaded from the selected database.The average citation rate was then calculated for each author's paper in the journal it had been published by retrieving all the papers published by that journal in the same year.The Percentile Rank Index was calculated using the formula: PRI=(NP-R+1)/N*100 Where NP stands for the number of papers in the year set of the journal; R for the descending citation rank of the paper (among all the papers published in the journal the year the target paper appeared).In case of ties (several papers having the same citation frequency), each of the tied values is assigned the average of the G.A.RONDA-PUPO 196 ranks for the tied set.For example, if a target paper is the most cited paper in a journal in a year, it is PRI = 100.
The expected PRI value draws the line to establish if the article is above or under the expected value for the PRI.The value of the expected PRI was calculated using the formula suggested by PUDOVKIN, et al. (2012).
PRIExp=(50+1/NP*100/2) Where NP stands for the number of papers in the year set of the journal.

Assessing the Author Superiority Index
Once the PRI is calculated, the authors' overall citation performance is obtained through the ASI index.Pudovkin and Garfield (2009) propose three thresholds for ASI; the first threshold is the number of articles with a PRI ≥99, the second for the articles with a PRI ≥95, and the final threshold is the articles with a PRI ≥75.For the present study we added a PRI ≥50 percentile.

Calculation of the Relative Author Superiority Index
When analyzing the ASI measure to assess a researcher's performance, Finch (2010) posed that the main limitation of ASI is that it depends on the number of papers.To overcome this drawback, we created the RASI, which ensures the establishment of the overall citation-based performance of each individual.

RASI≥50=(ASI≥50/NP)
Where ASI ≥50 is the number of papers of the author under evaluation above ASI 50 and NP stands for the number of papers.

Results
Twenty-five Cuban researchers accounted for 72% of the overall Cuban scientific output in Biotechnology.These authors produced 496 papers between 1988 and 2013 in 61 international journals.The top ten out of 61 journals accounted for 58% of the articles while 50% of the journals had less than 3 articles each.The distribution of papers per journal corresponds to Bradford's law (Figure 1).
Table 1 shows the distribution of journals where more than 10 Cuban articles in Biotechnology appeared.These journals from the United States of America, the Netherlands and the United Kingdom accounted for 71% of the papers.31% appeared in journals in the United States of America, 25% in the Netherlands and 14% in the United Kingdom.As for the impact factor quartile in the Journal of Citation Reports of the journals, 33% were from the first quartile, 48% from the second, 16% from the third and 3% from the last quartile.These percentages evidence that the 80% of the articles appear in top-tier journals of the discipline.

Assessing the researcher's citation-based performance Percentile Rank Index
Table 2 shows the values of the indicators of research performance of scholars from the Cuban Biotechnology research group.The expected median value of PRI for the 496 papers is 50.Three hundred and fourteen papers (63%) out of 496 were above the expected median PRI.This shows that these papers are cited above the median of the papers published in the journals they appeared.Of the evaluated authors (23), 92% have an average PRI above the median of expected PRI for their papers.The median PRI value for the 496 papers is 66, which is significantly higher than the expected PRI global median value of 50.Because the variable is not normally distributed, we ran a Mann-Whitney U-Statistic=90921,500, T=278350.500n(small)=496 n(big)=496, p<0.001.The difference in the median values between the two groups is greater than expected and a statistically significant difference was found (p<0.001).
The use of the average PRI adds insight to the analysis.For example, authors with few published papers, such as Martin C., show a higher performance than authors that double his scientific production.That is because 100% of his articles are above ASI≥50 reflecting high quality papers.Perez R., has both high scientific oeuvre and at the same time high quality papers.Seventy seven percentile of his papers are in ASI≥50.In addition, author Delafuente J., who has an average scientific production, has high quality papers with 83% of his articles above ASI≥50.This result suggests that PRI is not as sensitive to the number of articles of an author as the H or G indexes are.In addition, no correlation between the number of papers and PRI exists r s =0.071, p=0.734.This suggests that the PRI focuses on the quality of the papers as defined by the citations they received compared to the rest of the articles published in the same year in the journal in which they appeared.This measure is not sensitive to the age of the publication or the length of the author's career.

198
published.However, his overall citation score is almost 3 times lower than the top author Villalonga R, while his H-index is only nine which is much lower than Villalonga's H-index 16.This researcher is certainly an author of high standing as 20% of his papers are among the most cited ones of the journal the year they were published.

Relative Author Superiority index
The Relative ASI also adds interesting insights.For example, the author Martin C. has a low number of papers, H-index below the median, and a relatively short career (seven years below the median) but his RASI is 100.This suggests that all his papers are above the 50 percentile, or he has a high citation basedperformance.In addition, Fragoso A. and Silva R. are not highly productive authors and their citation rate is below the median but their papers are above the median of the Percentile Rank Indicator.The results corroborate that the relative ASI does not correlate to the number of papers as described by Finch (2010) for ASI.This result suggests that the Relative ASI overcomes this drawback.

Conclusion
The results obtained suggest that Cuban researchers in biotechnology have achieved a high citation-based performance within the period analyzed.The evidence that support this conclusion is as follows: -70% of the 496 papers published in journals of the WoS appeared in top-tier international journals.
-63% of the papers are above the expected median PRI value.
-11% of the articles are from the 95th to 99th percentiles.This means that those articles are among the most cited papers of the journal the year they were published.
-The selected indexes to assess the citationbased performance of a group of researchers of a particular discipline in a country showed that: 1) The PRI is not sensitive to the age of the publication since it analyzes each article individually comparing it with all articles published within the same journal in the same year the article appeared.As this index assesses the citations of the papers within the same journal there is not a bias derived from a difference in the journal impact factor.
2) The Relative ASI indicator depends less directly on citation numbers.Thus, this index possibly reveals authors with a good relative citation standing, when their overall citation score is not high.
According to the results obtained, the Relative ASI is a more effective indicator to assess the citationbased performance for the following reasons: • It correlates high and significantly to H and G indexes and it does not correlate to the number of papers, the length of author career or self-citations.
• It provides insightful information about the citation-based performance of individuals or a group and it is not mutually exclusive to the H, G or Crown indexes but rather complementary to them.
Although the conclusions are based on this particular field of study, they should hold true for other disciplines.According to the results, it is advisable, before using a single indicator to assess an individual's performance, to test if that indicator is sensitive to any of the three variables that could hinder its effectiveness to evaluate the individual's performance.These variables are the length of the scientific career of authors, the number of papers and the number of self-citations.
The main limitations of the RASI index to assess the authors citation-based performance is that it is only useful to evaluate individuals.Its application to assess performance of journals could be limited.Although it is easy to interpret, it is not easy to calculate as it requires data from numerous sources and several calculations.

Practical implication for research evaluation
The findings suggest practical implications for policy making at all decision-making levels.In the present study, the Relative Author Superiority Index is suggested to be very effective for assessing the research performance of a researcher or a group of researchers with an average impact factor in a discipline of a particular country.Furthermore, the use of this indicator may be a useful tool to evaluate the research performance of authors, research groups or fields with low citation behavior.This result has practical implications for assessing the impact of a discipline.For example, the results suggest that when measuring the research performance of individuals with relative low impact factor, the Relative Author Superiority indicator could be a complementary solution for the main limitations of previous indicators such as the H, G indexes.
The results suggest possible future lines of research such as the comparison of the citation-based performance between groups within a research field or from different countries or institutions.
, the tapered H-index by Anderson et al. (2008) the Hi-index by Batista G.A.RONDA-PUPO 194 et al. (2006), the Hm by Schreiber (2008), the Hp by Wan
THE RELATIVE AUTHOR SUPERIORITY INDE INDEX https://doi.org/10.1590/2318-08892017000200006published between 1988 and 2013, included in journals in the WoS database.The search was conducted in March 2014.

Table 1 .
Distribution of journals with 10 or more Cuban papers in Biotechnology.