Mapping cancer , cardiovascular and malaria research in Brazil

This paper presents performance indicators for the Brazilian cancer, cardiovascular and malaria research areas from 1981 to 1995. The data show an increasing number of papers since 1981 and author numbers indicate a continuous growth of the scientific community and suggest an expected impact of scientific activity on biomedical education. The data also characterize cardiovascular research as a well-established area and cancer research as a faster growing consolidating field. The 1989-1994 share of Brazilian articles among world publications shows a growing trend for the cancer (1.61) and cardiovascular (1.59) areas, and a decrease for the malaria area (0.89). The burden of the three diseases on society is contrasted by the small number of consolidated Brazilian research groups, and a questionable balance of thematic activity, especially with regard to malaria. Brazilian periodicals play an important role in increasing the international visibility of science produced in the country. Cancer and cardiovascular research is strongly concentrated in the Southeastern and in Southern regions of Brazil, especially in São Paulo (at least one address from São Paulo in 64.5% of the 962 cancer articles and in 66.9% of the 2250 cardiovascular articles, the second state being Rio de Janeiro with at least one address in 14.1 and 11% of those articles, respectively). Malaria research (468 articles) is more evenly distributed across the country, following the pattern of the endemic distribution of the disease. Surveying these national indicator trends can be useful to establish policies in the decision process about health sciences, medical education and public health. Correspondence


Introduction
Knowledge has steadily increased life expectancy and there is a widespread hope for science-derived continuous improvement of health (1).Health research is becoming increasingly complex and expensive and, since the resources are limited, emphasis has been placed on applied research, raising questions about the adequacy of supporting basic biomedical science.Data from the World Health Organization (WHO) indicate that only 3.4% of the resources spent on health are directed at research and development (R&D) (1).The balance of competing demands on public spending between health care initiatives and biomedical research permeates the political debate and the establishment of priorities will be a central issue in the decades to come.
In developing countries such debate is further complicated by the changing profile in disease burden and an incomplete epidemiological transition (1).These countries face a considerable burden of infectious diseases along with an increasing frequency of chronicdegenerative diseases, a persistent incidence of maternal and perinatal diseases and a large incidence of external causes.Emergent and reemergent infectious diseases add further complexity to this scenario (1).
In Brazil these multiple challenges have to be faced with limited budgets and the support of a relatively small scientific community as compared to the material and human resources available in developed countries.Moreover, changes in the profile of science investment programs have frequently been made without reference to existing capacities or even with no relation to the most pressing health-related problems.Therefore, organized information on epidemiological data as well as on the output of different scientific fields may help map the existing scientific competence and trends in health research areas, and better inform the decision process of defining health priorities and resource allocations.Such information is distributed in different databases and should be retrieved and organized in order to become useful.
Cancer and cardiovascular diseases are among the most prevalent causes of death in Brazil (2).On the other hand, malaria is one of the increasing threats among emergent infectious diseases; its frequency has increased during the last decades and the disease has been a focus of interest for research in Brazil (3).
We present here the results obtained by mapping the cancer, cardiovascular and malaria research areas in Brazil.Such mapping is intended as a resource to assist the scien-tific community, funding agencies and government efforts to concentrate resources on strengthening the national research capacity, and enhancing efficiency to find new means and processes of prevention and treatment of diseases.It is expected that this study may help to adjust the focus of the current research developed in the country on the perceived needs as judged by members of the scientific community, the medical community, and the Ministry of Health.Furthermore, it is also expected that the mapping will help to identify not only the output of articles but also the human resources necessary to support science development and health care initiatives as the parameters needed to analyze the results of funding policies implemented during the last fifteen years.

Material and Methods
Brazilian scientific activity organized by subject field was analyzed by surveying papers published from 1981 to 1995, retrieved from the Institute for Scientific Information (ISI) and the USA National Library of Medicine (Medline) databases, and using key word filters (4)(5)(6).
Information on scientific output from countries like Brazil is usually incomplete in databases like ISI (7).The ISI Science Citation Reports does not include journals or PhD theses written in Portuguese, especially in clinical areas.Furthermore, articles published overseas co-authored by Brazilian scientists, even when supported by government agencies, frequently do not show a Brazilian address.An alternative way to complement information in the health sciences is to retrieve data from the database produced by Medline, which covers a larger set of medical research journals more clinically oriented as well as some written in Portuguese.A major disadvantage of Medline is that it does not record all the addresses listed on a paper, but only that of the first author.There-Mapping cancer, cardiovascular and malaria research fore, Brazilian authors are not always found and the articles retrieved are limited to those where Brazil appears in at least one of the Medline search fields (address, title, abstract, etc.).Furthermore, Medline does not provide information concerning citations of published papers.
Merging Medline with ISI information allowed us to partially overcome some of the problems mentioned above and increased by 25, 35 and 44%, respectively, the number of articles related to cancer, cardiovascular and malaria research in our database.

Thematic filters
The traditional procedure followed to organize filters for subject field analysis is based on the names of specialized journals as categorized by the ISI (5,8).This process, however, is unsatisfactory for thematic analy- The search for malaria-related articles is more difficult because journal titles and the ISI classification do not indicate that the subject of an article is malaria.
In order to select relevant papers from journals other than those directly covering cancer and cardiovascular research and to retrieve malaria-related articles, key word filters were used.The filters were developed following an adaptation of the methods proposed by Lewison (5).
Briefly, an initial set of key words was obtained through a search of the Medline database.The search was carried out using comprehensive words in the following format: heart or cardiovascular and Brazil, for cardiovascular diseases, cancer or neoplasm and Brazil, for cancer, and malaria or Plasmodium and Brazil for malaria.
The words appearing in the titles of the retrieved articles were listed in descending order of frequency.After elimination of words such as the and of, and technical words not specifically related to the themes targeted, the most frequently used words were selected, forming filter version 1.This filter was applied to the titles of ISI articles and the set of retrieved articles was checked to evaluate the recall and precision of the filter (5).
Maximizing both recall and precision led to filter version 2. This set of words and the articles they permitted us to retrieve from ISI were presented to scientists working in cardiovascular, cancer and malaria research for further corrections and suggestions.A version 3 filter was thus generated.
Titles from ISI and Medline were searched using the version 3 filter key words, and the corresponding articles were loaded into .mdbfiles allocated to predefined positions, together with the corresponding bibliometric information.

Loading the data warehouse
A data warehouse was created to collect the information in a relational format, thereby providing data integrity, allowing efficient information retrieval and the establishment of different relationships between parameters.Merging ISI and Medline data requires identifying the same institutions, authors names, and journal titles appearing in each database in different forms so that the system recognizes these information as identical.Brazilian (Portuguese) names are especially error-prone from the point of view of indexing.Silva JB, da Silva JB, and Dasilva JB are different forms used by an author whose name may be João Baptista da Silva.Authors in Brazil, and indexing systems, complicate the search even further since rigid rules of institutional descriptions are not enforced.The University of São Paulo, one of the leading universities in the country, can be found in at least 23 different forms in the ISI.For all of these routines, a middle-ware processing set was developed.Data control included the identification of articles indexed in both databases.The elimination of the duplicate articles was mostly automatic, and about 25% of them, presenting small differences in the titles, required an operator-assisted process in a case by case decision.The information thus processed was loaded onto a data warehouse star set of tables.This relational database was managed using the Access 97/Seagate Crystal Info 6 software running on a Compaq deskpro Windows NT 4.0 machine.

Bibliometric analysis
Original research was identified from the Science Citation Index (SCI) by selecting three types of publications: articles, notes and reviews.Abstracts presented at meetings, discussions and editorials were not included.Only articles were selected from the Medline database.
The bibliometric analysis included quantitative indicators -number of publications and authors; qualitative indicators of the direct impact -number of citations; institutional and geographical mapping of expertise, and journal publication profile.
Citation numbers include only articles obtained from ISI.An equivalent criterion was adopted for proportional comparisons with the life-science area as a whole, be-cause it was not possible to identify and retrieve with confidence the complete set of Brazilian life-science papers from Medline.Therefore, only ISI papers were used to calculate these proportions.
Authors address data are very incomplete in both ISI and Medline.Only 60 to 70% of authors addresses were completely or partially identified.The process of improving information about addresses involved incorporating data from GPESQ, a database organized by the Brazilian National Research Council (9), and required about 30% of operator-assisted corrections during the middle-ware processing.

Articles and authors
The number of cancer-, cardiovascularand malaria-related papers has increased since 1981, the growth being more marked from 86-87 onwards.The increase was faster for articles related to cancer when compared to cardiovascular ones and decreased for the malaria area in 94-95 (Figure 1A-C).
The total number of authors was proportional to the number of papers.Detailed analysis of the authors publishing in each field, organized into five-year periods, e.g., 81-85, 86-90, 91-95, showed informative features.A large population of authors, surrounding authors, appears in only one or two of the periods and consists mainly of Brazilian graduate students and collaborators, both Brazilian and foreign (Table 1).The authors publishing during the three periods, a core scientific community, is probably responsible for the training of the surrounding population of students that are co-authors of the articles published during each period.A qualitative inspection of the list of surrounding authors (data not shown) indicated that it includes also most of those associated with foreign addresses.The fast growth of the number of surrounding authors is probably related to the increasing number of (Brazilian) state and federal supports for graduate students.Official governmental policy since the early 70s has resulted in a ten-fold increase of graduate students during the period covered by the present survey (10).From the data in Table 1 it is possible to speculate that, using the same definitions, the next five-year period (1996)(1997)(1998)(1999)(2000) may exhibit a significant growth of core authors in the three areas studied.Consolidated human resources, as measured by core authors, are still small, although presenting a growing trend.
Cardiovascular-related research can be described as a long-term well-established area, presenting the largest proportion of papers, core and surrounding authors.Although the output in this area is twice as large as that in cancer, the difference has been decreasing gradually (Figure 1).Research in the general area defined here as cancer is better identified as a consolidating field in Brazil, presenting faster growing numbers of articles and authors during the last decade.Such trend may reflect the fact that the growth of molecular biology and immunology during the last two decades was rapidly incorporated in the cancer area while this transition has been slower in cardiovascular research in Brazil.
Even though malaria is a major local health problem, the number of papers and authors in this field was markedly reduced compared to the other two areas.
The comparison of the bibliometric data for a country with the total number of papers produced in the world (i.e., indexed in the same data base) is an important element in the analysis of relative national emphasis on certain research areas.World and Brazilian data concerning total number of papers in each area were obtained using a similar methodology (4,6) (Table 2).The Brazilian contribution in terms of malaria-related papers was larger than in terms of cancer or cardiovascular papers in both 1989 and 1994, e.g., 5.0 and 4.4%, respectively (Table 2).The Brazilian contribution in terms of malariarelated papers, however, was decreased by 11%, while its relative local contribution to cardiovascular and cancer papers grew by 59 and 61%, respectively, during the same period (Table 2).These patterns may reflect the proportion of the epidemiological impact of these diseases.If the number of papers is related to the national science investment policy, it should be pointed out that the Brazilian response to the malaria threat is lagging behind that of other developing countries like India and Thailand, where the incidence of malaria is equivalent to that occurring in Brazil (11).India and Thailand published twice as many malaria-related articles, 122 and 121, respectively (4), as compared to the total publications from Brazil in the same area, i.e., 68 in 1984, 1989 and 1994.The building of a more extensive science base in malariarelated research, therefore, in view of the reemergence of malaria infection in epidemic proportions (3) would require special programs and focused investments since, as shown above, the core base is small and the output in this area is losing the relative importance it once had in the world literature.
The relative importance of these three areas in the health sciences in Brazil is shown Biomedical research and epidemiological indicators in the areas studied can be compared by analyzing their relative proportions among the life sciences publications and disease morbidity and mortality.This correlation can be viewed as an approximate relationship between biomedical research and disease burden.
In order to relate research to disease burden we have defined two parameters: M, representing the relative weight, as percentage, of a particular disease in relation to the total number of cases, and P, representing the relative proportion, as percentage, of articles published in the area in relation to the total number of articles in the life sciences.The P/M ratios, therefore, represent the relationship between research and disease burden in each area.
The shares of cancer and cardiovascular publications have increased since 1981 (Figure 2).The morbidity and mortality Ms of the corresponding diseases have remained relatively constant until 1995 (Table 3A and  B).
The P/M ratios for cancer morbidity and mortality have slowly increased in the last 15 years (Table 3).In the cardiovascular area P/ M ratios are significantly lower than in the cancer area and the rate of increase is only significant for morbidity.Such indicators, our large population (157 millions in 1996; Ref. 12), and the burden of these diseases for society (see Table 3A and B) are in contrast to the small numbers of core biomedical research groups (see above) in Brazil, and a questionable balance of prevailing thematic activity.
The situation with respect to malaria is more worrisome.Deaths from malaria in Brazil are not frequent and were not included in the present analysis (3).The morbidity indices presented in Table 3C were obtained from positive blood testing for malaria (per 100,000 habitants) and cannot be directly compared with morbidity rates for cancer and cardiovascular diseases -obtained from data concerning patient hospitalization.Nevertheless, such numbers were taken as a more reliable approximation for calculating the share of malaria morbidity among the total number of hospitalizations for all diseases, than hospital admissions recorded for this disease.Malaria epidemics doubled in the latter 70s and early 80s, and have reached a plateau since 1991 (Table 3C; Ref. 3).The P/M ratios have remained in the 0.3 range for more than 14 years and, as discussed above (Table 1), the number of core authors in this field is remarkably low, even for Brazilian standards.

P.S. Rodrigues et al.
Table 3 -Morbidity and mortality numbers per 100,000 habitants were used as indicators of disease burden.
Morbidity numbers for cancer and cardiovascular diseases were obtained from data about patient hospitalization, while morbidity numbers for malaria were obtained from positive blood testing for this disease.M: Represents morbidity/mortality for cancer, cardiovascular disease and malaria with the relative weight (%) of each disease within the total disease burden indices.P: Relative proportion (%) of article output in each area to the total number of articles in the life sciences.Cancer: The 1995 indices were 257.2 for morbidity and 63.6 for mortality.Cardiovascular: The 1995 indices were 808.9 for morbidity and 157.0 for mortality.

Impact of scientific output
The impact of Brazilian papers -citations during the 5 years following their publication divided by the respective numbers of papers -are shown in Table 4.The impact of papers on cardiovascular research ranged from 3.3 to 5.4 throughout this period.The 5.4 peak is the result of highly cited papers from 1981 to 1984: seven out of ten, 81 to 95, of the most cited cardiovascular papers were published during the 81-84 period.The impact for papers in the cancer area grew from 3.6 (81-82) to 5.7 (89-90) and presented a peak of 11.2 in the years 83-84, due mostly -67% of the whole set of citations -to only one article.The impact of malaria articles reached its highest values in 83-84 and 85-86, also due to the citations concentrated on two articles published in 1984 and 1986; during the other years the citations were more uniformly distributed among articles and the impact ranged from 3.2 to 3.8.It becomes evident that larger numbers of articles generating life science impact produced more reliable values, and that impact analysis should be undertaken with caution when dealing with the relatively small numbers of research articles on cancer, cardiovascular disease and malaria.Nevertheless, the numbers in Table 4 show a group of highly cited papers published during the early eighties; it also shows a trend toward an increasing impact in the life sciences and in the cancer area.

Mapping cancer, cardiovascular and malaria research
The impact of scientific output can also be tentatively considered from the viewpoint of author number, which can be seen as an indicator of the influence on biomedical education.The existence of a fast growing population of authors restricted to articles appearing in only one five-year period of time (Table 1) is probably related to the increasing number of graduate students co-authoring papers.A large proportion of such authors are thus being incorporated into the biomedical labor force, rather than into a formal scientific career, during or after a period of highly specialized scientific training, which involved the publication of at least one indexed paper.This figure led us to suggest that both article and human resource outputs should be considered when analyzing the performance of Brazilian science.Such numbers may be particularly useful when corre-lating performance with funding and accountability estimates, an analysis frequently based only on publication output.

Journal and article visibility
The distribution of Brazilian publications in the current literature, represented by journals and their visibility as measured by the citations the articles received and the ISI impact factor, is presented in Table 5.The table contains a ranking of 20 journals that either published the highest number of articles or received the highest number of citations during the 1981-1995 period.Citations and the citation/article proportion prevail among international journals, and 40% of journals listed in Table 5 have an ISI impact factor greater than 2.0.Over 30% of the cancer and cardiovascular output appearing in Table 5 journals is published in higher impact publications (³2.0) while less than 10% of the malaria papers appearing in Table 5 are found in such higher impact journals.
The best ISI-rated national publication is the Brazilian Journal of Medical and Biological Research.Furthermore, it is the journal publishing most articles on cancer and cardiovascular research and its contribution to the visibility of national science is higher for cardiovascular than for cancer research.The presence of the Brazilian Journal of Medical and Biological Research in the two P.S. Rodrigues et al.
Table 5 -Ranking of the 20 journals that either published the highest number of articles or collected most citations during the 1981-1995 period.
The ranking is established in descending order of number of articles and citations.Numbers in parenthesis are ISI's impact factor for each Journal for 2 years.Some journals listed below are not indexed by ISI and had no citations or impact factor recorded in the ISI data base.5 shows its importance for the diffusion of Brazilian science, while articles appearing in the Revista Paulista de Medicina and Revista Brasileira de Medicina barely received any citations, although ranking second and third, respectively, in publication numbers.In malaria the presence of Brazilian publications is higher than in the other two areas: among the 7 journals publishing most of the Brazilian articles, 6 are Brazilian publications and the most important ones are those specialized in tropical diseases.Furthermore, in this area, the articles published in these low impact journals receive a fair number of citations (for publication data, ISI and Medline, and for citation data, ISI data only).

Numbers and geographical distribution of articles and authors
Cancer and cardiovascular research is concentrated mainly in the Southeastern and in the Southern regions of the country: 64.5% of the cancer articles have at least one address from São Paulo, 14.1% at least one from Rio de Janeiro, 4.0% from Rio Grande do Sul and 4.1% from Minas Gerais (N = 962).Other state addresses are present in only 11.2% of the papers, and foreign Brazilassociated addresses in 20.9%.In the cardiovascular area São Paulo appears in 66.9% of the articles, Rio de Janeiro in 11%, Rio Grande do Sul in 6.4%, and Minas Gerais in 5.7% (N = 2250).Other states are present in 13.2% of the articles and foreign Brazilassociated addresses in 11.2%.The malaria research output is more evenly distributed across the country.The shares of São Paulo and Rio de Janeiro are closer, 36.1 and 21.6%, respectively (N = 468), probably due to the pioneering work of Fundação Oswaldo Cruz, a leading organization of research on tropical diseases located in Rio de Janeiro.Furthermore, the number of articles with addresses from states containing endemic regions like Amazonas (N = 39), Pará (N = 55), Brasília-DF (N = 36), Mato Grosso (N = 4), Goiás (N = 5), and Roraima (N = 5), corresponds to a 30.8% share much higher than in the other two fields.
Although about 65% of all articles have a São Paulo address, the presence of authors from this state in articles is proportionally smaller (~32%).Furthermore, the proportion frequency in articles per author (B/A) and articles per author (C/A), that may represent productivity, is also more balanced a-mong the Brazilian states (Table 6).These data also emphasize the sizeable contribution to the cardiovascular area by authors from Santa Catarina and especially Espírito Santo, which are states still hosting a relatively small and recently formed scientific community.The experience concerning the formation of these scientific groups can be used in the effort to extend scientific competence throughout the country.The share of malaria authors, in each state, as also observed for articles, follows a more even geographical distribution strongly influenced by the endemic patterns of the disease, and the profile of article output has not changed during the 81-95 period (Figure 3C).Meanwhile the gap between São Paulo and the rest of the country in cardiovascular and cancer research areas has increased starting in the mid-eighties, and is represented in Figure 3A and B by a more accelerated pattern of the article output of the state.This trend probably results from the successful combination of support from Federal Government Agencies and from Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP, State Foundation for Research Support).In Brazil FAPESP is a paradigmatic example of a state funding agency.
The disparities in economic wealth between São Paulo and the rest of the country have raised much political concern in Brazil.These tensions have also permeated the scientific community since the gap of research conditions is progressively frustrating a larger number of Brazilian scientists.Furthermore, if research spending is also intended to enhance the capability to deliver high quality health care to patients throughout the country, then a more equitable distribution of clinical and basic research that takes into account the population needs all over the country may be desirable.

Concluding remarks
The results reported here show that the ISI and Medline databases can be merged to produce reliable performance indicators for the mapping of scientific capabilities and for monitoring scientific activity in Brazil in health-related areas such as cancer, cardiovascular and malaria.
The comparative profile of number of articles presents a positive trend for cancer and cardiovascular research, while indicating that malaria research is lagging behind, Article and author numbers together allow a better accountability of research results.Author numbers grouped in specified periods may help to define the core of the scientific community and also to identify the number of individuals only transiently involved in research projects.These numbers can be used to extend the analyses of the results of scientific activity, frequently based on publication outputs, to its educational impact on individuals not engaged in the scientific career but possibly joining the biomedical labor force, after experiencing a highly specialized scientific training.
The present data also characterize the cardiovascular area as a well-established research area and cancer as a faster growing consolidating field.Both areas present increasing shares of world publications.On the other hand, the burden of these diseases on society is contrasted by the small numbers of consolidated biomedical research groups in Brazil, and a questionable balance of prevailing thematic activity, the situation being more worrisome with regard to malaria.
Brazilian journals play an important role in increasing the international visibility of science produced in the country and therefore should be supported selectively in order to improve opportunities for global collaborative work and raise the interest of international funding agencies in scientific work based in Brazil.
The geography of national publications shows a remarkable trend toward concentrating scientific activity in the cardiovascu-lar and cancer research fields in São Paulo.Such concentration is probably present in many other areas, malaria research being rather an exception than the rule.São Paulos leading role is very probably the result of successful state-based funding policies directed at R&D.This concentration deserves special attention from Brazilian science administrators.Budgetary priorities should be intended to enhance the ability to deliver high quality health care to patients and to provide medical education in the country as a whole, and combine a more equitable distribution of clinical and basic research that takes into account the population needs all over the country, with the ability to support a small number of high quality science centers.
The survey of the trends for performance indicators presented here can be useful to inform the decision process about policies, especially long-term policies directed at health science research, medical education and public health priorities.
ses because Brazilian authors involved in cancer and cardiovascular research publish a large proportion of their papers (N = 751/ 3,212 in our sample) in local general journals such as the Brazilian Journal of Medical and Biological Research (N = 301), Memórias do Instituto Oswaldo Cruz (N = 34), Revista Brasileira de Genética (N = 23), Revista Brasileira de Medicina (N = 118), Revista Paulista de Medicina (N = 58) and Revista de Saúde Pública (N = 61), and/or in international journals not specifically related to cancer or cardiovascular research such as the European Journal of Pharmacology (N = 29), American Journal of Physiology (N = 49), Journal of Urology (N = 27), American Journal of Tropical Medicine and Hygiene (N = 29), and British Journal of Pharmacology (N = 22).

Figure 1 -
Figure 1 -Article output and trends for author numbers.Ordinates: Left -number of authors; right -number of articles.Abscissa: Two-year periods from 82-83 to 94-95 in which author and article numbers were grouped for each research area analyzed: cancer (A), cardiovascular (B) and malaria (C).Data from both ISI and Medline.

Figure 2 -
Figure 2 -Values express the proportion (%) between number of articles on cancer, cardiovascular disease and malaria (ISI data only) and the articles of life sciences as a whole.These articles included all ISI catcodes (categories of specialized or multidisciplinary journals classified by the ISI; more details in Ref. 8) related to life sciences.

Figure 3 -
Figure 3 -Geographical distribution of scientific output trends.Values compare the number of articles with at least one address from São Paulo and the number of articles with at least one address from institutions from other states, from 82-83 to 94-95, in cancer (A), cardiovascular (B) and malaria (C) research fields.Data from both ISI and Medline.

Table 1 -
Authors grouped into five-year periods.Numbers in columns do not overlap.Core authors: Numbers of authors appearing in articles published during the three periods.

Table 2 -
Brazil's share of world publications in cancer, cardiovascular and malaria research.

Table 4 -
Impact factor calculated by counting citations received by articles during the 5 years following their publication and dividing citations by the respective numbers of papers published in the same period.

Table 6 -
Geographical distribution of science output (81-95) in the Southeast and Southern regions of Brazil.States: Southeast and South states.A: Number of authors with the indicated state address; in parenthesesproportion of identified state addresses, as related to total number of authors; B: numbers indicating the frequency at which the respective state address and each author appear in the collected articles; C: number of articles presenting the indicated state address (the same article can be assigned to more than one state; in parentheses -proportions of total number of articles); D: total number of citations received by articles in C from 1981 to 1995.Foreign addresses: Authors and articles associated with foreign addresses.