Journal of the Brazilian Computer Society
Print version ISSN 0104-6500
MENA-CHALCO, Jesús Pascual and CESAR JUNIOR, Roberto Marcondes. ScriptLattes: an open-source knowledge extraction system from the Lattes platform. J. Braz. Comp. Soc. [online]. 2009, vol.15, n.4, pp. 31-39. ISSN 0104-6500. http://dx.doi.org/10.1007/BF03194511.
The Lattes platform is the major scientific information system maintained by the National Council for Scientific and Technological Development (CNPq). This platform allows to manage the curricular information of researchers and institutions working in Brazil based on the so called Lattes Curriculum. However, the public information is individually available for each researcher, not providing the automatic creation of reports of several scientific productions for research groups. It is thus difficult to extract and to summarize useful knowledge for medium to large size groups of researchers. This paper describes the design, implementation and experiences with scriptLattes: an open-source system to create academic reports of groups based on curricula of the Lattes Database. The scriptLattes system is composed by the following modules: (a) data selection, (b) data preprocessing, (c) redundancy treatment, (d) collaboration graph generation among group members, (e) research map generation based on geographical information, and (f) automatic report creation of bibliographical, technical and artistic production, and academic supervisions. The system has been extensively tested for a large variety of research groups of Brazilian institutions, and the generated reports have shown an alternative to easily extract knowledge from data in the context of Lattes platform. The source code, usage instructions and examples are available at http://scriptlattes.sourceforge.net/.
Keywords : academic production report; Lattes platform; knowledge discovery.