SciELO - Scientific Electronic Library Online

 
vol.40 issue1Disciplinary integration in information science: unsaid things about this familiar strangerOrganizational proposal of a file of administration for managerial organizations author indexsubject indexarticles search
Home Pagealphabetic serial listing  

Ciência da Informação

Print version ISSN 0100-1965

Abstract

GOMES, Georgia Regina Rodrigues  and  MORAES FILHO, Rubens de Oliveira. Automatic categorization of digital documents. Ci. Inf. [online]. 2011, vol.40, n.1, pp. 68-76. ISSN 0100-1965.  http://dx.doi.org/10.1590/S0100-19652011000100005.

The evolution of information technology and dissemination of digital documents on the Web calls for a mechanism for the organization of such documents in order to facilitate the search and recall processes. In digital libraries or repositories of electronic works, for example, there is a need for tools that will automatically classify documents, since the classification process (categorizations) is done manually. Such a tool will represent an important resource and support for cataloging. This article presents the development of a tool whose chief objective is to categorize digital documents automatically, using pre-established categories, where each document will belong to one or more categories according to its content, thus making the classification of such documents more efficient and also quicker. Techniques and algorithms of text mining were used to develop and validate the tool; also, some categories were defined in the case study, as well as related terms such as: information technology, law and physics.

Keywords : Information technology; Categorization; Digital libraries; Text mining; Digital documents.

        · abstract in Portuguese     · text in Portuguese     · pdf in Portuguese