Abstract
This article presents a strategy to extract the meaning of words in different contexts, using classification algorithms such as kNN, WiSARD, and 1NN, combined with a robust language model. The main objective is to investigate how the term “archive” is used in journalistic articles and how this usage reflects the value placed on the work of archivists. To achieve this, texts published in the newspaper “A Tribuna” between 2003 and 2017 were analyzed. The adopted method involves the automatic classification of sentences containing the term “archive,” dividing them into eleven categories that represent different interpretations of the term. The research was conducted through a classification algorithm, trained to identify semantic patterns in the sentences. This is a textual data analysis extracted from a digital collection of a periodical, without the direct participation of human subjects. The results indicate that combining the language model with the neural network significantly improves classification performance, surpassing traditional methods in metrics such as precision and recall. Additionally, the analysis showed that the term “archive” is widely used in different contexts by journalists, revealing multiple meanings and highlighting the importance of archivists in the process of organizing and documenting records. The proposed approach shows potential for application in other domains, contributing to the automation of semantic inference and the classification of large volumes of textual data.
Keywords
Contextual meaning; Machine learning; Natural language processing; Semantic classification; Text analysis
Thumbnail
Thumbnail
Thumbnail
Thumbnail
Source: Elaborated by authors (2024).
Source: Elaborated by authors (2024).
Source: Elaborated by authors (2024).
Source: Elaborated by authors (2024).