Acessibilidade / Reportar erro

How to find the most important keywords in a corpus with WordSmith tools

One of the most sensitive issues surrounding a keywords analysis with WordSmith Tools is the selection of a subset of words in a corpus that deserve being looked at in greater detail. This selection is normally needed because the size of the key word list can reach several hundred, up to 1,500 or more. One way to extract a selection consists of the pulling out 'exclusive key words'. This key lexis is made up of keywords that only in a single corpus only, in comparison with a bank of keyword lists. Nevertheless, comparing several keyword lists together is a demanding task, which most users of WordSmith Tools are not expected to cope with. An alternative would be the application of a general cut-off point, established through previous uses of the keyword bank. Such a cut-off point would indicate the section of a keyword list where it would be more likely to find exclusive keywords, with a certain degree of likelihood. The results obtained here suggest that the area corresponding to the top 31% to 53% of a keyword list are more likely to contain exclusive keywords.

Corpora; key words; WordSmith Tools; key lexis


Pontifícia Universidade Católica de São Paulo - PUC-SP PUC-SP - LAEL, Rua Monte Alegre 984, 4B-02, São Paulo, SP 05014-001, Brasil, Tel.: +55 11 3670-8374 - São Paulo - SP - Brazil
E-mail: delta@pucsp.br