Reclink: an application for database linkage implementing the probabilistic record linkage method

This paper presents a system for database linkage based on the probabilistic record linkage technique, developed in the C++ language with the Borland C++ Builder version 3.0 programming environment. The system was tested in the linkage of data sources of different sizes, evaluated both in terms of processing time and sensitivity for identifying true record pairs. Significantly less time was spent in record processing when the program was used, as compared to manual processing, especially in situations where larger databases were used. Manual and automatic processes had equivalent sensitivities in situations where we used databases with fewer records. However, as the number of records grew we noticed a clear reduction in the sensitivity of the manual process, but not in the automatic one. Although in its initial stage of development, the system performed well in terms of both processing speed and sensitivity. Although overall performance of algorithms was satisfactory, we intend to evaluate other routines in the attempt to improve the system's performance.

Information Systems; Software; Data Comparability


Escola Nacional de Saúde Pública Sergio Arouca, Fundação Oswaldo Cruz Rua Leopoldo Bulhões, 1480 , 21041-210 Rio de Janeiro RJ Brazil, Tel.:+55 21 2598-2511, Fax: +55 21 2598-2737 / +55 21 2598-2514 - Rio de Janeiro - RJ - Brazil
E-mail: cadernos@ensp.fiocruz.br