Electronic medical records in primary care: management of duplicate records and a contribution to epidemiological studies

Primary health care electronic medical records were analyzedin Rio de Janeiro for two chronic diseases, namely, hypertension and diabetes, in a population-based study with a crosssectional epidemiological design that considered the Rio de Janeiro population enrolled in Family Health Teams. Calculation of the prevalence rate was stratified by gender and age group, and the condition of the disease was measured by family doctors in their visits using the ICD-10.Except for the last two age groups (75-79 years and 80 years and over), with apparent under-registration of the diagnosed cases, a positive association was found between prevalence rates and age in both genders. The generation of objective and reliable statistical information is fundamental for local management, allowing the evaluation of demographic dynamics and the peculiarities of each territory, and assisting in the planning and monitoring of the quality of Rio de Janeiro people’s records registered in each family health unit. Thus, the regular management of duplicate records in the registered user roster is essential to minimize the over-registration of clinical cases reported in the electronic medical records.


introduction: contextualizing the use of electronic records in the health area
Health information technology -the hardware, software, and infrastructure needed to collect, store, and exchange patient information in clinical practice has been changing healthcare in the world 1 . Its implementation involves a complex process with the participation of several technical, human, individual and organizational dimensions 2 . The first step in using the software consists of recording registration and administrative data of all health care providers -units, health professionals of the services employing medical records. Its aspects of usability must also be planned, and should be maximized for gains in adherence, efficiency, and quality of user registration. We are referring to the issue of the need for previous elaboration and validation by managers, doctors, and users of a data dictionary, an important component that must be developed when building an information system containing the description of the layout of variables 3 .
Hendy et al. 4 highlight the long period required to implement electronic health records in the UK experience. In 1998, it was estimated at 10 years. However, after four years, only 3% of the trusts had met the target. The reasons given were funding decentralization to the local level, and the lack of standardized health information technologies. Rozenblum et al. 5 , on the other hand, describe a nationwide Canadian project started in 2001 that aimed to implement a unique information system with interoperable electronic health records. As the main barrier, the authors noted the lack of an e-health policy and engagement of clinical doctors, the definition of centralizing single national interoperability, instead of outlining a regional approach and standards 6 , the difficulty of making the implementation plan more flexible, and the non-use of clinical pay-for-performance indicators. Moreover, it was hard to measure the goals and results with the investment made 7 . Analyzing the rate of implementation of electronic medical records by doctors in the provinces of Canada, Chang & Gupta 8 highlight, first of all, the regional gap in the level of their implementation. They point out that funding in this country is not the main barrier, and the existence of a "super-user" clinical physician, a flexible regional leadership for the implementation, and local support are factors that have contributed to its success.
The literature highlights Denmark's role in successfully achieving this goal. In a compar-ative report on the health systems of Germany, the United Kingdom, France, the Netherlands and Denmark, Freudmann & Studer 9 highlight that the latter country was the one that achieved the best results in the use of electronic medical records that reduced, on average, 50 minutes per day,the completion of paper forms, and increased the number of serviced patients by 10%.
Some authors 10 point out that these records are a broad, low-cost data source, which allows longitudinal comparisons, cohort studies, quality assessment, and clinical trials with large sample sizes 11 , which can be used for health surveillance, particularly for chronic diseases such as diabetes, hypertension, and cardiovascular diseases, contributing in no small part to the morbimortality of the Brazilian population 12,13 . In 2019, the Ministry of Health 1 4 published the results of the most recent edition of the Risk and Protective Factors for Chronic Diseases by Telephone Survey -VIGITEL 2018 15 , highlighting the city of Rio de Janeiro as the federative unit with the highest prevalence of hypertension (31.2%) and diabetes (9.8%), and above the mean of the country's capitals (24.7% and 7.7%, respectively). It is relevant to highlight that data in medical records are sources with greater power and representativeness of the population's health (clinical morbidity) than population surveys such as VIGITEL, which consider the referred morbidity.
In this paper, we considered electronic health records to describe a set of integrated, interoperable and customizable modules in the daily care of primary health care services 16 , covering administrative, registration and clinical bases, with individual actions and group activities, procedures, medical consultations with records as per the International Classification of Diseases (ICD-10), International Classification of Primary Care (ICPC), immunization, home visits, laboratory and exam results, medication and problem lists, accessed by all team professionals that are part of the services. These records can be linked with other databases and using a unique identification code such as the Individual Taxpayer Registration Number (CPF). Furthermore, geographically-wise, a country's postal code 17 and Census data can be used to obtain additional information on social determinants in health, including for academic research 18 .
This paper aims to analyze the electronic medical records of the Family Health Teams in the city of Rio de Janeiro, presenting the process of managing duplicate records, which allowed calculating the prevalence rates of two of the most important chronic diseases to be monitored in PHC, namely, hypertension and diabetes.

Methods
This is a population-based, cross-sectional epidemiological study that considered (i) the administrative bases related to the registration data of the Rio de Janeiro's population with "Family Health Teams" in Municipal Centers or Family Clinics, (ii) the demographic and clinical records of the population of a certain area of the city of Rio de Janeiro, which started the implementation and use of electronic medical records in PHC in the mid-2010s. This area was chosen because it was the first area of the city to universalize the use of electronic medical records, as well as the existence of a Family and Community Medicine Residency Program 19 .
The municipality of Rio de Janeiro is the second most populous in Brazil, the 1229 th in geographical area, and 18 th in population density. Its strategic location in the country and its importance from the historical, political, economic and social viewpoint has been attracting the attention of researchers worldwide, concerning the PHC Reform, implemented since 2009, as highlighted by the then President of WONCA, Professor Amanda Howe 20 .In 2016, the city held the Twenty-First World Conference of Family Doctors (WONCA) 21 and presented its "Primary Healthcare Care Reform" 22 , in which more than four million Rio de Janeiro dwellers started to have 1,244 own teams of primary health care in operation until December 2016 (including Family Clinics launched in the last quarter of 2016, and which still did not contain records in the National Registry of Health Establishments (CNES) in December of the same year), representing around 66% of real coverage between its residents -calculated from the registries of electronic medical records, after management of duplicate records.
From the perspective of the Ministry of Health, the city was, in 2016, the second municipality in Brazil with the most significant number of people potentially covered by Family Health Teams in that period ("potential coverage", whose numerator is calculated from the number of ESF x average number of 3,450 people registered per ESF). However, after 2017, a significant overall reduction was observedin the number of Rio de Janeiro's teams and the number of community workers per team and, consequently, the potential coverage of ESF in September 2019, dropped to 45.3% (882 teams 23 x average number of 3,450 people, divided by 6,718,903 inhabitants, estimated population for 2019, by IBGE).
The units that develop the Primary Health Care actions in the municipality of Rio de Janeiro recorded in the National Registry of Health Establishments (CNES) of the Ministry of Health are the Municipal Health Centers and Family Clinics existing in ten health planning areas (AP), the city's "health districts". They are classified into 'type A' units, where only Family Health Teams (with one family doctor, one nurse, one nursing technician, six community health workers, and one health surveillance worker) exist, and 'type B' units, in which the Family Health Teams coexist with medical professionals from other specialties and other health professionals. All Family Clinics have been implemented as 'type A' units, and we can find 'type A' and 'type B' Municipal Health Centers.
We selected two chronic diseases, namely, diabetes and arterial hypertension to calculate the prevalence rate or point prevalence, as defined by Pereira 24 (considering the 12-month period -July 2014-June 2015). These were stratified by gender and age group. Initially, we are interested in presenting the "duplicate records" management process, which is essential to keep the database of unique identification of users clean and with quality to perform linkages between other clinical databases and individual data. Next, we aimed to evaluate the quality of medical records' coding in the visits performed by them, computing the ICD-10.
This study was approved by the Ethics Committee of the Municipal Health Secretariat of Rio de Janeiro within the scope of the Laboratory of Epidemiology and Health Statistics (LEES) Project for Primary Care and Health Surveillance in the city of Rio de Janeiro. The cadastral baseline with sociodemographic data was consolidated in the first half of 2013, with 171 decentralized face-to-face meetings involving a total of 2,560 community health workers (ACS), directors/managers of the Municipal Centers and Family Clinics. These workshops were developed to qualify the definitive registrations (called "Sheet A") of the population enrolled in the entire municipality of Rio de Ja-neiro. These sheets contained administrative and sociodemographic data, essential to define the list of people responsible for each primary health care team -which had a reference limit of 3,000 people -allowing the construction of a baseline for the "management of duplicate user roster". This management was facilitated when the City of Rio de Janeiro defined the CPF for people over 16 and the number of the Live Birth Certificates (DNV) for people below 16 years of age as the unique identification number of each person.
The Workshops were held after the printing and binding on A3 paper of the list of variables in Sheet A, by micro area and ESF, facilitating each community worker's receipt of a notebook and starting the process of reviewing their list of registered and duplicated users. Thus, one by one, hundreds of thousands of people's records were updated in the electronic medical records over six months, under the supervision of the Health Observatory Network Stations (Rede OT-ICS-RIO/SMS-RJ).
After the realization of these Workshops, the projection carried out brought evidence of approximately 340,000 records of homonymous people with duplicate records; that is, 13.7% of the people registered in the municipality (Table 1).
These records were revised and deleted over the following months, thus expanding the possibility of access and registration of new residents in the areas covered by the ESF. With this set of workshops developed in 2013, it was possible to create a baseline that determined the denominators to be used for the calculation of clinical and epidemiological indicators in the ensuing years.
In this study, we opted to analyze two of these population-based indicators, considering the resident population aged 18 years and over, which totaled 115,280 inhabitants and were monitored by the Family Health Teams in a planning area in the city of Rio de Janeiro.

results and Discussion: the use of electronic medical records in Primary Health care and the management of 'duplicate' records Prevalence of hypertension and diabetes
The analysis of the health situation of a given area or territory is in constant construction/ reconstruction. Thus, it can only be understood if analyzed as a result of a historical process in constant transformation. Ferreira 26 defines the health situation as the knowledge and interpre-tation of the quality of life of the population of a given territory, historically produced and in a permanent process of transformation. Two of the leading chronic diseases monitored by teams in primary health care refer to hypertension and diabetes. Of the total number of people monitored by the Family Health Teams, 4.1% of Rio de Janeiro's dwellers had both morbidities in the period studied. When observed in isolation, hypertension and diabetes rates stood at 16.7% and 9.6%, respectively ( Table 2).
Our population-based findings report higher rates in females in most of the age groups considered, except for the last two age groups (75-79 years and 80 years and over), in which an under-registration of diagnosed cases is probable, with a positive association between prevalence rates and age group, in both genders. Older adults in these age groups have a harder time traveling to health facilities, and this evidence suggests that family health teams could review and intensify the needs for home visits for this population subgroup in the area in question.
The quality of the records that allowed the calculation of the prevalence of hypertension and diabetes for the population in the south region of the city of Rio de Janeiro was facilitated by the importance attributed by resident doctors working in most of the researched health units to the data records for the generation of information for decision-making.
The estimates found by gender and age range allow its use for future studies in the area covered by the Family Health Teams in the south region of the city of Rio de Janeiro. Both our findings and those of the VIGITEL-2018 study evidenced that women had higher prevalence rates. However, for hypertension, the situation presented by the Ministry of Health for the municipality of Rio de Janeiro is quite different: 35.3% vs. 16.7% for the area surveyed in the municipality of Rio de Janeiro, among PHC users. We believe that this occurs mainly because the middle and upper-class population is present in the Brazilian study and, proportionally, less frequent in the study with the Family Health Teams in the city of Rio de Janeiro, as well as the issue of measurement (self-informed vs. registered in medical records).
On the other hand, in the period studied for the municipality of Rio de Janeiro, the management of duplicate records allowed calculating prevalence more appropriately than the traditional way in local health systems in primary care that do not "overhaul" their databases and end   up working with aggregated (ecological) data. Several European countries (Portugal, Spain, the Netherlands) have routines/algorithms for managing duplicate records, and annually update the list of each doctor/family health team, also using computer and probabilistic algorithms to identify duplicate cases and join records.

Final considerations
For the first time in the history of primary health care of the SUS in the city of Rio de Janeiro, this study used registers of electronic medical records to calculate the prevalence of chronic diseases based on population in a city planning area, showing the need for monitoring periodic and regular management of "duplicate user records" in primary health care for the calculation of more reliable health indicators. The results found are relevant to what was expected in the literature, namely, the older the age, the higher the prevalence of hypertension or diabetes, with differences between men and women, thus showing the quality of clinical records.
The generation of objective and reliable statistical information is essential for the management of SUS from micro to macro level, allowing the assessment of demographic dynamics and the particularities of each territory, and assisting in the planning and monitoring of Brazilians registered in each family health unit, with continuous analysis of the list of users to remove citizens with more than one record.
In 2019, the Brazilian Ministry of Health defined some innovations to improve the management of teams, one of which provides that some monitoring indicators will be followed-up in the new federal funding for primary care 27 . Thus, we believe it is essential to prepare management reports/algorithms that allow returning to each Family Health Team their list of active users in the year, and the list of registered users who have not used the services (non-users/regulars) and other information. Only in this way will it be possible to calculate the most appropriate clinical and epidemiological indicators, from the micro (family health team) to the macro (total in the municipality). Thus, finally, after 30 years of implantation of the Unified Health System (SUS) (*) The data consider a specific area of city planning, representative of the municipality and one of the first to implement the use of electronic medical records in all units with family health teams. and after 25 years of the Family Health Strategy, the Federal Government defined that the CPF would be the unique identification number of Brazilian citizens in dozens of public services 28 . We believe that, given the regular administration by the Ministry of Economy of the CPF database (active, inactive, and invalid), this process may progressively favor a significant reduction in the millions of duplicate records of users of the Family Health Teams. After all, if the (conservative) results of the projection observed for the municipality of Rio de Janeiro (13%) are used for the whole country (which must be much worse, given the difficulty of small municipalities in managing the duplicate records), we will find about 20 million duplicate entries in the Family Health Strategy through the CPF alone. If we add to this the duplicates by name (homonyms; linkage of name, mother's name and date of birth) and those registered in more than one health unit or more than one municipality, another million records will be found in this situation.
The redirection of the Ministry of Health towards the qualification of individual records is a virtuous point in this direction, as long as it is accompanied by continuous monitoring of electronic records and support from the State Health Secretariats for small municipalities, with less management capacity.
Expanding access is also enhancing the management of the lists of Brazilian citizens, uniquely identifying them, so that one can know the real statistical denominators for calculating the relevant indicators in each case, starting with the actual coverage of the Family Health Strategy. collaborations LF Pinto and LJ Santos participated equally in all stages of drafting this paper.