Prospective cohort study in the early stage of the COVID-19 pandemic, General Pueyrredón, Argentina (INECOVID): infection dynamics and risk factors

: Objective: To establish the magnitude and risk factors for SARS-CoV-2 infection in the General Pueyrredón, Buenos Aires, Argentina: the INECOVID study. Methods: Prospective cohort designed with participants from the District general population. The follow-up period was from June 22 nd to December 18 th , 2020, with a minimum appointment interval of 21 days. Data were obtained via questionnaires and serum or plasma samples. The primary event was considered as the time to seroconversion (IgG) as evidence of SARS-CoV-2 infection. The accumulated risk of infection was estimated using the Kaplan Meier method. Cox models were built with time-dependent variables. Results: 345 participants were recruited (n=222 women, 64.3%; 123 men, 35.7%), with a median age of 45 years in women (Interquartile range: 19) and 49 in men (Interquartile range: 26). 12.8% of participants (n=44) had evidence of SARS-CoV-2 infection [incidence density of 9.1 cases (women: 11.1, men: 5.1) per 10,000 person-day]. 36.4% of the cases (n=16) were asymptomatic. The following factors were associated to the risk of infection: being in close contact of a confirmed COVID-19 case (HR=5.56; 95%CI 2.85–10.83), being a health worker (HR=2.93; 95%CI 1.55–5.52), living in crowded conditions (HR=2.23; 95%CI 1.13–4.49), and age (HR=0.98; 95%CI 0.95–1.00). Conclusion: The identified risk factors endorse the protection policies and protocols adopted by the Argentinian sanitary authorities for the general population and the care programs for health workers in the pre-vaccination phase.


INTRODUCTION
The coronavirus disease 2019 (COVID-19) pandemic caused by severe acute respiratory syndrome (SARS-CoV-2) put health systems in check in much of the world. With a high level of transmissibility, since its inception from the notification of the first confirmed case on December 31 st , 2019, in Wuhan, China, the disease has spread disproportionately globally. Argentina reported its first confirmed case on March 3 rd , 2020 1 .
As of June 22 nd , 2020, the start date of this study, 8,860,331 confirmed cases and 465,740 deaths had been registered globally, these values for the Americas region being 4,370,519 and 221,771 respectively 2 ; the total number of confirmed cases in Argentina was 44,931, with 1,043 deaths 3 . The Argentine government promoted an early suppression strategy with the aim of reducing viral circulation and avoiding the exponential growth of the case curve, in a still pre-epidemic stage. On March 19 th , with 31 confirmed cases, social, preventive, and compulsory isolation (ASPO) was decreed, along with a variety of measures in other sectors, such as the suspension of classes in the educational system.
The objective of the research was to establish the magnitude and risk factors of SARS-CoV-2 infection in the PGP, Buenos Aires, Argentina: the INECOVID study.

METHODS
A prospective cohort design was used. The population was made up of volunteers of any age and gender, with real domicile in the PGP. The PGP is located in the southeast of the province of Buenos Aires, Argentina; its estimated total population for the year 2020 was 656,456 inhabitants 6 .
The study was developed in the context of a pandemic, so its primary interest was to make contributions on clinical-epidemiological aspects that contribute to achieving a timely and effective control of the situation 5 . Volunteers of all age ranges and genders with real domicile in the PGP were eligible for INECOVID, with the possibility of staying 6 months in follow-up and providing their signed informed consent. People with a contraindication for venipuncture were excluded. Participants were recruited from an open call disseminated in press releases, advertisements in local newspapers and radios, social media, and e-mails. In order to include participants of pediatric and adolescent age, children and adolescents were invited to periodically attend to Interzonal Specialized Maternal and Child Hospital "Don Victorio Tetamanti" (Hospital Interzonal Especializado Materno Infantil -HIEMI) for controls for any underlying pathology, in situations in which said controls included a blood draw.
It was proposed to recruit 300 volunteer participants, taking into account local capacities and the availability of resources 5 . The follow-up period was from June 22 nd to December 18 th , 2020. From the beginning of the study, participants were included until they met the established sample; as losses were recorded, new participants were recruited with a deadline of November 16 th (dynamic cohort). Participants attended the INE on several occasions during the follow-up, with a minimum interval of 21 days and shifts previously agreed upon by a member of the team who received their telephone reception, except in the case of children and adolescents from HIEMI.
The information was obtained through a questionnaire specifically designed for this study, administered within the framework of an interview by previously trained interviewers and whole blood samples obtained by venipuncture. The questions included in said questionnaire were previously validated questions, taken from the following data sources of the national public administration: questionnaires from the National Institute of Statistics and Censuses (Instituto Nacional de Estadísticas y Censos -INDEC), questionnaires used in the National Survey of Risk Factors 2018 7 and Record of Notification of suspected cases of COVID-19 8 . The proposal for the incorporation of the ethnic variable into the public health information system in times of COVID-19 was adopted, agreed in the participation process of the pre-census design with the INDEC and the Indigenous Professionals Meshwork 9 . A pilot test of the survey instrument was carried out, which consisted of conducting simulated interviews with INE workers and people from the community, with the sole purpose of evaluating their adaptation. The necessary adjustments were made in the questions in which there were difficulties of interpretation by the interviewee and the interviewer's manual was prepared.
Participants complied with a minimum fast of two hours for blood collection. Serological determinations were carried out in the INE laboratory, under biosafety level II. COVIDAR IgG tests (authorized by ANMAT PM 1545-4) and COVIDAR IgM (COVIDAR IgM, ANMAT PM 1545-5) were used, both co-developed by CONICET, Instituto Leloir, Universidad de San Martín and Laboratorio Lemos SRL 10 . These tests are non-competitive, heterogeneous, immunoenzymatic assays based on an indirect method for the in vitro detection of SARS-CoV-2 specific IgG/IgM antibodies in human serum or plasma samples.
The detection of both isotypes was carried out in parallel. Faced with the detection of IgM-type antibodies, a nasopharyngeal and pharyngeal gold swab sample was taken to perform the real-time polymerase chain reaction test (reverse transcription polymerase chain reaction -RT-PCR) with the aim of ruling out an active infection. In such cases, the participant was isolated, and their close contacts were actively searched, in line with the provisions of the national epidemiological surveillance strategy. In all cases in which IgG was detected, the antibody level was titrated, using serial dilutions in medium, according to the manufacturer's instructions. The titer was reported as the inverse of the last positive dilution, depending on the technique.
The primary event to observe was the time from entry to the cohort (initial event) until the occurrence of seroconversion (measured in specific IgG antibodies for SARS-CoV-2) considered as serological evidence of SARS-CoV-2 infection (final event).
The variables were organized into four blocks: identification of the participant, sociodemographic data, clinical-epidemiological history, and history of symptoms.
Baseline characteristics were described; continuous variables were summarized by means of the median and the interquartile range, the categorical variables from the absolute and relative frequencies.
The incidence density by gender was calculated, using the total time-person at risk. The accumulated risk function was estimated by the Kaplan Meier method, stratifying by the different variables considered fixed: age (46 years old or less /older than 46 years), gender (male/female), level of education (elementary or less/middle school/high school or higher education), health professional (yes/no), number of residents (less than 2 people per room/2 or more people per room), immunosuppressive treatment (yes/no), presence of comorbidities (yes/no). The comparison of the curves was carried out using long-rank tests (p <0.05).
Cox proportional hazards models (PHM) were constructed for time-dependent covariates; Adjusted Hazard Ratio (HR) were estimated for each of the covariates, with a 95% confidence interval (CI). The variables that were significant according to the log-rank test were considered for the construction of the model and the time-dependent covariates were tested: current situation (isolated/essential exits/working/excepted/other) and history of close contact with a confirmed case of COVID-19.
The information recorded in the questionnaires was entered by a trained data entry operator, in a database made in EpiInfo 7.2.4.0. For the processing of the database, the R 3.6.3 11 language was used, functions from the tidyverse 12 , epiR 13

RESULTS
Between June 22 nd and November 16 th , 2020, 345 volunteer participants were recruited from the general PGP population aged 11 months to 81 years (n=222 women, 64.3%; 123 men, 35.7%). The participants were interviewed with the established periodicity; in the fieldwork stage, no missing data were recorded. Losses to follow-up represented 8.0% (n=27) and were due to leaving the study, 91.7% of whom did so at the last visit (n=22).
At baseline, the median age of the participants was 45 years for women [Interquartile range (IQR) 19] and 49 for men (IQR 26). Regarding gender, 3 participants (0.8%) were assigned to the non-binary category. 1.8% of women (n=4) and 3.3% of men (n=4) recognized themselves as descendants of indigenous or Afro-descendant peoples. High School or Higher education was the predominant one in both genders. 95% of the women (n=201) and 90% of the men (n=111) had social, mutual or prepaid insurance coverage prior to the start of the SPCI. The most common comorbidity in both genders was obesity, followed by arterial hypertension (Table 1).
12.8% of the participants (n=44) had a reactive result for IgG antibodies during follow-up, resulting in an incidence density of 9.1 cases per 10,000 people-day (women: 11.1, men: 5.1 cases per 10,000 people-day). 50% of the cases (n=22) were detected between epidemiological weeks (EW) 35 and 46, after entering the city in phase 3, depending on the health situation characterized by an increase in community viral circulation and history of the occurrence of COVID-19 outbreaks in health institutions and long-stay establishments (Figure 1). 36.4% of the participants who showed evidence of infection (n=16) had no symptoms in the immediately preceding period. Among those who had at least one symptom (63.6%, n=28), the most frequent were headaches (40.9%, n=18); myalgia (36.4%, n=16); odynophagia and anosmia (31.8%, n=14); cough and fatigue (29.5%, n=13). Only 22.7% of those with symptoms (n=10) reported fever.
Only 31.8% of the participants with reactive IgG (n=14) had had a swab test; 10 of them (71.4%) obtained a detectable RT-PCR result.
In 17 (38.6%) of the 44 participants who had positive IgG, IgM was detected synchronously with the detection of IgG. In only one case, IgM detection occurred at the visit prior to IgG detection. Regarding the antibody titers found, they ranged between 50 and 12,800, with a median of 400. In this regard, it was observed that in people aged 46 years old or less, only 7 (23.3%) exhibited titers higher than median, while, in those over 46 years of age, this percentage was 57.1%.
None of the participants required hospitalization or died during follow-up.   Figure 2. The factors statistically associated with the probability of seroconversion (log-rank test <0.05) were gender, age, number of residents and being a health professional: females, individuals aged 46 years old or younger, health professional and those living with two or more inhabitants per room had a higher probability of seroconversion.

The cumulative risk of SARS-CoV-2 infection is shown in
In the multiple analysis, it was observed that having a history of close contact with a confirmed case of COVID-19 increased the risk of infection by more than 5 times [HR=5.6 (95%CI 2.9-10.

DISCUSSION
In this early pandemic cohort study, the incidence density was 9.1 cases of SARS-COV-2 infection per 10,000 people-day. On the other hand, being in close contact with a confirmed REV BRAS EPIDEMIOL 2021; 24: E210055  case, being a health professional, living in a crowded environment, and age were the main factors associated with the infection.
The proportion of people with evidence of infection and asymptomatic course found in our study (36.4%) was similar to that found in other investigations: experiences such as that of Vo´ in Italy, conducted in the general population during the 14-day block imposed by the authorities, they found 42.5% of the infections confirmed by SARS-CoV-2 as asymptomatic 16 ; another study carried out in Iceland reported 43% 17 . In an investigation carried out on the Diamond Princess cruise ship, at the time of testing, 46.5% of those who obtained positive results were asymptomatic 18 . Thus, one of the main utilities of serological tests is highlighted: longitudinal testing of a given population makes it possible to estimate the incidence of exposure of the population to the virus, even of those individuals who have had the infection asymptomatically.
Regarding the risk factors for infection by SARS-CoV-2 in this cohort, the history of close contact of a confirmed case of COVID-19, being a health professional and living in a crowded situation were the most relevant: the first was the one with the greatest strength of association, sixfold the risk. Similar results were obtained in a multiple conditional logistic regression analysis on the types of contact: contact at home (OR 6.3) and traveling together by car (OR 7.1) were significantly associated with infection 19 . Already in the early stages of the epidemic in China, a history of close contact with a case of COVID-19 was identified as a risk factor for disease 20-22 , therefore included among outbreak prevention measures 23 .
With regard to the condition of health professionals, from the early stages of the pandemic, concern about the increased risk of infection by SARS-CoV-2 in this subgroup of the population was underlined at the international level, as they constitute the first line of battle in fighting the epidemic. What is observed in this study agrees with that evidenced by research carried out in different countries 24-26 . In our cohort, when the ratio of the number of people to the total number of sleeping rooms in the home was 2 or more people per room, the risk of infection doubled. This is logical, since in these circumstances the possibilities of isolation in the event of a positive case among cohabitants are reduced, undermining the need to reduce the concentration of virus-carrying particles in the air and, consequently, the number of people exposed. These findings are consistent with other studies that found that the number of residents is one of the socioeconomic factors associated with an increased risk of disease from COVID [27][28][29] . Being an aerosol-borne respiratory disease, this link had already been evidenced for diseases such as tuberculosis and influenza 30,31 .
Regarding the differences by gender, in relation to antibody response, there are studies that showed higher titers in women than in men after a serious infection by COVID-19 32 . Although our work indicates a higher probability of seroconversion in women, this risk was not observed in the adjusted analysis.
Although most of the investigations that address the relationship between the presence of comorbidities and COVID-19 have highlighted the impact in terms of disease progression REV BRAS EPIDEMIOL 2021; 24: E210055 and fatal outcomes 33,34 , the pathophysiological mechanisms involved in the increased risk of SARS-CoV-2 infection in patients with comorbidities 35 . The present study did not present significant differences in the risk of becoming ill among participants with comorbidities, including heart disease, diabetes mellitus, asthma/chronic obstructive pulmonary disease (COPD), immunodeficiency, cancer disease, obesity, and chronic kidney disease. The sanitary protection measures implemented in Argentina during the first pandemic year, coinciding with the study period, included among their priorities the protection of vulnerable groups, such as aged people and people with comorbidities. Our hypothesis is that this may have influenced the results observed in this research, as these are groups that have had less mobility and, therefore, a reduced exposure to the virus.
The main strength of our study lies in the extensive follow-up carried out with the participants and in the frequency of measurement, scheduled every 21 days. The former made it possible to monitor the probability of seroconversion in different pandemic scenarios; the latter, adjusting each event to a limited time window. In turn, the dropout rate was low. In this sense, they worked intensively to avoid loss of follow-up, through repeated phone calls and communications via WhatsApp in case of absence from the scheduled shift.
As limitations, participants were not selected by probability sampling, so the results should not be extrapolated to the general population. In relation to the general population of the PGP 36 , in our sample the age group between 35 and 64 years old, women, people with high school or higher education level and those with some type of health coverage (social work, mutual benefit or prepaid) were overrepresented.; while unemployed people were underrepresented. Based on the final number of participants, considering the close contact history as a covariate of interest, as it is time-dependent and based on the HR found, the power calculations ranged between 74.10 and 74.26 37 .
Regarding the serological determinations themselves, the technique used was robust, both for the detection of cases that were positive by RT-PCR, and for those that were self-perceived as asymptomatic. Likewise, a report published by the developers of the assay maintains that the total IgG titers against spike using COVIDAR are highly informative to estimate the neutralizing capacity of these antibodies 38 .
In this study's cohort, people over 46 years of age exhibited a higher proportion of high antibody titers. This is consistent with what was found in another study carried out in our country, with the same technique, where antibody measurements were also correlated with age, showing the highest levels of antibodies associated with older patients. In this same study, synchronous IgG and IgM seroconversion was observed in most cases (72%), pre-IgM seroconversion to IgG in 21% of patients, while IgG appeared before IgM in 7% of patients. These results also go in the same direction as those obtained in our cohort, with the exception that, in our case, we could not reconstruct the latter situation, since at the time of being IgG positive, our participants left the cohort 39 .
In conclusion, the antibody response captured a third of asymptomatic infected persons not detected in the epidemiological surveillance system, which implies that the SARS-CoV-2 infection could be greater than the number of official confirmed cases. The risk factors for infection found in this cohort support the protection policies and protocols adopted by the Argentine health authorities for the general population, as well as the care programs for health workers in the pre-vaccination stage.