Gene-environment interactions and preterm birth predictors: A Bayesian network approach

Abstract Preterm birth (PTB) is the main condition related to perinatal morbimortality worldwide. The aim of this study was to identify gene-environment interactions associated with spontaneous PTB or its predictors. We carried out a retrospective case-control study including parental sociodemographic and obstetric data as well as newborn genetic variants of 69 preterm and 61 at term newborns born at a maternity hospital from Tucumán, Argentina, between 2005 and 2010. A data-driven Bayesian network including the main PTB predictors was created where we identified gene-environment interactions. We used logistic regressions to calculate the odds ratios and confidence intervals of the interactions. From the main PTB predictors (nine exposures and six genetic variants) we identified an interaction between low neighbourhood socioeconomic status and rs2074351 (PON1, genotype GG) variant that was associated with an increased risk of toxoplasmosis (odds ratio 12.51, confidence interval 95%: 1.71 - 91.36). The results of this exploratory study suggest that structural social disparities could influence the PTB risk by increasing the frequency of exposures that potentiate the risk associated with individual characteristics such as genetic traits. Future studies with larger sample sizes are necessary to confirm these findings.


Stressful situations
Divorce, lack of family support, death of a family member, changing residence, job loss, domestic violence, and other criminal situations.

Diseases and health complications
The following illnesses and complications prior to or during pregnancy were included: asthma, chronic hypertension, difficulty in conceiving, vaginal bleeding (in the first and second trimester), vaginal discharge, urinary tract infection -UTI-(UTI, symptomatic cystitis, or pyelonephritis), Chagas' disease, toxoplasmosis, anemia, dental treatment (e. g. dental cavities and extractions), periodontal disease, sexually transmitted diseases, lupus, other autoimmune diseases, type 1 and 2 diabetes mellitus, arthritis, psychological disorder, fibroid / myoma, surgery before pregnancy, and major medical procedure during pregnancy (e. g. blood transfusion).

Medications and supplements
The following were considered: iron, magnesium, or folic acid supplement; vitamins; anemia medication; UTI treatment; and medication intake before pregnancy.

Habits and activities
Tobacco smoking before and during pregnancy, passive smoking; alcohol consumption before and during pregnancy; coffee, tea, or mate intake; illicit drugs use; physical activity before and during pregnancy; sexual activity during the last month of pregnancy; hypocaloric diet during pregnancy; and work during pregnancy.

Imputation of individual level variables
Variables with more than 20% missing data were discarded.On the remaining variables we imputed the missing data using decision trees based on resampling aggregation (Kuhn, 2008).

Proportion of neighbourhood households without Unsatisfied Basic Needs
To estimate neighbourhood socioeconomic status, the Unsatisfied Basic Needs (UBN) index from the Argentine 2010 national census was used (Instituto Nacional de Estadística y Censos, 2010).A housing unit has UBN when at least one of the following conditions is present (Feres and Mancero, 2001; Instituto Nacional de Estadística y Censos, 2010): the family resides in a pension, tenancy, hotel, precarious dwelling, or facility not intended for housing purposes; the house does not have a toilet; there are more than three people per room; there is at least one child, 6 to 12 years old, who does not attend to school; in the household, four or more people exist per employed family member and its head member has not completed third grade of primary school (<4 years of schooling).
The percentage of households without UBN linked to each maternal domicile was determined using the census radii.The census radius is a censal unit which represents the smallest territorial entity with available data and comprises 100 to 300 dwellings on average (Instituto Nacional de Estadística y Censos, 2021).The census radii close to each maternal domicile were identified by superimposing 100m Euclidean buffers around the domiciles with a specified street name and number and 500m Euclidean buffers around the estimated vicinity centromere of domiciles that only specified the neighbourhood.For each maternal domicile, the percentage of households without UBN was weighted with the percentage of the overlapping surface of the census radius in the defined buffer.R sf package was used (Pebesma, 2018).

Urban conglomerate
To consider the heterogeneity between urban and rural populations, the general population was stratified according to the number of inhabitants.To do this, the 2010 National Census' centromeres of the census radii were grouped using the Density Based Clustering of Applications with Noise (DBSCAN) algorithm (Instituto Nacional de Estadística y Censos, 2010; Hahsler et al., 2019).We used a minimum number of points equal to 2 and an epsilon of 1000m, which was estimated by analysing the distribution of 2-nearest neighbour distances.The