On the use of the E-value for sensitivity analysis in epidemiologic studies

This study illustrates the use of a recently developed sensitivity index, the E-value, helpful in strengthening causal inferences in observational epidemiological studies. The E-value aims to determine the minimum required strength of association between an unmeasured confounder and an exposure/ outcome to explain the observed association as non-causal. Such parameter is defined as , where RR is the risk ratio between the exposure and the outcome. Our work illustrates the E-value using observational data from a recently published study on the relationship between indicators of prenatal care adequacy and the outcome low birthweight. The E-value ranged between 1.45 and 5.63 according to the category and prenatal care index evaluated, showing the highest value for the “no prenatal care” category of the GINDEX index and the minimum value for “intermediate prenatal care” of the APNCU index. For “inappropriate prenatal care” (all indexes), the E-value ranged between 2.76 (GINDEX) and 4.99 (APNCU). These findings indicate that only strong confounder/low birthweight associations (more than 400% increased risk) would be able to fully explain the prenatal care vs. low birthweight association observed. The E-value is a useful, intuitive sensitivity analysis tool that may help strengthening causal inferences in epidemiological observational studies. Measures of Association, Exposure, Risk or Outcome; Observational Studies as Topic; Health Care Outcome Assessment Correspondence R. M. V. R. Almeida Programa de Engenharia Biomédica, Instituto Alberto Luiz Coimbra de Pós-graduação e Pesquisa de Engenharia, Universidade Federal do Rio de Janeiro. C.P. 68510, Cidade Universitária, Rio de Janeiro, RJ 21945-970, Brasil. renan@peb.ufrj.br 1 Instituto Alberto Luiz Coimbra de Pós-graduação e Pesquisa de Engenharia, Universidade Federal do Rio de Janeiro, Rio de Janeiro, Brasil. 2 Departamento de Estatística, Universidade Federal Fluminense, Niterói, Brasil. doi: 10.1590/0102-311X00294720 Cad. Saúde Pública 2021; 37(6):e00294720 QUESTÕES METODOLÓGICAS METHODOLOGICAL ISSUES This article is published in Open Access under the Creative Commons Attribution license, which allows use, distribution, and reproduction in any medium, without restrictions, as long as the original work is correctly cited.


Introduction
Sensitivity analyses are commonly used in observational epidemiologic studies to quantify the robustness of an investigated association to unmeasured or uncontrolled confounders 1 . Traditional sensitivity analysis methods estimate the strength of association between the unmeasured confounder and the outcome (RR UD ) and between the unmeasured confounder and the exposure (RR EU ). After specifying these associations, one can calculate the influence of a given pair RR UD and RR EU on the risk ratio between exposure and outcome (RR ED ) ( Figure 1). The confounding factor (B) -maximum relative amount by which unmeasured confounders could reduce an observed and -is calculated as follows 2 : (1) By dividing the observed by B, one obtains the maximum value by which a set of confounding factors could alter the observed RR ED . However, some author express concern about the subjectivity underlying sensitivity parameters choice (RR UD and RR EU ) 1,2 . These parameters also entail simplifications related to unmeasured confounders, such as being defined as a binary variable or requiring the assumption of a single confounder 3,4,5 , which negatively impact the robustness of sensitivity analyses and the causal inferences sought in observational studies.
Seeking to develop a simple and intuitive tool that waves the need for strong assumptions, Ding & VanderWeele 2 proposed a new sensitivity analysis technique for observational studies -the E-value. This tool aims to determine the "...minimum strength of association, on the risk ratio scale, that an unmeasured confounder would need to have with both the treatment and the outcome to fully explain away a specific treatment-outcome association, conditional on the measured covariates" 1 (p. 268). The E-value can be calculated as shown in Equation 2 , Where RR is the risk ratio between exposure and outcome. The E-value is conditional on the measured covariates and calculated based on the risk ratio scale used in the analysis. When the effect measure is the odds ratio (OR) and the outcome is relatively rare (prevalence below 15% in the population), OR can be used in Equation 2, defining the following: (3)

Figure 1
Scheme of a traditional sensitivity analysis (adapted from VanderWeele & Ding 1 ).
RR ED : risk ratio between exposure and outcome; RR EU : measures the strength of the association between exposure and unmeasured confounders; RR UD : the strength of association between unmeasured confounders and outcome.
Cad. Saúde Pública 2021; 37(6):e00294720 Equation 3 is also applicable to a confidence interval (CI) parameter. For cases where the lower limit (LL) of the CI is lower than or equal to "1", the E-value is considered one; otherwise, the E-value is determined as follows: (4) For large E-values, the unmeasured confounder will need a considerable impact to fully explain the effect estimate. Conversely, small values indicate that little impact would already be able to explain the effect estimate, indicating weak causal relations between the study variables.
Next, we will illustrate the use of the E-value using observational data from a recently-published study on the relationship between indicators of prenatal care adequacy and the outcome low birthweight 6 .

Methods
An observational study conducted by Vale et al. 6 used multiple logistic modeling to investigate low birthweight in 368,093 singleton term live births in Rio de Janeiro, Brazil, from 2015 to 2016. Box 1 summarizes the study covariate variables and prenatal care indexes. The E-value was used to determine the minimum strength of association between possible unmeasured confounders and the outcome capable of altering results interpretation. E-values were calculated based on the adjusted OR and the lower limit of the 95% confidence interval (95%CI) of each prenatal care adequacy index (only statistically significant categories). The "Adequate prenatal care" category was used as reference (OR = 1.00). Analyzes were performed using the R Studio v.1.2.5001 (http://www.r-project.org) and the SPSS v.23 (https://www.ibm.com/).

Results
The estimated E-value ranged between 1.45 and 5.63 according to the category and index evaluated (Table 1), showing the highest value for the "no prenatal care" category of the GINDEX index and the minimum value for "intermediate prenatal care" of the adequacy of prenatal care utilization (APNCU) index. For "inappropriate prenatal care" (all indexes), the E-value ranged between 2.76 (GINDEX) and 4.99 (APNCU).

Discussion
Based on the E-value parameter, researchers were able to determine the minimum required association between unmeasured potential confounders and low birthweight for explaining the observed associations. For instance, E-value reached its maximum value (4.99) when APNCU (the most discriminatory index) was considered "inappropriate prenatal care," and only very strong confounders/ low birthweight associations would be able to explain the prenatal care vs. low birthweight association observed.
The E-value method allows these results to be contrasted with association values for known risk factors not included in an traditional analysis. Studies approaching smoking during pregnancy, for example, reported an adjusted OR ranging from 1.23 to 2.63 7,8,9,10,11 -lower than that found by E-value parameters regarding the "inadequate prenatal care" category of all evaluated indexes, except for that of Ciari Jr. et al. 12 and Kessner et al. 13 . This suggests that, alone, smoking during pregnancy is not capable of explaining the prenatal care vs. low birthweight association observed, thus strengthening causal inferences.
Cad. Saúde Pública 2021; 37(6):e00294720 Table 1 E-value parameters calculated as a function of the adjusted odds ratio (OR) in logistic regression models for predicting low birthweight by prenatal care adequacy indexes.

Models
Prenatal Likewise, studies reported adjusted ORs for alcohol and drug use during pregnancy ranging between 1.04 and 1.68 9 , so that similar causal inferences may be made: the consumption of alcohol and drugs during pregnancy per se also would not comprise confounding factors capable of fully explaining the effects estimate.

Limitations
Just as any new metric, the E-value could be potentially misused, and should only be applied provided that researchers have a clear understanding of its scope and limitations. The recent literature on parameters for E-value has pointed out the following caveats for its use 14,15,16,17,18 : (1) The E-value is strictly concerned with the impact of unobserved confounders, evaluating no other biases such as sample bias, selective reporting, or other design flaws. These factors should be considered when interpreting an E-value, so that a good study with a low E-value may produce more reliable results than poorly designed and controlled studies with a high E-value.
(2) The E-value may be less useful in the presence of multiple, possibly interacting unmeasured confounders, in which case "...one should perhaps question whether the data available are in fact adequate to get a reasonable estimate of the causal effect at all (...) it is perhaps time to leave that study data alone and pursue other data sources more adequate", as stated by VanderWeele et al. 17 (p. 4).
(3) Another limitation concerns the assumption of the same value for the confounder x exposure and confounder x outcome association. When this is not the case, more complicated methods were developed for applying E-value-like metrics 1,2 . Using the index under these circumstances is valid upon the assumption that the E-value is a heuristic filter for the total maximum effect of all unknown confounders 14,15,16 .

Conclusion
Many are the available procedures for conducting sensitivity analyses. However, for being considered "...too complicated to describe in reports, (...) too difficult to present, occupy too much space" and given that "reviewers and editors were often unsympathetic and believed that they could not be understood", as emphasized by VanderWeele et al. 16 (p. 131-2), these procedures are not commonly used. Before this scenario, our study illustrated the use of a recently developed sensitivity index: the E-value, an intuitive tool of easy implementation that assemble the toolbox for dealing with causality inferences in nonexperimental settings. "Statistical significance" metrics such as the p-value determines the existence of possible relationships between exposure and outcome, but fails in addressing potential bias arising from unmeasured confounders -to which end the E-value could be used.
Cad. Saúde Pública 2021; 37(6):e00294720 Contributors C. C. R. Vale contributed to data collection, analysis, and writing. N. K. O. Almeida and R. M. V. R. Almeida contributed to the study design, data analysis, and writing. All authors approved the final version for publication.

Conflicts of interest
The authors declare no conflicts of interest related to this study.