Sensitivity analysis for unmeasured confounders using an electronic spreadsheet

In studies assessing the effects of a given exposure variable and a specifi c outcome of interest, confusion may arise from the mistaken impression that the exposure variable is producing the outcome of interest, when in fact the observed effect is due to an existing confounder. However, quantitative techniques are rarely used to determine the potential infl uence of unmeasured confounders. Sensitivity analysis is a statistical technique that allows to quantitatively measuring the impact of an unmeasured confounding variable on the association of interest that is being assessed. The purpose of this study was to make it feasible to apply two sensitivity analysis methods available in the literature, developed by Rosenbaum and Greenland, using an electronic spreadsheet. Thus, it can be easier for researchers to include this quantitative tool in the set of procedures that have been commonly used in the stage of result validation.


DESCRITORES: Interpretação estatística de dados. Análise de sensibilidade. Fatores de confusão (Epidemiologia). Estudos observacionais.
There is much epidemiological interest in establishing causes and relationships.While science is concerned with the frequency, distribution and determination of disease factors, methodological procedures have been developed based on statistical models to identify causes of diseases. 3,4,7However, these models rely on assumptions that frequently cannot be tested through the observed data, that is the discussion of causality addresses the assessment of the validity of the fi ndings obtained in the studies.
A study is considered valid, with a resulting causal interpretation, if it is bias-free, i.e., there are no systematic errors that explain the association found as an alternative to the causal hypothesis. 4  In studies assessing the effects of a given exposure variable and a specifi c outcome of interest, confusion may result from the mistaken impression that the exposure variable produces the outcome of interest when the effect observed is actually due to an existing confounding factor.According to Koopman* (1997), confounding occurs when a non-causal association is observed between the exposure and the outcome of interest in a reference population.Two types of biases resulting from confounding may arise: overt bias, caused by confounders that are measured in the study, and hidden bias, caused by existing unmeasured confounders in the study 5 (1991).
When analyzing observational studies, the measured potential confounders are usually analytically "adjusted" using statistical techniques such as stratifi cation, pairing, among others.However, quantitative techniques are rarely used to determine the potential impact of unmeasured confounders.According to Greenland 2 (1996), the random errors and confounders measured in the data generation process often constitute only a fraction of the total error, and are rarely the only important sources of uncertainty.It is thus convenient to develop and use an appropriate statistical tool that allows a quantitative evaluation of such errors, with the sensitivity analysis being a statistical technique that allows the quantitative measurement of the impact of an unmeasured confounding variable on the association of interest that is being assessed.
Although conceptually well-developed, the two sensitivity analysis methods available in the literature developed by Rosenbaum 6 (1995) and Greenland 2 (1996) require laborious calculations not handled by currently available software programs.However, such methods may be fully applied through an electronic spreadsheet.The purpose of the present study is to make it feasible to apply each of these methods using an electronic spreadsheet in order to make it easier for researchers to include this quantitative tool in the set of procedures that have been commonly used in the stage of result validation.The selection of a spreadsheet is prompted by its widespread use.

SENSITIVITY ANALYSIS METHODS
Rosenbaum 5 and Greenland 2 developed two sensitivity analysis methods applied to dichotomic variables that allow analyses of the behavior of study results in the event of unmeasured confounders.Also known as the external adjustment method, the Greenland method tries to quantify the variation in the association observed in a specifi c study when adjusted for a potential unmeasured confounding variable.The method consists of simulating various plausible values for the confounder prevalences by exposure level, specifi cally in those individuals who do not show the outcome, as well as the magnitude of association between the confounder and the outcome, then calculating an estimate of the association between the exposure and the outcome "adjusted" for the specifi ed confounding variable for each combination studied.
In contrast to the Greenland method, which considers the classic confounding scheme (i.e., the confounder must be associated with the exposure and be an independent predictor of the outcome), the Rosenbaum method works only with the association between the confounder and the exposure.This method quantifi es the magnitude of the association between the unmeasured confounder and the exposure variable required to make the association statistically non-signifi cant.It is found between the exposure and the outcome, assuming that the gap between the confounder and the outcome is enough for the confounding to affect the association between this confounder and the exposure variable.

ELEMENTS AND NOTATIONS FOR THE APPLICATION OF A SENSITIVITY ANALYSIS
To formalize Greenland Table 1 shows the general scheme for presenting the fi ndings obtained in this hypothetical study.
In order to apply the Greenland method, Table 1 should be "stratifi ed" by the unmeasured confounding variable Z, according to the scheme presented in Table 2.
The following magnitudes are now considered: P Z1 : prevalence of the unmeasured confounding variable among exposed individuals; P Z0 : prevalence of the unmeasured confounding variable among non-exposed individuals; OR DE : odds ratio between the outcome and the exposure; OR DZ : odds ratio between the outcome and the confounding variable; OR EZ : odds ratio between the exposure and the confounding variable.
The Greenland method speculates on the plausible values for OR DZ , P Z1 and P Z0 , and, consequently, it speculates on the possible values for the association between E and Z (OR EZ ), because OR EZ is affected by the values of P Z1 and P Z0 , according to the following formula (1).
In order to fi nd the values to complete Table 2, the hypothesis formulated is that the odds ratio between E and Z has the same value for both Z strata (Z is the confounding variable for the association between E and D).Thus, speculating about the plausible values for these three (or four) magnitudes, various OR DE values are obtained and "adjusted" for Z, allowing an analysis of existing variations considered epidemiologically relevant that may point out fi ndings other than those obtained.
On the other hand, the Rosenbaum method 5,6 speculates on the Γ value, the magnitude associating the unmeasured confounder to the exposure which makes the observed association of interest OR DE statistically non-signifi cant.For dichotomic variables, the method is based on the Mantel-Haenszel statistic (T).This is a test statistic normally used in analyses where a third variable is taken into consideration that may "mask" the association found between the exposure and the outcome of interest 1 (1981).It considers the total number of exposed individuals showing the outcome (T = A in the hypothetical case presented in Table 1).The calculation of the expectation and the T variance is carried out on an approximate basis by the normal distribution, establishing the values for the marginal patient totals and the exposed individuals as R and M in Table 1.The expectation expression is given by a second level equation on the null hypothesis that the exposure is not associated with the outcome, obtained by the odds ratio between the exposure and the outcome when equal to the speculated value of the association between the exposure and the unmeasured confounder (Γ).The variance calculation considers the expectation value and the A 1+ , R, M, N and Γ values.Once the expectation and variance values are obtained, the standardized statistical T value (T std ) is calculated and the p-value is obtained for the upper limit.For calculating the lower limit p-value, Γ is replaced by in the odds ratio equation between the exposure and the outcome, and the expectation and variance calculations are reworked.The value sought by this method is the lowest value for Γ, which makes the observed association of interest (OR DE ) statistically non-signifi cant at a 95% confi dence level.The formulas for calculating the expectation, variance and standardized T were developed by Stevens 8 (1951).

Table 1. General scheme of the frequencies observed
Outcome Exposure for exposure and outcome (1 = present; 0 = not present).
Table 2. General scheme (expected data) for the sensitivity analysis (Greenland) of an unmeasured dichotomic variable Z.

Outcome
Unmeasured variable

SENSITIVITY ANALYSIS PERFORMANCE SPREADSHEETS
To make available the two sensitivity analysis methods under consideration, two spreadsheets were developed that allow the calculations to be carried out as required for their application.
Figure 1 shows the spreadsheet for applying the Greenland method.All cells in this spreadsheet should be completed as described in Table 3.The cells C6, C7, E6, and E7 must be fi lled out with the data observed in the study and the magnitudes to be speculated should be entered into cells B12, B13 and B14.Once completed as described in Table 1, all fi ndings will be automatically generated by the spreadsheet.Cells B23 and B24 show the odds ratio between the exposure variable and the outcome, between individuals exposed and not exposed to Z, respectively, adjusted for the speculated values in cells B12, B13 and B14.These values are identical as Z is considered a confounding variable for E and D in the method development.Cell B22 provides the    value of the association between the confounder and the exposure variable (OR EZ ).
In turn, the spreadsheet in Figure 2 (Rosenbaum method) should be completed as described in Table 4.When completing the specifi ed cells, the odds ratio value observed between the exposure and the outcome (OR DZ ) is calculated automatically in cell H6; T expectation values, T variance, T std statistic and p-value for Γ = 1 are automatically calculated in cells B16, C16, G16 and H16, respectively.Cell B20 should be completed with the values to be speculated for Γ, when Γ 1.When completed, T expectation value, T variance, T std statistic and p-value of the upper limit are automatically calculated in cells G24, H24, C27 and D27, respectively.The expectation values and variance required to calculate the p-value for the lower limit are obtained in the same way, replacing Γ value by in cell B20.
If there are two or more strata, the expectations and variances for each stratum should be calculated for each Γ value considered.The T statistic will be given by the sum of the exposed individuals showing the outcome for all the strata and, as the T expectation and T variance, the sum of the expectations and variances for all the strata respectively.After obtaining T statistic, expectation and T variance values, T std values and p-value are calculated for the upper limit, similar to the formulas described in cells C27 and D27 respectively.Once again, for calculating the p-value of the lower limit, the calculations are repeated, replacing Γ by .

EXAMPLE OF A SENSITIVITY ANALYSIS APPLICATION
As an example of the use of the spreadsheets presented, a hypothetical observational study is considered, analyzing the association between the exposure to a factor E and an outcome of interest D, whose fi ndings are presented in Table 5.
In order to verify the behavior of the association found in the presence of a potential unmeasured confounder (Z), it was decided to apply a sensitivity analysis to the observed data.Due to the importance of the two methods available, it is suggested that they be applied in an integrated manner* (2005).Initially, the Rosenbaum method was applied in order to obtain the Γ value making the OR DE value adjusted for the unmeasured confounding variable statistically non-signifi cant.The spreadsheet showed in Figure 2 was used for Γ values equal to 1.0, 1.5, 1.8, 1.9, 2.0 and 3.0, and also for the corresponding values, and the fi ndings are showed in Table 6.
According to Table 6, the lowest Γ value making the adjusted OR DE value statistically non-signifi cant at a 95% signifi cance level is Γ = 1.9.The suggestion is to start the Greenland method by taking the minimum value of Γ in the Rosenbaum method as the initial value for speculating on the value of the association between the exposure variable and the unmeasured confounder Z (OR EZ ).Thus, using the spreadsheet in Figure 1 for the OR ZE values set at 1.9, 2.5, and 3.0, the OR DZ values "speculated" at 3.0, 5.0, 10.0 and 15.0, with P Z1 values varying between 0.1 and 0.9; and the corresponding P Z0 values obtained through formula (2).
Analyzing the fi ndings showed in Table 7, based on a hypothetical study, it can be noted that the variations in the adjusted OR DE values move away from the observed OR DE value (2.57) when the unmeasured confounder increases the chance of exposure by 2.5 and also presenting an odds ratio with the outcome of at least 10.

FINAL CONSIDERATIONS
The two spreadsheets presented in Figures 1 and 2 are intended to provide an operating tool that streamlines the application of a sensitivity analysis by researchers, allowing quantitative measurements of the impact of an unmeasured confounding variable on the association of interest that is being assessed.
The spreadsheets provided are easy to use and allow the immediate application of the Greenland 2 (1996) and Rosenbaum 5 (1995) methods.The Greenland method approach focuses more on the epidemiological elements of the study, while the Rosenbaum method addresses the statistical signifi cance of the fi ndings observed.
As these two approaches are important for observational studies, the example presented suggests a way of integrating these two techniques in order to direct and reduce the number of calculations required for a sensitivity analysis.The calculations presented for these two methods address the exposure, outcome and dichotomic confounder variables.
It should be stressed that with the Greenland method, should it prove necessary to stratify for a measured confounder, the calculations in the method description should be repeated for each stratum, and the fi ndings obtained for each of them should then be merged.Moreover, the spreadsheet provided to apply the Rosenbaum method may be used when the marginal totals for each stratum are large, i.e., when M, N -M, R and N -R are large.Otherwise, other expressions for the exact expectation and variance of the T distribution should be used, which may be found in Rosenbaum 6 (1995).
2 and Rosenbaum6 methods, a hypothetical study is considered where the exposure, outcome and unmeasured confounder variables are defi ned as follows: INTRODUCTION * Koopman NJ.Stratifi cation of exposure-disease relationships upon a third variable and the assessment of joint effects [monografi a na internet].Ann Arbor; 1997.Available at: http://www.sph.umich.edu/group/epid/[Access on 17 May 2006] the total number of exposed individuals showing the outcome C7 Value observed for the total number of exposed individuals not showing the outcome E6 Value observed for the total number of nonexposed individuals showing the outcome E7 Value observed for the total number of nonexposed individuals not showing the outcome B12 Speculated value of the odds ratio between the outcome and the confounder B13 Proportion of the confounder Z among exposed individuals B22 Speculated value of the odds ratio between the exposure and the confounder B9 =(C6*E7)/(E6*C7) C19 =(B12*C6*C20)/(B12*C20+C7-C20)

Figure 1 .
Figure 1.Excel spreadsheet for the application of a sensitivity analysis using the Greenland method.

Figure 2 .
Figure 2. Excel spreadsheet for the application of a sensitivity analysis using the Rosenbaum method.

Table 4 .
Description of spreadsheet cells using the Rosenbaum method, showed in Figure2.

Table 3 .
Description of the spreadsheet cells when using the Greenland method presented in