Bias of using odds ratio estimates in multinomial logistic regressions to estimate relative risk or prevalence ratio and alternatives

Recent studies have emphasized that there is no justification for using the odds ratio (OR) as an approximation of the relative risk (RR) or prevalence ratio (PR). Erroneous interpretations of the OR as RR or PR must be avoided, as several studies have shown that the OR is not a good approximation for these measures when the outcome is common (> 10%). For multinomial outcomes it is usual to use the multinomial logistic regression. In this context, there are no studies showing the impact of the approximation of the OR in the estimates of RR or PR. This study aimed to present and discuss alternative methods to multinomial logistic regression based upon robust Poisson regression and the log-binomial model. The approaches were compared by simulating various possible scenarios. The results showed that the proposed models have more precise and accurate estimates for the RR or PR than the multinomial logistic regression, as in the case of the binary outcome. Thus also for multinomial outcomes the OR must not be used as an approximation of the RR or PR, since this may lead to incorrect conclusions. Odds Ratio; Prevalence Ratio; Logistic Models; Relative Risk Resumo Recentes trabalhos têm enfatizado que já não há justificativa para o uso da razão de chances (RC) como aproximação do risco relativo (RR) ou razão de prevalência (RP). Deve-se evitar a interpretação equivocada da RC como RR ou RP, pois vários estudos demonstraram que a RC não é uma boa aproximação para tais medidas quando o desfecho é comum (> 10%). Para desfechos multinomiais é usual aplicar a regressão logística multinomial. Nesse contexto, não há estudos demonstrando o impacto da aproximação da RC nas estimativas de RR ou RP. O objetivo deste trabalho é apresentar e discutir métodos alternativos à regressão logística multinomial, baseados na regressão de Poisson e no modelo log-binomial. As abordagens foram comparadas por um estudo de simulação com diversos cenários. Assim como no caso do desfecho binário, os modelos propostos apresentaram estimativas mais precisas e acuradas para o RR ou RP do que a regressão logística multinomial. Então, também para os desfechos multinomiais não se deve utilizar a RC como aproximação do RR ou RP, pois conclusões incorretas podem ocorrer. Razão de Chances; Razão de Prevalências; Modelos Logísticos; Risco Relativo 21 QUESTÕES METODOLÓGICAS METHODOLOGICAL ISSUES http://dx.doi.org/10.1590/0102-311X00077313


Abstract
Recent studies have emphasized that there is no justification for using the odds ratio (OR) as an approximation of the relative risk (RR) or prevalence ratio (PR).Erroneous interpretations of the OR as RR or PR must be avoided, as several studies have shown that the OR is not a good approximation for these measures when the outcome is common (> 10%).For multinomial outcomes it is usual to use the multinomial logistic regression.In this context, there are no studies showing the impact of the approximation of the OR in the estimates of RR or PR.This study aimed to present and discuss alternative methods to multinomial logistic regression based upon robust Poisson regression and the log-binomial model.The approaches were compared by simulating various possible scenarios.The results showed that the proposed models have more precise and accurate estimates for the RR or PR than the multinomial logistic regression, as in the case of the binary outcome.Thus also for multinomial outcomes the OR must not be used as an approximation of the RR or PR, since this may lead to incorrect conclusions.
Odds Ratio; Prevalence Ratio; Logistic Models; Relative Risk

Introduction
In clinical and epidemiological studies one often wishes to estimate the effects of factors in binary outcomes.For these cases, when one is interested in estimating the relative risk (RR) or prevalence ratio (PR), it has already been well established that the logistic regression is not the most suitable statistical analysis, particularly when the outcome is common (> 10%).Several authors have shown that the odds ratio (OR) obtained by the logistic model can underestimate or overestimate the RR or the PR 1,2,3,4 .Alternatives have also been suggested and discussed, such as the use of the Poisson regression with robust variance and the log-binomial model 1,3,4,5,6,7,8,9,10 .
Although recommended methods seem to be well established for dichotomous outcomes, there is still little discussion concerning which method should be used to estimate the RR or the PR for outcomes with three or more categories.In such circumstances, one usually uses the multinomial logistic regression which, unlike the binary logistic model, estimates the OR, which is then used as an approximation of the RR or the PR.
Blizzard & Hosmer 11 proposed the log-multinomial regression model, which directly estimates the RR or PR when the outcome is multinomial.To evaluate whether this leads to an underestimation or overestimation of the RR or PR when we use the OR obtained through the multinomial logistic regression, ideally one would compare the results obtained using this method with those obtained using the log-multinomial model.However, we are not aware of the availability of a computational routine for this model.Moreover, Blizzard and Hosmer 11 concluded that separate robust Poisson regressions produce estimates very similar to those of the log-multinomial model.
Therefore, this work aims to study through simulation the accuracy and precision of OR, obtained through the multinomial logistic regression, to estimate RR and to compare it with the RR estimated by separate robust Poisson regressions and those estimated by the separate log-binomial models.The separate log-binomial models have been included in the study as they have proved to be the most suitable for estimating the RR or the PR for binary outcomes.Another reason for choosing the robust Poisson regression and the log-binomial model is that both of them are implemented in various statistical programs.

Methodology
It is well known that RR and PR have very different epidemiological meanings, but the mathematical definitions are equivalent.The accuracy and precision of the model to estimate the RR are the same as for estimating the PR, and so henceforth we will only refer to the RR.
In order to illustrate the three techniques evaluated, without loss of generality, we will consider a multinomial outcome Y with the categories A, B and C, and a binary exposure factor X (yes x no).We are interested in estimating the RR for the exposure X with the categories B (RR B ) and C (RR C ) of the outcome and we are not interested in the association with category A.
The first method studied in this work was the multinomial logistic regression.The definition of the model and details about the properties of their estimators can be found in Hosmer & Lemeshow 12 .To adjust it, we use the library VGAM of program R, version 2.9.0 (The R Foundation for Statistical Computing, Vienna, Austria; http:// www.r-project.org) 13 .The syntax for adjusting the model is: As in Blizzard & Hosmer 11 , the use of separate Poisson regressions, according to the second method evaluated, consists of creating binary outcomes from the original multinomial outcome, and then fitting separate robust Poisson regressions between each binary outcome and the factors.For the aforementioned outcome Y, we create the dichotomous variables Y B and Y C as follows: To fit the models we used the libraries lmtest 14 and sandwich 15,16

Simulation
In the simulation study we considered an outcome (Y) with three categories (A, B and C) and a binary exposure factor (X).An RR as established between exposed and non-exposed individuals equal to 1.5 for the outcome B (RR B = 1.5) and equal to 2 for the outcome C (RR C = 2).The simulated scenarios combined three probabilities of exposure, say P(X=+), equal to 0.1, 0.5 and 0.9 and three marginal distributions for the outcome Y: P(Y=A) = 0.90, P(Y=B) = 0.05 and P(Y=C) = 0.05; P(Y=A) = 0.60, P(Y=B) = 0.35 and P(Y=C) = 0.05, and P(Y=A) = 0.4, P(Y=B) = 0.30 and P(Y=C) = 0.30.Thus, nine different scenarios were simulated, each one being repeated for three sample sizes (100, 500 and 1,000).
Considering these fixed RRs and probabilities for each response category, the conditional probabilities of each outcome given the exposure factor were calculated.Using the R program, for each sample size and scenario, 10,000 contingency tables 2x3 were simulated, in which the two lines are defined as presence and absence of the exposure factor (X), and the three columns are the categories of the outcome (Y).These tables were simulated using two independent multinomial distributions (one for those exposed and the other for those not exposed) with probabilities specified in Table 1.
In order to avoid the occurrences of estimates of RRs equal to zero or infinity, if a simulated table contained a cell with zero frequency, it was discarded and another one generated.The consequences of this will be discussed further on.
For each simulated table, the point estimate and respective 95% confidence interval (95%CI) of the RRs obtained by each method were recorded.From the 10,000 estimates resulting for each RR (RR B and RR C ) and each method, the following measures of accuracy and precision were calculated: averages of estimates, absolute average biases and coverage of the confidence intervals as a percentage.The absolute average bias of the estimators for each parameter q, i.e., each true RR, was calculated as , where i q is the estimated RR of the i-th table from the respective model.Whereas the coverage percentage of the confidence intervals was calculated by: %Coverage = (number of confidence intervals which contained q/10,000) x 100.

Results
The separate robust Poisson regression and separate log-binomial models had practically identical measures of accuracy and precision.Therefore, we will only display here the results of the separate log-binomial model and the multinomial logistic model.The tables with the results for the robust Poisson regression can be obtained at www.mat.ufrgs.br/~camey/Poisson.The results of the simulations for the parameters RR B and RR C are summarized in Tables 2 and 3, respectively.
For both parameter RR B , and parameter RR C , in nearly all the scenarios, the log-binomial model had more accurate estimates (smaller averages of absolute biases) than the multinomial logistic regression model.In some cases the average absolute bias of the multinomial logistic regression model was more than 2,000 times the value of the average absolute bias of the log-binomial model (Tables 2 and 3, scenario 7, n = 500 and n = 1,000).The exceptions occurred only for RR C in scenarios 3 and 6 with n = 500 and n = 100, respectively.Generally speaking, the ratio between the biases is nearer 1 when the incidence of the outcomes is low [ and/or ].

Table 2
Averages of the estimates, absolute bias, and coverage percentage of the 95% confidence interval (95%CI) for the relative risk of outcome B, RR B = 1.5, obtained by the two models and ratio between average absolute biases.

Scenario/n
Log  For the parameter RR B , except for scenario 3, the averages of the absolute biases diminished as we increased the sample size.Whereas for the parameter RR C , there were many cases in which increasing the sample size did not lead to the bias being reduced: scenarios 2, 3, 5, 6 and 9.
In the log-binomial model, for both the parameters, in only three situations did the cover- age percentage of the intervals remain below the nominal value of 95% (see Table 2, n = 100, third scenario and Table 3, n = 100, third and sixth scenarios), always with a difference of less than 10%.However, for the multinomial logistic model, the coverage percentage of the confidence interval was only near the nominal value of 95% when there was a low prevalence of outcomes and/or sample size of n = 100.In some cases the coverage % was near zero (see scenarios 7 and 8, in Tables 2 and 3).
It should be pointed out that, for all the tables, there were no problems of convergence in the log-binomial model or any unallowable solutions for the probabilities estimated by the robust Poisson regression model, i.e., all the probabilities assumed values between 0 and 1.
A fourth approach was also studied in this work, which was the robust Poisson regression with offset 17 .We have not detailed this method, or displayed its results here (available at www. mat.ufrgs.br/~camey/Poisson),as they were practically identical to those obtained using the separate robust Poisson regression without offset.Moreover, the calculation of the offset is a practical disadvantage, as it is increasingly complex, since categorical exposure (or confusion) factors are added, and cannot be calculated in the presence of continuous exposure (or confusion) factors.

Comparison using real data
As an example, we used a data set of the Study of Food Intake and Eating Behavior in Pregnancy (ECCAGE) which is a prospective cohort study of pregnant women followed until the puerperium 18 .These pregnant women were receiving care in 18 primary care units in two cities in Southern Brazil.They were first evaluated between the 16th and 36th week of pregnancy at a prenatal visit.In this example we use a sample of 667 pregnant women.
Total weight gain was classified according to the recommendation of the Institute of Medicine 2009 19 .Total weight gain between 12.5 and 18kg was considered adequate for women with prepregnancy body mass index (BMI) below 18.5kg/ m 2 , between 11.5 and 16kg for women with prepregnancy BMI between 18.5 and 24.9kg/m 2 , and between 7 and 11.5kg/m 2 for women with pre-pregnancy BMI between 25.0 and 29.9kg/m 2 .Total weight gain between 5 and 9kg was considered appropriate when pre-pregnancy BMI was higher or equal than 30.0kg/m 2 .Incidences of pregnancy weight gain according to the categories of pre-pregnancy BMI 20 are showed in Table 4.
The results obtained of the adjustment by separate log-binomial model and multinomial logistic regression are shown in Table 5.In this example we can perceive that the choice of model influenced the direction, magnitude and significance of the effect (RR).
The first difference between the models can be observed in the RR of the insufficient weight gain of the women who had a pre-pregnancy BMI greater than or equal to 30 (obese), which in spite of not being significant is rather illustrative.When the multinomial regression was used, being obese was a risk factor (RR = 1.42) for insufficient weight gain and by the log-binomial model it becomes a protective factor (RR = 0.81).
The influence of the choice of model in the magnitude of the effect can be seen in Table 5 in several comparisons between the estimates of the two models, as for example, in the RR of excessive gain of women with a BMI greater than or equal to 25 and less than 30kg/m 2 (overweight).In this case there is a reduction of approximately 70% in the magnitude of the effect when the logbinomial model is used.The illustration that the choice of model influences the significance of the RR can be seen through the RR for insufficient weight gain of women with a BMI greater than or equal to 25 and less than 30kg/m 2 (overweight).By the multinomial logistic model, being overweight is not a statistically significant protective factor for insufficient weight gain, different from what occurs when the log-binomial model is used.
This example makes it clear that the suitable choice of model influences the result of a study in several ways.

Conclusions
As in the case in which the outcome is dichotomous, the results showed that the multinomial logistic model should not be used to estimate the RR or RP with polytomous outcomes, as its estimates generally have greater bias.As a result of the bias of the estimator, the estimate of the confidence interval (CI) is centered upon a distant value of the parameter, reducing the coverage of the CI.
The ratio between the biases of the RR estimated by the multinomial logistic model compared with those estimated by the log-binomial model is nearly always greater than 1, and this ratio increases to the extent that the incidence of the outcomes increases.This also occurs when the outcome is dichotomous.
In spite of it being generally agreed that increasing the sample size improves the features of the estimates, we can clearly see that when the multinomial logistic regression is used, increasing the sample size produces a CI which in most cases does not contain the parameter.Three cases occurred with the larger sample size in which the coverage is near to zero, meaning that of the 10,000 CI estimated practically none contained the parameter.
As mentioned above, a problem of nonconvergence of the log-binomial model did not occur in our simulations, something which it is known does not always occur in practice 5,6,7,21 .We believe that this occurred due to the design of our simulation, with only one binary factor, and to the discarding of the contingency table with null cells.For the same reason, there were no situations in which the estimated probabilities were greater than one using the robust Poisson regression.We would like to draw your attention to the fact that when there is a convergence problem in the log-binomial model, in general, the Poisson regression estimates probabilities greater than one 5 .This occurs because in the Poisson regression the average of the Poisson distribution is modeled, which is a positive number, but not restricted to being less than one.
It is known that there already exist proposals for addressing the problem of non-convergence of the log-binomial model, such as the COPY method 6,21 and the use of non-linear programming 22,23 .More realistic simulations, including more factors of exposure, which are discrete and quantitative, are already being executed.
We would like to stress that for the choice of method of analysis, it is important to review the conceptual aspects of the measure of association which one wishes to estimate, as already pointed out by other authors 1,3,8,24 , avoiding or minimizing the erroneous interpretation of OR as RR or PR, especially by readers.For a multinomial outcome this becomes even more important, as the interpretation of the OR is performed concerning a reference category of the outcome and in the RR we have the risk ratio for the category of interest among levels of the factor, but without having a reference category.Indeed we could estimate Cad.Saúde Pública, Rio de Janeiro, 30(1):21-29, jan, 2014 RR even for the reference category.But as the main issue of this paper is to compare the results with multinomial logistic regression, where this is impossible, we did not estimate.
This study is relevant regarding the discussion of the estimate of RR or PR when the outcomes are multinomial, showing that there is no justification for using the multinomial logistic regression to estimate them, once the last one estimates OR.Our present recommendation is analogous to that of Spiegelman & Hertzmark 8 for binary outcomes: one must try to adjust the separate log-binomial models, but if they do not converge, the separate robust Poisson regressions are adjusted.
Oportunidad Relativa; Razón de Prevalencias; Modelos Logísticos; Riesgo Relativo Contributors S. A. Camey and A. Vigo participated in the literature review, simulation design, supervision of implementation and preparation of the text.V. B. L. Torman contributed towards the literature review, writing simulations and text preparation.V. N. Hirakata collaborated on the literature review and preparing the text.R. X. Cortes contributed to the writing and implementing simulations, as well as preparing the text.
It should be pointed out that the Poisson regression with robust variance and the log-binomial model are also available in other statistical software such as SAS (SAS Inst., Cary, USA), SPSS (SPSS Inc., Chicago, USA) and Stata (Stata Corp., College Station, USA).

Table 1
Probabilities used in the different simulation scenarios.

Table 3
Averages of the estimated values, absolute bias and coverage percentage of the 95% confidence interval (95%CI) for the relative risk of outcome C, RR C = 2.0, obtained by the two models and ratio between average absolute biases.

Table 4
Incidence of total pregnancy weight gain according to pre-pregnancy body mass index (BMI) of women receiving care at primary care services in southern Brazil.Porto Alegre and Bento Gonçalves, Rio Grande do Sul State, Brazil, 2007 (N = 667).

Table 5
Estimated relative risk of insufficient and excessive gain, by the log-binomial model and by multinomial logistic regression for pregnant women receiving care at primare care services in southern Brazil.Porto Alegre and Bento Gonçalves, Rio Grande do Sul State, Brazil, 2007 (N = 667).