Validity of the Brazilian version of WHOQOL-BREF in depressed patients using Rasch modelling

OBJECTIVE: To assess the validity of the Brazilian version of the World Health Organization Quality of Life Instrument – Abbreviated version (WHOQOLBREF) in adults with major depression, using Rasch modelling. METHODS: Study analyzing data from the baseline sample of the Longitudinal Investigation of Depression Outcomes in Brazil, including a total of 208 patients with major depression recruited in a primary care service in Porto Alegre (Southern Brazil), in 1999. The Center for Epidemiological Studies Depression Scale was used to assess intensity of depression; the WHOQOLBREF to assess generic quality of life; and the Composite International Diagnostic Interview version 2.1 for the diagnosis of depression. RESULTS: In the Rasch analysis, the four domains of WHOQOL-BREF showed appropriate fi t to this model. Some items needed adjustments: four items were rescored (pain, fi nances, services, and transport); two items (work and activity) were identifi ed as having dependency of responses, and one item was deleted (sleep) due to multidimensionality. CONCLUSIONS: The validation of the WHOQOL-BREF Brazilian version using Rasch analysis complements previous validation studies, evidencing the robustness of this instrument as a generic cross-cultural quality of life measure. DESCRIPTORS: Depression. Quality of Life. Questionnaires. World Health Organization. Translations. Validation Studies. 148 WHOQOL-BREF in depressed patients, Rasch modelling Rocha NS & Fleck MP The World Health Organization Quality of Life Instrument, abbreviated version (WHOQOL-BREF) is a generic quality of life (QoL) measure, which has been developed simultaneously in many cultures and languages by the World Health Organization. Although the WHOQOL-BREF has been developed using this methodology to ensure its cross-cultural validity, it was established in previous research using only Classical Test Theory (CTT).23 In consonance with international WHOQOL-BREF validation studies, the Brazilian Portuguese version was also validated using CTT.7, 10 Modern statistical analyses such as Rasch analysis has been pointed as a useful statistical method to yield measures that could be at the same time not infl uenced by sample nor by the scale.2,20,24 Also this method may be a complementary tool for validation studies that used CTT.17,18 Based on the assumption of invariability posed by Rasch analysis, we can identify whether or not items that are part of a scale are affected by external factors such as presence of a depressive episode, age, gender and culture. Some authors have questioned the validity of QoL measure in depressed patients6 with the possibility of existing an overlap of depression and QoL constructs.4,11,12,15,16 RESUMO OBJETIVO: Testar a validade da versão brasileira do World Health Organization Quality of Life Instrument, abbreviated version (WHOQOL-BREF) em adultos com depressão maior, usando o modelo de Rasch. MÉTODOS: Estudo utilizando dados secundários da amostra brasileira basal do “Longitudinal Investigation of Depression Outcomes”, constituída por 208 pacientes com depressão maior, recrutados em um serviço de atenção primária de Porto Alegre, RS, em 1999. Os instrumentos utilizados foram: a Center for Epidemiological Studies Depression Scale para avaliar a intensidade da depressão; a versão brasileira do WHOQOL-BREF, como medida de qualidade de vida genérica; e a Composite International Diagnostic Interview, version 2.1 para o diagnóstico de depressão. RESULTADOS: Após usar a análise de Rasch, os quatro domínios do WHOQOL-BREF se mostraram adequados ao modelo de Rasch. Alguns itens necessitaram de ajustes: quatro itens foram recodificados (dor; fi nanças, serviços e transporte), 2 itens (trabalho e atividade) mostraram dependência de respostas, e 1 item foi retirado (sono), por apresentar sinal de multidimensionalidade. CONCLUSÕES: A validação da versão brasileira do WHOQOL-BREF usando a análise de Rasch complementa os estudos prévios de validação, confi rmando a importância deste instrumento como uma medida transcultural genérica de qualidade de vida. DESCRITORES: Depressão. Qualidade de Vida. Questionários. Organização Mundial da Saúde. Tradução (Produto). Estudos de Validação.

The World Health Organization Quality of Life Instrument, abbreviated version (WHOQOL-BREF) is a generic quality of life (QoL) measure, which has been developed simultaneously in many cultures and languages by the World Health Organization.Although the WHOQOL-BREF has been developed using this methodology to ensure its cross-cultural validity, it was established in previous research using only Classical Test Theory (CTT). 23In consonance with international WHOQOL-BREF validation studies, the Brazilian Portuguese version was also validated using CTT. 7,10 ern statistical analyses such as Rasch analysis has been pointed as a useful statistical method to yield measures that could be at the same time not infl uenced by sample nor by the scale. 2,20,24Also this method may be a complementary tool for validation studies that used CTT. 17,18Based on the assumption of invariability posed by Rasch analysis, we can identify whether or not items that are part of a scale are affected by external factors such as presence of a depressive episode, age, gender and culture.Some authors have questioned the validity of QoL measure in depressed patients 6 with the possibility of existing an overlap of depression and QoL constructs. 4,11,12,15,16SUMO OBJETIVO: Testar a validade da versão brasileira do World Health Organization Quality of Life Instrument, abbreviated version (WHOQOL-BREF) em adultos com depressão maior, usando o modelo de Rasch.

INTRODUCION
Since major depression is an important public health problem in Brazil, where the estimates of point prevalence of this condition are between 3.5% to 9.7% and lifetime prevalence rate may be as high as 15%, 13 a valid QoL instrument is of great interest for Brazilian people.Regardless of the research context, any health policy or test of new treatment for these patients would benefi t from information given by this measure that is not only focused on symptoms or functionality.
The objective of the present study was to assess the validity of the Brazilian version of WHOQOL-BREF in adults with major depression, using Rasch modeling

METHODS
Study using secondary data from the Longitudinal Investigation of Depression Outcomes (LIDO).The LIDO is a multicenter, cross-national observational study which followed patients with depressive disorders in primary care settings for 12 months in six countries. 8atients attending a primary care service in the city of Porto Alegre, Southern Brazil, were screened for depression symptoms.Those meeting the inclusion criteria -new and/or untreated episode of depression and a score over 16 on the Center for Epidemiological Studies Depression Scale (CES-D) 19 -were interviewed and assessed with a standardized diagnostic instrument for major depression, the Composite International Diagnostic Interview (CIDI). 21ntinuous variables were age and years of education; and binary variables were gender, marital status, and self-report of health status.
The CES-D is a 20-item scale designed to measure symptoms of depression in community populations 19 and was applied to measure the intensity of depression.The WHOQOL-BREF 10,23 is a 26-item questionnaire distributed into four domains (physical, psychological, social relationships, and environment) and answers are scored using individualized fi ve-point scales.Each subscale is scored positively.The CIDI, version 2.1, is a completely structured psychiatric diagnostic assessment developed for use in cross-national epidemiologic studies.Data from CIDI were used to assess diagnostic criteria for depression from the American Psychiatric Association (DSM-IV). 1Rasch analysis was performed.The Rasch model is a one-dimensional model, fi rst used in educational assessment, 20 which asserts that the easier the item the more likely a person will give a correct response, and the more able the person, the more likely she/he will give a correct response on an item compared with a less able person.In the assessment of QoL, patients are presented with a range of items corresponding to differing facets of QoL.Thus, a person with higher QoL will have greater probability of answering positively (where positive refl ects better QoL) than someone with lower QoL.This model can be extended to analyze items with more than two categories, and this involves a "threshold" parameter, represented by the equal probability point between any two adjacent categories within an item.The model used in the present analysis is a further derivation, the Partial Credit Model. 14ree overall fi t statistics are considered to determine the model fi t.Two are item-person interaction statistics distributed as z-statistic with mean of zero and SD of 1 (indicating perfect fi t to the model).The third one is an item-trait interaction statistic reported as x 2 , refl ecting the invariance across the trait (indicated by a no signifi cant x 2 ).Besides, individual item-fi t statistics are presented as residuals (acceptable within the range ±2.5) and as x 2 statistic (required also a non-signifi cant x 2 ).
The boundaries between categories of responses are called "thresholds" and "disorder thresholds" may indicate that it will be necessary to collapse adjacent categories.Following this, data are fi tted to the model to determine overall fi t, and how well each item fi ts the model.
The Rasch model has some assumptions that need to be evaluated to ensure that an instrument has Rasch properties.The most commonly Rasch assumptions assessed are: a) unidimensionality; b) local independence; and c) invariability.
Unidimensionality is used to assess whether a single latent trait can explain all the data variance.The residuals are what remain when the "Rasch factor" has been removed from the data, and therefore, the fi rst factor of the Principal Component Analysis (PCA) is the primary contributor to data variance, with the "Rasch factor" discounted.We take the items showing the highest positive correlation with the fi rst component of PCA of the residuals, and the items with the highest negative loading items, and derive estimates for these two sets.These are compared to test if the assumption of unidimensionality holds by applying an independent t-test to each person pair of estimates.If less than 5% of the estimates are outside the range of ±1.96, the scale is considered unidimensional.
Local independence means that when the ability infl uencing the performance is constant, responses to any pair of items are statistically independent.We check the residual correlation matrix to see if any values exceed +0.3.This will indicate the presence of local dependency.If items are correlated in the residuals, we merge them into a "super" item through the subtest procedure, and see if improved fi t is obtained.If so, it is a sign of local dependency and a violation of one of the Rasch assumptions.
Invariability implies that the parameters that characterize an item are not dependent on the distribution of persons' abilities and the parameters that characterize the persons are not dependent on the set of test items.To ensure the invariability of this measure, an analysis known as Differential Item Functioning (DIF) is performed.The statistical test used for detecting DIF is an analysis of variance (ANOVA) of the personitem deviation residuals with person factors (e.g., age, gender, country) and class intervals (e.g., group along the trait) as factors.All items were checked for DIF by gender, age and educational level as person factors.Items that do not yield the same item response function for two or more groups are violating the requirement for unidimensionality and invariability.
The internal consistency reliability of the scale was also determined based on the Person Separation Index (PSI), where the estimates on the logit scale for each person were used to calculate reliability.
Rasch analysis was undertaken using the Rasch Unidimensional Measurement Models (RUMM) 2020 package. 3l patients agreeing to participate in the study signed a written consent including the objectives of the study.The local Research Ethics Committee approved the study.

RESULTS
Table 1 summarizes the characteristics of the baseline sample.Our sample consisted predominantly of female, married, middle-aged, elementary school educated subjects with good health status and moderate levels of depression.
Of all domains of WHOQOL-BREF, only "physical" did not meet the requirements of the Rasch model, assessed by the summary of overall measures of fi t statistics, where: total item x 2 was 86.9; chi-square p-value was 0.02; PSI is 0.79; P (independent t-test) ranged between 0.06-and 0.12.
Similarly, when analyzing individual item fi t of all items, only the "sleep" item (residual=3.09;x 2 = 23.66;p-value=0.0001)did not fi t the Rasch model.(Table 2) Categories of responses were checked.Of the 26 items of BREF, "pain," "fi nances," "services," and "transport" displayed disordered thresholds of response categories, and they required to be rescored to meet Rasch properties.By suppressing the middle response category in all these items their thresholds were ordered.Note that after rescoring these items, the response scale was shortened to 1-4, while all remaining items maintained their original scoring 1-5.
All items were checked for DIF by age (younger than 45 years old vs. older than 45 years old), gender, and educational level (at least elementary education vs. elementary education and more).The items "positive feelings" and "support" displayed DIF for age and "energy" for gender.No item displayed DIF for educational level.No item has shown DIF for more than one factor.(Table 3) As the physical domain items did not fi t the Rasch model, all items were re-analyzed to conform to the Rasch model assumptions.In the analysis of the physical Domain, "activity" and "work" items showed correlations of 0.34 on person-item residual correlation matrix, indicating local dependency of responses (correlation >0.3).After the subtest analysis was performed, physical domain overall measures of fi t improved, total item x 2 changed from 86.9 to 51.6; P from 0.02 to 0.57; PSI from 0.73 to 0.76; P (independent t-test) from 0.06-0.12 to -0.02-0.04.Despite the improvement of all overall measures, the "sleep" item remained showing signals of misfi t to the model (residual = 2.58; x 2 = 10.4;p=0.32).The deletion of the "sleep" item resulted in the best overall measures for the physical domain: total item x 2 changed from 51.6 to 38.9; P from 0.57 to 0.72; PSI remained 0.76; P (independent t-test) changed from -0.02-0.04 to 0.04-0.10.The "sleep" item misfi t was due to multidimensionality.
The items of the psychological, environment, and social domains maintained their fi t to the Rasch model, as illustrated in Table 4.

DISCUSSION
Our fi ndings indicate the validity of WHOQOL-BREF as a measure of generic subjective QoL in depressed primary care patients in Brazil.Other studies that had similar purposes used CTT, small sample sizes (from 41 to 81 subjects) and patients from clinical settings. 6,7,12,16e analysis of all WHOQOL-BREF items in this Brazilian sample showed that only seven items (sleep, activity, work, pain, fi nances, services, and transport) did not meet Rasch requirements.Interestingly, four of them are items of the physical domain.This fi nding may be associated with the primary care setting, where patients were recruited in a visit with a general practitioner for physical complains which may or may not have been related to their depressive episode. 5An international study reported rates of depression with somatic manifestations from 45% to 95% in the countries studied. 22The deletion of the "sleep" item and its inclusion as a separate item may be justifi able by its multidimensionality since "sleep" may be a content from physical or/and psychological domains.
The conciliation of conceptual model and empiric evidence is a challenge to researchers.Although the inclusion of these excluded items had a conceptual reason.The exclusion and inclusion of items clearly reduce the comparability of measures between different populations, thus leading researchers to make concessions, depending on research purpose.
The present study did not re-test the original four-domain structure of WHOQOL-BREF, which can be a limitation, despite the fact that there is not a literature consensus on the most suitable method to evaluate instrument dimensionality. 9Hence, we opted to maintain the 4-domain structure because, besides being the most conservative approach, it is the most studied one. 25 conclusion, the validation of the WHOQOL-BREF Brazilian version using Rasch analysis complements previous validation studies, evidencing the robustness of this instrument as a generic cross-cultural QoL measure.

Table 1 .
General characteristics of the baseline sample of depressed patients enrolled in the Longitudinal Investigation of Depression Outcomes Study.Porto Alegre, Southern Brazil, 1999.
CES-D: Center for Epidemiological Studies Depression Scale

Table 2 .
Fit of WHOQOL-BREF items to the Rasch model of the baseline sample of depressed patients enrolled in the Longitudinal Investigation of Depression Outcomes Study.Porto Alegre, Southern Brazil, 1999.

Table 4 .
Fit of WHOQOL items adjusted by the Rasch model of the baseline sample of depressed patients enrolled in the Longitudinal Investigation of Depression Outcomes Study.Porto Alegre, Southern Brazil, 1999.P of test for unidimensionality, shown as 95% CI, probability of 0.05 must be included