Validity Study of the Beck Anxiety Inventory ( Portuguese version ) by the Rasch Rating Scale Model Estudo de Validade da Versão Portuguesa do Beck Anxiety Inventory Mediante o Modelo Rasch Rating Scale

Our objective was to conduct a validation study of the Portuguese version of the Beck Anxiety Inventory (BAI) by means of the Rasch Rating Scale Model, and then compare it with the most used scales of anxiety in Portugal. The sample consisted of 1,160 adults (427 men and 733 women), aged 18-82 years old (M=33.39; SD=11.85). Instruments were Beck Anxiety Inventory, State-Trait Anxiety Inventory and Zung Self-Rating Anxiety Scale. It was found that Beck Anxiety Inventory’s system of four categories, the data-model fi t, and people reliability were adequate. The measure can be considered as unidimensional. Gender and age-related differences were not a threat to the validity. BAI correlated signifi cantly with other anxiety measures. In conclusion, BAI shows good psychometric quality.

Anxiety is a prevalent emotional disorder that interferes with psychosocial functioning (Balestrieri, Isola, Quartaroli, Roncolato, & Bellantuono, 2010).Thus, it is not surprising that most anxiety assessment tools have been developed in clinical settings.
Anxiety measuring instruments can be classifi ed into those that assess only the neurovegetative components of the anxious response and the ones combining the evaluation of physiological components with the cognitive and behavioral components.The Beck anxiety inventory (BAI; Beck, Epstein, Brown, & Steer, 1988) is one of the most used clinical rating scales.In previous studies, BAI scores have shown high internal consistency, with Cronbach α of .92 and moderate test-retest reliability for one week with r = .75.BAI discriminated groups diagnosed as anxious (panic disorders, generalized anxiety, etc.) from groups  Endereço para correspondência: Faculdade de Ciências Médicas, Universidade Nova de Lisboa, Campo Mártires da Pátria, 130, Lisboa, Portugal 1169-056.E-mail: sonia.quintao@gmail.comdiagnosed as not anxious (major depression, atypical depression, etc.).
In the study of the Brazilian BAI version the scale had adequate reliability, with a Cronbach α of .91 for psychiatric samples, .86 for clinical samples, and .86 for non-clinical samples.The correlation between test and retest with a week of difference ranged from .53 for a sample of 115 students and .99 for a sample of 65 subjects of the general population (Cunha, 2001).Another study (Sanz & Navarro, 2003) examined the psychometric properties of a Spanish BAI version in a sample of 590 Spanish university students.BAI showed a high level of internal consistency, with a Cronbach α of .88 and factor analyses revealed a dimension formed by two very interrelated factors, corresponding to somatic and afective-cognitive symptoms.Taking the DSM-IV as the standard, the validity of BAI content was appropriate because their items covered 45% of symptomatic criteria specifi c of anxiety disorders and 78% of the symptoms of panic attacks.
For Leyfer, Ruberg and Woodruff-Borden (2006) BAI is not a diagnostic tool, but its brevity and simplicity make it an ideal instrument for use as a pretest for presence of anxiety disorder.The State-Trait anxiety Inventory (STAI; Spielberger, Gorsuch, & Lushene, 1970) is one of the self-assessment instruments most used internationally (Andrade & Gorenstein, 1998).In previous studies, Cronbach alpha have been found to range from .86 to .95 for the subscale STAI-State, and from .89 to .91 for the STAI-trait (Spielberger et al., 1970), whose scores have excellent test-retest reliability in multiple time intervals (Barnes, Harp, & Jung, 2002).Scores from the Zung Anxiety Scale (Zung, 1971) have also shown adequate internal consistency.Zung and BAI measure similar constructs, with emphasis on the somatic aspects of anxiety.
The objective of this study was to validate the BAI in Portugal with a modern psychometric model and then run a comparison of BAI, STAI trait, STAI State and Zung, the most used scales of anxiety in Portugal.The limitations of classical test theory, the usual model for construction and analysis of tests, has led to the emergence of alternative models, among which one of the most parsimonious is the Rasch model, which allows the conjoint measurement of persons and items (Bond & Fox, 2001;Rasch, 1960).A well-known extension of this model for polytomous data is the Rating Scale Model (Andrich, 1978;Prieto, Delgado, Perea, & Ladera, 2010;Stone, 2003).In order to fulfi ll our objective, we had to analyze the response categories, estimate the model parameters, their precision and degree of fi t, test the scale dimensionality and the differential item functioning, and correlate the scores from BAI, trait STAI, State STAI and Zung.

Instruments
We used a demographic questionnaire designed for this research, which asked about gender, age, residence, ethnicity, education level, religion and status, and the following anxiety instruments: Beck Anxiety Inventory (BAI; Beck et al., 1988).It consists of 21 items, which are statements descriptive of anxiety symptoms that participants have to evaluate with reference to themselves, in a Likert scale of 4 points.The possible range of total scores goes from 0 to 63 (Beck et al., 1988;Cunha, 2001).
State-Trait Anxiety Inventory (STAI trait, STAI state; Spielberger et al., 1970).This questionnaire is composed of two blocks of 20 statements, evaluated in a four-point Likert scale.Form 1, STAI State, evaluates transient or temporary anxiety and form 2, STAI Trait, dispositional or general anxiety.
Zung Anxiety Scale (Zung, 1971).It was designed to assess situational anxiety.The scale consists of 20 statements evaluated in a four-point Likert scale.Scores range from a minimum of 20 to a maximum of 80.The 20 items are distributed in four anxiety subscales: Cognitive, Motor, Vegetative and Central nervous system, but only the total score was used in this study.

Procedure
Test Application Followed Ethical Standards.The implementation was carried out in various universities, companies and public facilities.Participants who did not comply with at least one item in BAI were removed from the database.Missing values were replaced by item averages.Reversed items were recoded.Data was analyzed with the program Winsteps, version 3.68 (Linacre, 2009).

Data Analysis
The model proposed by Rasch (1960) is based on two major assumptions: the attribute can be represented on a single dimension where people and items are conjointly located; and person level and item location are the only (probabilistic) predictors of a correct answer.The formula to model this relationship is: where B s is the person parameter and D i the item location.
With polytomous data, the formula for the Rating scale model is (Andrich, 1978): where, P nik is the probability that person n answer is category k; P ni(k-1) is the probability that the observation or response is k-1; B n is the skill, attitude, trait… of person n; D i is the location of item i; F k is the transition point (step) between k and k-1.This model is widely used in the analysis of scales with Likert format, in which all items are answered with the same set of ordered categories.The analysis of the functionality of the categories of response followed criteria proposed by Linacre (2002): (a) suffi cient frequency and regular distribution of the chosen categories; (b) the average measures according to category should monotonically go up in the rating scale; (c) no category should show misfi t, and (d) the transition points (steps) must increase monotonically.
Model fi t (with pearsonian residual-based statistics) and score unidimensionality were then evaluated.Although strict unidimensionality is never achieved in practice (Zickar & Broadfoot, 2009), a principal component analysis of the residuals allows to assess whether the lack of unidimensionality is large enough to threaten score validity; the less stringent criterion is Reckase's (1979, cited in Zickar & Broadfoot, 2009), according to whom the percent of variance explained should be over 20% and there should not be a second dominant factor.
Differential Item Functioning (DIF) indicates lack of validity because the likelihood of an answer is determined by factors other than the construct measured.Currently, DIF analysis is an obligatory step in the validation of a test.Accordingly, we carried out DIF analyses with respect to gender and age (30 or less and more than 30).The procedure implemented in Winsteps estimates, for each item, the difference between item diffi culty in each group (focal and reference).The contrast is carried out with the formula proposed by Wright and Panchapakesan (1969): Where B f -B r are item locations for the target and reference groups, and SE 2 f and SE 2 r are the squares of their typical errors.According to Wright and Douglas (1975), the DIF values that degrade the measures correspond to differences (B f -B r ) over .5 logits.However, the Bonferroni correction is currently recommended to calculate a posteriori signifi cant differences (Linacre, 2010).Finally, factorial ANOVAs were carried out to test differences (impact) by sex and age in the Rasch-model scores.Previously we corroborated that assumptions for the use of parametric tests, i.e. normal distribution (Kolmogorov-Smirnov test) and homogeneity of variances (Levene test), were fullfi led.

Results
Every category system met Linacre's ( 2002) criteria as can be seen in Table 1.Once checked the adequacy of the categories, unidimensionality was put to the test.The BAI Rasch dimension, analogous to a fi rst factor in a common factor analysis, explained 41.2 % of the variance: not optimal according to Linacre ( 2010), but still acceptable following Reckase (1979, cited in Zickar & Broadfoot, 2009).STAI-state, STAI-trait and Zung results were similar to BAI's, with 47.6%, 46.2%, and 38.9% variance explained, respectively.Thus, scores are essentially unidimensional.As to model fi t, no items were found exceeding 1.5 Infi t and/or Outfi t, excepting BAI item 16 (Fear of dying), STAI-state item 4 (Filling tired) and item 7 (Currently, I am concerned about possible woes), and Zung item 19 (I can only get a good rest during the night).Severe misfi t was only found for STAI-trait item 24 (I wish I could be so happy as others seem to be) and Zung item 13 (I can inspire and expire with ease).The remaining items had values around unity (Linacre, 2009).
For the BAI, 9.31% of the participants show moderate misfi t and 5.60% high misfi t, in  Item reliability was very high for every scale, close to 1.00.As to person reliability, BAI (.79) is reasonably good, STAI-state and STAI-trait are very good (.91 both) and Zung (.71) is moderate.These values have some similarity with the Cronbach's  of classical theory.Table 2 shows the summary of BAI results.
Table 3 shows the BAI person-item conjoint representation.It can be seen that the person mean is much lower than the item mean, showing the low anxiety level of the sample.
No item showed DIF related to gender, and only two showed age-related DIF: STAI-trait item 32 and STAI-state item 18 (-.54logits and -.65).These items did not work equally for participants below and over 30 even if they had the same level of anxiety.They should be excluded from the test if results are replicated in subsequent studies.
Finally, correlations between BAI scores and the remaining anxiety measures were large and signifi cant: r = Note.

Discussion
Our main goal was to carry out an initial validation of the BAI for Portuguese population and to compare it with some other usually applied anxiety measures (STAI-state, STAI-trait and Zung).A psychometric model with optimal properties, the Rasch rating scale model, was used to test the functionality of the response category systems.This is seldom taken into account by the classical test theory in which determination of the categories is usually a priori.All evaluated scales showed good category functioning following Linacre's criteria (2002).
The Beck Anxiety Inventory is a scale with good psychometric characteristics, and in some contexts, such as the clinical one, in which the physiological symptoms are important, more appropriate than other scales used in Portugal.
BAI presented person reliability (similar to Cronbach α) reasonably good, but poor than the internal consistency presented in the original version (Beck et al., 1988) and in some countries like Brazil (Cunha, 2001) and Spain (Sanz & Navarro, 2003).
Although several studies point to the existence of more than one factor in the BAI (Beck & Steer, 1990, 1991;Cox et al., 1996;Steer et al., 1993), previously studied samples come from diverse populations, so that generalization is risky.From a practical point of view, a unidimensional measure makes sense when one of the factors is clearly dominant.Our analyses show that BAI, STAI-state, STAI--trait and Zung can be treated as unidimensional.
With just some exceptions, item-model fi t was good enough.In BAI and STAI-state, no items with severe misfi t were found.As regards severe person-model misfi t, it was never over ten percent.Likewise, reliability estimates were high enough for every scale.It is worth noting that, although the BAI measures do not show higher reliability (Person Separation Reliability) than the other anxiety measures, this instrument presents the lowest total percentage of misfi t item and the lowest percentage of severe misfi t persons.
No item showed gender-related DIF and only two items from the STAI-trait and the STAI-state showed age-related DIF.As to impact, women had on average higher anxiety values, which is consistent with the scientifi c literature (Grillon, 2008).In relation to age, BAI, STAI-trait and Zung showed that the younger subsample had higher values of anxiety, results that are also consistent with past research (Spence, Rapee, McDonald, & Inaram, 2001).
Given that the instruments were originally designed to measure intensity of the anxiety symptoms, especially physiological symptoms (Beck et al., 1988;Leyfer et al., 2006;Spielberger et al., 1970;Zung, 1971), it is not surprising that most of the participants were below the mean range of the variable.It can be seen that person-item conjoint representation is a useful way of comparing anxiety levels and communicating results.
The Beck Anxiety Inventory is a measure widely used in international research, but is not used in Portugal for lack of evaluation of psychometric characteristics.In this study, the BAI showed a good evidence of validity and reliability.
The largest contribution of this research was to allow future research in Portugal to use the BAI as a tool for the evaluation of anxiety, as construct in general.This is of great importance, once that anxiety has been associated with an increased risk for other diseases, and plays an important role in the quality of life in general, as well as in relation to the capacity to drive in normal daily life.In addition, anxiety disorders involve high individual and social costs tend to be chronic and can be as disabling as somatic disorders (Lepine, 2002).
A limitation of this study was the fact that it wasn't used a clinical sample, being suggested for future studies the use of clinical samples, with medical or psychiatric disorders.

Table 2
Summary of the BAI Results