Internal consistency of the self-reporting questionnaire-20 in occupational groups

ABSTRACT OBJECTIVE To assess the internal consistency of the measurements of the Self-Reporting Questionnaire (SRQ-20) in different occupational groups. METHODS A validation study was conducted with data from four surveys with groups of workers, using similar methods. A total of 9,959 workers were studied. In all surveys, the common mental disorders were assessed via SRQ-20. The internal consistency considered the items belonging to dimensions extracted by tetrachoric factor analysis for each study. Item homogeneity assessment compared estimates of Cronbach’s alpha (KD-20), the alpha applied to a tetrachoric correlation matrix and stratified Cronbach’s alpha. RESULTS The SRQ-20 dimensions showed adequate values, considering the reference parameters. The internal consistency of the instrument items, assessed by stratified Cronbach’s alpha, was high (> 0.80) in the four studies. CONCLUSIONS The SRQ-20 showed good internal consistency in the professional categories evaluated. However, there is still a need for studies using alternative methods and additional information able to refine the accuracy of latent variable measurement instruments, as in the case of common mental disorders.


INTRODUCTION
In the early 1970s, the World Health Organization a built the Self-Reporting Questionnaire (SRQ) to assess the impacts of mental health problems in primary health care in periphery countries. This instrument was composed of 30 questions evaluating psycho-emotional symptoms, alcohol abuse, psychotic disorders and seizures. The target population was primary health service users. In 1980 12 , a version with 20 questions (SRQ-20) was developed, covering only psycho-emotional aspects. It was proposed for screening of common mental disorders (CMD), which are nonpsychotic symptoms characterized by insomnia, fatigue, irritability, forgetfulness, trouble concentrating, and somatic complaints 8 .
The SRQ-20 has been widely used and the performance of its measurements has been evaluated in populations of health service users 22 . However, few studies have assessed the validity and consistency of SQR-20 measurements regarding occupation 21 .
Instruments for assessment of behavioral traits or sets of symptoms need item homogeneity to capture different aspects of the same variable. Internal consistency evaluations of instruments are considered a reliability criterion for a measurement. However, this interpretation has its limits because measurements of internal consistency are based on management of the scale at a specific point, not considering sources of variation throughout time or among observers 24 .
The concept of validity is related to the quality of a given measurement. Reliability reflects the set of random and systematic errors inherent in a measurement. The relationship between validity and reliability can be analyzed by the consistency of external (validity) and internal (reliability) criteria 25 .
Reliability reflects a necessary but insufficient condition for the validity of a particular measurement. This condition does not characterize a property of a research instrument; essentially, reliability refers to the ability of the measurement produced by the instrument to be consistent in time and space or among different observers 16 . In addition, the conditions to evaluate reliability are relative, since every measurement has a degree of reliability when applied to a particular population under specific conditions 24 .
In the reliability assessment, three procedures can be adopted, depending on the type of the instrument and form of measurement. The stability evaluation of the measurement considers the consistency over time (test-retest), a technique related to the concept of reproducibility of measurements. The second procedure concerns measurement equivalence by considering different forms of measurement (interobserver). The measurement can also be evaluated by the internal consistency of the set of items used to estimate a certain latent variable or dimension 15 . Thus, the internal consistency of the instrument shows the degree of homogeneity of the measurement when the items or subscales measure the same variable 16 .
This study aimed to assess the internal consistency of SRQ-20 measurements in occupational groups. Study 3 -Health care workers. A cross section of a multicenter study using the same sampling procedures of a survey carried out in Belo Horizonte 2 , Minas Gerais, among primary health care workers from four municipalities of Bahia (Feira de Santana, Jequié, Santo Antônio de Jesus and a health district of Salvador). Its proportional stratified sampling considered: the distribution of the number of workers by geographical area; definition of events of interest estimates; composition of the sample according to the percentage of workers in each geographical area of the participating municipalities; and the draw, by random procedure, of the workers included in the study in each region. Throughout the 2012-2013 period, 2,448 workers were evaluated. In all studies, common mental disorders were evaluated with the use of the Self-Reporting Questionnaire (SRQ-20). To analyze the internal consistency of the dimensions (subscales) of the instrument, a tetrachotric correlation factor analysis was previously performed using the method of principal components with factor extraction, according to the Kaiser criterion of eigenvalue greater than 1. The scree plot technique was used to confirm the amount of factors to be extracted. The items presenting load greater than 0.40 were retained for composition of the factors 11 . PROMAX oblique rotation was applied for better interpretation of the values, using the Stata program, version 11.0.

METHODS
To compare estimates of internal consistency of subscales (dimensions) extracted by tetrachoric factor analysis, we used alpha (α) values by the Kuder-Richardson formula (KD-20), α for the tetrachoric correlations matrix, and stratified α 6 , to investigate a possible underestimation of internal consistency.
Stratified α was estimated using Microsoft Excel  (2007), employing the formula in which S 2 i is the variance of the items that constitute the factor i (i = 1, ..., f ), α i is Cronbach's alpha for the factor i and S 2 T is the total variance of the instrument, according to the equation below. The variation of 0.65-0.90 was adopted as the reference parameter of satisfactory performance 17 .
The four studies were evaluated and approved by research ethics committees when they were conducted. The present study was approved by the Research Ethics Committee of the Instituto de Saúde Coletiva da Universidade Federal da Bahia (CAAE 18723813.9.0000.5030).

RESULTS
Most evaluated workers were female, especially among teachers (92.0%; study 2) and health workers (80.6%; study 3). In the four studies, the predominant age group was 30 to 45 years ( Table 1). The educational attainment of the workers differed among studies. Most informal workers (study 1) reported less than primary educational attainment (95.9%). In study 3, 42.9% reported elementary educational attainment and 41.3% secondary technical education or higher education. Secondary technical education or higher education was reported by 82.1% of teachers (study 2) and 55.9% of workers in general (study 4).
Filling of SRQ-20 items varied among studies (Table 2). Study 2 had the greatest number of losses, with 605 missing data (13.5%), followed by study 3, which showed a loss of 60 data (2.4%). Smaller loss percentages were found in studies 1 and 4 (0.3% and 1.2%, respectively).
The α values of SRQ-20 dimensions of the occupational groups evaluated showed significant variations. The use of the alpha adjusted by the tetrachoric correlation matrix allowed more robust coefficients, considering the sample size.
In studies 1 and 2, we observed similar KD-20 and tetrachoric matrix α values (0.85 and 0.94, respectively). Study 3 (health workers) showed the lowest value of internal consistency for both estimators (0.82 and 0.92). In study 4, standardized values of the item set were 0.84 for the KD-20 estimate and 0.93 for the tetrachoric correlation α. In all studies, removing items did not change substantially the global values of internal consistency estimates of the instrument.
The factor analysis allowed the extraction of three factors that differed as to classification of the dimensions and number of items in each dimension among studies ( Table 3).
The Cronbach's α estimated for tetrachoric correlation matrix produced higher values for all dimensions of the studies. The internal consistency of the instrument items, assessed by stratified Cronbach's α, was high (> 0.80) in most studies.
In study 1, the dimension represented by factor 1 (F1 -depressive mood or anxiety symptoms) concentrated the largest number of items (11) and higher internal consistency according to both estimation methods (0.79 and 0.90). Factor 2 (F2) represented the somatic dimension and factor 3 (F3), the dimension "decreased energy". These last two dimensions showed identical Cronbach' s α values (0.65), but differed in estimates assessed for tetrachoric matrix correlation, with a higher value for F2 (α = 0.79).
The three dimensions extracted by factor analysis in study 3 presented similar estimates of reliability (α = 0.87) when the tetrachoric matrix was considered.

DISCUSSION
Internal consistency estimates of SRQ-20 dimensions and global scores, using a tetrachoric correlation matrix, showed adequate values consistent with the literature 17 . However, the high proportion of losses in study 2 may have compromised Cronbach's alpha estimates. Both losses affecting α values and differences among occupational groups show that the reliability of a measurement, evaluated by the reproducibility or homogeneity of measurement items, cannot be interpreted as an inherent or immutable property of an instrument because it depends on the interaction between the instrument and a specific assessed group 24 .
Internal consistency has been considered a suitable measurement to describe traits, characteristics of behaviors or disorders in a particular context, but not necessarily to identify groups possessing such attributes or not 25 . Thus, group characteristics may affect the homogeneity of measurement items. Intra-or inter-subject differences related to the variability of the data affects the α value. Generally, the smaller the variability of intra-subject responses and the larger the variability of inter-subject responses, the greater the α value 16 .
In this study, most occupational groups had a predominance of females, with an average age of 30 to 45 years and differences in educational attainment. Low educational attainment can be a barrier to express emotional disorders 7 . However, the stratification used here hampers a detailed analysis of this characteristic in the interpretation of the scale.
In addition to the differences between the groups, the high degree of consistency in the measurements of a scale denotes ease of interpretation of the final score, as a reflection of the items composing the instrument. To do this, the items should be moderately correlated with each other and maintain a correlation with the total score of scale 13 . Some aspects should be considered when evaluating the results related to internal consistency estimates of an instrument. Uncritical acceptance of α values, as a reflex of high levels of internal consistency, can lead to impaired judgment of the real scale homogeneity.
The α value depends not only on the magnitude of the correlation of the items in a scale, but also on the number of items that compose it. Thus, the greater the number of items in a scale, the greater the estimated internal consistency. Another condition that requires greater caution when interpretating internal consistency is the combination of scales that assess independent constructs, because the increase in the items will elevate α estimates. In addition, high α values may be related to redundancy of the scale items, compromising content validity, since an item set can assess the same conditions in different ways 25 .
In this study, most dimensions extracted by tetrachoric factor analysis showed internal consistency estimates within the assumed reference parameter (α = 0.65-0.90). However, the parameters used as reference for the estimates of α are also the target of criticism.
There are several reference values to evaluate the internal consistency of scales 17 This study provides evidence to characterize SRQ-20 as a multidimensional instrument, with dimensions varying among the different occupational groups investigated. Estimates of α cannot be interpreted as a property of the instrument, since they are conditioned by scale scores in a given population 24 .
In general, one-dimensional measurements feature high levels of internal consistency (high homogeneity). However, high internal consistency values represent conditions necessary but insufficient to ensure the unicity of a scale 5 . Thus, multidimensional instruments featuring high internal consistency levels for a particular measurement show that, although there are different dimensions, the components are strongly interrelated 16 .
The α values may underestimate the true internal consistency of the measurementss of a multidimensional instrument because the α estimate suggests equivalent distinction among the questionnaire items 18,19,b . Congeneric scales, characterized by correlation of items among themselves, are also affected by the underestimation of alpha values 10 . For multidimensional instruments, the stratified α has showed better performance for the estimates than conventional estimators, although the differences are not so expressive b . Therefore, the principle of tau equivalence must be considered when analyzing estimates of Cronbach' s α among multidimensional instruments. In these cases, the assumption that each item of test measures the same latent trait in an instrument is violated by the dimensions extracted by factor analysis. In this way, the α values of a multidimensional instrument will be underestimated 26 .
In the context of psychometric assessments, we highlight the debate on the superiority of measurements used as reliability criteria 3 . Cronbach's α measurements are widely used uncritically and often considered a reference for scale reliability. However, it is necessary to deepen the analysis of α values and to compare the indexes based on repetition of measurements (test-retest) 10 .
Operational and theoretical limitations reflect a low threshold of internal consistency for estimates produced by Cronbach's α. There are also the difficulties of interpretation and judgement of its measurements. Alpha estimates correlate with other statistics, which can confuse results when very low and very high values of this coefficient are found in one-dimensional or multidimensional instruments. Thus, additional information is required to assess α estimates separatedly as an internal consistency measurement 23 .
Despite the extensive use of Cronbach's α in the internal consistency evaluation of SRQ-20, the discussion about the real implications of its measurements for multidimensional instruments is still recent. Studies are needed to critically assess Cronbach's α estimates, comparing them with alternative methods and additional information to refine the accuracy of latent variable measurement instruments, as in the case of mental disorders.