Reliability of Screening Tests for Health-related Problems among Low-income Elderly

Screening tests for health problems can identify elderly people who should undergo the Comprehensive Geriatric Assessment, enabling the planning of actions to prevent disability. The aim of this study was to analyze the inter-rater reliability (IRR) of self-assessment questions (SAQ) and performance tests (PT) recommended in Brazil, in a sample of low-income elderly people, through an exploratory study performed with 165 elderly assessed by two professionals on different days. IRR was evaluated using the intraclass correlation coefficient (ICC) for continuous variables and the kappa statistic for categorical ones. The IRR for the PT (muscle strength, mobility body mass index, vision) was excellent and presented ICC values greater than 0.75. By contrast, the IRR for SAQ (urinary incontinence, self-perceived health and hearing impairment) was intermediate. Only the fall-related item presented a good IRR. In this study single SAQ had poor reliability when compared to PT, suggesting the necessity of revision of subjective self-assessment items with low reproducibility before implementation.


Introduction
The increase in the proportion of Brazil's elderly population has contributed to the non-communicable disease epidemic with concomitant increases in morbidity and mortality 1 .The consequence of these demographic and epidemiological changes is an increase in the occurrence of disability.Older people are likely to have multiple conditions interacting in different ways, leading to difficulties in performing important tasks 2 .The Comprehensive Geriatric Assessment (CGA), a time-demanding multidimensional diagnostic process, is aimed at detecting the biological, psychological and social disorders that affect this group 3 but due to the high demand for care, it is not applicable to all elderly.A challenge is to establish a rapid examination to detect health problems which could direct the application of CGA and postpone disability.
In Brazil, the Ministry of Health recommends the use of the Rapid Multidimensional Evaluation of Elderly Persons in primary care, containing items related to multiple capabilities 4 .However, most questions have not been validated, and some of them address matters whose appropriateness is questionable for a screening test.
This work is concerned with identifying some rapid and reliable screening tests already applied in Brazil to select those elderly patients that should be submitted to the CGA.
The reliability of measurements may vary depending on the context 5 , differing among population subgroups.Furthermore, the higher subjectivity of the test, the greater the possibility of variation in the interpretation of the results by different observers 6 .
Due to the importance of psychometric and sociocultural considerations when administering an instrument this study aimed to assess the inter-rater reliability (IRR) of health-related tests in a low-income elderly community.

Study and sample
This research was part of an exploratory study aimed at developing a strategy of rapid assessment of the elderly.It was performed at the primary healthcare unit of the Oswaldo Cruz Foundation (Fiocruz), in Manguinhos, in the city of Rio de Janeiro, Brazil.In Manguinhos, most houses have a single room, monthly family income is usually lower than a the minimum wage, and more than 50% of residents have no more than an elementary school education 7 .
The non-probabilistic sample was drawn from users that were 60 years or older who received care from the Family Health Team.Individuals with advanced cognitive and sensorial deficits or impaired locomotion were excluded.The sample size was calculated using the prevalence of depression, estimated at 20% in the elderly, as a reference 8 .A kappa coefficient of 0.6 with a 95% confidence interval was used to generate a conservative sample size of 180 individuals, estimated using the Win-Pepi application, version 2009 (http://www.brixtonhealth.com/pepi4windows.html).

Procedures
Data collection occurred from June to December 2010.Three health professionals were responsible for the standardization of the techniques used in the study during a three-hour meeting.Assessments were independently administered in two sessions.First a geriatrician applied the CGA and seven to 15 days later, either a psychomotor specialist or a social worker applied the tests whose IRR would be assessed.This interval aimed to avoid memory bias in favor of higher reliability and not to exhaust the patient after the long application of CGA.
Both assessments included rapid performance tests (PT) and self-assessment questions (SAQ) recommended for screening of health problems in Brazil 9,10 (Figure 1).Self-perceived health (SPH) is a measure of health associated with disability in the elderly 11 and could be indicative to referring them to the CGA.kappa statistic for categorical variables, adjusting for prevalence and asymmetries with the Prevalence-Adjusted Bias-Adjusted Kappa (PABAK) technique 12 .The classification of IRR followed the recommendations of Landis and Koch: kappa values greater than 0.75 or below 0.40 represent excellent or poor agreement, respectively 13 .Values between these levels denote intermediate to good agreement.We adopted the same rule for the ICC.SPSS version 13 (SPSS Inc., Chicago, U.S.A.) was used to perform statistical analyses.

Study ethics
The Ethics Research Committee of the Sergio Arouca National Public Health School/Fiocruz (report number 126/10) approved the research.All participants signed an informed consent form, which pledges anonymity and confidentiality of the information.

Results
The first and the second sessions evaluated 185 e 165 individuals respectively.Five were excluded due to visual or cognitive impairment.No significant differences in sociodemographic characteristics were detected between the elderly who completed the study and those who did not.The majority of participants were female (73.0%) and there was a slight majority of single or widowed participants (Table 1).
For the PT items, IRR was excellent (K > 0.75).By contrast, for SAQ items IRR was intermediate (0.75 > K > 0.40), except for the fall-related item, in which IRR was good (Table 2).

Discussion
The IRR between the PT items were excellent, unlike measures assessed by the SAQ.The possibility of low reliability for subjective measures can occur, despite adequate training of examiners.Sociocultural factors, psychological issues, memory lapses and lack of insight of informants lead to variation in how people communicate information about symptoms 14 .In addition, the reliability of the information may be compromised when there is no socially acceptable environment in which to talk about intimate questions 15,16 .The perception of "invasion of privacy" may lead the respondents to admit a problem in a first interview but deny it in the second one 5 .
All the above factors may have influenced our results.The SAQ are inherently subjective and susceptible to communication problems.The CGA is more likely to generate a context of confidence between the professional and the individual Table 1 Sociodemographic data and health problems identified in the geriatric assessment (N = 180).

Variable Estimate
Age  examined allowing him or her to assume difficulties such as urinary incontinence.Additionally, it is important to underline that our study was conducted with elderly people with little or no formal education, in which age and educational factors influence cognitive performance 11 .
The screening for hearing impairment with the use of a single question has been recommended in Brazil based on studies that have evaluated the sensitivity and specificity of the single item 10,17 , but only one study has examined its IRR and found a kappa coefficient of 0.65 18 .In relation to the SPH, two studies have assessed only test-retest reliability, but even not examining IRR these studies have identified discrepancies between assessments in racial and ethnic minorities, individuals with lower levels of education and the elderly 11,19 .Regarding urinary incontinence, the social stigma related to the problem may have contributed to our results 15 .Finally, falls, in the life of an elderly person lead to functional decline, what justifies more precise information about them, even by individuals with low levels of education.In our study, it was the only subjective question with good IRR.
Our study has limitations.It is known that observers tend to change their approaches over time so that planning a periodical replication is necessary 5 .This has not occurred here.Furthermore, we assessed elderly people with low levels of education in a primary care setting and our results should be applicable only to similar populations.
Aging often results in insidious changes in functional capacity.Screening for health problems allow examining a large portion of elderly, indicating those that should be submitted to CGA.Our study revealed that single SAQ addressing SPH, urinary incontinence, and hearing loss had poor reliability in older adults.Although high item reproducibility does not guarantee high accuracy, it is clear that subjective self-assessments with low reliability should be reviewed before implementation.
Triaje; Autoevaluación; Anciano Contributors V. T. S. Lino and M. C. Portela contributed in the conception, design, drafting and revising the manuscript.L. A. B. Camacho and N. C. P. Rodrigues contributed to the conception, design, data analysis and revising the manuscript.
mass index; SD: standard deviation; SPH: self-perceived health; TUG: timed up and go.