Combining Cognitive Screening Tests for the Evaluation of Mild Cognitive Impairment in the Elderly

OBJECTIVE To determine the accuracy of the Mini-Mental State Examination combined with the Verbal Fluency Test and Clock Drawing Test for the identification of patients with mild cognitive impairment and Alzheimer’s disease (AD). METHOD These tests were used to evaluate cognitive function in 247 older adults. Subjects were divided into three groups according to their cognitive state: mild cognitive impairment (n=83), AD (n=81), cognitively unimpaired controls (n=83), based on clinical and neuropsychological data. The diagnostic accuracy of each test for discriminating between these diagnostic groups (mild cognitive impairment or AD vs. controls) was examined with the aid of Receiver Operating Characteristic (ROC) curves. Additionally, we evaluated the benefit of the combination of tests on diagnostic accuracy. RESULTS Although they were accurate enough for the identification of Alzheimer’s disease, neither test alone proved adequate for the correct separation of patients with mild cognitive impairment from healthy subjects. Combining these tests did not improve diagnostic accuracy, as compared to the Mini-Mental State Examination alone, in the identification of patients with mild cognitive impairment or Alzheimer’s disease. CONCLUSIONS The present data do not warrant the combined use of the Mini-Mental State Examination, the Verbal Fluency Test and the Clock Drawing Test as a sufficient diagnostic schedule in screening for mild cognitive impairment. The present data do not support the notion that the combination of test scores is better that the use of Mini-Mental State Examination scores alone in the screening for Alzheimer’s disease.


INTRODUCTION
The term mild cognitive impairment (MCI) refers to a transitional state between normal cognitive aging and pathological decline, i.e., the propensity to develop incipient dementia syndromes. Subjects with MCI are non-demented individuals, yet they have demonstrable impairment in cognitive functions according to their performance on tests adjusted for age and educational level. These deficits minimally, if at all, affect the ability to undertake activities of daily living. 1 Individuals diagnosed with MCI have a higher risk for developing dementia, most notably Alzheimer's disease (AD); for this reason, at least a subset of MCI patients may in fact represent a prodromal stage of AD. 2,3 Despite its clinical importance, the proper identification of MCI remains a challenging issue in non-specialized settings. The evaluation of patients with MCI usually involves a comprehensive neuropsychological assessment, along with sophisticated laboratory tests (e.g., analysis of cerebrospinal fluid biomarkers) and neuroimaging exams (e.g., structural magnetic resonance with volumetric measures, functional exams such as SPECT/PET). These methods are expensive, need highly specialized professionals, and are not routinely available in most clinical settings.
The majority of cognitive screening tests that are commonly used in clinical practice (e.g., the Mini-Mental State Examination, the Verbal Fluency Test and the Clock Drawing Test), as well as informant-based questionnaires (e.g., the Informant Questionnaire of Cognitive Decline in the Elderly -IQCODE), have been developed for the diagnosis of dementia and are thus not sensitive enough to discriminate MCI patients from cognitively unimpaired older adults. 4,5 As a consequence, a large number of elderly subjects with MCI remain unidentified in clinical practice in spite of having presented their concerns to the clinician. 6 This challenge is even more difficult to overcome if one considers populations with varying degrees of schooling, given the relevant education bias that affects most cognitive tests. The combination of tests may add diagnostc accuracy to cognitive screening. The rationale for this strategy is that different tests may provide supplementary information about the cognitive functioning of a given patient, increasing the probability of identifying those with mild deficits. Several different combination strategies may be used, such as the "and"/"or" rule and the weighted sum of test scores, 7 which allow the examiner to improve the sensitivity and/or specificity of individual tests according to the combination rule. Brief cognitive batteries, such as the Cambridge Cognitive Test (CAMCOG), 8 also lack sensitivity to discriminate patients with MCI from cognitively unimpaired subjects, in spite of having a better diagnostic accuracy as compared to its subcomponents, e.g., the Mini-Mental State Examination (MMSE), Verbal Fluency Test (VFT) and Clock Drawing Test (CDT). 9 In clinical settings, the qualitative analysis of the performance in specific cognitive screening may help to improve diagnostic accuracy to detect mild but relevant deficits. 10 Although the combination of screening tests can improve the detection of mild to moderate cases of dementia 11 , few data exist about the validity of these instruments for the correct identification of elderly persons with MCI. In one study, the combination of MMSE and CDT reached good sensitivity and specificity values for the multiple-domain MCI; however, this combination was not appropriate for identifying single-domain (amnestic or non-amnestic) MCI subtypes. 4,12 In another study, the association of an informant-based screening test (the IQCODE) and the MMSE showed low sensitivity for screening MCI patients, regardless of their subtype 5 . Few studies have been conducted so far with this objective, and there is room for several other combinations to be investigated. Therefore, the aim of the present study was to determine the diagnostic properties of the combination of three widely used cognitive screening tests (the MMSE, CDT and VFT) to identify subjects with MCI. The tests are timely-effective, and their results are easy to interpret. We hypothesized that the combination of these instruments may yield a higher accuracy in the identification of MCI subjects than each test alone.

Patients
All subjects included in this analysis are participants of an ongoing prospective study on ageing and cognition carried out at a specialized memory clinic at the Institute of Psychiatry, University of São Paulo, São Paulo, Brazil. Recruitment started in June of 2003, and subjects were continuously enrolled for cognitive evaluation. Patients and controls were recruited from the hospital catchment area through information provided by media advertisements and lectures about health and aging. We also included subjects presenting with spontaneous demand for memory assessment and referrals from other medical services. All subjects were living in the community at the time of recruitment. This study was approved by the local ethics committee and was performed in accordance with the Helsinki Declaration. For the purpose of this study, we included 247 elderly outpatients (73% women) (mean age: 71.3±7.2 years; mean educational level: 10.6±5.9 years) with the baseline diagnosis of mild cognitive impairment (MCI, n=83), Alzheimer's disease (AD, n=81) and normal cognition (controls, n=83).

Assessments
Clinical and neuropsychological evaluations were carried out according to a standardized protocol and by trained physicians and neuropsychologists. Detailed information about the medical evaluation, cognitive and neuropsychological assessments, and diagnostic algorithms of this cohort can be found elsewhere. 13 In brief, the cognitive evaluation consisted of the administration of the Brazilian version of the Cambridge Examination for Mental Disorders in the Elderly (CAMDEX) semistructured interview, 8,14 which yields scores for the Cambridge Cognitive Test (CAMCOG), the Mini-Mental State Examination (MMSE) 15 , and a Verbal Fluency Test. The Clock Drawing Test, which is part of the CAMCOG schedule, was additionally scored accordingly to Sunderland's guidelines. 16 The 21-item Hamilton Depression Scale (HAM-D) was administered to all subjects prior to neuropsychological assessment in order to rule out the presence of clinically relevant depressive symptoms; 17 patients with a HAM-D score of 7 or more were referred to psychiatric attention and not included in the study.
A comprehensive neuropsychological evaluation included the culturally adapted, Brazilian Portuguese versions of the following tests: the Rivermead Behavioral Memory Test (RBMT) , 18 the Fuld Object-Memory Evaluation (FOME), 19 a Verbal Fluency Test (category: fruits), the Trail Making Test (TMT) A and B 20 , the Short Cognitive Test (SKT), 21,22 and the Wechsler Adult Intelligence Scale-Revised (WAIS-R) Vocabulary and Block Design tests. 23 Evidence of functional decline was based on the scores of the Informant Questionnaire of Cognitive Disorders of the Elderly (IQCODE) 24 , as well as all available evidence concerning difficulties performing basic and instrumental activities of daily living, as reported by a close relative or caregiver and on the patient's self-report.

Clinical diagnosis
Consensus diagnoses were reached at expert multidisciplinary sessions taking into account all information about the current medical history and evidence of objective cognitive decline as assessed by the neuropsychological exams. The scores on the MMSE, VF and CDT were obtained at the screening assessment of every patient, and this information was not added to the diagnostic protocol. Objective cognitive impairment was defined as performance below -1.5 standard deviations in the neuropsychological evaluation, adjusted by age and educational level in the São Paulo elderly population. 9,25 Diagnosis of amnestic MCI was made according to the following criteria 26,27 : (1) subjective cognitive complaint, preferably corroborated by an informant; (2) objective memory impairment in the neuropsychological assessment; (3) preserved global intellectual function; (4) preserved or minimal impairments in activities of daily living; (5) no signs of dementia. Diagnosis of Alzheimer's disease was carried out according to established diagnostic criteria 28 . The control group comprised individuals without objective evidence of cognitive impairment and subjects with cognitive complaints but normal performance in neuropsychological tests (i.e., subjective cognitive complainers).

Statistical analysis
We performed a Pearson's Chi-square analysis to assess the differences in the distribution of gender among different diagnostic groups. Kolmogorov-Smirnov tests were used to evaluate the normality of the distribution for each continuous variable. Since these analyses showed that all variables had a normal or near-normal distribution, we decided to carry out parametric statistical tests for all analyses. For the purpose of this study, we analyzed data only for the Mini-Mental State Examination, the Verbal Fluency Test and the Clock Drawing Test. We carried out a univariate analysis of variance (ANOVA) to assess the mean differences for socio-demographic data, clinical variables, and scores on cognitive and neuropsychological tests among the diagnostic groups. Receiver Operating Characteristic (ROC) analyses were performed to calculate the sensitivity, specificity and accuracy of the suggested cut-off points adjusted by educational level suggestive of cognitive impairment proposed by the MMSE (illiterate, < 20, 1 -4 years of education, <25, 4 -8 years of education, <26, 9 years of education and above, < 28), the VFT (illiterate, <10, one year of education and above <14) and the CDT (0 -8 years of education, < 6, above 8 years of education, <8) 29 to identify MCI or AD cases versus controls. We addressed if different combinations of the above mentioned tests were more accurate for identifying MCI and AD cases in this cohort. Therefore, the following combinations were assessed: (1) subjects who scored below the cut-off points in the MMSE, the VFT and the CDT were regarded as cognitively impaired (the "and" rule); (2) subjects who scored below the cut-off points in at least the MMSE, the VFT or the CDT were regarded as cognitively impaired (the "or" rule); (3) the combination of MMSE and the VFT; (4) the combination of the MMSE and CDT, MMSE and VFT and the combination of the CDT and VFT. All statistical analyses were performed using SPSS v14.0 for Windows (SPSS Inc., Chicago, IL), and the α value was set at 5%.

RESULTS
Demographic and cognitive variables are summarized in Table 1. As expected, patients with AD were older, less educated and performed worse on the cognitive tests as compared to MCI and control individuals (ANOVA, p<.001). screening tests (MMSE, CDT and VFT) as compared to AD and control individuals. Although significant, the magnitude of the difference between subjects with MCI and healthy controls on these test scores was small, and scores were above the cut-off values that are usually accepted as positive screening for dementia. All tests (separately and in combination) showed good accuracy for the identification of AD. The cut-off scores of the MMSE alone and the combination of MMSE and VFT showed the best accuracy for identifying AD patients ( Table 2). The MMSE scores alone yielded an area under the ROC curve of 0.85 ± 0.03 (p<0.001), indicating the best combination of sensitivity and specificity (98% and 71%, respectively). The "and rule" (i.e., MMSE and CDT and VFT scores) had maximum (100%) specificity, and the "or rule" (i.e., MMSE or CDT or VFT) had maximum sensitivity (100%). However, the AUC obtained from the separation of patients with MCI from controls was smaller regardless of whether the tests were analyzed separately or in combination. All tests alone, the combinations of two tests (MMSE + VFT, MMSE + CDT, and VFT + CDT) and the "and rule" had good specificity for separating MCI from controls, albeit with very low sensitivity.

DISCUSSION
Our results indicate that the MMSE, the Verbal Fluency Test (VFT), and the Clock Drawing Test (CDT) do not have a good diagnostic accuracy for identifying cases of MCI, in spite of their usefulness in the diagnostic screening for dementia. In addition, the combined use of these tests did not satisfactorily increase overall diagnostic accuracy when separating MCI from controls. Our results are in accordance with previous studies from our group and others in which the association of cognitive and/or functional tests did not provide a good sensitivity in the diagnostic screening for MCI. 4,5,12 Regarding the identification of cases of dementia in this sample, our ROC curves do not demonstrate any additional benefit of the combination of tests as compared to the analysis of each test score separately. This finding is somewhat surprising because other authors have proposed that this strategy may improve overall diagnostic accuracy. In fact, the present data demonstrate that the combination of tests did not __ or minimally, at best __ change diagnostic accuracy, as shown by the AUCs. This effect was primarily due to the detriment of sensitivity, since specificity was substantially increased through a combination of test scores. We also observed this phenomenon in the analysis of non-demented subjects, where the separation of patients with MCI from cognitively healthy controls yielded AUCs between 0.5 and 0.6 irrespective of the analysis strategy. In other words, the combination of tests substantially reduced sensitivity, but it increased specificity up to 97-100%. The only exception to this tendency was with the "or rule", which yielded a specificity below 50%.
A few considerations must be made with respect to the clinical implications of the present findings. The combination of two or three cognitive screening tests is a common approach used by physicians to substantiate the clinical diagnosis of cognitive impairment, particularly in settings where a thorough neuropsychological evaluation is not available or because of time constraints. We understand that the qualitative observation of the performance on such tests, in addition to the score itself, may add important insights to the diagnostic workup, as we have previously shown that the analysis of MMSE sub-scores supports the identification of MCI subtypes 10 . However, when translating this clinical impression into the screening for cognitive impairment in larger patient groups, we recommend that the output of this strategy be interpreted with caution, since the combination of test scores according to the "and rule" significantly impairs diagnostic sensitivity. In other words, caseness will be met only when a given patient shows abnormal performance on all tests utilized for assessment. This strategy will deliver a high number of false negative cases that correspond to patients who in fact have subtle abnormalities that might have been detected by one single test. On the other hand, the specificity of estimates based on the "and rule" will be significantly increased in the presence of abnormal performance in all tests together. Brief cognitive tests are developed to provide good sensitivity in the screening for dementia, whereas specificity is usually attained with more inclusive batteries or through formal neuropsychological evaluation. Specificity can be further increased by adding information from laboratory and neuroimaging tests, both for the purpose of ruling out comorbidities that concur with a high prevalence of cognitive impairment 30 and to ascertain underlying pathological features of AD 31 . The complexity of this procedure is a limitation for the large-scale diagnosis of MCI, particularly in primary care settings. The basic purpose of the tests that we addressed in this study (i.e., good for screening) was met by all three tests when the target was identifying dementia (AD), but not when attempting to detect subtle deficits (i.e., MCI). One possible explanation is that these tests are prone to ceiling effects, particularly among more educated MCI patients. In other words, subjects with mild deficits may still be able to perform well in spite of the existence of symptoms compatible with incipient AD 32 . Since most available cognitive screening tests have been developed to separate dementia from non-dementia, the development of new tests specially designed to screen for mild cognitive deficits, such as the Montreal Cognitive Assessment (MoCA) 33 and the Computer-assisted Neuropsychological Screening for MCI (CANS-MCI), 34 may overcome this limitation. In addition, the combination of cognitive tests and functional scales, or functional scales alone, may serve as alternatives for screening subjects with cognitive complaints. Bottino et al. 35 found that the association of FOME and MMSE with two informant-based functional scales (IQCODE and Bayer Activities of Daily Living Scale) was sensitive enough to identify cases with mild to moderate dementia. Perroco and colleagues 36 recently reported that the IQCODE (and its shorter version) was also sensitive enough to screen for patients with mild dementia. The objective functional assessment, as provided by the DAFS (Direct Assessment of Functional State), discriminated MCI patients from patients with dementia with high accuracy; nonetheless, it had a lower accuracy for discriminating patients with MCI from cognitively unimpaired elderly subjects 37 . Therefore, despite the fact that several screening strategies may be useful for the identification of mild to moderate dementia cases, there is still a long road ahead when addressing the screening for pre-dementia cases.
The sample of patients and controls on which the present analysis was based has certain characteristics that may explain the discrepancy between our findings and previous notions that supported the advantage of combining tests. First, the AD patients included in this study had mild or incipient dementia, as opposed the predominance of patients with mild to moderate dementia in previous studies, rendering patients in our AD group less impaired and narrowing the difference between their mean scores and those obtained by patients in the comparison groups (MCI and controls). Conversely, our controls are not only non-demented but also cognitively unimpaired, since we separated cognitively healthy subjects (controls) from those with MCI through a neuropsychological assessment. Most studies conducted in the late 1980s and 1990s did not separate these two states, leading to contamination of the control group with subtle forms of cognitive impairment; in this case, the scores obtained by historical controls tend to be lower than the ones observed in our sample. In both instances (i.e., less impaired AD patients and above average controls), the range of cognitive deficits will be smaller, affecting the cut-off scores that best separate groups. We understand that educational level is an important issue when assessing patients with cognitive complaints/ impairment, particularly in populations with heterogeneous backgrounds. In this analysis, we controlled for this effect by using different cut-off scores for the cognitive screening tests, which have been established for the Brazilian population. 29 In addition, the clinical diagnoses provided by the multidisciplinary expert consensus meeting yield important insights as to whether any given finding is clinically relevant or not.

CONCLUSION
The present results highlight three important issues that have practical implications for the diagnostic workup of cognitive disorders. First, the MMSE alone may be sufficient in the screening for dementia, since the combinations of tests did not increase diagnostic accuracy. Second, the combination of the MMSE, the VFT and the CDT (by means of several different analytical strategies) did not accurately identify cases of MCI in a clinical setting. Finally, the diagnosis of MCI still depends on the information provided by a comprehensive neuropsychological assessment. Alternatively, the clinical evaluation along with other supportive diagnostic techniques, such as the measurement of AD-related biomarkers in cerebrospinal fluid and/or structural and functional neuroimaging, may be necessary to substantiate the diagnosis of MCI and the subsequent risk of developing AD. [39][40][41] Of course, these resources are restricted to specialized settings and are not appropriate for large-scale screening of cognitive impairment. From the clinician's perspective, the identification of patients with MCI, as well as the evaluation of the risk for progression to dementia, still relies on careful clinical judgment and the longitudinal determination of cognitive status with the available tools. 42 FUNDING: Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP, 02/12633-7). The Laboratory of Neuroscience is supported by Associação Beneficente Alzira Denise Hertzog da Silva (ABADHS).