Psychometric properties of cognitive screening for patients with cerebrovascular diseases A systematic review

ABSTRACT. Screening instruments are ideal for acute clinical settings because they are easy to apply, fast, inexpensive and sensitive for specific samples. However, there is a need to verify the psychometric properties of screening in stroke patients. Objective: This study investigated the psychometric properties (methodological procedures) of cognitive screening for patients with cerebrovascular diseases. Methods: A systematic review of papers published on PsycINFO, Web of Knowledge, PubMed and Science Direct (2005 to 2016) was performed. Results: A total of 55 articles remained after applying exclusion criteria. The samples ranged from 20 to 657 patients. Most articles evaluated elderly individuals with four to 13 years of education who had experienced ischemic or hemorrhagic stroke. There was a tendency to find evidence of validity for criteria and to analyze the sensitivity/specificity of the instruments. Although the studies frequently used the Mini-Mental State Examination (MMSE) and the Montreal Cognitive Assessment (MoCA) to seek evidence of validity and reliability, the use of these instruments among stroke patients has been criticized due to their psychometric properties and the neuropsychological functions evaluated. Conclusion: Although there is no gold standard screen for assessing adults post-stroke, instruments devised specifically for this population have shown promise. This review helps both researchers and clinicians to select the most appropriate screen for identifying cognitive impairment in adults post-stroke.

C ognitive impairment is a common consequence following stroke, occurring in approximately 45% to 83% of subjects depending on the follow-up time, neurological characteristics and instruments used. 1,2 Notably, cognitive impairment is observed in more than 50% of patients six months after stroke. 2,3 These patients can develop vascular dementia, which affects both functional independence and quality of life. [4][5][6] The most common deficits in vascular cognitive impairment include reduced processing speed, executive dysfunction, hemineglect, inattention, aphasia, apraxia and amnesia. 3,5,7 There is no consensus on which tests should be used to evaluate performance on these functions in post-stroke patients. 8 The selection of tools usually depends on an instrument's availability and on the neuropsychologist's preference and familiarity with the tasks. 9 Using an extensive battery of neuropsychological assessment is impractical in many clinical settings, where evaluation with simpler cognitive screening instruments is required. 9 Screening instruments are therefore ideal for an acute clinical setting because they are easy to apply, fast, inexpensive and sensitive for specific samples. 10,11 Ideally, neuropsychologists should be aware of whether the selected screen has adequate psychometric properties for stroke populations in their countries. However, most neuropsychologists have based their diagnosis on instruments psychometrically tested in patients with nonvascular cognitive impairment. 6,[12][13][14] The Neuropsychological Working Group of the National Institute of Neurological Disorders and Stroke (NINDS) and the Canadian Stroke Network (CSN) have recommended three protocols (60,30, and 5 minute protocols) for assessing vascular cognitive impairment, 15 and their psychometric properties have been tested in many studies. [16][17][18][19][20] Regarding the instruments' psychometric properties, neuropsychological tests should exhibit evidence of specific forms of validity: evidence based on test content, evidence based on response processes, evidence regarding internal structure (dimensionality and relationships between scores of the same test), evidence regarding relationships with conceptually related constructs (convergent and discriminant evidence), evidence regarding relationships with criteria (contrasting groups, effect size, concurrent and predictive validity), and evidence based on the consequences of testing. In addition, it is important for the tests to demonstrate reliability in the form of temporal stability and internal consistency. 21 Furthermore, the instrument should be constructed in a manner that aims to determine cognitive deficits in a specific population. Thus, we recognize the need to verify the psychometric properties of screening in stroke patients. In this context, the present systematic review aims to identify the cognitive screening with adequate psychometric properties for use in stroke patient samples. The specific aims of this review study were: (a) to analyze the quality of the methodological information reported (sample size, age and education of participants, neurological data such as cerebrovascular disease and time post-stroke); and (b) to identify cognitive screening that have adequate validity and reliability evidence. This systematic review reports the methodological limitations of psychometric studies of adults post-stroke and investigates which screening are most adequate for identifying cognitive deficits in these patients. This review article can be distinguished from other studies in this field that tend to discuss only the sensitivity and specificity of screening instruments 22,23 or fail to examine the psychometric properties of the tests in stroke patient samples. 8

METHODS
We performed a systematic review of papers published from January 2005 to December 2016 on the following databases: PsycINFO (refined by the terms in the abstract), Web of Knowledge (refined by the terms in the subject of the article), PubMed (refined to include the terms in the title and abstract), and Science Direct (refined by the terms in the abstract, title, or keywords). The refinements varied because the databases use different advanced search tools. The following combinations of keywords were applied: "stroke", "cerebrovascular accident", "vascular cognitive impairment", and "cerebrovascular disease" versus "neuropsychological assessment", "neuropsychological evaluation", "cognitive screening", "neuropsychological screening", "cognitive assessment", and "cognitive evaluation". These Rodrigues et al.
Cognitive  keywords were selected from the most commonly used terms in the health databases to include all articles that reported neuropsychological evaluations in stroke patients. After excluding repeated articles, the remaining articles were divided and analyzed by two judges. Four judges selected only empirical studies in English, Portuguese, French or Spanish that assessed adults with cerebrovascular disease using cognitive screening. If the two judges disagreed on the selection of a particular study, a third judge was recruited. The judges had experience in neuropsychological assessment post-stroke and knowledge about the instruments used.
Many studies have evaluated neuropsychological deficits in stroke patients with cognitive screening tests, but failed to explicitly report that their analysis was psychometric. In these situations, we assumed that these articles claimed to analyze validity evidence based on relations to other variables. 21

RESULTS
After performing searches and excluding repeated articles, 74 studies that evaluated neuropsychological functions in stroke groups using screening instruments were selected. These articles were read in full, with a focus on the methods and results sections. Subsequently, 19 non-psychometric studies were excluded ( Figure 1). The results and discussion will be presented in two sections: (1) characteristics of the samples; and (2) psychometric properties of the cognitive screening.
Characteristics of the samples In our review, the samples ranged from 20 to 657 stroke patients, but only one study calculated the sample size. 24 Most of the articles evaluated individuals between 50 and 80 years old (Table 1), and only two studies included younger samples (i.e., patients under 30 years of age). 25,26 The majority of the studies evaluated patients with four to 13 years of education (9 years on average). However, 27.27% of the articles did not specify the educational levels of the participants (Table 1). With respect to cerebrovascular disease, 30.90% of studies evaluated ischemic and hemorrhagic stroke samples, 21.81% evaluated transient ischemic attack (TIA) and stroke patients, 10.90% ischemic stroke only, 9.09% cerebral small vessel diseases, 7.27% hemorrhagic stroke, 3.63% vascular dementia and 16.36% did not report this information ( Table 1). The time between post-stroke onset and neuropsychological assessment varied. A total of 50.90% of studies assessed patients at 3 months post-stroke,   Inter-rater (internal-consistency reliability), 29 test-retest (temporal stability reliability), 29 comparison between contrasting groups (relation with criteria), 29 convergent validity (relation with related constructs). 29 Birmingham Cognitive Screen (BCoS) Comparison between contrasting groups (relation with criteria), 30 predictive validity (relation with criteria). 30 Brief Memory and Executive Test (BMET) Inter-rater (internal consistency reliability), 31 test-retest (temporal stability reliability), 31   25.45% included patients who were assessed from 3 to 12 months post-stroke and 14.54% assessed patients more than 12 months post-stroke. A total of 9.09% of the articles did not report time post-stroke (Table 1).

Psychometric properties of the instruments
In the last few years, many studies have demonstrated the psychometric properties of the instruments according to the tripartite model of validity: content, criteria and construct. However, in our systematic review, we classified the evidence of validity and reliability in accordance with recently established definitions. 21 Most of the studies determined validity based on relationships with criteria (60%), relationships with related constructs (22.50%), and content-oriented evidence (0.83%) ( Table 2). Only 19 articles presented data on the reliability of the instruments, 10.83% of which discussed internal consistency reliability and 5% temporal stability reliability (Table 2). With respect to the procedures adopted to determine validity and reliability, many of the studies included a sensitivity and specificity analysis (28.33%), considered convergent validity (or relations to other tests) (21.66%), compared contrasting groups (14.16%), executed predictive validity (9.16%), performed interrater analysis (7.5%), considered concurrent validity (5.83%), considered test-retest reliability (4.16%), tested for correlation with other measures (3.33%) and identified effect size (2.5%). The alternate form, discriminant evidence, response times and an analytical judgment of the instrument (face validity) were investigated once for each (0.83% overall).
As can be observed in Table 2, the Montreal Cognitive Assessment (MoCA) followed by the Mini-Mental State Examination (MMSE) were the instruments most analyzed to find validity and reliability evidence (40.90% and 18.18%, respectively). Other instruments were investigated once or twice per instrument. The studies were classified according to appropriate or inappropriate values present in the Discussion section of the articles ( Table 2).

DISCUSSION
Characteristics of the samples Regarding sample size, we identified wide variability in the number of participants, and only one study presented a sample calculation. Calculating the sample size in psychometric studies is recommended both to avoid finding differences between groups by chance and to increase the likelihood of detecting true, clinically significant differences. 68 Therefore, the results of many papers should be interpreted with caution because they do not use representative samples of stroke patients.
It is essential to ensure the sample's representativeness by providing a detailed description of its sociodemographic and developmental characteristics in empirical studies. 21 Most of the investigations involved elderly stroke patients (>60 years), and the psychometric properties of the screening are shown only for this age group. An increasing number of young people affected by this injury exhibit cognitive impairment, which is present in approximately 20% to 30% of young stroke patients. 3 Age influences patient performance on cognitive tasks. 27,28,60 Therefore, it is important to verify whether validity and reliability evidence vary according to this variable for each test.
Educational background may influence both patient performance and test sensitivity/specificity. 1,9 However, several studies included in this review did not discuss the education of participants and did not control for this variable, which is a limitation. 7,12,17,25,29,35,36,38,39,40,47,59,63,64,67,69 Adults with high educational levels usually have better performance on neuropsychological assessments, and the cut-off points of tests should take this into account. 27,28,45,58,60,61 Years of education should always be considered in empirical studies in neuropsychology.
In relation to neurological variables, many studies did not report the cerebrovascular disease of the participants (16.36%). Patients present vascular cognitive impairment regardless of stroke type, 49 although there are differences in the neuropsychological performance of patients with vascular dementia (VD), subcortical ischemic vascular disease (SIVD) and mild cognitive impairment (MCI). 45 Therefore, future studies could provide validity evidence and cut-off points for the screening according to cerebrovascular disease (when differences are found between groups). This would enable clinicians to know when significant deficits are present in each case.
Lastly, the time post-stroke is important to note in empirical studies because instruments have shown different cut-off points and because patients recover some neuropsychological functions approximately six months post-stroke. 9,18,30,31,49,55,61 Neuropsychological assessment is indicated after acute stroke. The early recognition of cognitive deficits leads to improved interventions and thus prognosis. 7 Psychometric properties of the instruments Most instruments have shown validity regarding relationships with criteria, and the studies typically used age, education, stroke type and neuropsychological performance differentiations between clinical and control groups as criteria. This evidence is important in determining whether a neuropsychological instrument can predict either the performance of a specific group of individuals or whether there will be differences in the scores of contrasting groups. 21 However, a stroke may produce different behavioral changes in individuals, thus complicating the definition of a criterion group. Although heterogeneity of performance is important for identifying the test's psychometric properties, heterogeneity of lesions can limit the interpretation of the results for all types of cerebrovascular diseases.
Evidence based on relationships with related constructs was also one the most common forms of validity evidence found by the screening. Correlation with other tests and measures (related constructs) is important for proving that an instrument assesses the intended cognitive domains. 21 In general, cognitive screening have been related with other instruments in that they evaluate similar neuropsychological functions. However, the strength of the correlation between instruments varied widely due to the different characteristics of the tests. For example, the CASP showed weak correlation with MoCA and the MMSE likely because it has visual items that can be administered to patients with severe expressive aphasia, while the other screening are languagedependent. 35 Therefore, interpreting evidence of validity based on conceptually related constructs should be carried out with caution.
Other psychometric procedures, such as seeking content validity, may not have been found frequently because most of the screening instruments were not specifically devised for stroke samples. Further evidence of validity should be found in the manuals of the tests published in each country. Our study is limited by a failure to describe these data.
Most of the studies analyzed only the validity -not the reliability -of the instruments. We suggest that psychometric studies include analyses of reliability to enlarge their evidence and avoid measurement errors. For example, some studies with test-retest reliability (temporal stability reliability) demonstrated that patients have better performance on the reevaluation. 9,18,33,34 Other studies show temporal score stability. 29,31,55,60,61,67 Several studies did not specify the time of cognitive evaluation. 20,39,51,55,58,66 Therefore, future studies should clarify the timing of the evaluation and show evidence in accordance with this variable.
Regarding psychometric property procedures, sensitivity and specificity analysis were the most commonly used in the studies. The sensitivity of a test relates to the percentage of individuals with deficits that the instrument is able to identify (true positive rate). In contrast, the specificity indicates the test's ability to detect healthy people for the neuropsychological functions measured (true negative rate). According to Blake et al., 70 a cognitive screening instrument should have values superior to 80% and 60% for good sensitivity and acceptable specificity, respectively. However, many screening instruments did not reach these values. 6,7,14,19,20,24,27,38,[42][43][44][45]48,61,62,66,69 Therefore, items need to be better studied and replaced to improve the quality of the instruments.
Notably, convergent validity and comparisons between contrasting groups were frequently executed. These procedures are important to seek evidence of validity based on relationship with criteria, as previously discussed. Differences between contrasting groups with various degrees of severity of vascular cognitive impairment were highlighted in many studies. 14,19,24,31,41,45,46,49,58,60,62,66 However, studies need to improve the control of variables such as sociodemographic (age and education) and neurological data (cerebrovascular disease) that influence patient cognitive performance. 4,46 In this review article, most cognitive screening used in stroke samples were originally developed to evaluate MCI and Alzheimer dementia patients, such as the MMSE, MoCA, WCFT, R-CAMCOG, ACE and CDT. However, there is no theoretical basis to justify the use of such screening, and they do not contain specific tasks for stroke patients. The application of neuropsychological instruments with a theoretical base is important both to justify patient deficits and plan their rehabilitation.
The NINDS and the CSN recommended the use of the MoCA to evaluate vascular cognitive impairment as an alternative to the MMSE. 15 These instruments are correlated. 20,45,55,56,61 However, one advantage of the MoCA is that the ceiling effects were substantially less evident than for the MMSE in stroke patients. 14,24,26,62 Although both instruments are commonly investigated, the applicability to stroke samples has been discus sed. 12,14,24,25,44,[47][48][49][50]53,59,62 Some studies support the high sensitivity of the MoCA 18,46,51,54 , but reveal its low specificity. 12,14,24,42,48,49,52,57,62 Chan et al. 12 found that 77% of patients were classified as cognitively intact on the MoCA but were impaired for one or more cognitive domains on a neuropsychological assessment (intellectual functioning, processing speed, and visual memory) not evaluated by the screen. The MoCA also failed to identify patients without problems in daily life functioning after mild stroke 26,44,47,59 and discharge destination; 50 however, a relationship between the MoCA and functional measures was found post-stroke. 17,26 The MoCA has demonstrated wide validity and reliability in several languages. However, researchers should exercise caution with MoCA cut-off points in each country because this test is influenced by educational level, 45,58,60,61 age, 60 cerebrovascular disease 14,19,24,45,58,60 and time post-stroke. 49,57 Additionally, deficits in language (comprehension and expression) and perception (hemineglect), which are common post-stroke, may negatively affect the performance of participants on MoCA tasks.
A limitation of the studies on the MoCA is that the cut-off point for elderly samples without vascular disease, as well as cut-off points from different countries generally, to classify cognitive impairment patients, 49 underestimate the possible deficits post-stroke. It is also important to show cut-off points by subtest (cognitive function), which could contribute to understanding the impact of brain injury on specific skills. 62 The MMSE is more specific than the MoCA, 24,46 but is less sensitive for stroke patients. 6,43,45 This instrument can show differences between clinical and control groups 16,17,41 and between various cerebrovascular diseases, 16,24,46 but underestimates cognitive impairment post-stroke. 40 However, the MMSE has shown low prediction ability for functional outcomes. 44 According to Pendlebury et al., 14 the MMSE showed a ceiling effect in many subtests (naming, registration, reading and writing reaching near maximal scores) in amnestic, TIA and stroke groups. Moreover, the MMSE is insensitive for evaluating abstract reasoning, executive functioning, and visual perception/construction deficits that are present in subcortical lacunar strokes. 6 Compared to a detailed neuropsychological battery of tests, the MMSE did not present adequate levels of sen-sitivity and specificity. 13 However, refining cut-off scores by age and education can both improve the sensitivity of the MMSE (at the cost of specificity). 48 Studies that performed sensitivity/specificity analyses with the MMSE showed that these values were no higher than 80%. 42,43 As indicated by Stolwyk et al., 22 these scores suggest that 20% of patients with vascular cognitive impairments are not identified, which is unacceptable in clinical practice. Therefore, the MMSE is not recommended 6,37 because it does not exhibit adequate psychometric properties for stroke patients.
Other cognitive screening instruments developed for MCI and dementia have been tested in stroke samples, but none are specific for this population (CDT, WCFT, ACE, ACE-R, R-CAMCOG, MEAMS, Cog-4, SIS, SINS and RBANS). The psychometric properties of these instruments are weak and insufficient in clinical practice. The BNIS, 28 ZüMAX 67 and Cognistat 7 are cognitive screening developed for acquired brain lesions in general. The BNIS shows adequate psychometric properties, 28 but does not measure neuropsychological functions usually impairment post-stroke. In contrast, the ZüMAX and Cognistat present little evidence of validity in small stroke samples. Psychometric studies with these tests require further evidence of validity and reliability and should determine the optimal cut-off level for stroke patients.
This review study found only seven cognitive screening that are specifically designed to evaluate stroke patients: the BCoS, OCS, BNS, CASP, MVCI, BMET and NPEC. The BCoS assesses attention, executive function, language, memory, numeric abilities and praxis and exhibits wide validity and reliability. 29,30 However, the BCoS has evidence only in the country in which it was developed, as well as for a Cantonese version. Therefore, researchers from other countries (with different cultures and languages) should test it in their regions before applying it. The OCS is based on the BCoS and avoids the confounding effects of aphasia and neglect that are frequent in stroke patients. 9, 64 Demeyere et al. 64 showed higher sensitivity for the OCS than the MoCA in detecting cognitive impairments in stroke patients (88% vs. 79%). Future studies could build on the evidence of validity and reliability of this instrument, as well as provide broad normative data for other countries.
The format of the CASP appears better suited than the MMSE or MoCA for use in stroke patients with severe neurovisual disorders 36 and aphasia because it can be administered without using language. 35 However, its psychometric properties have yet to be studied. 36 The BNS 32 and NPEC 63 discriminated acute stroke patients with cognitive impairments from those without cognitive problems and can be used to determine different cognitive profiles according to the location of the lesion. The MVCI exhibits good validity and reliability, and the overall probability of correctly discriminating vascular cognitive impairment was 90.0%. 39 The BMET correctly identified 78% of patients with cognitive impairment, 31 but was tested only in cerebral small vessel disease patients and shows modest sensitivity.
Impairments in reasoning and executive functioning are the most frequent cognitive deficits in the early phase post-stroke. 6 Executive functions, attention and processing speed are also the most impaired functions in long-term stroke patients. 5 However, most neuropsychological screening do not include tasks that evaluate these functions because they are not developed for stroke patients, which justifies the construction of specific instruments. Moreover, the majority of the studies may underestimate patient deficits.
In summary, the psychometric properties of neuropsychological screening for stroke patients have been explored by initial analyses that did not use representative samples. Although the studies most frequently used the MMSE and the MoCA to find evidence of validity and reliability, the use of these instruments in stroke patients has been criticized due to their psychometric properties and the neuropsychological functions evaluated. Therefore, more studies involving specific instruments for stroke patients are necessary to confirm the validity and reliability of the cognitive screening.
Authors contributions. All authors drafted and critically revised the manuscript.