Evidence of the validity of a novel version of the computerized cognitive screening battery CompCog

ABSTRACT Although the availability of the computer-based assessment has increased over the years, neuropsychology has not carried out a significant paradigm shift since the personal computer’s popularization in the 1980s. To keep up with the technological advances of healthcare and neuroscience in general, more efforts must be made in the field of clinical neuropsychology to develop and validate new and more technology-based instruments, especially considering new variables and paradigms when compared to paper and pencil tests. Objective: This study’s objective was to produce concurrent validity evidence of the novel version of the computerized cognitive screening battery CompCog. Methods: Participants performed a traditional paper and pencil neuropsychological testing session and another session where CompCog was administrated. The data of a total of 50 young adult college students were used in the analyses. Results: Results have shown moderate and strong correlations between CompCog’s tasks and their equivalents considering paper and pencil tests. Items clustered in agreement with the subtest division in a principal component analysis. Conclusions: The findings suggest that CompCog is valid for measuring the cognitive processes its tasks intend to evaluate.


INTRODUCTION
N ew technologies are powerful tools to aid in diagnosis and health treatments. In the clinical neuropsychological context, technology can be incorporated into the cognitive evaluation, stimulation, training, and rehabilitation. However, since the personal computer's popularization in the 1980s, neuropsychology has not gone through a significant paradigm shift. 1 The availability of the computer-based assessment has increased over the years, especially over the past decade, 2 but the neuropsychological testing still relies primarily on paper and pencil tasks. 3 The use of computerized tests is still rare, 1 and many of the efforts made to keep up with the technological advancements are related to the development of a computerized version of a traditional paper and pencil test. 4 A meta-analysis found that they are usually equivalent. 5 Also, there is a criticism that both are based on the theoretical concepts developed decades ago, resulting in a "cosmetic change" only. 3 Some effort has also been made in the software development related to the scoring of those traditional tests. 1 Thus, in general, the studies are divided into two groups: (1) the adaptation of the existing standardized tests to the computer and (2) the development of new tests and batteries for evaluating cognitive functions, which range from the design and build to the validation of new computerized instruments. The examples of some traditional and widely used neuropsychological tests that were adapted to computerized formats are the Wisconsin Card Sorting Test (WCST), 4 the Category Test -a section of the Halstead-Reitan Neuropsychological Test Battery, 6,7 versions of the Raven's Progressive Matrices, 8 and the Corsi Block task. 9 The examples among the computerized batteries that were developed for the evaluation of cognitive functions are the Automated Neuropsychological Assessment Metrics (ANAM), which evaluates attention, concentration, reaction speed, memory, mathematical ability, and decision-making; 10 the Cambridge Neuropsychological Test Automated Battery (CANTAB), which evaluates working memory and planning, visuospatial memory, and attention; 11 the Central Nervous System (CNS) Vital Signs (CNVS), which evaluates memory, cognitive flexibility, psychomotor speed, time, reaction, and complex attention; 12 the Computerized Neuropsychological Test Battery (CNTB), which evaluates information processing, motor speed, verbal and spatial memory, attention, language, and spatial abilities; 13 the Mindstreams, which evaluates memory, executive functions, attention visuospatial, verbal fluency, motor skills, and information processing; 14 the Cognition assessment using the NIH Toolbox, which evaluates executive function, episodic memory, language, processing speed, working memory, and attention; 15 and the Computerized Neurocognitive Scanning, which evaluates executive function, memory, intellectual, and sensorimotor functions. 16 But to keep up with the technological advances of healthcare and cognitive neuroscience, in general, more efforts must be made in the field of clinical neuropsychology to develop and validate new and more technology-based instruments considering variables known to be valuable and that technology made it possible to use such as processing speed and reaction time. 17 The paper and pencil neuropsychological tests hardly have precise reaction time measurements, but this is an advantage that stands out among computerized tests: specific and complex variables, such as reaction time, can be measured even in milliseconds. 18 Reaction time is a variable that seems to be affected throughout many types of neurodegenerative conditions, [19][20][21][22] and the present computerized battery focuses on the reaction time as its primary measure -although it is also able to measure errors. Also, the reaction time is measured as "median reaction time" throughout all the tests, so some concerns considering variability, which have been raised, could be addressed. 23 Other computerized batteries using reaction time variables already exist and presented good evidence validity. 24 Still, they are available only in English and use a mean reaction time, instead of a median. In this context, this study's objective was to produce concurrent validity evidence of the novel version of the computerized cognitive screening battery CompCog. CompCog was initially called as Bateria de Testes Neuropsicológicos Computadorizados 25 (BTNC; Brazilian Portuguese version). It was created using the MEL Professional version 2.0 26 to evaluate anterograde episodic memory, attention, visual perception, processing time information, and short-term memory. The first study concerning it investigated the clinical markers of early Alzheimer's disease (AD). Forty individuals with mild AD and 73 controls, paired for age and education, were studied. The battery had six tests, and the application lasted 40 min on average. It was run on an IBM-PC-compatible microcomputer using a 14inch SVGA color monitor. A keypad with five buttons, labeled from 1 to 5, was used as a response input device. The AD group showed a significantly lower percentage of a correct response on episodic memory and short-term memory and a higher latency response on all other tests when compared to controls. The receiver operating characteristic (ROC) analysis showed that episodic memory, short-term memory, and choice reaction time tests were sensitive and specific to discriminate the groups and, therefore, were the clinical markers of early AD. 18 After that study, a screening version of the instrument was created. It took only 15 min to be administrated and was named Computerized Cognitive Screening test -CompCogs. CompCogs had the same material as BTNC, was developed with MEL Professional, was run on an IBM-PC, and had the same keypad with five buttons as a response input device. The CompCogs was applied in 47 individuals with probable mild AD and 97 controls. The idea was to investigate its validity for the early diagnosis of AD. CompCogs presented 91.8% sensitivity and 93.6% specificity for AD diagnosis using the ROC analysis of AD diagnosis probability derived by logistic regression. It showed high validity for AD early diagnosis and, therefore, may be a useful alternative screening instrument. 27 In 2011, CompCog was developed for mobile devices that operate on the iOS operating system, maintaining the original ideas of evaluation, but now with a new way of interaction -the touchscreen, more dynamic interface, and the possibility to be carried out in the iPad. This new version is broad and flexible. Although there is a suggested order of application, the examiner can select the tests and their order. It comprises eight tests that evaluate different cognitive domains such as information processing speed and reaction time, implicit, episodic, working memory, attention, and inhibitory control. Battery administration lasts about 40 min in healthy individuals.
The test is available in Portuguese, Spanish, and English. The demographical data is collected at the beginning, such as full name, age, education, sex, and handiness. The results of each test are presented at the end of each application and at the end of the whole battery. All data are stored in the cloud and are available in an Excel spreadsheet accessible through the test's website. All answers are issued using a touch screen and recorded. All tests generate reaction time measures registered in milliseconds for each touch, both as total time and as a median, to eliminate the eventual discrepant data through each test. Furthermore, correct responses' percentage, errors, and differences in reaction time between errors and correct answers are also registered. All stimuli tests are visuospatial, except for one test -Stroop test, which contains written words in order to maintain the original paradigm.

Setting and participants
The study took place at the Pontifical Catholic University of Rio de Janeiro, where the undergraduate psychology-level students were recruited and they performed the data collection at the university's psychology clinic. Participants performed a traditional paper and pencil neuropsychological testing session and another session where CompCog was administrated.
Initially, 64 young adults were selected. However, 14 were excluded: 10 for psychiatric disorders, 1 for a neurological disorder, 2 for metabolic disorders, and 1 for drug use. Thus, the data of a total of 50 participants were used in the analyses of concurrent validity evidence. The mean age in years was 21.18 (4.02), and they had a mean of 13.5 (1.7) years of schooling, being 86% women.

CompCog
This research used the test's standard tasks' order in Portuguese during the data collection phase. Tests are explained, with their variables and cognitive functions assessed, in Table 1.

Paper and pencil tests
The neuropsychological assessment was performed through a paper and pencil battery that included traditional tests commonly used in clinical neuropsychology in Brazil. Tests used and their variables are shown in Table 2.

Equivalence
The cognitive measures assessed through the traditional tests were compared to the ones evaluated by CompCog tasks, as shown in Table 3.

Ethics
This study was registered at the university's Ethics Research Committee and authorized by the 2012-31 Favorable Opinion. The volunteers participated in the study by signing a free and informed consent form, according to resolution 196/96 of Brazil's National Health Council, which deals with the guidelines and standards for research involving human subjects. Participation in this survey was voluntary, so they did not receive any payment. The study did not bring any risk to volunteers' health, and they could refuse and/or withdraw consent to participate in the study at any time.

Statistical methods
All data entry and analysis were carried out using SPSS Windows 22.0. Pearson's correlations were run between traditional paper and pencil variables and CompCog variables. The p-values under 0.05 were considered to show statistical significance. A principal component analysis (PCA) with oblimin rotation (delta=0) was conducted on all items regarding time measures. Components were extracted based on eigenvalues of more than 1. Since the size sample was small, we used the Kaiser-Meyer-Olkin (KMO) measure to verify the sampling adequacy for the analysis. KMO was 0.662, above the level of 0.5 for adequacy. Bartlett's test of sphericity (χ 2 (300)=1658.970, p<0.001) indicated that correlations between items were sufficiently large for PCA.

Test
Cognitive functions involved and how they are evaluated Variables

Simple Reaction Time
Processing speed. As soon as a white square appears in the middle of the screen, the person should touch the rectangle in the bottom of the screen. Median reaction time, total time Choice Reaction Time Processing speed. As a white or orange square appears in the middle of the screen, the person should touch the rectangle of the same color in the bottom of the screen.
Median reaction time, total time, correct answers Implicit Learning Test Implicit learning. As 1 of 10 gray squares distributed in the screen turns white, the person must press it. There is a fixed sequence of 25 squares that is repeated four times (sequence 1-4) and one last random sequence for control (sequence 5).
Implicit learning (median reaction time in sequence 4 -median reaction time in sequence 1) and implicit learning interference (median reaction time in sequence 5 -median reaction time in sequence 1) Visual and Spatial Short-Term Memory Working memory. There are 10 gray squares distributed on the screen. One will become white at a time, making a sequence that must be reproduced.
Correct answers, direct order SPAN, reaction time in direct order, inverse order SPAN, reaction time in direct order

Face Recognition and Memory
Episodic memory. A total of 10 drawings of unknown faces are presented for 30 s. Then, the participant should choose, between 10 pairs of faces, which one was among those initially shown for memorization in 4 attempts.
Total time and correct answers, and median reaction time and correct answers for each of the four tasks.

Inhibitory Control Test
Attention, Inhibitory control Squares of different colors will appear in the middle of the screen for 1 s each, the white ones should be avoided.
Total time, median reaction time, correct answer median reaction time, error's median reaction time and errors

Stroop Test
Attention, Inhibitory control. All tasks have four colored rectangles located at the bottom of the screen. The person should touch the one matching the stimuli that appear in the middle of the screen considering its color without (task 1) and with distracters (tasks 2 and 3).
Total time, interference, median reaction time and errors for each of the three tasks

Survey Test
Attention. Squares of different colors will appear in the middle of the screen for 1 s each. Participants should press the white ones on the first tasks, white and blue ones on the second, and yellow ones on the third.
Median reaction time and correct answers for each of the three tasks

RESULTS
Results have shown moderate and strong correlations between CompCog's tasks and their equivalents considering paper and pencil tests. The results of the correlations are shown in Table 4, regarding comparisons reported in Table 1 with variables reported in Tables 2 and 3. Participant's performance in each test are reported in Table 5. Regarding the PCA, Table 6 shows the pattern matrix, with extracted components' loadings (i.e., regression coefficients) after oblimin rotation. The Phi matrix is reported in Also, the results have shown moderate correlations (0.30<r<0.49) and strong correlations (r>0.5) (28) between CompCog's tasks and their equivalents considering paper and pencil tests. Some factors can explain the small number of strong correlations. First, it is important to highlight the differences between both assessments used. Besides all the particularities that computerized tests have in relation to paper and pencil tasks, CompCog is highly dependent on visual and motor stimuli, something unusual when considering traditional cognitive tasks that rely on paper, pencil, and oral interactions. Other studies that aimed to produce concurrent validity evidence of computerized tests comparing it to traditional analogous tests had the same results. 29,30 Second, the target group to which CompCog was designed, i.e., elderly and AD patients, 18 is not the same group that composed the study sample. This brought some facilities to the study realization and comparisons, such as higher subjects' availability and normal cognitive performance in both batteries, but it   also created a ceiling effect in some tasks that may have influenced the analyses. The high diagnostic accuracy reported in other studies is, itself, another evidence of the validity of CompCog's tasks. 18,27 Simple reaction time and choice reaction time Both tests assess processing speed and attention, and its variables were correlated to TEACO-FF, TECON 1, and WAIS Processing Speed Index. The correlations are negative because CompCog measures are provided in time, while the analogous tests have the variables of correct answers. Choice Reaction Time seems to be a better measure of processing speed than the Simple Reaction Time. An explanation is a possibility of creating an automatic pattern of hitting the screen in the Simple Reaction Time test, something that cannot be performed in the Choice Reaction Time as it involves a choice between options.

Implicit learning test
There are not many traditional paper and pencil tests that specifically measure implicit learning, possibly due to the difficulty of assessing this cognitive function through paper and pencil tests. Nevertheless, WCST and R-1 Test are the options that, even though not primarily, involve implicit processes. 31,32 CompCog's implicit learning variable was correlated with R-1 score (r=-0.305, p=0.031). R-1 involves insight and implicit perception, 31 as the implicit learning variable also does. The correlation is negative because the smaller the variable is, the higher the implicit learning. The implicit learning interference measured through CompCog had a significant negative correlation with WCST variable trials to complete first category (r=-0.410, p=0.003). This can be explained as the process to learn the rule to complete the first category in WCST involves implicit learning. 32 The more trials one needs to complete the category in WCST, the less implicit learning happened in the CompCog task, the smaller the interference of this learning when completing the last sequence. The other variables suffer less influence of implicit learning, being this probably the explanation for the absence of other correlations.
Visual and spatial short-term memory There were almost no correlations between CompCog's short-memory task and short-memory traditional paper and pencil tests. However, it is important to notice that the CompCog's task is based on the visual and spatial stimuli, while WAIS-III variables are based on the auditory stimuli. It is widely accepted that, although there is a common component for both stimuli processing in working memory (i.e., central executive), 33 there are different processing pathways for each one of them. Auditory stimuli should be processed in the Phonological Loop, and the visual and spatial stimuli, in the Visuospatial Sketchpad, 33 making both tests challenging to compare.

Face recognition and memory
The tasks of Rey Auditory Verbal Learning Test (RAVLT) and CompCog had moderate correlations regarding only the first CompCog's task and the correct answers' total percentage. An explanation for the lack of correlations between the other tasks and RAVLT variables is the ceiling effect seen on the computerized test. As it was developed considering the memory capacity of elderly  and AD patients, university students could quickly learn the stimuli presented in the first try, creating an almost constant variable on the subsequent tasks. Besides it, RAVLT involves auditory stimuli while CompCog considers the memory of visual stimuli. Rey Figure Test, instead, assesses visual memory, but it is a test that suffers a lot of influence of executive functions, 34 which does not happen in CompCog. Even so, the moderated correlations were found between the recall score of Rey Figure Test and the median reaction time of the computerized test's 1st and 2nd tasks. Again, the ceiling effect may have contributed to the lack of correlations regarding the other two tasks.

Inhibitory control test
Inhibitory control is an executive function that requires the inhibition of automatic processes in order to activate controlled processes. This cognitive function is evaluated in Stroop's 3rd task and Color Trails test, with which CompCog's Inhibitory Control Test had moderated correlations. Total time, median reaction time, and correct answer's median reaction time were correlated with Stroop's 3rd task. This makes sense since the last task needs more time to be completed if they take a longer time to produce their answers due to slower processing related to inhibitory control. Instead, Error's median reaction time was correlated with Color Trails' 2nd task. A more significant Error's median reaction time shows a situation where inhibitory control actually did not work out. The same can be said about time to completion in Color Trails' 2nd task.

Stroop test
Stroop's paper and pencil test and its computerized version are both based on the same paradigm. Even so, there are still some differences between both versions. Correlations were found between the 2nd task of each presentation, 3rd task, and interference. An explanation for the lack of correlation between the 1st tasks -the only one not correlated -is that the paper and pencil version is more an automatic process -naming colors -than the computerized version. The latter involves the time to choose between the colors' buttons at the bottom of the screen, which is less automatic than naming.

Survey test
Survey is a component of attention and was mainly correlated to TEACO-FF, the traditional test most related to the CompCog's survey test. Both involve the survey for a simple stimulus (one figure or colors). The correlation is negative because the CompCog's test is measured through time -the smaller, the better -and the paper and pencil test is measured through correct answersthe higher, the better. TECON 1 did not present correlations with the computerized test, possibly due to the higher complexity of the stimulus that has to be searched, what demands working memory, and divided attention besides survey capacity. A limitation of this study is, as already mentioned, the differences between the characteristics of the sample used in this study and the population to which Comp-Cog was created to. 18 Nevertheless, the chosen sample brought benefits, too, such as preserved cognitive abilities to assess and compare the processes underneath each neuropsychological test. The differences between both assessments -computerized and paper and pencil -can also be considered as a limitation, but, at the same time, it is the option available to assess the concurrent validity of new computerized neuropsychological batteries. Nevertheless, the high number of measures based on reaction time and total time is an important advantage of CompCog and other computerized tests. The precise measurement given by time variables, with high sensibility and results with millisecond precision, is even more accurate than correct answers to evaluate cognitive processes, 19 as it may assess subtle changes in cognition that might not have impacted the outcome yet. Moreover, the PCA supported the differentiation between cognitive domains measured by the subtests of the battery. In conclusion, the results suggest that CompCog is a valid measure of memory, attention, implicit learning, inhibitory control, and processing speed.