Translation, Adaptation and Construct Validation of the Clock Test among Elderly in Brazil

OBJECTIVE: In Brazil, not many studies have investigated the validation of cognitive tests in the ageing population and none of them has analyzed the psychometric properties of Tuokko's Clock Test. The objective of the study was to translate and adapt the test to the Brazilian context, and to assess its construct validation. METHODS: This is a cross-sectional population-based study, involving 353 elderly patients from Juiz de Fora (Southeastern Brazil), from 2004-2005. To assess convergent and divergent validities Pearson's correlation statistics was used. The Clock Test substests were correlated with theses reference instruments: mini-mental state examination, digits, block design to evaluate the convergent validity. the divergent validity was assessed by comparing the substests to the Center for Epidemiologic Studies Depression Scale. RESULTS: In the sample, 74.1% were women, aged between 63 and 107 years (73.8±8.5), average schooling was 7.4 years (SD=4.7). In regard to convergent validity, signifi cant correlations were found between all CT subtests and MMSE, Digits, and Block Design (p≤0.01). As for the divergent validity, the only subtest that had signifi cant association with the reference scale was the " clock setting " (p≤0.05). CONCLUSIONS: The translated and validated Clock Test into a community sample of aged showed to be a brief screening instrument, with good construct validity when compared to other studies. Future research must investigate other psychometric properties, such as content and criterion validities.


INTRODUCTION
In the early days, population ageing was mainly observed in developed countries, but nowadays, it is in developing countries that the elderly population is soaring. 24This demographic transition is in tandem with a transition in the morbidity and mortality profi le, which is mirrored by the increase in prevalence of chronic-degenerative disease, such as Alzheimer.
Dementia processes are among the main health problems in the elderly population.An early diagnosis is crucial to obtain better results during therapy.There are some actions that enable delaying the functional loss period, mitigating unnecessary suffering of patients and their family, in addition to improving the life quality of the people involved.These actions include: pharmacological measures (galantamine, rivastigmine and donepezil) 15 and non-pharmacological measures, such as cognitive rehabilitation. 3iagnosing dementia syndromes is frequently a problem, especially in Alzheimer's, where certainty can only be achieved in post mortem studies.This being the case, the adopted standard for clinical diagnoses are based on diagnosing criteria established by international agencies (DSM-IV, 2 CID-10 13 and National Institute of Neurological and Communicative Disorders and Stroke and Alzheimer's Disease and Related Disorders Association).
Neuropsychological tests are tools to aid in early diagnoses of dementia processes.Tests used for cognitive monitoring are known to be easy to apply and can be carried out in a short period of time.They are important since they enable most dementia cases to be identifi ed in places where there is shortage of time and/or qualifi ed human resources, such as in primary healthcare services.
Despite the absence of a standardized way of applying and checking the Clock Test, it is widely accepted as a cognitive monitoring tool.The most commonly used methods in medical practice, according to Braunberger, a are the ones proposed by: Shulman et al, 17 Sunderland et al, 21 Wolf-Klein et al, 26 Mendez et al, 12 Tuokko et al, 23 Manos & Wu, 11 and Shua-Haim et al. 16 The differences in applying them are in the instructions given for the task, the time patients have to write down the answers, in addition to the scoring system adopted by each test.From the point of view of cognitive impairment scores, it is assumed that the clock test assesses visuospatial abilities, constructive abilities, and executive functions. 20 a review study, Shulman 18 found that most clock test studies had sensitivity and specifi city rates of approximately 85%, and high acceptability rates on the part of patients.b Construct validity, since it assesses the legitimacy of behavioral representation of latent features, 14 is probably the most fundamental way of studying neuropsychological tools and it has been employed by many authors in clock test assessment.A more comprehensive way of using the clock test in cognitive assessment was developed by Tuokko et al. 23 The authors took into consideration that the concept of time is abstract and that the abilities to draw, set and read certain times seem to be particularly sensitive in identifying cognitive impairment.This new test, herein called the Tuokko Clock Test (CT), is a clinical and research tool developed to detect cognitive impairment, with emphasis in investigating visuospatial, construction, visuoperceptive and abstract-conceptual abilities.The authors included two further tasks -setting and reading the time -, believing that theses additional tasks would enable a better understanding of cognitive impairment during its initial stages, when compared to clock drawing in isolation.
According to Tuokko et al, 23 the CT is a low-cost, reliable, valid and very useful tool in diagnosing dementia, especially when used in tandem with other cognitive assessment techniques and interviews with the patient and his/her family members.
Since it is a broader tool and uses tasks involving drawing, setting and reading the time, the CT can be useful in early diagnosis of suspected dementia also in individuals with little or no schooling.This is because performance in cognitive assessment tools is affected by schooling, a very signifi cant variable in this kind of population study, such as the one addressed in this paper.Therefore, it is important to develop Brazilian studies to further investigate the CT.
The objective of the presnet study was to translate and operationally adapt the Tuokko clock test model to the Brazilian context and to assess construct validation.

METHODS
The CT was applied in a population of elderly individuals living in the city of Juiz de Fora (Southeastern Brazil), between June 2004 and February 2005.This study is part of the Estudo dos Processos de Envelhecimento Saudável -PENSA [Healthy Ageing Process Study], carried out in the city of Juiz de Fora.
CT translation, adaptation and validation were authorized by the author of the original tool. 8One of the authors (KCAS) translated the tool, which was submitted to a second translator (a Brazilian profi cient in English).
The way the CT is applied and checked means a high cost to the reality in Brazil, because each kit is made up of seven detachable cards with carbon paper.In the original version the subject was not able to see what he/she had just done.This was possible because the cards were detached as soon as the subject fi nished the drawing.
In operationally adapting the CT, the number of cards and the material used in making the test kit were reduced, aiming at adapting the tool to the study's operational boundaries.
Pre-testing involved a sample of 20 elderly patients at the outpatient clinic of the university hospital, and resulted in making changes to the frame in the clock drawing subtest.
The frame consists of windows that open and close.When using the frame, the examiner must lift one window at a time (from biggest to smallest) and close them as soon as the patient has fi nished drawing, so the patient does not see what he/she has just fi nished doing or what will be the next drawing (Figure 1).After adjusting the frame, 20 other patients at a center for the elderly connected to research activities of a local university were submitted to the CT.After carrying out the test, the individuals were asked about the instructions for the tasks in the CT.No additional operational change was needed seeing that no patient reported diffi culties in understanding or performing the tasks.
Between June and July 2004, the fi eld researchers -four interviewers and one recruiter -were trained to apply the instrument.In the last stage, two assessors checked the CT.Standardization of checking procedures among assessors reported inter-and intra assessor agreement of 0.95 kappa in a sample of 20 patients.Test-retest agreement based on two assessors and 20 tests, over a two-month period, had kappa values of 1.0 and 9.4 per assessor.
The sample of the larger study (PENSA) was made up of elderly individuals in the community, residing in Juiz de Fora districts that accounted for 15% or more of the total city population.By using the door-to-door method of the Instituto Brasileiro de Geografi a e Estatística (Brazilian Institute for Geography and Statistics), 956 elderly individuals were included in this study. 4om this total, a randomized sample of 40% of the cases was selected to take part in this study.The inclusion criteria were: being able to hear and understand well enough to be able to take part in an interview and signing the informed consent form.
The total number of subjects selected to take part in this study was 382, of which 353 agreed to participate (92.4%).The reasons for not participating were: 27.7% death; 24.1% incapacity; 34.5% refusal and loss to follow up; and 13.7% moved.
The CT involves three empirical tasks: clock drawing, clock setting, and clock reading.The clock drawing subtest consists of a printed circle where the examinee must place the numbers and the hands of the clock showing 11.10 (Figure 2).Checking this subtest consists of a detailed qualitative and quantitative system of addressing the mistakes made by the examinee.The score system classifi es the mistakes in seven broad categories: omission, absence, perseveration, distortion, substitution, addition and rotation.To classify and obtain the score, mistakes in drawing the face of the clock, the numbers, the hands and the spaces between the numbers are analyzed.One point is attributed to each mistake and there is no maximum score, although one rarely sees scores above 31 points.
The clock setting subtest consists of fi ve circles on which the examinee is asked to draw the hands showing 3 o'clock, 9.15, 1 o'clock, 11.10, and 7.30 (Figure 1).Checking this subtest consists of giving one point for every hand placed correctly, and an additional point when the hour hand is smaller than the minute hand (total=15 points).
The clock reading subtest (Figure 3) consists of a notebook where clocks (set to the same time as in the clock setting subtest) are shown to the examinee who must express the time shown on each clock.Checking consists of attributing one point to each hand read correctly and one additional point when both hands are read correctly (total=15 points).
To assess CT construct validation, the tools we used were: the Mini-Mental State Examination (MMSE), Digits, Block Design and the Center for Epidemiologic Studies Depression Scale (CES-D).The MMSE is made up of 30 items, with subtests that assess spatialtemporal orientation, immediate memory, evocation, procedural memory, and language.The score ranges from 0 to 30 points.The cut-off points adopted in this study were the ones in Almeida: 1 19/20 for individuals with no schooling and 23/24 for individuals with some formal schooling.
The Digits and Block Design subtests are part of the Wechsler Adult Intelligence Scale -Third Edition (WAIS-III).The fi rst aims at assessing memory, immediate repletion, attention and working memory.The score is obtained through the total number managed backwards and forwards added together, and the maximum score is 30.The Block Design subtest assesses perceptual and visual organization, ability of abstract thinking, non-verbal concept formation and spatial visualization.The maximum score for this subtest is 68.
The CES-D consists of items that assess symptoms of depression on the week prior to the application of the tool.The score ranges from zero to 60.It was validated for Brazil by Silveira & Jorge. 19The cutoff point adopted in this study was 16.
The clock-drawing subtest score was converted into totaling correct answers to match the same scale as the other subtests.To compare the studied population to the original PENSA population, we used the T test matched to the continuous variable "schooling", and McNemar for the dichotomous variable "gender".
Convergent and discriminant validities were statistically analyzed through Pearson's correlation.CT subtests were correlated to Block Design and Digits on the WAIS-III and MMSE to assess convergent validity.Discriminant validity was obtained through the correlation between CT subtests with the CES-D.
Data were analyzed through SPSS version 10.0 statistics software.
The study received approval of the Research Ethics Committee at the University Hospital of Juiz de Fora, (#463.148.2004).

RESULTS
All individuals completed the test, and the respective data were analyzed for the entire population of which 74.1% were women, aged between 63 and 107 years (average age 73.8 years; standard-deviation 8.5; median 73; and mode 63).Higher ages prevailed (63.1% were older than 70 years of age).Approximately 7.9% of the elderly individuals informed having a university degree (Table 1).
When compared to the original PENSA population, there were no differences in the gender and schooling variables.The mean and the standard deviation for schooling in the PENSA sample were 7.1±4.4,and in the studied sample they were 7.4±4.6(t=0.907;p<0.36).Women accounted for 71.8% of the PENSA sample, whereas in this study they represented 74.1% of the population (agreement rate=84%; p<0.38).
In the clock-setting subtest there was a concentration of higher scale-values (mean 10.3 and SD=3.9).The same was noted in the clock-drawing subtest (mean 3.9 and SD= 5.7).The clock-reading subtest presented the most homogenous results among the elderly (mean 12.5 and DP=3.4).In regard to time required, the average CT time was 4.9 minutes.The sample's cognitive performance when assessed by the MMSE averaged 25.3 (SD=3.3),median 26, and mode 27.At the 23/24 cut-off point, 21.2% of the sample was cognitively impaired.
Performance according to the CES-D averaged 15.8 (SD=8.5),median 12, and mode 10.At the cut-off point of 16, which was suggested by the Brazilian study on the validation of the scale, 31.5% of the sample would be suspected of depression.Correlations among CT subtests, and the latter test and the following MMSE, Block Design, Digits, and CES-D tests can be seen in Table 2.
Internal consistency among CT subtests, according to Cronbach's alpha, was 0.773.

DISCUSSION
This is the fi rst Brazilian study which applied the CT to a sample of the elderly population, by translating, operationally adapting, and validating the construct.
Herdman et al 7 suggest that in a universalist approach to transcultural adaptation it is necessary to investigate the conceptual, item, semantics, operational, measurement and functional equivalences of the tool.
In the present study we did not consider carrying out all the stages proposed by these authors to be necessary for the following reasons: applying the CT is simple, there is no semantic diversity, and it is not a scale of written items.And this is because the CT consists of straightforward instructions for tasks which do not require subjective interpretation on the part of the examiner or the examinee.On the other hand, we adopted the conceptual equivalence of the CT in the Canadian and Brazilian cultures, where the concept of hours and minutes as a measurement of time and the use of analogical clocks as a means of measuring time are basic premises.
For the same reason, our initial impression was that it would not be necessary to have one group of indepen-dent translators carry out the translation and another group for back translation.During pre-testing, examinees said that the test instructions were easy to understand, and there were no problems in performing the task after instructions were given for the fi rst time.It seems that the lack of diffi culties confi rm our initial hypothesis concerning the kind of translation adopted.However, it is possible that the lack of diffi culties in understanding the instructions may have resulted from the fact that the individuals in the sample had a higher schooling rate than rates in other Brazilian population studies. 10Thus, conducting focus groups with elderly individuals with different schooling rates, would probably yield more objective elements to support our hypothesis.
However, the possibility of adopting similar forms of the instrument, concerning format, instructions, applying and checking -operational equivalence -, was an essential task.Due to the high costs of the original CT version, adapting the test operationally was essential because of the limited resources available for this study.
In any case, operationally adapting cognitive tests in general is also necessary to ensure the high cost of these tests does not prevent them from being acquired and employed in Brazil's public health services.The adapted version of this study seems not to have interfered in the results, because the construct validity found was similar to the Canadian one.
The average CT test time (4.9 minutes) positions the tool as a "short" monitoring instrument. 9Another positive aspect is that, due to its simplicity and speedy application, the CT can aid cognitive impairment diagnoses in  places where there is shortage of time and of qualifi ed professionals in cognitive assessment.However, no other studies were found on measuring the time of the CT, thus, this study is the fi rst to investigate this variable.
As our fi rst approach to the CT measurement equivalence, we examined its construct validity.Seeing that this is the fi rst Brazilian study on the tool, we believed that investigating a population-based sample would be more interesting because it would enable us to describe its construct validity and performance profi le in a sample of individuals who were supposedly not cognitively impaired.Based on the data gathered, the normative data will be addressed in future papers.In the future, as soon as clinical data on the cognitive performance of the individuals in the sample are available, criterion validity will be addressed.
The correlation with other cognitive assessment scales that have already been validated in Brazil (MMSE and WAIS-III subtests) suggests that the CT is able to measure what it aims at measuring.We found positive and signifi cant correlations, however they presented weak to moderate association strength.This leads us to believe that maybe the constructs assessed by the CT, MMSE, Digits and Block Design are similar, but not identical.The Digits subtest is the most specifi c to assess working memory, and Block Design for assessing visuospatial planning.This is a commonly found problem in psychometrics, seeing that there are no specifi c tests to assess identical constructs.However, other studies 22,23 assessing the correlation between these subtests also found weak to moderate association strength.This may show that maybe the association among these variables truly behaves in the fashion found in this study, since there really are differences among the constructs individually assessed per instrument.
On the other hand, Kurzman a states that, in general, convergent validity correlations are high when they are calculated based on clinical samples, contrary to what is found in population samples.This may happen because the score variation in normal individuals is limited.The Canadian CT study also found moderate associations for its correlations. 23 the discriminant validity, there was no correlation between clock drawing and clock reading subtests with the measure for symptoms of depression.In the Canadian CT study, none of the subtests were correlated to the mood scale. 23However, in this study, the clock setting subtest presented a negative and signifi cant correlation with the CES-D (r=-0.118;p≤ 0.05), meaning that the more symptoms of depression individuals have, the worse their performance in this subtest.This may be explained by the fact that this subtest requires more attention capacity, which is frequently altered in individuals suffering from depression.However, the strength of this correlation and the CES-D and the clock setting subtest was very weak, accounting for a spurious association.
In conclusion, the translated and adapted CT applied to a population sample of non-clinical Brazilian elderly individuals proved to be a short cognitive monitoring tool, which presented good construct validity when compared to other data in the literature.Its psychometric features concerning content validity and criterion validity deserve to be addressed in future studies, as do its sensitivity and specifi city values.

Figure 1 .
Figure 1.Clock-setting subtest model and frame developed for the subtest.

a
Kurzman D. The construct validity of the Clock Test in normal and demented adults.[Doctoral thesis].Victoria: University of Victoria; 1992.

Table 1 .
Sociodemographic and self-reported and functional health characteristics of the sample.Juiz de Fora, Southeastern Brazil, 2005.N=353 BDLA: Basic Daily Life Activities IDLA: Instrumental Daily Life Activities

Table 2 .
Correlation between the Clock Test and other cognitive assessment and depression tests.Juiz de Fora, ** p≤0.01; * p≤0.05;MMSE = Mini-Mental State Examination CES-D = Center for Epidemiologic Studies Depression Scale