Introduction
Early cognitive psychological findings1 regarding mental disorders support that patients have personal beliefs about the world and themselves, which are expressed behaviorally through symptoms (e.g., aggressiveness, delusions, liability). Moreover, there is evidence suggesting links between self-schemas and cognitive background,2 which is also related to several psychiatric conditions.3 In this sense, the role of cognition has been highlighted in psychiatry, comprising not only classical functions (e.g., memory, language), but also those regarding handling of social environmental stimuli (e.g., face emotion recognition, Theory of Mind [ToM], moral judgment).4 This field of cognitive science is known as social cognition, which has been considered a new target for therapeutic interventions.5
One of the major issues that has been addressed in social cognition, alongside recognition of facial emotions, is ToM. ToM is the ability to “mentalize” - to reason about our own thoughts, feelings, and intentions, as well as those of others. Studies of ToM are now widely conducted in humans, although the term originated in primates research.6 At the ages of 4 to 6, children are able to depict mental states in false-belief stories (an assessment paradigm of ToM); at the age of 7, they are able to understand metaphors; and at the age of 9 to 11 years, they would perform ToM similarly to adults. However, the course of ToM is not dependent on other cognitive abilities, as patients with Down syndrome or Williams syndrome do not exhibit impairments in social cognition.7
In adults, psychiatric conditions are related to impairments in ToM. For example, comparisons between participants with mood disorders,8 alcohol abusers,9 and dementia10 and healthy controls showed differences in favor of the controls. However, the conditions with the strongest evidence regarding ToM impairments are schizophrenia11 and autism.12 Considering schizophrenia and its associations with ToM performance, Frith argued that disease leads to an impairment in ToM capacity, which could make false beliefs lead to false conclusions, thus increasing delusional symptoms.13 However, data about impairments in ToM are also available in people at high risk of psychosis.14 Hence, it is still unclear whether ToM functioning could be a trait or state of schizophrenia. Moreover, it is also unknown whether ToM can be a “social-endophenotype” for psychosis or a social-cognitive side effect of psychosis.15
The first methodological strategy used to assess ToM in schizophrenia research was the false-belief paradigm, based on vignettes that provided contextual information and required the participant to answer what the characters in the story were thinking.16 This method is still used, but the influence of another method is now being felt and enriching results: the Reading the Mind in the Eyes Test (RMET). The RMET assesses one's ability to read the intentions/feelings/thoughts of others through facial expressions, specifically in the region of the eyes.17,18
The RMET was first described by Baron-Cohen et al.,19 who realized that information about context alone is not enough to figure out other peoples' intentions. They argued that people use resources from facial expressions to construct ideas about their feelings and desires. The interpretation of such signs should be mentally represented according to individual characteristics, and this mental representation would characterize the mental states of others. Findings such as these led to the present understanding of ToM, which combines appropriate input (catching several stimuli), recognition, and representation, thus entailing adoption of resources from previous knowledge.20 However, the full apparatus can be abbreviated if an immediate response is required; therefore, ToM probably works in two modes - one controlled and one implicit.21
The first version of the RMET contains pictures from the eye region and two options; the participant must then choose the option that better represents the feeling expressed. Later, Baron-Cohen et al.19 upgraded the original version, adding two options to each item. Results from the use of this method supported their theory about the inability of an autistic person to capture subtle differences in emotion expression in this region of the face.18 This new version (Revised RMET) facilitated advances in psychiatric research, as it made it possible to combine the RMET with imaging techniques. Now, results are discussed with regard to behavioral consequences. In autism, for example, the difficulties involved in accurately understanding the feelings of others may cause isolation effects and the maintenance of avoidance of eye contact with others.22
The RMET can be completed either using paper and pencil, or as a computerized version. It is simple and its administration is not restricted to specialized professionals. The original revised version contains 36 pictures, which are all black-and-white and of the same region of the face (midway along the nose to just above the eyebrow), plus one to be used in a training condition. Each image is surrounded by four words regarding mental states. The participants must choose the word that correctly depicts the mental state expressed in the picture. The test also includes a glossary, which is a list of all words the test contains, appropriate synonyms for each, and an example of their use in a phrase. The glossary must be presented before the test starts, a procedure adopted to prevent language biases. The RMET is often interpreted in terms a total score; however, some studies have added subscales to the original version. The most common subdivision consists of splitting the items into positive, negative, and neutral emotions,23 although the psychometric properties of this procedure have not been officially recognized.24
The RMET has been translated into several languages, including French and German (see ARC website for more: http://www.autismresearchcentre.com/arc_tests), and has a documented validation process for the Turkish25 and Italian24 versions, with adequate internal and external consistency. Nevertheless, it is important to note discrepancies in cultural comparisons, which may produce a familiarity effect due to ethnicity issues within actors in the pictures. For example, when the performance of American subjects was compared to the performance of Japanese participants in two versions of the RMET (the original version and a version with pictures depicting Asian actors), the authors found an important familiarity effect.26
Considering the strengths of the RMET and the lack of Brazilian Portuguese instruments for ToM assessment, we sought to translate and adapt the original English RMET into Brazilian Portuguese and for the Brazilian reality. Although an early version of the RMET was previously translated into Brazilian Portuguese by one of the authors of this paper (H.A.T.), this first version did not include all the semantic validation steps, and was thus beset with uncontrolled language issues. To overcome these issues, we conducted the translation and content validation of RMET to Brazilian Portuguese in both paper-and-pencil and computerized versions.
Method
The Brazilian version of the RMET includes 37 items, as did the original test.18 Regarding material and legal matters, we contacted the Autism Research Centre (Autism Research Center/ARC, Cambridge, United Kingdom) and obtained their formal authorization to conduct the adaptation of the instrument to Brazilian Portuguese.
The final version was organized through standardized translation procedures. Special attention was paid to a valid semantic adaptation to the Brazilian sociolinguistic reality.27 The first step consisted of semantic and conceptual translations and back-translation. The second step sought to ascertain whether the translated Brazilian Portuguese target-words were correctly matched with the facial expressions in the pictures (acceptability trial). Finally, the third step was a pilot study, as recommended by the ARC. Each procedure is further illustrated in Figure 1.

Figure 1 Flowchart of methodological steps of adaptation. BT = back-translation; FV = final version; O = original version; T = translation by two researchers; V1 = version 1; V2 = version 2.
Semantic and conceptual translation
Two independent language professionals translated the original version of the test into Brazilian Portuguese. The two versions were submitted for semantic and conceptual evaluation by a psychology researcher to verify the general meaning of the terms in light of the theoretical model.27 These procedures led to the construction of a consensus version by the authors (version 1).
Following ARC recommendations and guidelines for test adaptations,27 a back-translation of version 1 was performed by a bilingual native English speaker. Later, to avoid repeated words and perform semantic validation, a professional linguist screened both version 1 and the back-translation. Any differences were analyzed and corrected; there were no conceptual conflicts. Despite these strict methodological procedures, some accuracy issues might still be a problem (for instance, some pictures could not correspond to target words). To avoid this, the authors conducted a judging phase to test the acceptability of the final target words and their matching with the corresponding pictures.
Acceptability procedures
To test target-word vs. picture acceptability, an acceptability trial was conducted. The trial was computerized, performed on a 15-inch color monitor using E-Prime software version 2.0.28 Participants were seated in front of the computer screen so that their eyes were approximately 50 cm from the display and all response keys were located on a standard QWERTY keyboard. The same pictures from Baron-Cohen's final version of the RMET were presented in the center of the screen. Each picture remained on the screen until the participant responded. The presentation of options was randomized to avoid laterality-biased answers.
Ten raters (six female, four male), all mental health professionals, were invited to take part in the acceptability trial. This method is based on choosing between two words: the aim-word and a foil-word. Foil words were selected from the glossary list, with the sole criterion of having a different meaning from the target word. The acceptability trial presented herein is similar to that adopted for the original version of the RMET, which had only two word options.19 The results were analyzed for descriptiveness, looking for mistakes committed twice or more times. If the raters still disagreed with the target-word vs. picture match after the mistake was presented, the incorrect items were revised.
Once the acceptability trial was complete and all corrections had been made, the authors conducted a pilot study to look for problems in the RMET behavioral outcomes in a non-clinical sample. The pilot study was also an ARC recommendation, as previously described.
Pilot study
The final version of the RMET consisted of one block with 37 items (one item being the training trial) programmed on E-Prime 2.0 software. Each item began with a 500-ms presentation of the “+” sign in the center of the screen, which served as a fixation cue. Immediately following the termination of this display, the image stimuli (measuring 23.5 x 9.4 centimeters) were presented and kept on the screen until the participant responded (Figure 2). Unlike in the acceptability trial, the response was obtained using the DoHit Test function of E-Prime, which is a procedure to obtain responses with the mouse.
The pilot study was conducted with a sample of 10 individuals (five male, five female), whose mean age was 26.19 (±3.02) years, mean educational attainment was 14.53 (±2.86) years, and mean monthly household income was US$5206.77 (±2110.07), which, in the Brazilian reality, is considered a high socioeconomic class. In the pilot study, pictures were presented with four options, one in each corner of the picture (one being the target word and the others being foils), as illustrated in Figure 2. Foil words were selected to be the correspondent ones from original version. Thus, the participant had to choose one option with the mouse.
The correction of this step was based on the Turkish adaptation of the RMET.25 As in this study, items that more than 50% of the sample had mistaken or those for which 25% of subjects or more had incorrectly selected the same foil option were corrected. Remaining conflicts were re-evaluated and re-tested using the whole sample.
Results
Semantic and conceptual translation
The previously planned steps were all completed, as presented in Table 1, which shows the original words, version 1, the back-translation, version 2, and the final version (completed after the acceptability trial and pilot study).
Table 1 Main changes to instrument terms through the translation and adaptation process
Original words are copyrighted material owned by John Wiley & Sons, Inc.18
The process, however, was conducted with some difficulties. The major issue was the need to maintain the same number of words, which was critical in some cases. Faced with this situation, the authors had two options: a) to use words which were not so common in standard Brazilian Portuguese; or b) to use fewer words than the original version of the RMET. Since the instructions permitted the respondent to use a glossary to recognize words clearly, the authors decided on option “a.” However, exceptions remained due to the limited number of synonyms in Portuguese. Therefore, eight words were repeated in version 1, three in version 2, and four in the final version.
After back-translation, a professional linguist compared the translated and original versions. This process revealed some mismatching terms. Such terms were revised in the Portuguese form, for example, “dispirited” and “encouraging,” which matched “humiliated” and “animated” in the back-translation version, respectively. The corrections were completed and the authors established version 2, which was used to conduct the acceptability and pilot trials.
Acceptability procedures
This procedure revealed a number of patterns regarding mistakes in some items. Among the 37 items, two raters got six items wrong (1, 4, 17, 25, 31, 36), three raters for one item wrong (10), and more than five raters got three items wrong (7, 9, 13).
Results from the acceptability trial alerted the authors to other issues; therefore, items 7, 9, and 13 were revised, changing the words “inquieto” to “apreensivo” (7), “nervoso” to “preocupado” (9), and “torcedor” to “aflito” (13). The remaining mistakes were analyzed together with the raters who conducted the trial. In these cases, raters agreed that errors were their fault, but once they knew the correct option, they were able to recognize it. Therefore, the authors and raters could reject the hypothesis that mistakes occurred because the word did not fit the picture properly or because there were too many confusing options. Once this phase was completed, the last procedure to be carried out was a pilot trial.
Pilot trial
The trial run detected few errors; therefore, items 8, 9, 10, 12, 17, 23 and 32 were run a second time, as the first trial detected more than 25% of mistakes regarding the same foil word. For this reason, these items were analyzed and modifications were made, especially to the foil words. For example, in item 9, the word “horrified” (translated to “amedrontado”) was conflicting with the right answer (“preoccupied” translated as “preocupado”). In item 9, the foil word was substituted for another foil word, “relieved,” translated as “aliviado.”
Table 2 presents the frequency of responses in the final application of the RMET for each option of the task. The changes are described in Table 1, with the final version in the right hand column. The retest was run with the whole first sample, and this time, the results showed no mistakes in items 10, 17, 23, 32; however, items 8 and 12 still contained errors, which did not reach 50% of the sample, nor 25% in the same foil word option. Hence, this was considered the final version of the Brazilian adaptation of the Revised RMET. Of the 37 items, five reached 100% agreement, 15 had 90%, eight had 80%, seven had 70%, and two had 60%.
Table 2 Item option frequency in the pilot study (n=10)
Percentage of choices shown in parenthesis; bold words refers to correct options of each item; “-” means none of the participants chose the corresponding option.
Discussion
This project sought to construct a Brazilian version of the RMET, following recommended methodological steps. All procedures were successfully completed, resulting in a final Brazilian version of the Revised RMET. It adds an advanced method to social cognition research.29-31 Hence, the main contribution of this article is to add criteria for the translation and adaptation of the test. Furthermore, additional adaptations were made to the instrument to enable its administration using computer resources. Here, the translated methods were updated and an effort was made to follow translation guidelines27 and ARC recommendations (pilot running), to maintain the properties of the test.
As noted above, our adaptation of the RMET employed computerized methods. We stress that the use of computers allowed us to obtain more accurate results, in addition to other outcomes (e.g., reaction time, response time, bias). Computerized assessment is also an advanced method in neuroimaging studies investigating ToM functioning.32,33 However, the test can also be administered using paper and pencil, as the development of this Brazilian version was based on standardized procedures for translation of instruments and scales,27 including additional linguistic adaptation methods.
The revised Brazilian version of the RMET has 36 items plus the training item, preserving the format of the original English version. The instructions for administration are accessible to any trained health professional and consist of a glossary, which includes all the words used in the test. After presenting the glossary, the evaluator provides instructions concerning what must be answered and the need to accurately reason on the mental state depicted by the eyes in each photo. One answer out of four options is presented together with the picture. In this study, the answers were given by mouse response, although the response method can be adapted to the needs of the investigator.
The glossary presentation sought to diminish vocabulary limitation bias in the test outcomes. The words in the glossary are accompanied by their meanings, as well as synonyms and examples provided as statements (the glossary is available from the authors on request). In the glossary, we adapted some statements (e.g., character names) to correspond to Brazilian reality. Our glossary only contains the words we used, omitting duplicate translations, such as “annoyed” and “irritated” (both of which were rendered as the Portuguese word “irritado” in the final version).
All the steps followed in our adaptation aimed to achieve the most appropriate meaning for the translated words to ensure adequate fit to the corresponding picture. Furthermore, we sought to avoid conflicting words that could confuse respondents; the pilot study was conducted to identify such cases. In such situations, the authors had to conduct a theoretical analysis of the mixed outputs. These analyses always followed the search for emotions with a similar valence (e.g., “preocupado” and “nervoso,” both of which refer to anxiety states).
During the translation, limitations arose because the ranges of synonyms are different in different languages and, in some cases, because of incompatibility between the grammatical classes of words. Some of the original words, out of context, could be considered nouns or even verbs in Portuguese (e.g., “comforting” and “disappointed”). The first translation committed this type of error, which was caught early on during back-translation and avoided in the final version. This observation revealed the need to restrict the options to adjectives. This recommendation was fully accepted, as the task consisted of providing the right definition for the emotions depicted by the photos, thus adjectiving the eyes in each picture.
The acceptability trial presented in this study recreates the original paper by Baron-Cohen et al.,19 where the respondent may choose between two options to describe the expression in the picture. In the present study, the respondent has unlimited time to respond to each item in the acceptability trial, whereas in the original work, the pictures were displayed for 3 seconds, after which they disappeared and the participant was forced to choose between the two options. The infinite response time was selected in this phase of the adaptation because the participant needs to use other resources, such as working memory and attention, to provide a correct answer; therefore, limiting the response time to 3 seconds is restrictive. As the image should remain in the mind while the participant selects an option,34 the outcomes could be biased. The authors wanted to avoid any potential bias (especially because it was done in an acceptability trial setting) by diminishing interferences from other cognitive processes, as previous research has already found significant relations between complex cognitive functions and ToM.20
Having detailed the strengths of the methodology of this study, some limitations of relevance to the interpretation of this work must now be noted. The sample had a high level of educational attainment, which is not representative of Brazilian reality. Therefore, the possibility of unfamiliarity with some expressions must be considered, although the glossary tried to minimize this effect. Caution should be exercised when interpreting the results due to the lack of an evaluation of the back-translation by a native English-speaking health professional. However, a native English speaker adjusted the translation to make corrections as necessary, which helped improve linguistic control of the stimuli.
In conclusion, future studies should aim to validate the REMT among the Brazilian population. Furthermore, photos of Brazilians should be used in the test, as familiarity effects driven by the culture have already been recorded in previous research.26 Validation should be conducted in larger samples, together with clinical samples that are known to reflect ToM impairments (e.g., autism, schizophrenia), so as to compare Brazilian RMET results with those observed in foreign data. After validation, ToM screening could be added to neuropsychological clinical evaluations. The Revised Brazilian RMET is a version of an established, elegant, and sophisticated method for assessment of social cognition. This subject, which is neglected in most psychiatric studies that focus on cognition conducted in our country, is an exception to theoretical works35 and reviews.36 Furthermore, to our knowledge, there is a work by Jou and Sperb,37 who developed an instrument (using the false-belief paradigm) for the assessment of children. In addition, a translation of False Belief tests was recently carried out by our own group.38
Finally, considering future research, ToM instruments must continue to be developed and adapted to the Brazilian reality to enable a deeper understanding misperceptions about the social environment and its relation to symptoms. Instruments addressing theories other than ToM must also be considered, as well as tasks that assess other social cognitive abilities, such as emotion recognition, social-value attribution, and social rules. In this sense, a common instrument in recent research, which has not yet been adapted to Portuguese, is The Awareness of Social Inference Test Revised (TASIT). This test has excellent ecological validity and comprises visual, verbal, and non-verbal issues, which other tasks/tests fail to investigate.39 TASIT has three parts, the third of which is considered to assess ToM through video presentations. Continued advancement in methods for the evaluation of social cognition is required to shed light on new perspectives that are increasingly being considered as crucial to changes in psychiatry and, consequently, to improvement of pharmacological and non-pharmacological treatments and even avoidance strategies for mental disorders.40