Psychometric properties of the Jefferson Empathy Scale in four nursing student faculties

1 Universidad Andres Bello, Faculty of Nursing, Santiago, Chile. 2 Universidad Bernardo O’Higgins, Health Faculty Santiago, Chile. 3 Pontificia Universidad Católica de Chile, School of Medicine. Department of Psychiatry, Santiago, Chile. 4 Universidad San Sebastián, Dentistry Faculty, Concepción, Chile. 5 Universidad de Atacama, Faculty of Health Sciences, Copiapó, Chile. 6 Universidad Andres Bello, School of Dentistry, Research Department, Santiago, Chile. ABSTRACT Objective: To evaluate the psychometric properties of the Jefferson Medical Empathy Scale, Spanish version ( JSE-S), its factorial structure, reliability, and the presence of invariance between genders in the behavior of empathy levels among Chilean nursing students. Method: Instrumental research design. The JSE-S was applied to 1,320 nursing students. A confirmatory factor analysis was used. An invariance study between genders was carried out. Descriptive statistics were estimated. Between genders, Student’s T distribution was applied alongside a homoscedasticity analysis. The level of significance was α ≤ 0.05. Results: The confirmatory factor analysis determined the existence of three dimensions in the matrix. The statistical results of the invariance tests were significant, and allowed comparison between genders. Differences were found between mean empathy values, as well as in some of its dimensions between genders. Conclusion: The factor structure of empathy data and its dimensions is in correspondence with the underlying three-dimensional model. There are differences in empathy levels and their dimensions between genders, with the exception of the compassionate care dimension, which was distributed similarly. Women were more empathetic than men.


INTRODUCTION
Empathy is a multidimensional construct with both cognitive and emotional components (1) . The literature has shown a positive association between high levels of empathy and positive results in treatment and patient care from several points of view (1) .
Empathy is a central part of nursing work as it is inherent in the therapeutic relationship (2) . This makes it an essential component in delivering quality healthcare focused on the patient and family. The development of empathy allows nursing professionals to fulfill several care goals, such as alleviating loneliness and isolation, providing support, and understanding and validating patients in their health situation, to name a few. Despite this, evidence has shown that patients perceive low levels of empathy in interactions with health providers. This is a wake-up call for those who participate in nursing education, especially when technological development of care services has increased significantly, and may threaten the humanization of care (3) .
Because of its relevance, empathy has become important in the development of significant research. In a relatively large number of these studies, three principal factors have been focused on: gender, years of study, and different health specialties (4) . In dentistry and medicine students in Latin America several types of distribution have been found, which indicates variability, not only decline, in the distribution of empathy. As regards gender, variability has also been found. Empathic decline and gender differences remain controversial, at least in Latin America (4) .
The process of empathy development is no stranger to evolution and ontogeny (1) , and the latter at present seems to preponderate over evolutionary factors in determining the empathic make-up of subjects studied both individually and socially. Therefore family factors, such as the mother-child relationship, alongside complex social networks, psychological factors, moral factors and stress (among others) have a greater impact on empathic formation (5) .
Although there are various measures of empathy, such as the Hogan Empathy Scale (6) , the Emotional Empathy Scale (7) and the Interpersonal Reactivity Index (IRI) (8) , the Jefferson Empathy Scale ( JSE) (9)(10) is undoubtedly the most widely used measure of empathy in the context of health. It has been translated into 56 languages and used in more than 80 countries (11) .
The JSE is a 20-item instrument specifically developed to measure empathy in the context of health-profession education and patient care for administration to health professionals, students, and practitioners. Items are answered on a 7-point Likert-type scale (i.e., 1 = strongly disagree, 7 = strongly agree) (12) . "Empathy in patient care was defined as a predominantly cognitive, rather than an affective, attribute that involves an understanding of pain and suffering of the patient, combined with a capacity to communicate this understanding, and an intention to help" (9) . The JSE is conceptualized as a multidimensional construct comprised of three related factors: (a) perspective adoption; (b) compassionate care; and (c) walking in patient's shoes (13)(14) . Such factors were generated by exploratory factor analysis, the preferred method for their study during the first years of such study.
One of the instruments most commonly used in nursing to measure empathy is the JSE (5,15) . However, in Latin America the psychometric properties of the Jefferson Scale of Empathy, Spanish version ( JSE-S) have been little studied, especially as regards nursing students.
As a consequence, the purpose of the present study is to evaluate the following psychometric properties of the JSE-S in Chilean nursing students: factorial structure, reliability, presence of invariance between genders and the behavior of empathy levels. There are two hypotheses: a) that the latent three-factor empathy model fits the sample data, and b) that empathy does not vary by gender.

Study deSign
Instrumental research design (16) using secondary data from four cohorts of nursing students.

PoPulation
Made up of the students of the Faculty of Nursing of the Universidad San Sebastián (USS), Santiago and Concepción, Chile; Universidad Mayor (UM), Temuco, Chile; and Universidad de Atacama (UDA), Copiapó, Chile.

SamPle definition
The sample sizes were USS (Santiago): n = 479, N = 740/64.7%; USS (Concepción): n = 396, N = 589/67.2%; UM: n = 277, N = 403/68.7% and UDA: n = 168, N = 255/65.9%. Stratified samples were obtained by gender and course in each population analyzed (female: n = 1,073; male: n = 247). All students, without restriction, who attended classes on the day of the application of the instrument were included, and those students who were absent were not evaluated later, in order not to contaminate the answers.

Selection criteria
The inclusion criteria were the same as those reported by several authors (5)(6)(7)(8)(9)(10)(11)(12)(13)(14)(15) , and were answered by those students participating in the class or clinic. There were no exclusion criteria, since the objective was to evaluate the variable of interest in the greatest number of students. However, since students were able to visit different clinical areas and attend classes in different places, in addition to absences, among other circumstances, it was not possible to apply the scale to all students. The scale was not applied a second time to avoid possible biased responses.

data collection
The data were collected between July 2016 and November 2018. The JSE-S was applied. Prior to its application, the JSE-S was submitted to a committee composed of five relevant faculty members from the fields of psychology, nursing and higher education, in order to verify cultural and content validity. Subsequently, a pilot study was carried out to verify students' understanding of the culturally adapted scale. Application was confidential with a neutral operator, after signing informed consent.

inStrument
The JSE-S is an instrument to measure empathy in medical students. It consists of 20 items, each a Likert scale from one to seven points (140 points in total). It is composed of three dimensions: Compassionate Care (CC), Perspective Adoption (PA) and "Walking in Patient's Shoes" (Wips). The essential property of these dimensions is that they interact dialectically (1) .

data analySiS and treatment
The item-test correlation was performed using Pearson's test (17) . The study of extreme groups was carried out by using the standardized difference between the score obtained in each item from the 25% of the sample with highest and lowest scores. The value obtained is similar to Cohen's d. The sample was randomly divided into two groups (n 1 = 674 and n 2 = 640). An exploratory factor analysis (EFA) was performed on n 1, and a confirmatory analysis (CFA) was performed on n 2 . The Kaiser-Meyer-Olkin and Bartlett sphericity tests were applied to the EFA data. The number of latent factors was determined using three criteria: (a) Kaiser criteria (eigenvalue > 1); (b) sediment graph analysis; and (c) previous antecedents of the factorial structure. The factor extraction method employed weighted least squares mean and adjusted variance (WLSMV) estimation. The solution was rotated obliquely using Promax. Factor load of ≥ 0.3 was considered adequate. The goodness of fit indices used were as follows: root mean square error approximation (RMSEA) and standardized root mean square residuals (SRMR). RMSEA < 0.8 and SRMR < 0.08 indicated a good fit to the model (18) . Confirmatory factor analysis (CFA) employed a model of 20 items and three latent variables were specified: PA (10 items), CC (8 items) and Wips (2 items).
The general fit of the model was evaluated using RMSEA, Tucker Lewis (TLI) and comparative fit (CFI) indices, and the weighted root mean square residual (WRMR). The values that suggest a good fit are (19)(20) : CFI > 0.9, TLI > 0.9, RMSEA < 0.8. Since the chi-square index is sensitive to the sample size, the relationship between the model's chi-square and degrees of freedom was used. Values below 3.0 suggested an acceptable fit. Reliability was estimated using Cronbach's alpha coefficient for the full scale and its dimensions. Changes in reliability were analyzed by eliminating elements (adequate value ≥ 0.7).
Estimation of factor invariance allowed valid comparisons between these groups (21) . Three models were estimated: configural, metric, and scalar, and each were compared sequentially (e.g. metric v/s configural, scalar v/s metric). A new level of invariance was accepted if the difference in the comparative fit index (CFI) between the two models was less than 0.01. To make valid comparisons between groups, it was necessary to have scalar invariance (21) . The mean and standard deviation were estimated for each of the universities and then by gender within them, as well as for the total, unified data in general. The comparison between genders was carried out using Student's t-test, and homoscedasticity with Levene's F test. The significance used was α ≤ 0.05 and β ≥ 0.2. SPSS 25.0 ® and Mplus 8 software was used. A source of bias in this study was that the sample was not random, and was simply made up of all the students who attended classes on the day of the evaluation. The others were not evaluated due to not attending classes in the clinical area, among other reasons. All surveys were fully answered.

ethical aSPectS
This study was bioethically governed by the Helsinki regulations. Student participation was voluntary and confidential, after having signed informed consent prior to completing the instrument. This study was approved by the Research Ethics Committee of the Faculty of Dentistry of the San Sebastián University (N° 2015-02 and N° 2020-83).

RESULTS
The percentage of students, by university, gender and combined totals observed are as follows: UM The normality tests and homoscedasticity were not significant (p > 0.05), and it is inferred that the data are normally distributed with equality of variance between the groups compared.
Corrected correlations and discrimination rates are found in Table 1. The item-test correlations ranged between 0.2 and 0.5. A value of 0.3 indicated adequate discriminative capacity, and 13 of the 20 JSE-S items were above this value. However, when analyzing the behavior of extreme groups, all items showed a moderate to large capacity to discriminate between people with high and low levels of empathy (12) . The Kaiser-Meyer-Olkin index was found to be above 0.6 (KMO = 0.86) and Bartlett's Sphericity test was significant (χ2 (190) = 4130.9, p < 0.01): this implies that the data were suitable for EFA. Three factors were extracted, whose eigenvalues were 6.65, 1.69 and 1.39 respectively. Together they explain 48.6% of the variance present in the data. The items were distributed in each factor as previously reported in the literature ( Table 2). The first factor (compassionate care) included 7 items; the second factor (perspective adoption) included 10 items; the third factor, (walking in patient's shoes, Wips), was made up of 3 items. In all cases the factor loads were greater than 0.3. Only item 18 showed a load of less than 0.3.
However, its discrimination coefficient was high, suggesting that it was able to distinguish between people with high and low empathy scores. Therefore, as in a previous study, we decided to keep it in the Wips dimension (17) . The three-factor model was adequately adjusted to the data, with significant standardized estimates of all items (p < 0.001) ( Table 3) with an RMSEA of 0.04, the chi-square degrees of freedom ratio was 1.85 [χ2 (df 167) = 306.729], the CFI and TLI reached a value of 0.97. Items 18 and 19 showed factor loads slightly less than 0.3. Considering the good overall fit of the model, the discriminative capacity and theoretical sense of the factorial solution, we decided to retain both items. The reliability of the JSE-S was estimated considering the study's total sample. The value of the Cronbach's alpha coefficient for the full scale was 0.75 (standardized Cronbach's alpha = 0.79) and McDonald's Omega = 0.81, which indicates good reliability. The coefficients for each dimension were 0.70 (CC), 0.68 (PW), and 0.63 (Wips).
The JSE-S reached a level of scalar invariance between men and women, which allows valid comparisons between both groups. Such a scalar level implies that there is equivalence between men and women in the number of latent factors present in the JSE-S (configural invariance), the factorial loads of items in each factor (metric invariance), and each the item's mean (scalar invariance). This indicates that the scale measures the same construct in the same way in both groups. Configural The mean and standard deviation values for empathy and its dimensions for the universities examined are presented in Table 4. Although there are no cut-off points by which to assess the levels indicated, it was observed that these values were relatively high for all universities and both genders examined. In fact, if the average total value of E (118.75 points) is considered, it represents 85.2% of the total possible score (140 points); in CC the average score was 42.96, 85.2% of a possible 69 points; in PA 63.86, 91.22% of a possible 70; and finally in Wips the observed value was 11.92, 56.76% of a possible 21, which was a relatively low value (Table 4).
continue… Table 4 -Descriptive statistics of empathy and each of its dimensions arranged by university and gender -Copiapó, Santiago, Concepción, Temuco, CH, Chile, 2016-2018. When comparing empathy and its dimensions by gender, it was found that women presented higher levels of E (t = 3.224, p < 0.001), PA (t = 3.036, p < 0.01) and Wips (t = 2.229, p ≤ 0.05) than men, with the exception of the CC dimension in which no significant differences were found (t = 1.625, p = 0.104).

DISCUSSION
The two hypotheses of the present study were that the latent factorial structure of the theoretical empathy construct would not differ from that found in the present work, and that there would be invariance between genders. As a consequence, the aim of this study was to evaluate the psychometric properties of the JSE-S, the presence of invariance between genders, and the behavior of empathy levels among Chilean nursing students.
The JSE-S has been validated for use in nursing students in different contexts (2) . However, this is the first time that its psychometric properties have been explored in nursing students in Latin America, and in Chile in particular. Our analysis shows that the items of the Chilean version of JSE are able to discriminate between students who have different levels of empathy. In the CFA, the three-factor solution was shown to be a good overall fit, with two dimensions associated with the cognitive aspects of the professional nursepatient relationship (PA and Wips) and one with the emotional aspects (CC) (17) . The first factor, compassionate care (CC) included 7 items. The second factor, perspective adoption (PA) included 10 items. Finally, a third factor, walking in patient's shoes (Wips) included only two items, consistent with what was previously reported in the original (17) . Only item P18, belonging to the first, showed a low factorial load (0.168). However, the values provided by the discrimination coefficient for this item, as well as its correlation with the total test score, suggest that it is able to predict the JSE-S scores and distinguish between people with high and low empathy scores. Therefore, it was retained in this study, as it has been in others (17,(22)(23)(24) . Reliability levels in general were satisfactory, especially for the global scale. Consistent with these findings, it has been shown that the empathy scale for Chilean nursing students has reliability and validity.
Regarding the specific results of the distribution of empathy and its components, it is possible to point out that they are relatively high, although there are not any cut-off points as of yet. In empathy, the score was 84.82% of the possible total (111.75 of 140 points). This result is similar to previous reports of nursing students, that have shown average scores of between 104 and 115 (2). Regarding specific dimensions, the CC (87.67%), and PA (91.22%) dimensions obtained higher scores than the Wips dimension, which obtained the lowest score: 56.76% on average (11.92 out of a maximum of 21 points) ( Table 4). Despite these differences in subscales, from a theoretical point of view, empathy should be considered a system with three elements: CC, PA and Wips. Therefore, empathy is based on the interaction among these elements, an existence characterized mainly by active, positive correlation, and any alteration of their natural relationship (for example, the decrease or absence of positive correlation of an element, such as Wips in our case), may alter the system itself; consequently it cannot function as before or change into another type of system (1,(25)(26) . In others words, empathy is a dialectical synthesis of cognitive (PA and Wips) and affective (CC) attributes.
Different studies have shown interest in the variation of nurses' empathy levels between men and women when measured with the JSE-S (2,27) . However, to make valid comparisons in this sense, it is necessary to check the equivalence of the structure (invariance) of the scales used between the two groups. Our analysis supports the gender invariance of the JSE-S among nursing students: therefore, it is possible to make valid comparisons of empathy scores across gender. The results observed in this study show that women were more empathetic than men in general, in terms of E scores, as well as PA and Wips (both cognitive dimensions). However, there were no differences in the CC dimension, associated with the emotional aspect. These findings are partially consistent with previous research.
On one hand, the differences in the general score are in line with the tendency to show women as more empathetic than men (28) . However, studies report three possible forms of distribution: greater empathy in women than in men, greater

RESUMEN
Objetivo: Evaluar las propiedades psicométricas de la Jefferson Medical Empathy Scale, versión en español ( JSE-S), su estructura factorial, confiabilidad y la presencia de invariancia entre géneros en el comportamiento de los niveles de empatía entre estudiantes de enfermería chilenos. Método: Diseño de investigación instrumental. La JSE-S se aplicó a 1.320 estudiantes de enfermería. Se utilizó un análisis factorial confirmatorio. Se realizó un estudio de invariancia entre géneros. Se estimaron estadísticas descriptivas. Entre in men than in women and without differences (either via statistics or in absolute values) (29)(30) . The explanation of such variability has not yet been found. However, it seems that there is agreement that the expression of empathy and neuronal response is different in men and women (31) . On the other hand, the results in the subscale analysis are not consistent with the claims that women are more emotional than men, or that men more rational than women. Indeed, women seem to have greater emotional responses, reflect the pain responses of others, better recognize emotions and show more prosocial and altruistic behavior (32) . Instead, men seem to have more developed cognitive empathy and a greater number of areas related to cognitive control and cognition (32) . However, empathy cannot be reduced to those neurobiological structures that support it, and its development could well be influenced by social, contextual and cultural conditions (33) . This would influence the difference in the behavior of effective empathy between men and women without ruling out the possibility that these differences between genders in empathy may be the consequence of different selective evolutionary pressures (32) .
Given the relevance of empathy in the relationship between diseases and patients, various interventions have been proposed to develop it in nursing students (34) . These interventions must include the interaction between three dimensions of empathy (PA, CC, Wips). This is a significant challenge, as it implies profound changes in structure of teaching curricula and its own dynamics right up to the level of empathic teacher training. Many researchers have analyzed studies of interventions for developing empathy in nursing students, concluding that in general, the effect of these interventions on levels of empathy was small to moderate (25)(26)34) .
Most of the studies analyzed looked at limited interventions that did not necessarily represent a curricular change; this may partially explain the results obtained. The incorporation of empathy in nursing curricula has been gradual (3) . An analysis of the curricula of different nursing pre-registration programs in the UK conducted in 2000 showed inconsistencies in approach and emphasis in teaching skills related to empathy. Only in 2007 were standards for the delivery of empathy-based care curricular in that country (3) . Some authors have suggested strategies for incorporating this aspect into nursing curricula; among them the promotion of reflective teaching practice both in and outside clinical settings, as well as forms of evaluation that would establish connections between the actions of students and their learning of a therapeutic relationship (3) .
These interventions must be combined with serious, deep empathic diagnoses and valid, reliable, and culturally appropriate tools will contribute to this goal. The empathic diagnoses should strongly relate to pedagogical actions, curricula modification, and the introduction of active teachinglearning processes, among many other aspects. All of these strategies need to be applied simultaneously and must function alongside and in accordance with empathic diagnoses (4,(35)(36)(37) , as various authors have long held (1,4,25) .
A limitation of the present study is the bias generated when using secondary data that do not constitute a random sample of the population of Chilean nursing students, a failing in the control of sampling errors. Regarding the nursing faculties included, the sample represents 68.71% of this population of students, with a sample size that favors the stability of the estimates.

CONCLUSION
Empathy data in Chilean nursing students corresponds to a three-dimensional factor structure that matches the original instrument. There is an invariance of the factor structure between genders, and data for empathy and its dimensions is comparable between them. There is variability in the distribution of empathy values and their dimensions among universities. Women were more empathetic than men in general, but in the compassionate care dimension there were no differences between them. los géneros, la distribución T de Student se aplicó junto con el análisis de homocedasticidad. El nivel de significancia fue α ≤ 0,05. Resultados: El análisis factorial confirmatorio determinó la existencia de tres dimensiones en la matriz. Los resultados estadísticos de las pruebas de invariancia fueron significativos y permitieron la comparación entre géneros. Se encontraron diferencias entre los valores medios de empatía, así como en algunas de sus dimensiones entre géneros. Conclusión: La estructura factorial de los datos de empatía y sus dimensiones está en correspondencia con el modelo tridimensional subyacente. Existen diferencias en los niveles de empatía y sus dimensiones entre géneros, a excepción de la dimensión de cuidado compasivo, que se distribuyó de manera similar. Las mujeres eran más empáticas que los hombres.