The Postgraduate Hospital Educational Environment Measure (PHEEM) Questionnaire Identifies Quality of Instruction as a Key Factor Predicting Academic Achievement

OBJECTIVE This study analyzes the reliability of the PHEEM questionnaire translated into Portuguese. We present the results of PHEEM following distribution to doctors in three different medical residency programs at a university hospital in Brazil. INTRODUCTION Efforts to understand environmental factors that foster effective learning resulted in the development of a questionnaire to measure medical residents’ perceptions of the level of autonomy, teaching quality and social support in their programs. METHODS The questionnaire was translated using the modified Brislin back-translation technique. Cronbach’s alpha test was used to ensure good reliability and ANOVA was used to compare PHEEM results among residents from the Surgery, Anesthesiology and Internal Medicine departments. The Kappa coefficient was used as a measure of agreement, and factor analysis was employed to evaluate the construct strength of the three domains suggested by the original PHEEM questionnaire. RESULTS The PHEEM survey was completed by 306 medical residents and the resulting Cronbach’s alpha was 0.899. The weighted Kappa was showed excellent reliability. Autonomy was rated most highly by Internal Medicine residents (63.7% ± 13.6%). Teaching was rated highest in Anesthesiology (66.7% ± 15.4%). Residents across the three areas had similar perceptions of social support (59.0% ± 13.3% for Surgery; 60.5% ± 13.6% for Internal Medicine; 61.4% ± 14.4% for Anesthesiology). Factor analysis suggested that nine factors explained 58.9% of the variance. CONCLUSIONS This study indicates that PHEEM is a reliable instrument for measuring the quality of medical residency programs at a Brazilian teaching hospital. The results suggest that quality of teaching was the best indicator of overall response to the questionnaire.


INTRODUCTION
Medical residents receive most of their training within hospital programs. In Brazil, after completing a medical undergraduate education, students enroll in residency programs that involve three (basic) to seven years of training. Students who complete their residency ultimately receive certification as a specialist. Resident physicians undertake supervised training in areas of the hospital that include the ambulatory care facilities, surgical centers, laboratories, radiology departments, and in-patient wards.
Previous efforts to study effective learning environments resulted in the development of a questionnaire for undergraduates. The 50-item Dundee Ready Education Environment Measure (DREEM) used a standard methodology grounded in education theory together with a Delphi panel of nearly 100 professional health educators from around the world. It was translated into a variety of languages and has been used worldwide. 1, 2 The DREEM was translated and adapted for Portuguese and has been used with Brazilian undergraduate medical students as well as residents. It evaluates first year medical students' experiences during outpatient consultation observation. This is typically done in Brazil to introduce students to the concept of patient interaction at an early stage in their training. 3 The version of DREEM used in this study also featured a psychometric performance evaluation component. This questionnaire was given to medical residents from different programs at six institutions in three Brazilian cities. 4 A similar methodology to DREEM was used to develop the PHEEM -Postgraduate Hospital Educational Environment Measure questionnare. 5 This elegant 40-question survey assesses metrics of the level of autonomy, quality of teaching and social support during the hospital-based training period undertaken by all new physicians. PHEEM can identify specific strengths and weaknesses within a certain educational environment. Because medical residency programs in Brazil have rarely been comprehensively evaluated to date, we aimed to assess PHEEM as a possible tool for performing future evaluations. 5 The objective of this investigation was to validate the PHEEM questionnaire translated into Portuguese and to study its reliability. We also aimed to compare PHEEM results among residents from the Internal Medicine, Surgery and Anesthesiology departments at a university teaching hospital in Brazil.

METHODS AND MATERIALS
The method used to translate the PHEEM questionnaire was based on the modified Brislin back-translation technique. 6 Briefly, two Brazilian teachers of English independently translated the original PHEEM into Portuguese. The resulting versions were then back-translated into English (En1). A second translation was independently undertaken to generate a second Portuguese draft copy from the En1 version, and the resulting document was again back-translated by a native English speaker (En2). A final Portuguese version was generated by comparing the two back-translated English versions (En1 and En2) with the original and the two Portuguese drafts.

Reliability
Our study was approved by the appropriate institutional review board and all participants were required to sign an informed consent document. Each questionnaire featured an anonymity barcode which was not linked to the identity of the participant. The 40-question Portuguese PHEEM survey was given to a randomized sample of medical residents from the Hospital das Clínicas, University of São Paulo Medical School. Our test group included physicians finishing their first year (n=174) or at the end of their second year of residency (n=89). In addition, residents from the Hospital Governador Celso Ramos (n=78) participated. This group included a wide range of medical specialties. In addition, we gave the survey to a number of medical residents who had entered a local competition in two main areas: Internal Medicine (n=459) and Surgery (n=298). Respondents were asked to indicate their agreement with each statement using a five-level Likert-type scale, which went from 'strongly disagree' -0 to 'strongly agree' -4. Higher levels of agreement were correlated with more beneficial educational environments.
In order to determine the needed sample size, a pilot study was conducted with the DREEM questionnaire to identify the impacts of differences in means. It suggested that a sample of 27 questionnaires across three groups would yield a statistical power of 80% and a significance level of 5%. 4,7 Cronbach's alpha coefficients were used to assess reliability and internal consistency. The Hotelling's T-squared test was used as a multivariate analysis tool to evaluate the null hypothesis that all of the items on the scale would have the same mean. ANOVA was used to compare PHEEMderived data from residents in the Surgery, Anesthesiology and Internal Medicine departments.
A subsample of medical residents (n=50) from the Clinics Hospital at the University of São Paulo Medical School answered the PHEEM twice in a period of 30 days. The Kappa coefficient was used as a measure of agreement between these two samples. In general, excellent agreement was indicated by κ > 0.74; good agreement, κ = 0.60 to 0.73; fair agreement, κ = 0.40 to 0.59 and poor agreement, κ < 0.40. 8 The Spearman correlation test was used to assess equivalence between the two samples. These medical residents completed an additional survey to assess their experience in completing the PHEEM questionnaire. This additional survey is known as the WHOQOLbref. 9 We evaluated correlations across data from the two questionnaires.

Factor analysis
Principal component factor analysis was applied across the three sections in the original PHEEM questionnaire. This analysis aimed to explain a considerable amount of the variance that was present in the 40-item survey. Factor loadings were obtained following varimax rotation.

RESULTS
The PHEEM Portuguese questionnaire was answered by 30.7% of the Internal Medicine residents (n=141) and by 20.5% of the Surgery residents (n=61) who had entered a local competition. In addition, the survey was competed by 19.5% (n=34) of the first year residents at Clinics Hospital and by 17.9% (n=16) of the second year residents. Residents from Governor Celso Ramos Hospital achieved a higher response rate of 69.2% (n=54).
The PHEEM survey in Portuguese (n=306) showed a Cronbach's alpha of 0.899. The alpha value did not vary by more than 5% across groups and genders. As expected, the null hypothesis was deemed invalid according to Hotelling's T-squared test. This result indicates that the outcomes for each question had a different mean.
The Cronbach's item-total correlation identified item one, "contract of employment;" item seven, "racism;" item thirteen, "sexual discrimination;" and item 25, "no-blame culture," as essentially uncorrelated with the total score.
The weighted Kappa coefficient was used as a measure of the agreement in responses over an interval of 30 days. The coefficient indicated a 56% return (n=28). The majority of questions were in excellent agreement, while items one, "contract of employment;" two, "clear expectations;" eight, "inappropriate tasks;" ten, "good communication skills;" and 26, "catering facilities," had coefficients that suggested good agreement. None of the questions scored in the "fair agreement" range or lower. The Spearman correlation index r was greater than 0.8 in all cases except for question sixteen, "collaboration with other doctors" (r=0.659).
The lowest recorded score was 1.2 (for item 26: "There are adequate catering facilities when I am on call") while the highest was 3.6 (the opposite of item 7: "I can sense the existence of racism in my position"). A rating higher than 2.0 indicates a more supportive/suitable educational environment. Question numbers 1, 14, 17 and 32 within the "Autonomy" section were found to have relatively low ratings. Teaching quality questions 3 and 39 also showed low ratings. Social support showed low ratings for question numbers 20, 25, 26, and 38 ( Table 1). The questionnaire presented four items that featured negative statements (items 7, 8, 11 and 13); the scores for these items were inverted in order to calculate total score and percentile results from the questionnaire. The autonomy section total score was 33.9 ± 8.6 (60.5% ± 15.3%), the teaching quality score was 35.0 ± 10.0 (58.3% ± 16.6%) and the social support section score was 26.6 ± 6.0 (60.4% ± 13.6%).
A subset of questionnaires was selected for further analysis. Results from the three sections of the PHEEM survey were compared among residents from Internal Medicine, Surgery and Anesthesiology programs by ANOVA followed by the Holm-Sidak test. The level of autonomy was generally perceived to be higher by residents in Internal Medicine than by those in Surgery (p=0.000001). However, there was no difference in perception between Internal Medicine and Anesthesiology residents (p=0.0573). Perceived levels of teaching quality were higher for residents in Anesthesiology as compared with Internal Medicine (p=0.017) or Surgery (p=0.00048). Internal Medicine residents also rated quality of teaching significantly higher than Surgery residents (p=0.0308). There were no differences in the perception of social support among any of the groups ( Table 2).
Exploratory factor analysis followed by varimax rotation identified nine factors with eigenvalues higher than 1 ( Table 3). The first factor had an eigenvalue of 12.0, which accounted for 30.0% of the variance. The next eight factors had eigenvalues lower than 2.3; these factors accounted for 58.9% of the variance in the data. These findings suggest that the questionnaire is essentially a one-dimensional scale. The first factor included questions 10, 28, 23, 6, 2, and 22 from the teaching quality section of the PHEEM questionnaire, and items 26 and 35 from the social support section.
There was a 48% return rate (n=24) for the WHOQOLbref survey. The WHOQOL's domains were considered adequate, reaching 60% for the environmental and greater than 70% for the psychological, social and physical domains. They did not correlate with any other results from the PHEEM questionnaire.

DISCUSSION
This study evaluated the applicability of the PHEEM survey translated into Portuguese with a sample of resident physicians in Brazil. Reliability of the Portuguese-language translation was high. Our PHEEM results suggest that the teaching quality section was the most important part of the questionnaire.
The questionnaire showed a high internal consistency. Cronbach's alpha was above 0.89 for all 40 questions. Five items (1, 7, 13 and 25) did not correlate well with the total score and, from a statistical point of view, they were found to not influence the overall results. Despite the observed lack of influence, we suggest that these statements be retained to allow for comparisons between programs or institutions. Indeed, items 7 and 13 deal with racial and gender discrimination, respectively, and this sample of residents indicated that these issues were not problematic during their training programs. On the other hand, items 1 ("information about hours of work") and 25 ("no-blame culture") reached a mean of just below 2.0, which suggests that these issues constituted an educational obstacle. Há boas oportunidades de aconselhamento para residentes reprovados, para que possam completar satisfatoriamente seu treinamento.
1.7 ± 0.9 * Items 7, 8, 11 and 13 must be switched to properly match the perception/sector to which they belong.
The Kappa coefficient suggests excellent inter-sample agreement. The good agreement observed for five items may be attributed to the fact that they queried different areas of the residency workload. More specifically, during the 30-day test interval, many residents switched between programs and this may have impacted their answers to these questions. 10 PHEEM can be used to identify strengths and weaknesses of a medical residency program. When the published guide suggested for this questionnaire is applied to interpret mean scores, all residents had "a more positive perception" (33.9/56) with regard to the level of autonomy; teaching quality seemed to be "moving in the right direction" (35.0/60) and there were "more pros than cons" (26.6/44) in terms of the provision of social support. 5 Such results should be taken into account by curriculum planners as they consider improvements to educational programs. Nevertheless, we note very positive responses in terms of the statements that confirmed minimal discrimination (items 7 and 13). Relevant items showed that residents were given appropriate levels of responsibility (item 5) and opportunities (item 30). Each of the latter items exhibited a mean score of greater than 3.
Items with a mean of between 2 and 3 identify elements of the educational program that could be enhanced. 5 It is interesting, albeit disappointing, to note that 28 out of 40 questions for this sample fit in this classification. Specifically, the net perception of teaching quality was within this range or, in some cases, below 2 (question 3 -"educational time is safeguarded" and question 39 -"clinical teachers provide feedback"). Significant attention should be directed to those questions with means below 2 as they may indicate serious problem areas. In addition to the two items that pertain to perception of teaching quality, those in the most urgent need of improvement were related to social support. Certain specific responses pointed to a lack of personal and sometimes professional support (question 25 -"no-blame culture", 26 -"catering facilities" and 38 -"available counseling for junior doctors who fail"). Similar shortcomings were seen with regard to the level of autonomy (question 1 -"I receive adequate information about my work hours"; 14 -"I am always clear on which clinical protocols are acceptable" and 17 -"my working hours are in accordance with the limits specified by the National Board").
It is also important to highlight weaker items related to social support. Residency training is often correlated with stress, depression and burnout due mainly to excessive work hours, sleep deprivation, challenging patients and an aggressive and challenging work environment. 11- 15 The PHEEM did not ask questions that directly addressed these aspects, but respondents suggested that their social life during residency could be uninspiring, and that their level of social interaction was unsatisfactory. Other researchers have evaluated how medical residents experience personal growth and have highlighted the need for an environment that fosters supportive relationships and encourages reflection. 16 On the basis of our study, it would be interesting to further investigate the positive associations of teaching achievement and social interactions. This is a possible area for future studies.
The PHEEM teaching quality section identified two problem areas: safeguarded educational time and feedback from instructors. Certainly, these two elements are important to obtaining a meaningful learning experience during medical residency. Indeed, in order to acquire learning skills, an apprentice needs safeguarded time away from the institutional schedule. In the same way, feedback from instructors is critical to the learning process. 17,18 The item related to work hours being consistent with limits specified by the National Board requires discussion, at least in the case of Brazil's system of medical residency. It has been observed that violations of residency program requirements correlate with very low perceptions of quality of life and a poor educational environment. 19 . However, this study did not identify a low quality of life, considering the results of WHOQOLbref, but did identify a weak educational environment. It has been suggested that reliance on such data is not appropriate when studying associations of working hours with quality of life and patient care. 20 Educational as  well as management theories support the understanding that the development and maintenance of a learning-oriented culture should be a high priority for residency programs and their institutions. 21 It has been suggested that a lack of supervision can contribute to medical errors, but despite this fact, best practices in medical communication and supervision skills have received little evaluation. 22 It would also be interesting to consider how network analysis may facilitate the mapping of ties between residents' peers and supervisors and the nature and rules of their relationship in order to best understand how to create and maintain an appropriate academic environment. 23 Finally, the factor analysis of this sample supports Boor et al. who suggest use of PHEEM as a one-dimensional scale instead of the three original sub-scales . 5, 24 The main factor that explains 30% of the variance includes seven items that are related to the perception of teaching quality and one that relates to social support and instructor mentoring skills. Even more interesting, these eight items are strongly linked to the role of the instructor. Our factor analysis is consistent with other published data in which four of seven items composed a one dimensional scale that also treated the importance of the instructor during residency. 25 Considering various other studies that have also used this questionnaire, we would reemphasize that PHEEM should be used in its original format with all 40 basic questions in order to allow for comparisons between programs and to permit evaluations of the three sections during different phases of a physician's medical residency.
The results from this study support the use of PHEEM as a reliable instrument to identify issues related to the clinical educational environment. Our data suggest that the results from the section on teaching quality are very important.