Evaluation strategies in active learning in higher education in health: integrative review

Objectives: to analyze scientific evidence on evaluation strategies for active learning methods in health undergraduate programs. Methods: integrative literature review in the Medical Literature Analysis and Retrieval System Online, Latin American and Caribbean Literature in Health Sciences, Nursing Database, Scopus, Web of Science and Education Resources Information Center databases. Results: different evaluation strategies are used: Presentation of seminars, Self-evaluation, Evaluation of student performance in Tutotest-Lite tutoring, Peer Evaluation, Active Learning and Critical Thinking Self-evaluation Scale, Objective and Structured Clinical Exam, Portfolio, Progressive Disclosure Questions, Modified Dissertation Questions, Progression Test, Dissertation Test, Objective Test, Immediate Learning Checks, Clinical Case Resolution and Cumulative Test. Final Considerations: evaluation strategies in active learning are used in combination, aiming at the affective, cognitive and psychomotor development of the student. However, studies with greater power of scientific evidence would be needed. Descriptors: Learning; Problem Educational Education; literatura


INTRODUCTION
The term "evaluation" has several meanings and, in general terms, it refers to the validation verification of the teaching-learning process, through an accompaniment of the biopsychosocial dimensions of the student. It should be understood as a cooperative action between teacher and student, in which both are benefited and guided during the construction of knowledge and, therefore, it should be dissociated from the punitive character (1)(2)(3) .
In the teaching-learning process, the evaluation allows questions, identification of advances and difficulties, as well as subsidizes decision making. Thus, in order to guarantee the success of this process, it is necessary that the evaluation receives a prominent role, being planned in accordance with the curriculum, so that the educational objectives are contemplated (2)(3) .
Didactically, Bloom et al. (4) divided the educational objectives into domains, each one consisting of categories arranged in a hierarchical and interdependent manner. In this sense, the student needs to master the level at which he is to advance to the next. The domains and their respective main attributes are: 1) the Affective, represented by attitudes, behaviors, respect, feelings, values, and the five categories of this domain are Receptivity, Response, Appreciation; Organization and Characterization"; 2) the Cognitive, which means the ability to recognize facts, patterns, and concepts, as well as the willingness to constantly develop, and the objectives of this domain form six categories -Knowledge, Understanding, Application, Analysis, Synthesis, and Evaluation; and 3) the Psychomotor, which is related to physical abilities, but other scholars have broadened its characteristics to include reflection, perception, the ability to develop improved movements, and non-verbal communication, and, in this domain, the categories are Imitation, Manipulation, Articulation, and Naturalization" (4)(5) .
Thus, bearing in mind the wide range of skills and competencies assumed by the above educational objectives, it is evident that an isolated method of evaluation does not consider what is required when proposing a comprehensive teaching and learning process (6) .
In this perspective, it is noted that traditional models of education do not deepen each of the educational objectives and their domains, because they tend to emphasize the cognitive aspects to the detriment of the others. For this reason, the adoption of active learning methods has been growing in the health undergraduate program, aiming at developing skills such as autonomy, proactivity, teamwork, ability to reflect and problematize reality, to solve problems, among others, which consist of ethical, technical and political skills, in a teaching-learning movement centered on the student (7)(8) .
In Brazil, the National Curricular Guidelines (NCG) for undergraduate health courses propose the use of active learning methods in order to prepare future professionals to carry out comprehensive care and problem solving by bringing them closer to reality (9)(10)(11)(12)(13) . Such skills meet the principles and guidelines of the Brazilian Unified Health System (UHS). Overcoming the theory plan and performing its articulation with practice, previous knowledge, experiences in health services and community are also the objectives of active learning methods (7)(8) .
Considering that the NCGs of health courses suggest the use of active learning methods and that evaluation is an extremely important aspect in the learning process, it has become pertinent to carry out this integrative review of the literature.

OBJECTIVES
To analyze scientific evidence on evaluation strategies in active learning methods in health undergraduate.

METHODS
This is an integrative literature review (ILR) on evaluation strategies used in the development of active learning. ILR is based on evidence-based practice, which allows the researcher a consistent and understandable overview of the phenomenon analyzed and the synthesis of this knowledge through a broad and diverse sample of studies. In this revision approach, it is valid to include both experimental and non-experimental studies, theoretical and empirical literature (14)(15) .
This process is carried out in six phases: 1) elaboration of the guiding question; 2) search or sampling in the literature; 3) data collection; 4) critical analysis of the included studies; 5) discussion of the results; 6) presentation of the integrative review (14)(15) . The guiding question adopted for this study was elaborated based on the strategy PICo -acronym in English for Population (P), Phenomena of Interest (I) and Context (Co). This strategy is used for qualitative reviews and helps in the identification of key words and/or descriptors more coherent with the objective of the study, so that they promote the localization of primary studies of relevance in the databases (16) .
In this study, the PICo strategy was established as follows: Pevaluation strategies; I -evaluation in active learning methods; and Co -undergraduate courses in the health area. The guiding question adopted was: What are the scientific evidences about the evaluation strategies in active learning methods, in the undergraduate courses in the area of health?
For the selection of articles, the descriptors "Educational Measurement" AND "Problem-Based Learning" were used, through consultation with the Health Science Descriptors (DeCS) and Medical Subject Headings (MeSH) -although the use of the second descriptor seems to reduce the search to the use of the Problem-Based Learning (BPL) method only, as an active learning method, being synonymous with the "active learning method" and referring to the use of modalities other than the traditional one. The articles were searched in the databases most likely to contain the bibliographic material worldwide on the information desired, namely, Medical Literature Analysis and Retrieval System Online (MEDLINE), Multidisciplinary Database (Scopus), Education Resources Information Center (Eric), Set of Databases, also known as Science Citation Indexes (Web of Science) and Latin American Literature in Health Sciences (LILACS).
The following were adopted as inclusion criteria: to be the original article available in full, answering the question of the study; to be the language of publication Portuguese, English or Spanish; publication period between 2013 and 2018 -this time cut aimed at the inclusion of more contemporary articles. Articles of theoretical reflection, dissertations, theses, reviews and editorials were excluded. Initially, 1,117 articles were found, which were submitted to the bibliography management software for publication of scientific articles (Endnote), in which duplicates were eliminated, leaving 1,098 articles. In the sequence, taking into consideration the inclusion criteria, selections were made sequentially, based on the titles, abstracts and the full text, and the final sample consisted of 14 articles as shown in Figure 1. and conclusions/observations, with the purpose of providing a comparative analysis.
The corpus of analysis of the articles was classified into levels of evidence: Level I -result of meta-analysis, controlled clinical studies and randomization; Level II -evidence obtained from experimental design studies; Level III -evidence obtained from quasi-experimental research; Level IV -evidence obtained from descriptive studies or with a qualitative methodological approach; Level V -evidence obtained from case reports or experience reports; Level VI -evidence based on expert opinions or based on standards or legislation (14) .
The results and discussion were presented in a descriptive manner, categorizing the data extracted from the selected studies into thematic areas, by identifying variables of interest and key concepts (in this case, the evaluation domains), in accordance with the recommended literature on ILR (14,18) .

RESULTS
The 14 articles selected for the discussion of this integrative review include all the pre-established inclusion criteria, as shown in Chart 1.
In the characterization of the articles, four Brazilians were found, four Americans, and the others are from different countries and continents. Of these, two are from 2013, two from 2014, two from 2015, four from 2016, two from 2017 and two from 2018. As for the level of evidence, most are in Level IV and V.
The active learning methods used in the analyzed articles were: BPA in seven articles; four of them did not specify the active learning method modality; case discussion, collaborative learning, "differentiated methodology", one article each. Regarding the courses, nine were Medicine, three Dentistry, one Nursing and one Speech Therapy.
Different modalities and evaluation instruments were identified, being the Presentation of seminars, Self-evaluation, Evaluation of student performance in Tutotest-Lite tutoring, Peer Evaluation, Scale of Self-evaluation in Active Learning and Critical Thinking Chart 1 -Presentation of the articles included in the integrative literature review according to title, year, country, type of study, level of evidence, teaching method, undergraduate course and evaluation strategy, Marília, São Paulo, Brazil, 2020

Title
Year/ Country

Teaching method/ Course Evaluation strategies
How to develop a competency-based examination blueprint for longitudinal standardized patient clinical skills assessments (19) 2013/ United States

Active learning/ Medicine
Objective and Structured Clinical Examination, formative New virtual case-based assessment method for decision making in undergraduate students: a scale development and validation (20)

2013/ Slovenia
Observational crosssectional study/ IV Case based discussion/ Medicine Real clinical case resolution on online platform Multiple tutorial-based assessments: a generalizability study (21)

2014/ Argentina
Research-action/ III Collaborative Learning/ Dentistry Portfolio Enhancing students' learning in problem based learning: Validation of a self-assessment scale for active learning and critical thinking (23)

2015/ Indonesia
Mixed method for scale development/ IV Problem-Based Learning/Medicine Scale of Self-evaluation in Active Learning and Critical Thinking Data collection and critical analysis of the articles were performed separately by the authors; and, in the sequence, consensus was reached, leading to the definition of the articles to be analyzed. For the analysis, a script was constructed containing the items: title of the article, journal, year of publication, authors, country of origin, objective, type of study, level of evidence, number of participants, teaching method, course, evaluation tool, main results To be continued

DISCUSSION
The literature on evaluation strategies in active learning methods is still incipient and may show that there is resistance from schools to adopt evaluation methods that differ from traditional methods, although it points to the use of active learning methods on different continents. The product of this ILR demonstrates the contingent of articles with a low level of evidence, which suggests the need to produce research on the object investigated with greater power or level of scientific evidence in order to subsidize its improvement and, consequently, contribute more effectively to the formation of health professionals.
In order to better present the findings of this review, it was decided to group them into three categories. Each one of them contains evaluation strategies that emphasize some of the necessary domains for the acquisition of professional competence, in active learning methodology: 1) Evaluation strategies with predominance of the affective dimension; 2) Evaluation strategies with predominance of the cognitive dimension and 3) Evaluation strategies with predominance of the psychomotor dimension.

1) Evaluation strategies with predominance of the affective dimension
In the articles analyzed, we found some BPA performance evaluation tools that highlight the affective dimension (21,27,29) . For use at the end of the tutoring sessions, a suitable training tool was identified for the stages of the BPA. In opening the problem, evaluation criteria are used such as the ability to identify learning issues, use previous knowledge, generate hypotheses, synthesize ideas and communicate in a clear and organized way; and in closing the problem, the relevance of the information brought, the ability to synthesize, the exposure of information in a clear and organized way and the student's critical attitude towards the information shared. In addition, in both moments, the student is evaluated regarding his/her interaction with the group, punctuality, the student's role in group work, ability to make and receive criticism and interpersonal relationships with colleagues and tutor (29,33) .
For use at the end of the unit as an instrument of summative evaluation, there is an instrument filled in by the tutors, a Likert type medium, on the following skills: reasoning and expression, personal development, teamwork and clinical skills (21) . Besides this one, another was identified, filled in by tutors trained in BPA, about the evaluation of the so-called behavioral indicators (solution of the problem, use of information, group process and professionalism), whose measurement is done through a 7-point scale, varying from unsatisfactory (1) to excellent (7) (27) .
Still at the BPA, the use of self-evaluation and peer evaluation was observed. Self-evaluation allows students to explore their own strengths and weaknesses in the process of learning and reflecting on their practice. In this way, they can perceive their progress throughout the course and set goals in order to improve their performance (23,34) . SSACT (23) points out the following aspects to be Validation of a performance assessment instrument in problem-based learning tutorials using two cohorts of medical students (27)  considered by the student during his/her self-evaluation: definition of personal learning objectives, application of multiple learning strategies during individual study, ability to synthesize the key points of the study during discussion, effectiveness of individual study in solving the problem discussed and in achieving the established goals, concern for other members of the group, ability to formulate questions and unite previous knowledge with that acquired. This tool was evaluated by the significance of its items, and its use is indicated at the end of each tutorial meeting, as a strategy to assist the student in better understanding each phase of the BPA and inform him/her on how to improve self-directed learning.
In turn, the peer review (30) emerges with the aim of sharing the responsibility for learning the group among each participant. This democratic model of evaluation allows each student to evaluate their colleagues and their tutor, being considered more reliable than the self-evaluation and the evaluation of the tutor, because the students can provide a more reliable image of their colleagues, since they live in other learning scenarios, while the tutor has only the moment of tutoring. Its main function is to diagnose the gap between the current behavior of the group members and the desirable behavior, in order to narrow it down (34) . However, factors that make this practice unreliable if applied in isolation stand out: often, there is a lack of rigor of the evaluator (another student) and subjectivity. Even so, a set of this form of evaluation contributes to generate a more coherent reflection in the student, since it softens the factors responsible for the variance. In the approach identified (30) , each student evaluates their peers (about ten students), in an online instrument composed of nine items, by means of a Likert type scale, and elaborates a constructive written feedback from at least four members of their tutoring group. Although of a formative nature, the participation of the students was considered mandatory.
The presentation of seminars (25) It also consists of an instrument of affective evaluation, since it mobilizes, besides cognitive aspects, the affective ones, communication and reveals the student's posture. This instrument has the possibility of being developed individually or in groups and can contribute to the construction of knowledge, since it allows its participants to research information, synthesis and the construction of debates (25,(35)(36) .
Thus, it can be noted that, when it comes to the evaluation for the acquisition of professional competence in the affective dimension, there are numerous instruments to accompany and identify the student's development, that is, formative and summative instruments. This fact is of utmost importance, since both types of evaluation must be included, due to their specific contributions to the formation of the student. In addition, it is worth pointing out that the compilation of articles has shown instruments that are only focused on the BPA method.

2) Evaluation strategies with predominance of the cognitive dimension
Referring to the evaluation of cognitive aspects, its predominance in evaluation processes is observed (24)(25)(28)(29)32) , both for its tradition and for the support it provides to the teacher in his decisions regarding the student, because, related to the student's performance, it consists of physical evidence (manuscript).
Moreover, it provides a sense of justice, since the same instrument is applied to all students in a similar way. This instrument can be presented in a discursive or objective manner (35,37) .
It is worth mentioning that the elaboration of a theoretical evaluation at the BPA can contextualize the problems addressed in the tutorials, in order to bring the student closer to real experience (29) .
Objective evidence is most often based on memorization, but can also advance to more complex levels of cognition such as interpretation, application, analysis, synthesis and judgment. Its structure may require the student to either fill in gaps or draft short answers, or also to choose a correct alternative from the various presented (multiple choice questions) or judge certain items (true or false, ordering) (25,35,37) .
In this perspective, the Progression Proof (PP) (32) , also called progressive assessment or progress test, is an objective multiplechoice test, usually containing between 100 and 200 questions, which are formulated considering the content required for a course curriculum. This form of evaluation is applied in all grades, aiming at identifying the student's progress over the years, that is, besides allowing the student to check his/her punctual performance (diagnostic evaluation), it is also a longitudinal evaluation. It is considered a valuable academic management tool, as it helps in curricular adjustments, besides allowing the student to ponder about his performance and his progression along the grades (32,38) . Similarly, the cumulative test (28) is employed in active learning methods and intends to identify the progression of the student's knowledge, but does not necessarily compare it with other grades, since it can be applied during the teaching of a certain subject in a short period of time. In the study in question, the cumulative test was applied in three moments: the first two parts contained half of all the items; and the third part, the remainder.
The discursive test (24)(25) refers to the one elaborated with questions to be answered in a descriptive and free way, but the correction must be based on certain objectives, previously established. It allows the identification of abilities such as synthesis, judgment, creativity, exemplification, argumentation, memorization, correlation between knowledge, among others, which should be developed and evaluated in higher education (25,35,37) .
In the logic of the evaluation of the cognitive aspects, in BPA, PDQs/MEQs were also found (24) , which are designed with different levels of difficulty (according to Bloom's Taxonomy), require a reflective process from the student to construct the answer and mobilize different skills, including writing ability, as opposed to evaluations with multiple choice questions in which the student must only identify the correct answer. It is intended that the use of this modality helps in the development of clinical and logical reasoning. However, it was found that the performance of students is even better in matters of basic level (memorization) compared to those involving clinical and logical reasoning (24) .
The ILA (25) , which consists of applying discursive or objective tests at the end of each class, addressing the content worked on the respective day, aims at early identification of gaps regarding the subject addressed in critical-reflective learning method, allowing these to be worked on in subsequent classes, enhancing the teaching-learning process (25) .
Clinical case resolution (20) , while it is a form of case-based evaluation in discussion means the construction by the student of an outcome to a case presented, taking into consideration available semiological data and its relation to the knowledge previously learned by the student in a predefined format, i.e., it is the combination of pattern recognition with hypothetical-deductive reasoning. The skills consonant with the discursive modality can also be evaluated through this resolution (20,39) . This strategy was evaluated positively, as a means of verifying the clinical decision making capacity of medical students (20) .
In a broader perspective of the forms of evaluation, is the use of the portfolio (22,25) , an instrument focused on the student's learning process, which follows the critical-reflexive capacity, creativity, mastery of written and formal language norms, as well as the construction of narrative and text sequence (40) . In addition, it allows the student to analyze his commitment, posture and participation in activities, being considered a space for selfevaluation. It is therefore an important tool for the promotion of the student-centered approach and a space for teacher-student dialogue (22,25,41) . Strategy used in collaborative learning and criticalreflective teaching method.
In this sense, although many cognitive evaluation proposals are similar to the traditional ones, it is relevant to highlight adjustments made, such as the use of real cases to contextualize the evaluation, the employment of issues at different levels of complexity and that require increasing cognitive skills, in addition to strategies that require reflection and own constructions such as the resolution of clinical cases and the portfolio. It is also worth mentioning the use of diagnostic evaluations, which measure the acquisition of knowledge at a given moment, of the module or of the course -of great importance for adjustments in the educational process.

3) Evaluation strategies with predominance of the psychomotor dimension
In the context of active learning methodologies, the practical evaluation has represented a great advance, since it allows the verification of tasks, attitudes and procedures or the judgment of the product of an action (29,42) . An example of this modality is the OSCE (19,26,31) , whose purpose is to evaluate the clinical skills, knowledge, professional posture as well as the student's communication. The execution of tasks such as anamnesis, physical examination, quality of explanation given to the patient, the way the meeting is closed are verified by the observer through a checklist and, at the end of the activity, the student receives a feedback of his performance (26,31) . A return strategy is, at the end of the feedback session, for the evaluator to document in a handwritten portfolio the strengths and weaknesses of the student's performance, make suggestions for the next OSCE, and give a copy to the student (26) . The OSCE has been applied among health students around the world, both in the summative and formative modalities (42) . The analyzed articles brought, as potentialities of this modality, the opportunity to evaluate skills and abilities less evaluated in daily life, associate skills to skills and give feedback to students (19,26,31) . As a disadvantage, it is recognized the short time in each season, anxiety of the students, high cost and difficulty to gather the required number of examiners and simulated patients (31) .
The use of another strategy to evaluate student performance in practical skills activities was also identified. In a course that employs BPA, for the daily and formative evaluation of the activities of practical abilities, the technical and personal aspects are considered, as association between theoretical-practical knowledge, attendance, discipline, ethics, execution of tasks, punctuality and responsibility. Based on the simulation of real situations, contextualized in practical experiences, a summative theoretical-practical evaluation is also carried out (29) .
Thus, possibilities of evaluation for the acquisition of professional competence in the psychomotor dimension, both formative and summative, were exposed.
It is notorious and even expected that evaluation tools address more than one dimension, but it is important that all these are examined with due attention. Among the studies analyzed, the combination of evaluative instruments stands out, which cooperates with the student's involvement in the learning process and with its growing development in all aspects: affective, cognitive and psychomotor (25,29) .
The articles that approach evaluation in active learning methods point, in their majority, to forms and instruments that go beyond the cognitive and technical aspects of professional training and also involve the affective and attitudinal as well as the critical reasoning, with emphasis on the capacity of group work and the full understanding of the assisted person. This movement, which comes from different countries, reiterates that effective evaluation practice must take place based on multiple possibilities and instruments, so that all the dimensions necessary for formation for today are contemplated. In addition, it emphasizes that the evaluation must occur concomitantly with the teaching and learning process and use different instruments to provide adequate feedback to students and enable their improvement in each of the areas of action (19)(20)(21)(22)(23)(24)(25)(26)(27)(28)(29)(30)(31)(32) .

Limitations of the study
The study is limited by the fact that it included only articles from certain databases, although with the concern of seeking the main ones that make up the area of health and education, in addition to the exclusion of articles that were not in English, Portuguese and Spanish.

Contribution to the health area
The study can contribute to reflections on the training of health professionals and offer subsidies for adjustments to be made in undergraduate courses in this area, in order to train professionals in accordance with the provisions of the NCG.

FINAL CONSIDERATIONS
This research presents a compilation of 14 articles with different evaluation strategies on active learning methods in health undergraduate. Among the findings, it is evident that such strategies are directly related to the affective, cognitive and psychomotor dimensions and that the combination of these is essential to achieve the desired training. An evaluative proposal that integrates multiple strategies, evaluators and that occurs at different moments of learning, using different types of evaluation (diagnostic, summative and formative), is fundamental when one intends to contemplate all the abilities and competences foreseen to be developed during the graduation, with the use of the active learning methods. For this, it is necessary that traditional models of evaluation be replaced or improved, so that the monitoring of the cognitive domain is not restricted to the memorization of contents, but advances to enhance the ability to construct reasoning, as is desired in active learning methods, as well as to enable a balance between the development of this domain, the affective and the psychomotor.
Thus, through this ILR, one can broaden and deepen the perception about such a complex subject and obtain subsidies for curricular advances and educational practice, since it provides a general observation to researchers about what already exists in this field of knowledge. Furthermore, the results of this review showed that it would be necessary to conduct studies with greater power of scientific evidence.