Are Behavior Rating Scales Able to Identify Behavioral Changes in Preschool Children Undergoing a Dental Intervention? A Systematic Review

Objective: To evaluate the scientific evidence concerning the behavior rating scales efficiency to identify behavioral changes in preschool children undergoing dental treatment, through a systematic review. Material and Methods: MEDLINE/PubMed, Scopus, Cochrane Library, Web of Science, BVS databases and grey literature were searched. Also, a hand search of the included studies reference lists was conducted. Studies that evaluated healthy preschoolers’ behavior before and after invasive dental treatments to observe behavioral changes were included. Two independent reviewers selected studies, extracted data and analyzed the risk of bias with a tool for before-and-after studies. The certainty of the evidence was evaluated with the GRADE approach. Results: Three studies were included. The Frankl Scale and North Carolina Behavior Scale were used in these studies. Both scales were able to identify behavioral changes in preschool children undergoing a dental intervention, although two of these included studies were considered fair with a high risk of bias, and one considered good with a low risk of bias. Conclusion: Although Frankl and North Carolina behavior scales were able to identify changes in the children`s behavior during dental treatment, these findings are not supported by strong evidence. Thus, further well-designed studies are needed to confirm this evidence.


Introduction
Dental treatment is a circumstance with great fear, anxiety and uncooperative behavior generating potential, mainly in pediatric dentistry patients [1]. Therefore, pediatric dentists need knowledge and competence in order to prevent and treat dental anxiety and behavior management problems in children and adolescents [2]. A proper assessment of the child's behavior and cooperative potential is important to help dentists in the treatment planning, become the dental appointments more effective and efficient. Besides, an effective treatment strategy should result in a more cooperative behavior of the child. Thus, children's behavior evaluation is necessary in dental setting [3,4] and the development of evaluation measures is essential [5].
There are many different assessment tools available for behavior evaluation, such as hormonal and physiological measures, self-report questionnaires, projective techniques and behavior rating scales [6][7][8][9].
Rating scales play a special role in pediatric dentistry, as they may provide an aid to classify behavior and cooperation of patients. They can be used in children at an early stage, making it possible to prevent further development of behavioral problems [2]. Moreover, the most consistently employed measure of behavior in dental setting involves analysis of the child's behavior through the use of rating scales, especially in young children. Several rating scales have been used quite effectively [10]. Among them, the Frankl Scale [11] is probably the most frequently used. This tool has high reliability, is considered the gold standard in the literature and is widely used both in the clinical and in the research areas [2,8,9].
Behavior rating scales should have clear definitions of behavior; and must be uncomplicated for easy implementation. It is essential that these tools are well standardized, with documented reliability, validity, and psychometric properties [2,12,13]. There is not a single assessment method or tool that contains all of these attributes [13].
Despite the wide use of the behavior rating scales and their importance in pediatric dentistry, the scientific evidence regarding the efficiency of these instruments to identify children's behavioral changes is still unclear. Thus, there is a need for a study aiming not only to summarize the scientific evidence regarding this issue, but also to appraise the methodological quality of the existent literature, in order to guide future studies. Therefore, the aim of this study was to conduct a systematic review to assess whether behavior rating scales are, in fact, able to identify behavioral changes in preschool children undergoing dental treatment, considering their importance to help dentists in the treatment planning.

Protocol and Registration
The study protocol was registered on the PROSPERO database (http://www.crd.york.ac.uk/PROSPERO) under the number CRD42018084208.

Eligibility Criteria
Original studies involving healthy preschool children (from 2 to 6 years), who had their behavior assessed during dental treatment, were considered eligible for the present systematic review. In addition, the study should evaluate the behavior of the same child in two different moments: at the baseline without invasive procedure (entering the dental office, sitting the dental chair, oral exam, prophylaxis, fluoride therapy), and at another moment under an invasive procedure (local anesthesia, rubber dam isolation, restoration). The inclusion criteria were based on PICO [14] strategy and comprised studies in humans who met the following criteria: population (P) -healthy preschoolers without special needs; whose behaviors were evaluated after an intervention (I) -invasive dental treatment; and before the intervention that represented the control (C)behavioral assessment at the beginning of dental care, before performing invasive procedures (baseline

Study Selection
Two trained researchers (J.A.S. and L.G.P.) independently performed the review process. A calibration exercise consisting of two steps was performed: initially, researchers discussed the eligibility criteria and then independently evaluated the titles and the abstracts of all studies retrieved in both the electronic and hand searches. Titles and abstracts that did not meet the inclusion criteria were excluded from this review. In cases of disagreement, eligibility criteria were further discussed with a third reviewer (A.F.G.). Those complying with the eligibility criteria had their full-text retrieved and once again, the same examiners (J.A.S. and L.G.P.) evaluated the compliance of these investigations with inclusion criteria. Studies with only the title and without abstract, as well as those in which researchers were in doubt, were included in the full-text analysis. In cases of disagreement, eligibility criteria were further discussed with a third reviewer (A.F.G.).

Data Collection Process
Data on the following issues were extracted by the authors, with the goal of characterizing the included studies: (1) author(s), year of publication and place where the study was undertaken; (2) study design; (3) sample; (4) participants age range and/or mean age; (5) baseline and/or first visit and/or before treatment; (6) dental intervention; (7) behavior rating scale used; (8) children's behavior at baseline; (9) children's behavior after invasive dental intervention; (10) outcome.

# 1 and # 2
Scopus #1 (TITLE-ABS-KEY("Behavior Rating Scale" OR "Evaluation scales" OR "Scale Behavioral Rating" OR "Tests Behavior" OR "Behavior scale" OR "Frankl Scale" OR "Frankl behavior Rating Scale" OR "Frankl scores" OR "Frankl Behavioral Scale" OR fbrs OR "Global Rating" OR houpt OR "Wright Classification" OR "Behavior Profile Rating Scale" OR "Venham Behaviour Rating Scale" OR "Venham Scale" OR "Kurosu Behaviour Evaluation Scale" OR "Ohio State University Behavior Rating Scale" OR osubrs OR "North Carolina Behavior Rating scale" OR ncbrs OR "Visual Analog Scale" OR vas OR "Visual Analogue")) #2 (TITLE-ABS-KEY(dentistry OR dentist OR tooth OR teeth OR "Dentist-Patient Relations" OR "Relation Dentist-Patient" OR "Relations Dentist-Patient" OR "Pediatric Dentistry" OR pedodontics OR "Dentistry Pediatric")) #3 (TITLE-ABS-KEY(child OR "child preschool"))

# 1 and # 2
The records were classified according to the answers obtained in the quality assessment, and the following criteria were established: 'good' quality for those records with 6 to 9 'yes' answers, indicating a low risk of bias; 'fair' quality for those studies with 3 to 5 'yes' answers, indicating a high risk of bias; and 'poor' quality for those records with 1-2 'yes' answers, indicating either a lack of information or uncertainty over the potential for bias [17]. For those studies that randomization was not possible, the term not applicable (NA) was recorded in the criterion number five. In this case, the study was classified as good when 5 to 8 'yes' were counted; 'fair' when 2 to 4 'yes' were obtained; and 'poor' under 2 'yes'.
During data extraction and risk of bias assessment, any disagreements or doubts between the reviewers were fixed through discussion, and if needed, by consulting a third reviewer (L.G.P).

Results
The search strategy yielded 1330 records. After exclusion of duplicates, 914 investigations were screened based on title and abstract and 48 of those had their full text retrieved for analysis. One of the 49 articles selected for full-text reading, fulfilled the eligibility criteria and was included [18]. Two other studies were identified manually [11,13] and were included either. Most full texts were excluded due to age range, studies that do not have a baseline with non-invasive procedures assessment, literature review, case reports and pilot studies.
Three articles [11,13,18] were included in the present systematic review. The PRISMA flow diagram of the study selection process is presented in Figure 1.

Characteristics and Quality Assessment of the Selected Studies
The qualitative synthesis and the risk of bias of the included studies were assessed and summarized in Table 2, as well as all the extracted data are displayed in Table 3. The studies were published between 1962 and 2017. One study was a randomized clinical trial conducted in Egypt [18] and two were cross-sectional studies conducted in the United States of America [11,13].
Among the three selected records, one got six 'yes' and achieved a 'good' score [18], and two [13,11] obtained a 'fair' score, with four and three 'yes' respectively. No article obtained a 'poor' score, according to the 'Before and after' assessment tool [15,16].
The study developed by Kamel et al. [18], classified as good and with a low risk of bias, aimed to evaluate the impact of positive images versus neutral images on child behavior during dental treatment. It included children from 4 to 6 years and the Frankl scale was used to evaluate behavioral changes in these patients undergoing dental treatment. The study control group (neutral images) was chosen as the evaluation group for this review since it was considered the most neutral one (n=30). In the first behavior evaluation . These results demonstrate that the scale used was able to identify behavioral changes in the studied population (p=0.003), since in the second behavior evaluation when an invasive procedure was conducted, just a few children were considered definitely positive. Although this article did not report clear eligibility selection criteria for the study population, a blind operator and a calibrated examiner were presented, which allowed it to be categorized as 'good' according to the quality assessment (Table 3).
In contrast, the study developed by Chambers et al. [13] was considered fair with a high risk of bias.
It aimed to develop a scale for assessing the amount of disruptive child behavior, undergoing dental treatment.
It included children from 3 to 5 years and these participants had their behavior evaluated through the North Carolina Behavior Rating Scale. This study did not clearly report the eligibility selection criteria for study population, the sample size calculation, the randomization and the blindness of the operator. Since all dentists were trained and calibrated, children always attended by 'dentist 1' was chosen as the evaluation group for this review (n=40). In this research, the authors used the North Carolina behavior rating scale to evaluate the behavior changes. It was possible to observe that after restorative treatment under local anesthesia, there was a decrease of 'High-Hands'; and an increase of 'Leg movement' and 'Crying protest', showing that the scale is able to identify behavioral changes in preschoolers under dental treatment ( Table 3).
The last included study [11], considered as fair with a high risk of bias, aimed to investigate children's reactions in the dental office under or not the mother separation. It included children from 3,5 to 5,5 years, and the Frankl scale was used to assess their behavior. This study did not report a clear objective, an eligibility selection criterium for the study population, a sample size calculation, and a blind and calibrated operator. The domain five of the 'Before and after' tool (16) was not applicable for this study design. It was possible to observe an increase of negative behavior [-] during the local anesthesia (17.83%) and also during cavity preparation (5.33%). On the other hand, there was an increase of definitely positive behavior [++] during the insertion of the restorative material (5.44%), as well as on the children's departure (5.44%) ( Table  9   Table 3. Summary of characteristics extracted from the selected studies.  In this way, two included studies [11,13] were considered as "fair" in the quality assessment because there were no clear eligibility selection criteria, sample size calculation, randomization, blinded and calibrated operator. One included article [18] was classified as having a good score in the quality assessment. This study conducted by Kamel et al. [18] found a statistically significant difference (p=0.003) in children's behavior during the dental treatment (pulpotomy followed by stainless crown), suggesting that the Frankl scale is efficient to identify behavioral changes in preschool children undergoing dental treatment.
The conclusions regarding the efficacy of the children behavior rating scales to identify behavioral changes in preschool children undergoing dental treatment is based on the results of three studies containing 126 preschool children. Considering the limitations on the risk of bias, inconsistency, indirectness, imprecision and other considerations, the certainty of the evidence was graded as very low (Table 4).  [11,13] were classified as having high risk of bias, while one [18] was qualified as having low risk of bias. Thus, the proportion of information from studies at high risk of bias is enough to affect the interpretation of results. As summary, there is crucial risk of bias for one criterion, or multiple criteria, that likely to seriously alter the results. b The evidence summarized in the review comes from studies that partially address the question of interest to the review, and therefore the conclusions may not be directly answering the review question with respect to PICO. So, the evidence that was found is more restrictive than the review question and some indirectness exist. c The number of children analyzed (N=126) is not enough to detect a precise estimate of the effect.

Discussion
Child behavior is a complex and multifactorial phenomenon, by which children express their feelings and reactions to dental care [19]. It is important that dentists are able to assess behavior problems in pediatric dentistry patients as early as possible to identify patients who need special care [20]. Cadermatori et al. [9] believe that evaluation of child reactions is very important in order to identify non-collaborative behavior, allowing dentists to adopt appropriate techniques for behavior management in dental offices [2]. Considering that one of the primary goals in pediatric dentistry is the child behavior management in the chair, which facilitates the speed and quality of dental care [21], the assessment of children based on their behavior is one of the most important skills for the dentist [22]. Therefore, to raise the evidence whether children behavioral rating scales are able to identify behavioral changes in preschoolers undergoing dental treatment is essential.
Since no systematic reviews were found in the literature corroborating the focused question of the present study, this systematic review was performed, once it could base clinical decisions for the use of the better scale.
Our results demonstrated that among the behavior rating scales, Frankl and North Carolina behavior scales, were able to identify changes in the children behavior undergoing dental treatment.
In order to answer the study focused question, this systematic review was undertaken following the PRISMA [23] recommendations. In addition, the "Before-and-after' quality assessment tool was used [15,16].
This tool is a useful method for comparing the effects of a procedure to the baseline values of the outcome assessed [17]. To better adapt to the methodology of the included articles, a modification in item 9 of the "Before-and-after" quality assessment was suggested by the authors of the present study. Also, the criterion 'not applicable' was used for item 5, which evaluates the quality of study randomization. It happened because this review included human studies that the same sample was evaluated before and after dental treatment at the same moment.
Many behavioral rating scales for evaluating child's behavior on dental setting have been reported in the literature [11][12][13]24,25]. However, only two of them, Frankl scale and North Carolina behavior rating scale, were evaluated in the present systematic review. It shows that more well design human studies should be carried out to establish a better comparison between wider ranges of scales. The Frankl scale [11] is one of the most widely used behavior evaluation scales in pediatric dental research and in daily clinical practice [20]. This method is considered one of the most reliable tool for behavior rating and is the gold standard according to the literature, so several studies have used these four categories scale for the validation of instruments that assess children's reactions during dental care [8,9,26]. However, according to Shindova et al. [20] and Wright et al. [27], a shortcoming of the Frankl scale is that this classification does not provide definite items for observation and it does not give enough description as to the specific type of children's negative behavior. On the other hand, the study conducted by Kamel et al. [18] found a statistically significant difference (p=0.003) in children behavior during the dental treatment (pulpotomy followed by stainless crown), suggesting that the Frankl scale is efficient to identify behavioral changes in preschool children undergoing dental treatment.
The North Carolina Behavior Rating Scale [13] is a four-category tool, which, according to the authors, is reliable and requires little time for training and implementation. Many studies have used this scale to categorize children's behavior in dental setting [28][29][30]. Although North Carolina behavior rating scale was evaluated as "fair" in the quality assessment of the present systematic review, it was possible to observe that this tool is able to identify behavioral changes, such as 'leg movements' and 'crying protest', in preschool children undergoing dental treatment. Nevertheless, this scale was developed to mainly identify and define observable negative behavior [13]. Studies have shown that child behavior is influenced by age [21,[31][32][33], and the preschool age range was chosen for this review. Preschool children generally cooperate less with dental procedures, report greater related fears and the frequency of behavioral problems is considerably high in this group [21,32]. Children younger than preschoolers have limited communication abilities and are in lack of cooperative behavior. On the other hand, children over six years of age are able to acquire abilities in communication, independence and selfcontrol. Thus, it is expected that these children present less behavioral management problems during dental care [32].
Since two, of the three included studies, were considered as "fair" in the quality assessment, and the evidence was rated as very low, this could be considered a limitation of this review, considering that the quality assessment was performed to obtain the best evidence available on the subject [17]. However, it should be emphasized that these studies were conducted in 1962 and 1981, respectively, and the criteria for studies development are currently better designed.
This systematic review demonstrates an urgent need for additional research whose objective is to study children's behavior changes before and after dental treatment. Researchers should bear in mind the necessity of performing high-quality investigations, which will be able to answer with methodological quality these issues.

Conclusion
Among the behavior rating scales, Frankl and North Carolina behavior scales were able to identify changes in the children behavior undergoing dental treatment. These findings are not supported by strong evidence since two studies were classified with a high risk of bias and the certainty of evidence was rated as very low. Thus, well-designed studies should be conducted with correct criteria for sample size, a clear eligibility selection criteria, matching, and blindness of outcome analyses.

Financial Support
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior -Brasil (CAPES) -Finance Code 001 and Federal University of Rio de Janeiro.