Validity of two occlusal indices for determining orthodontic treatment needs of patients treated in a public university in Belo Horizonte , Minas Gerais State , Brazil

The aim of the present study was to validate the dental aesthetic index (DAI) and index of complexity, outcome and need (ICON) based on the opinions of a panel of Brazilian orthodontists. A comparison of these two orthodontic treatment need indices was carried out based on the consensus of a panel of 20 experienced orthodontists. A set of 108 study casts representing the full spectrum of malocclusions was selected. A calibrated examiner scored the casts for both indices. The orthodontists individually rated the casts regarding the degree of orthodontic treatment need. The panel’s mean rating of the need for treatment was used as the gold standard for evaluating the validity of the indices. The accuracy of the indices, as reflected in the area under receiver-operating characteristic curves, was high: DAI = 81.83% (95%CI: 71.21-92.44); ICON = 88.75% (95%CI: 78.57-98.92). Although the accuracy of the ICON was higher than that of the DAI, both indices are recommended for determining orthodontic treatment need in Brazil. Malocclusion; ROC Curve; Validation Studies Introduction There is a high prevalence of malocclusion in the Brazilian population, with national values of 58.14% among children aged twelve years, varying from 53.7% to 64.1% in different regions 1. In the city of Belo Horizonte, Minas Gerais State, there is a 61.9% prevalence among children aged 10 to 14 2. Considering the fact that malocclusion is a public health problem due to its prevalence and its impact on quality of life 2,3 as well as the limited treatment resources and the difficulty dentists have in diagnosing it and correctly assessing its severity, the use of a valid instrument for objectively evaluating criteria for recommending and prioritizing orthodontic treatment is necessary. Validation and reliability tests have been carried out on a limited number of indices by a large group of orthodontists and have demonstrated that some indices precisely reflect the opinion of specialists, who, in turn, have demonstrated high degrees of inter-examiner agreement 4,5,6,7,8,9,10. It is possible that the convergence of opinions among American specialists is due to the standardization of the educational process, the clinical curriculum of which is directed toward accreditation standards and criteria defined by the American Association of Orthodontists, which has led to a common notion of what constitutes the ideal occlusion 7. Validation studies on the dental aesthetic index (DAI) have been carried 581 ARTIGO ARTICLE


Introduction
There is a high prevalence of malocclusion in the Brazilian population, with national values of 58.14% among children aged twelve years, varying from 53.7% to 64.1% in different regions 1 .In the city of Belo Horizonte, Minas Gerais State, there is a 61.9% prevalence among children aged 10 to 14 2 .Considering the fact that malocclusion is a public health problem due to its prevalence and its impact on quality of life 2,3 as well as the limited treatment resources and the difficulty dentists have in diagnosing it and correctly assessing its severity, the use of a valid instrument for objectively evaluating criteria for recommending and prioritizing orthodontic treatment is necessary.
Validation and reliability tests have been carried out on a limited number of indices by a large group of orthodontists and have demonstrated that some indices precisely reflect the opinion of specialists, who, in turn, have demonstrated high degrees of inter-examiner agreement 4,5,6,7,8,9,10 .It is possible that the convergence of opinions among American specialists is due to the standardization of the educational process, the clinical curriculum of which is directed toward accreditation standards and criteria defined by the American Association of Orthodontists, which has led to a common notion of what constitutes the ideal occlusion 7 .Validation studies on the dental aesthetic index (DAI) have been carried out using different strategies for the obtainment of the gold standard in the USA and Australia 8,11,12 .Regarding the index of complexity, outcome and need (ICON), validation studies have been carried out in the USA and Holland 9,10,13 .Thus, the validation of these indices has only been carried out in three countries.
The use of a panel of orthodontists from a specific geographic region makes it difficult to make generalizations about the results.There is evidence that the country in which a specialist works has an effect on his/her assessment of orthodontic treatment needs 14 .Thus, the validity of an index depends on the panel of orthodontists that serve as the gold standard.Thus, validation studies on the DAI and ICON are needed, especially in developing countries.
The DAI is an occlusal screening index recommended by the World Health Organization (WHO) and has been used in an epidemiological survey carried out by the Brazilian Ministry of Health as well as in local studies in Brazil 1,2,3 .The ICON offers the advantage of simplicity of use, as it has few characteristics to be measured and integrates both the aesthetic component and aspects related to the morbidity of malocclusion in the same equation 4 .However, the ICON has not been used in epidemiological studies in Brazil.
The aim of the present investigation was to validate the DAI and ICON based on the opinion of a panel of Brazilian orthodontists.

Material and methods
The DAI is a quantitative index that relates the perception of patients regarding dental aesthetics to the objective physical measurements of characteristics associated to malocclusions.To obtain the DAI score, ten occlusal characteristics are assessed based on socially defined occlusal standards of dental aesthetics.The score of the index ranges from 13 to approximately 80 and is categorized by cutoff points 8 .The ICON is a quantitative index used to determine the need for orthodontic treatment.To obtain the ICON score, each characteristic is measured and categorized based on a protocol that defines the score for each item.The values attributed to the five characteristics (four occlusal components and one aesthetic component) are multiplied by their respective coefficients (weights) and summed.The total score is a single value situated on an interval scale ranging from seven to 128 points 4 .
The validation of the indices of orthodontic treatment need was carried out in two main stages: the obtainment of the gold standard and the measurement of the indices.
The sample size was calculated to achieve a power of 75% and type II error of 25%, α = 0.05 and δ 2 = 0.01.Based on the kappa agreement of 0.81 (σ 2 = 0.15) obtained from the ICON validation study, it was determined that 108 cast models were needed 9,15 .
The sample was selected from a set of 445 pairs of study casts in the archives of the Orthodontics Department of the Federal University of Minas Gerais.Each of the 455 casts was scored with the Standardized Rating Scale of Dental Attractiveness (SCAN) by an orthodontist with theoretical and practical knowledge of the use of this scale 16 .Each of the ten categories of the SCAN scale contained varieties of dentition types as well as a wide variety of malocclusions.Thus, the sample was representative of a wide range of types and severities of malocclusion.
The pairs of casts were distributed among the categories of the SCAN scale, representing ten different degrees of severity for the different dentition types and malocclusions.Mixed dentition accounted for 14.8% of the casts and permanent dentition accounted for 85.2%.The number of casts varied between severity categories such that there was a greater number of casts in the intermediate categories, as cases classified at the extremities of the treatment need scale (maximal need or no need) would not test the discriminatory capacity of the indices.For example, severity categories 5 and 6 each had 13 pairs of casts, whereas categories 1 and 10 each had six pairs.A randomized numeration was determined on an electronic spreadsheet and all pairs of casts were numerically recoded and labeled (Table 1).
The opinion of a panel of orthodontists was the reference for the gold standard regarding the recommendation for orthodontic treatment for the validation of the DAI and ICON indices.The panel was made up of 20 orthodontists from the city of Belo Horizonte, who fulfilled the following criteria: (a) master's or doctoral degree registered at the Brazilian Ministry of Education; (b) professor affiliated to a university located in Belo Horizonte; (c) a minimum of five years experience in private orthodontic practice in Belo Horizonte.
Following official contact with the respective universities, the specialists who fulfilled the inclusion criteria were contacted by telephone and asked to participate voluntarily in the study.The study and its objectives were explained and an account of the form of participation of the collaborators was given.All participants signed terms of informed consent.The study was conducted with professors who met the above-mentioned inclusion criteria.No professor refused to participate in the study.Emphasis was given regarding the need for dedication and precision in the recording of the opinions.The participants were to appear on the dates defined for the initial and final evaluations of the study casts.
All 108 pairs of casts were assessed for the obtainment of the opinion of the panel regarding the need for orthodontic treatment.The evaluations were scheduled on the days when the participants were at the universities.Between one and four sessions were held, depending on the day of the week and time available in the professors' chronograms of activities.The casts were arranged in order following the previously randomly attributed numeration in rooms offered by the universities in order to facilitate access for the participants.
The evaluation forms contained the number corresponding to the case, the coded identification of the orthodontist, necessary instructions and a numerical scale to be marked.The specialists were instructed to rate the occlusion in each pair of casts on a seven-item Likert scale, on which 1 represented "no or minimal recommendation for treatment" and 7 represented "very highly recommended for treatment".The participants worked at their own pace and within their own time constraints 7,8,9 .Each study model received a single score on a seven-item Likert scale, obtained from the mean values given by all specialists.
Each specialist was then asked to use the same seven-point scale to define the value at or above which he/she thought treatment would be recommended.The "recommended treatment point" (RTP) was obtained for each of the casts.The mean of the 20 RTP values was used as the single cutoff point, thereby reflecting the opinion of the specialists in a dichotomous manner for each model.Casts for which the mean opinion was lower than the mean RTP were classified as "treatment not recommended", whereas those equal to or higher than the RTP were classified as "treatment recommended" 8 .
To assess intra-examiner and inter-examiner agreement, a subset of forty pairs of casts was randomly selected 8 for re-evaluation by the 20 specialists.The examinations were repeated with a minimal interval of three weeks between sessions for each index.Inter-examiner agreement in 20 randomly defined combinations of pairs of specialists and intra-examiner agreement for the 20 specialists were determined using the quadratic weighted kappa statistic.
The results of the agreement coefficients for the seven-item Likert scale (obtained from the kappa) were accompanied by the lower bound of the 95% confidence interval (95%CI).Given its equivalence in the evaluation of levels of agreement for the sensitivity and specificity model and agreement coefficients, the scale proposed by Fleiss and modified by Cicchetti was used for the assessment of the level of agreement 17,18,19 .
A calibrated orthodontist (R.N.C.; intraclass correlation coefficients for DAI ≥ 0.978 and ICON ≥ 0.931) measured each study cast.The determination of the ICON and DAI was made on the set of 108 pairs of casts six weeks following the last calibration session for each index.The measurements on each cast were recorded on specific charts.The values of the indices attributed to the casts by the researcher, blindly, were compared to the gold standard.The cutoff points were 31 for the DAI and 43 for the ICON.The DAI was measured based on an adaptation for mixed dentition, as proposed by the WHO, which considers variations stemming from the development of the occlusion in the components absent visible teeth, spacing of anterior segments and anteriorposterior molar relation.
Specificity, sensitivity and accuracy were evaluated.Positive and negative predictive values were calculated from the sensitivity and specificity values of the indices and the prevalence of the need for orthodontic treatment in the local population 2 .The accuracy of each index was assessed through analysis of the receiver-operating characteristic (ROC) curve and the performances were compared through the determination of the statistical significance of the difference between the areas of the two ROC curves using the non-parametric method proposed by DeLong et al. 20 .
All mean values were tabled on an electronic spreadsheet for the purpose of statistical analysis 3 .The data entered were verified using the original charts with the aid of a digital text reader program (Free Natural Reader.Natural Soft.Ltd., Richmond, Canada).The calculations were carried out either on the spreadsheet itself or on a statistical program (Stata/SE for Windows, version 8.0.Satat Corp., College Station, USA).The present study and term of informed consent received approval from the Ethics Committee of the Federal University of Minas Gerais (process nº.ETIC 068/06).The study casts used in this investigation pertain to the Department of Pediatric Dentistry and Orthodontics and their use was authorized.

Results
There was variability in the opinions of the specialists regarding the 108 pairs of casts rated on a seven-point scale.The mean was 4.795 (±0.76), with a minimum of 3.25, maximum of 6.44 and median value of 5.13.The coefficient of variation was 15.85%.The RTP used to dichotomize the opinion regarding treatment need determined by the 20 specialists was 2.90.The determination of this cutoff point classified 86.11% of the cases as "with need" and 13.89% as "without need" for treatment.
Table 2 displays the inter-examiner agreement of the specialists.The level of agreement ranged from poor to good.The level of intra-examiner agreement ranged from poor to excellent.The lowest level of intra-examiner agreement was obtained by examiners 2 and 10.Only one specialist achieved an excellent level of agreement 21 (examiner 11) (Table 3).
The measurement of the 108 pairs of casts based on the DAI and ICON criteria resulted in a mean value of 34.033 (-8.422) and 57.565 (-23.509),respectively.For the DAI, this value was below the central value of the index scale, which The likelihood ratios for the DAI and ICON were 5.08 and 11.12, respectively.The area under the ROC curve was 81.83% (95%CI: 71.21-92.44)for DAI and 88.75% for ICON.The comparison of accuracy measures (area under the ROC curve) of each index revealed that the ICON appears to have a better discriminatory power regarding the need for treatment than the DAI, as the area under the ROC was significantly greater (p = 0.0435) (Figure 1).

Discussion
The assessment of specialists for the definition of treatment need among a set of orthodontic casts has demonstrated variable results in the literature 13 .A similar finding was identified in the present study.The set of casts was selected in such a way that the cases were distributed throughout the treatment need scale, with a greater frequency of cases of intermediate need and a lower proportion of cases at the extremes of the scale.This distribution enabled a better evaluation of the discriminatory power of the indices in the range of greater decision-making difficulty.The mean obtained from the panel of specialists was higher than the central value of the opinion scale (four).This may be caused by the use of a sample that resulted in a large number of casts with treatment needs in order to represent a wide variety of types of malocclusion.The variability in the opinion of the specialists regarding the need for orthodontic treatment may be considered a reflection of the difficulty in defining objective signs of malocclusion.As treatment need is a measure of deviation from an arbitrary ideal, there should be a consensus regarding this ideal 5,7,22,23 .RTP values 3.25 and 3.53 have been reported in previous studies using the same methodology 7,8,9 .The studies by Beglin et al. 8 and Firestone et al. 9 , who respectively assessed 156 and 170 study casts, found more balanced classification percentages.In both studies, the panel classified 64% of the cases in the treatment need category.For the validation of the DAI and ICON, respectively, these two studies obtained better reliability values for the panel of specialists.Beglin et al. 8 found an inter-examiner agreement of k w = 0.81 (lower bound of the 95%CI: k w ≥ 0.81) and intraexaminer agreement of k w = 0.92 (k w ≥ 0.90).Firestone et al. 9 found an inter-examiner agreement of k w = 0.81 (k w ≥ 0.81) and an intra-examiner agreement of k w = 0.92 (k w ≥ 0.90).
In the present investigation, the removal of the most discordant specialists who determined extreme values altered the variability values, while the mean opinion and RTP values were not altered to the point of affecting the number of cases classified as having treatment need, which remained 93 (data not shown).Therefore, the original panel of 20 specialists was maintained as the gold standard of opinion regarding the need for orthodontic treatment.
Agreement regarding the opinion of the specialists was similar to that reported by Richmond et al. 22 , who found intra-examiner agreement values ranging from 0.54 to 0.97 for aesthetic need and 0.12 and 0.89 for dental health need.Similar levels of agreement were described by Richmond & Daniels 14 , who report values for the lower bound of 95%CI of the kappa coefficient for intra-examiner agreement of 0.58 and 0.55 for need based on aesthetics and health, respectively.
Younis et al. 7 suggest that the lack of agreement among specialists whose training in orthodontics occurred outside the USA may be due to a lack of common curricula and accreditation standards.In the present study, the DAI classified fewer cases with a treatment need than the panel of specialists.It is possible that the opinion of the panel formed by these Brazilians to treat more cases demonstrates a geographic difference in decision making.However, this is an issue that requires further investigation.
Considering the 26.4% prevalence rate in the city of Belo Horizonte, a good screening index for orthodontic treatment need would be that with a high positive predictive value, which would offer greater specificity and a lower number of false positives.False positives can lead to emotional and financial burdens, whereas false negatives generally do not result in a serious problem, as malocclusion does not offer a serious health risk.Thus, a higher specificity value should be prioritized and the sensitivity value should be balanced in order not to excessively increase the number of false negatives.
Comparing the values related to the validity of the DAI and ICON using their original cutoff points, the ICON better reflected the opinion of the panel of specialists.A possible cause for the lower validity of the DAI is its failure to consider characteristics related to the impact of malocclusion on oral health 24,25 .Thus, cases considered by the panel as needing treatment due to factors not only related to aesthetics but to the concept of ideal functional occlusion may not have been classified by the DAI.As the ICON considers both aesthetic and functional characteristics, it has a greater chance of reflecting the opinion of the panel of specialists.
Another aspect related to the construction of the DAI that may limit the generalization of its use is how the cutoff points were obtained and the recommendation of its validation in both developed and developing countries.The cutoff points were obtained through the correlation between the frequency distribution of the numeric value of the DAI and the distribution of categories of malocclusion severity according to the report from the American National Center for Health Statistics.The degrees of treatment need were deduced from a comparison of the accumulated percentage among the samples 8,26,27 .The recommendation of the generalization of its validity was based on a correlation between the opinion of students in four countries regarding the aesthetic impact of malocclusion, as assessed in 25 photographs 28 .
The use of a panel of orthodontists from a specific geographical region means it is not possible to make generalizations about the results.There is evidence that the country in which a specialist works has an effect on his/her evaluation regarding the need for orthodontic treatment 23 .Thus, the validity of an index depends upon the panel of specialists that serve as the gold standard.The quality of this validation depends on the mini-mization of errors of different origins.The low degree of reliability of the validating panel makes the gold standard very subjective.As long as there are no more objective, evidence-based standards for the decision of orthodontic treatment, the opinion of specialists continues to be the best reference.Considering the variance in the objective opinion of the panel, indices may be useful educational tools for improving the agreement among professionals regarding the need for orthodontic treatment.
Although the ICON achieved significantly greater accuracy than the DAI, both may be considered valid for use as screening measures of the normative need for orthodontic treatment in epidemiological studies.In situations involving the use of these indices on a real population, a number of issues should be raised regarding their validity: whether the values measured by the index are altered during occlusal development; whether the validity of the measurements determined in study casts corresponds to those in actual patients; and whether the decision regarding the need for treatment corresponds to an actual normative need or may be related to the perception of the patient or his/her culture.
According to Summers 23 , the development of the occlusion may alter the values measured by an index.A substantial change in occlusion was observed in the critical period between 12 and 16 years of age, which reduced the value measured by the DAI in this period by an average of 5.1 points.Such alterations depend on what components are measured by the index.In the case of the DAI, for instance, the components that most contributed to this temporal alteration were overjet, crowding, anterior gaps, midline diastema and inter-molar relation.These components are very mutable characteristics in the development of the occlusion.As each index uses measures of different occlusal characteristics in its calculation, the impact of changes during the development of the occlusion on the final value may not be the same for different indices 29 .In the present study, the DAI was measured based on an adaptation for mixed dentition, as proposed by the WHO.In order to minimize the effect of occlusal development on the reduction of the DAI score, some physiological characteristics of the mixed dentition (especially in the "ugly duckling phase") should be discarded or modified in the evaluation: midline diastema, anterior maxilla irregularity, anterior mandible irregularity and anterior open bite.Further studies are needed to determine the magnitude of these specific temporal alterations for each index and establish the most adequate correction for each case: adaptations to the index structure or weights of specific categories or even a correction in the final value, depending on the phase of the dentition.In the present study, 16 cases (14.8%) with mixed dentition were used.Considering the temporal effect, the score for these cases should be reduced, which would potentially reduce the number of cases classified by the DAI as needing treatment.
The reliability and validity of evaluations made with study casts in comparison to clinical situations may be questioned.These aspects require further study, given the importance of the comparison of retrospective data (which are generally obtained from study casts) and current, clinically obtained data.Objective measurements from study casts are believed to offer high reliability in comparison to those obtained in clinical exams.Substantial agreement has been identified when comparing the dental health component of the index of orthodontic treatment need (IOTN) between study casts and clinical examination.Substantial agreement has also been found when comparing the aesthetic component of the IOTN (SCAN scale) between study casts and clinical examination 30 .Although studies of this nature have not been carried out for the DAI and ICON, Beglin et al. 8 state that the majority of occlusal indices may be administered on study casts, as general diagnostic tests do not require the presence of the patient, since the source of the data (blood, tissue or image) has been acquired.
The adjustment of the cutoff points affected the sensitivity and specificity of both indices.However, based on the confidence intervals, the sensitivity and specificity were similar to the original values for these indices, which signify that the DAI and ICON are applicable to the Brazilian population studied.
It should be pointed out that there was a higher prevalence of orthodontic treatment need in the present study than that considered for the Brazilian population as a whole, which hinders the generalization of the results of this study 1,2 .Validation studies are generally carried out at orthodontic clinics 8,9 .It is likely that validation studies involving representative samples of the general population would achieve different results from those of the present study.However, an effort was made to minimize this aspect by selecting study casts from all categories of the SCAN scale.Moreover, indices tend to be administered by clinicians in epidemiological surveys rather than specialists.It is therefore important to carry out validation studies involving general dentists.
The determination of malocclusion in public health allows the establishment of public poli-cies directed at oral health and the organization of public dental services that offer orthodontic treatment.The use of occlusal indices for the determination of orthodontic treatment need generally has limitations inherent to the procedure -from the determination of the gold standard to the incorporation of the patient's perception regarding treatment need.As well as appearing structurally different, the DAI and ICON have significant differences regarding their validity.Due to its greater accuracy and the fact that it addresses important health-related components along with aesthetic aspects, the ICON could replace the DAI as an index for orthodontic treatment need.Other characteristics, such as simplicity, temporal reliability and duration of the clinical exam, should be considered when choosing indexes for oral surveys.

Conclusion
Although the accuracy of the ICON was greater than that of the DAI, both indices are recommended for determining orthodontic treatment needs in Brazil.

Resumo
O objetivo do presente estudo foi validar os índices dental aesthetic index (DAI) e index of complexity, outcome and need (ICON) a partir da opinião de um painel de ortodontistas brasileiros.A comparação desses dois índices de necessidade de tratamento ortodôntico foi feita baseada no consenso de um painel de 20 experientes ortodontistas.Um conjunto de 108 modelos de estudo representando uma grande variedade de tipos de maloclusão foi selecionado.Um examinador calibrado mensurou os modelos para ambos os índices.Os ortodontistas, de forma individual, avaliaram a necessidade de tratamento ortodôntico de cada modelo.O valor médio do painel de ortodontistas em relação à necessidade de tratamento foi utilizado como padrão-ouro para validação dos índices.Os valores de acurácia dos índices, medidos pela curva de característica de operação do receptor, foram altos: DAI = 81,83% (IC95%: 71,21-92,44); ICON = 88,75% (IC95%: 78,57-98,92).Embora a acurácia do ICON tenha sido mais alta do que a do DAI, ambos os índices são recomendados para determinar a necessidade de tratamento ortodôntico no Brasil.Maloclusão; Curva ROC; Estudos de Validação Contributors R. N. Costa was responsible for the study design, statistical analysis and interpretation of data, as well as organizing and writing the paper.M. H. N. G. Abreu was responsible for analysis and interpretation of the data and drafted the paper.C. S. Magalhães was responsible for the study design, supervision, and data collection, and also assisted with the analysis, interpretation of data and with writing the paper.A. N. Moreira was responsible for the study design, supervision, and data collection, and also assisted with the analysis, interpretation of data and writing the paper.

Table 1
Frequency of malocclusion classifi cation in relation to classifi cation using the Standardized Rating Scale of DentalAttractiveness (SCAN scale) from a set of 108 casts.

Table 2
Sampling of inter-examiner agreement by pairs.Belo Horizonte, Minas Gerais State, Brazil.

Table 3
Intra-examiner agreement for values of the opinion of the specialists.Belo Horizonte, Minas Gerais State, Brazil.

Table 4
Frequency of orthodontic treatment need comparing diagnosis performed by panel opinion (gold standard) and dental aesthetic index (DAI) and index of complexity, outcome and need (ICON).
Figure 1 ROC curve for the dental aesthetic index (DAI) and index of complexity, outcome and need (ICON).Belo Horizonte, Minas Gerais State, Brazil.