CLASSIFYING RADIUS FRACTURES WITH X-RAY AND TOMOGRAPHY IMAGING

Acta Ortop Bras. 2009; 17(2):9-13 Received in 09/26/07; approved in 09/26/07 Department of Orthopaedics and Traumatology, University of São Paulo Medical School, Hospital das Clínicas, and Musculoskeletal Investigation Laboratory (LIM 41) Correspondences to: Rua Antenor Guirlanda, 92, apto 122, Casa Verde, São Paulo, SP, Brasil, CEP 02514-010, e-mail: paulorobertomiziara@uol.com.br was found for interobserver reliability between the Universal classification using plain radiographs and the Universal classification using computed tomography. Interobserver reliability of the AO classification system using plain radiographs was significantly higher than the interobserver reliability of the AO classification system using only computed tomography. Conclusion: From these data, we conclude that classification of distal radius fractures using CT scanning without plain radiographs is not beneficial.


INTRODUCTION
Fractures are a public health problem affecting a significant portion of the population.In a study published in Scotland in 2006, an incidence of 11.67/1000 fractures/year was evidenced in men and of 10.65/1000/year in women.The most frequent ones were those located on radius distal third -1.95/1000/year. 1 Another study, conducted in Sweden in 2007 found an even greater incidence: 2.6/1000/year. 2he way in which these injuries are treated has dramatically changed over the last two decades: from the almost universal use of plastered cast to a large variety of surgical techniques. 3hese changes happened after the importance of appropriately restoring radius distal end joints was proven in order to improve prognosis of these fractures. 4oint congruence and other parameters employed on therapeutic decision about these fractures are mainly evaluated by plain X-ray images.4][15][16][17][18] Studies assessing specific parameters such as step and diastasis have also presented conflicting results. 5,7,9,19Most studies assessing reliability on classifications of radius distal fractures in-

ABSTRACT
Introduction: This study evaluated the interobserver reliability of plain radiograpy versus computed tomography (CT) for the Universal and AO classification systems for distal radius fractures.Patients and methods: Five observers classified 21 sets of distal radius fractures using plain radiographs and CT independently.Kappa statistics were used to establish a relative level of agreement between observers for both readings.Results: Interobserver agreement was rated as moderate for the Universal classification and poor for the AO classification.Reducing the AO system to 9 categories and to its three main types reliability was raised to a "moderate" level.No difference volve the use of plain X-ray images only.Few studies use computed tomography images for this kind of assessment. 6,8,20In Brazil, we didn't find any interobserver reliability study for Universal and AO classifications using computed tomography images, according to our search on Pubmed, Lilacs and Embase databases with the keywords classification, tomography and radius.The purpose of this study is to investigate interobserver reliability for AO and Universal classifications using plain X-ray and computed tomography images on patients with radius distal third fractures.

MATERIALS AND METHODS
We captured X-ray and tomography images of 21 adult patients of both genders with radius distal fractures.Only patients with acute fractures not previously treated were included.X-ray images were taken at anteroposterior and lateral planes, while tomography images were captured at sagittal, coronal and axial planes.The images were evaluated by five 3 rd year resident Orthopaedic and Traumatology doctors.Fractures were first classified from the Xray images preventing patient identification and at random order.Then, fractures were classified based on tomography images, also without any patient identification and at random order, so that the observer could not associate X-ray images with tomography images.The AO classification (Chart 1) and Universal classification (Chart 2) were used in this study. 21,22ype 1 -Extra-joint fracture, without deviation Type 2 -Extra-joint fracture, with deviation 2A Reducible and stable 2B Reducible and unstable 2C Irreducible Type 3 -Intra-joint fracture, without deviation Type 4 -Intra-joint fracture, with deviation 4A Reducible and stable 4B Reducible and unstable 4C Irreducible The AO classification was assessed by stratification.For this, we defined three detail levels: first level corresponding to types A, B or C, that is, it seeks to assess the reproducibility of classification when an observer only needs to determine if the fracture is extra-joint, partially intra-joint or fully intra-joint; the second level corresponds to the 9 subtypes from A1 to C3, while the third level is represented by the full classification, with its 27 sub-items: from A1.1 to C3.3.(Chart 1)

STATISTICAL ANALYSIS
These data were assessed by using the Kappa statistical method.The Kappa coefficient is employed to assess consistency between observers, subtracting the consistency that would be attributed to fate.The interpretation of values was made as recommended by Landis and Koch 23

RESULTS
The mean interobserver reliability of the Universal classification using plain X-ray images was 0.42.(Table 1) The result for the first level of the AO classification was a mean rate of 0.47, 0.32 for the second level, and 0.21 for the third level.(Table 1) The mean interobserver reliability of the Universal classification using computed tomography was 0.37.For the AO classification: 0.34 on the first level, 0.21 on the second level, and 0.11 on the third level.(Table 2) The differences found on interobserver reliability rates for the Universal classification based on plain X-ray images in comparison to those based on computed tomography images were shown to be statistically insignificant after the Wilcoxon's non-parametric paired test.
The differences found between interobserver reliability rates for AO classification based on X-ray images compared to the rates based on tomography images were statistically significant in all detail levels of the AO classification.

DISCUSSION
A good interobserver reliability is of great importance for any classification system.An appropriate classification should consider and evidence the aspects defining the severity of the injury, serving as a basis to decide on the kind of treatment, evaluate its result and predict its prognosis.
The number of patients presented in this study is consistent to the average samples of studies assessing interobserver reliability for radius distal fractures classification, and has shown to be appropriate to provide results with statistical significance.
The selection of observers among 3rd-year resident doctors intended to assess the reliability among observers with a general orthopaedic education with no specific specialization in Hand Surgery.
The studies addressing interobserver reliability for the evaluation of radius distal fractures have several methodologies, making comparisons difficult.][10][11][12][13][14][15][16][17][18][19] Other studies use measurements such as steps and diastasis 5,11 while some other still assess these fractures using some well-known classification method: Frykman, AO, Universal, Melone, Mayo or Older's.Studies using both plain X-ray and computed tomography images usually compare the use of plain X-ray imaging versus plain X-ray imaging associated to tomography. 6,7,9Johnston et al. 8 select an observer with access only to tomography images, and another observer with access to both plain X-ray and tomography images.However, they do not perform a comparative analysis between these observers or calculate consistency between them.Cole et al. 5 make a separate analysis of X-ray and tomography images considering step and joint diastasis, finding a significant difference in terms of reliability of the X-ray images (Kappa: 0.31 -0.47) and tomography images (0.69 -0.83).Pruitt et al. 11 also presented an evaluation of tomography images independently from X-ray images, but they do not calculate the Kappa index for data analysis.In this study, the evaluation of images by observers is made independently for X-ray and tomography, showing a different approach of tomography images as compared to other studies estimating interobserver reliability for AO and Universal classifications.
The Universal classification for radius distal fractures is divided into four types.Types II and IV can be subdivided into subtypes A, B and C, which refer, respectively, to stable reducible, unstable reducible and irreducible fractures.However, these sub items have not been used in this study because a standardization of images would be necessary, consisting of a baseline examination and an assessment after bloodless reduction, which has not been performed.

X-RAY IMAGES
In this study, the results for the Universal classification (0.47 -moderate reproducibility) had a similar magnitude to those reported by Oliveira-Filho et al. 16 (0.33 -low reproducibility).
For the first level of the AO classification, this study found a mean reliability rate of 0.47 (moderate).Andersen et al. 13 reported, in their study, a rate of 0.64, Kreder et al. 15 0.68, while Oskam et al. 18 (who reduced the AO classification to 4 types), 0.65.These values correspond to a good reproducibility.Flinkkilä et al. 6 , with 5 types, report a rate of 0.23, a low reproducibility level, and, with 2 types, 0,48, which represents a moderate reproducibility.
For the second level (nine subtypes), we found a mean rate of 0.32 (low reliability).Kreder et al. 15 reported, in their study, a Kappa value of 0.48 (moderate).
For the AO classification in its third level (integral, with 27 subtypes), the present study reported a mean Kappa of 0.21 (low reliability).Oliveira Filho et al. 16 reported a Kappa index of 0.21, Illarramendi et al. 14 reported indexes between 0.31 and 0.40, Andersen et al. 13 0.25 and Kreder et al. 15 0.33, all of these in the "low" reliability range.Flinkkilä et al. 6 present a Kappa index in the "poor" reliability range: 0.18.
As expected, the mean Kappa index has progressively reduced as the detailing of AO classification increased.

TOMOGRAPHY IMAGES
As opposite to our expectations, the reproducibility of Universal and AO classifications using tomography images was systematically lower than the reproducibility calculated from X-ray images.
Concerning tomography images, we found a mean rate of 0.37 (low reliability) for the Universal classification.Despite of the lower reliability rate with this classification using tomography images when compared to X-ray images (0.47), no statistical difference was found after an analysis using the Wilcoxon's non-parametric paired test.We couldn't find another study for comparison.The first level of the AO classification (three types) showed a mean Kappa index of 0.34 (low reproducibility).Flinkkilä et al. 6 reduced the AO classification to five types, resulting in a rate of 0.25 (low reproducibility).By reducing the classification to two types, they found a rate of 0.78 (good reproducibility).We found a statistically significant difference (p<0.05 by the Wilcoxon's test) between rates calculated based on tomography images compared to those calculated based on X-ray images (mean index of 0.47).This reflects a better classification reproducibility based on X-ray images compared to the one using tomography images alone as a source.
Concerning the second level of the AO classification (with nine types), the rate for this study was 0.21 (low reproducibility).There is no other study for comparison purposes.Similarly, a significant difference was found in favor of reliability using X-ray images (mean Kappa of 0.32).The third level of AO classification (integral, with 27 subtypes): mean index of 0.11 (poor reproducibility).No other studies are available for comparison purposes.The difference found when compared to the X-ray images (0.21) is statistically significant.
A potential reason for this difference between reliability rates for the AO classification from X-ray images compared to tomography images would be the difficulty of an observer to precisely determine a three-dimensional morphology of the fracture trace from tomography images without previous aid provided by X-ray images.Flinkkilä et al. 6 had previously commented this fact in their study, adding that a higher degree of difficulty is seen in determining the degree of metaphyseal comminution from tomography images compared to plain X-ray images.The importance of the parameters "trace morphology" and "comminution degree" seems to be critical in the AO classification for a correct determination of types.
On the other hand, the Universal classification only requires that the facture is classified as intra-joint or not, and if a deviation is present or not.This could explain the fact that the difference on reliability rates between X-ray and tomography images showed no statistical significance for the Universal classification.This concept is also consistent with the results reported by Cole et al. 5 In their study, the authors compare the measurement of fracture steps and diastasis from X-ray images compared to those captured with tomography.They found a better reliability with tomography images.Similarly, in this study, there is no need to determine the morphology of a fracture trace or comminution degree, favoring the use of tomography over X-ray imaging.
When we assessed the images once the calculations were made, we could retrieve some examples of cases in which a significant inconsistency existed between observers.Figure 1 shows an example of a case in which an inconsistency was found concerning the intra-or extra-joint nature of the fracture trace on plain X-ray images.All observers identified an intra-joint fracture when assessing that patient's tomography image.
Figure 2 shows an example of inconsistency between observers when assessing a computed tomography image.
The observers came to no agreement regarding the determination of fracture as partially articular or fully articular.It is interesting to observe that, in this case, the observers were unanimous to judge the fracture as fully articular when assessing it with X-ray images.

CONCLUSIONS
This study corroborates the low reliability rate reported on other studies addressing AO and Universal classifications for radius distal fractures using plain X-ray images.The use of computed tomography alone is associated to a low reliability rate, which is statistically worse than the reliability of classifications based on plain X-ray images.Therefore, we do not recommend the use of computed tomography alone to classify radius distal fractures.The Universal and AO classifications were historically designed for X-ray images of radius distal fractures, and not for the use of tomography.In the future, a specific classification for tomography images would be able to present improved results.

Figure 1 -
Figure 1 -X-ray and computed tomography images of a wrist at anteroposterior and lateral planes Acta Ortop Bras.2009; 17(2):9-13 and as all studies employing Kappa coefficient addressing the topic have used.Kappa values above 0.8 indicate an excellent consistency; between 0.61 and 0.8, a good reproducibility; between 0.41 and 0.60, moderate reproducibility/ between 0.21 and 0.4, low reproducibility; while between zero and 0.2, a poor reproducibility.Negative values represent inconsistency.

Table 2 -
Mean interobserver consistency for Universal and AO classifications using computed tomography

Table 1 -
Mean interobserver consistency for Universal and AO classifications using plain X-ray images