RELIABILITY OF THE AO CLASSIFICATION OF THORACOLUMBAR FRACTURES COMPARED TO TLICS

Objective: To test the reliability of the new AO/2013 classification compared with AO/Magerl and TLICS. Methods: Four spine surgeons retrospectively and blindly evaluated imaging and clinical data from 98 patients with thoracolumbar fractures. Results: Using the Kappa coefficient, we obtained the best reproducibility for the AO/2013 classification compared to the other two, represented by Kappa coefficient of 0.690. We could also obtain, with good reproducibility among the evaluators (Kappa 0.690), the most common subtypes of AO/2013 classification with indication for surgery. Conclusion: We believe that the new AO/2013 classification has proven to be a good communication tool among spine surgeons with good reproducibility, but more studies should be conducted in several centers in order to be consolidated and so that the prognosis between the types of injury is better understood.


INTRODUCTION
As a consequence of the advances in the initial treatment of polytraumatized patients, more and more victims of serious spine injuries are showing up at emergency care units alive and needing immediate treatment. 1he age group most frequently affected by spine injuries is the one with the highest productivity for society, i.e., from 20 to 59 years of age. 1 The region between the T12 and L2 vertebrae is the site of more than 50% of all spine fractures.(Figures 1 and 2) The attention given to spine fractures in Brazil has increased in the last few years, with the rise in urban violence and high-energy traumas, traffic accidents, and falls from heights, in addition to  the potential for neurological damage, which approaches 40% in cervical fractures and 15-20% in thoracic fractures. 2ith the goal of unifying communication for better treatment of spine fractures, and in particular those of the thoracolumbar segment, several classification systems have been developed. 3n terms of the classifications based on pathomorphology, that of Magerl et al. 4 is the most detailed, with a total of 53 types of fractures.Up until now, this classification system has been used as an international reference.However, clinical application of this classification has been neither validated nor revised. 5,6Watson-Jones 7 has already stated that the concept of instability would be critical to any algorithm regarding thoracolumbar fractures.
The classification known as TLICS (Thoracolumbar Injury Classification System) 8 assesses neurological status, the integrity of the posterior ligamentous complex (PLC), and the morphology of the lesion using descriptive categories. 8It uses Magnetic Resonance Imaging (MRI) to evaluate the integrity of the PLC, a fact that has been evaluated in other studies. 9,10We should note the importance of this classification, as it emerged in order to overcome several deficiencies of the AO/Magerl 1994 classification system. 4With the recent publication of the new AO classification in 2013, 3 it needed to be validated in the medical community, and especially among the professionals who deal with this type of injury.The new AO/2013 classification maintains the group format (A, B, C) and the subtypes (A0, A1, A2, A3, A4, B1, B2, B3, and C). 3 The objective of this study was to test the reliability of the new AO/2013 classification in our service, comparing it to the two previously used classification systems, AO/Magerl 1994 4 and TLICS. 8he research was developed at the Hospital do Trabalhador-UFPR, located in Curitiba-PR, which treats around 60% of the traumas of the capital and the metropolitan region.

METHODOLOGY
The study was approved by the Institutional Review Board as number CAAE: 42605915.5.0000.5225.
This was an observational, longitudinal, retrospective, and descriptive study in which we reviewed the medical reports of 100 cases of thoracolumbar spine fractures treated at the Hospital do Trabalhador-UFPR in Curitiba-PR, Brazil, during the period from January, 2013, to December, 2014.The following parameters were used for case selection: Inclusion criteria: Patients with vertebral fractures at the T1 to L5 level, radiographs and computed axial tomography (CAT) taken at hospital admission.
Exclusion criteria: Patients with pathological fractures, incomplete medical records, fractures not at the proposed levels, inadequate imaging exams, firearm projectile fractures (FAP).
For each case we evaluated the radiographic images in anteroposterior (AP) and lateral orthogonal views and CAT images in coronal, sagittal, reconstruction, and axial views with 2 mm slices.A CD-ROM was distributed to each of four (4) physician/examiners (PE).All the examiners were orthopedists, specialized in spine surgery, accredited by the Sociedade Brasileira de Coluna (SBC) [Brazilian Spine Society (BSS)] and skilled in the treatment of spine fractures.Each CD-ROM contained 100 cases that were individually evaluated.There was no communication among the PEs.All the cases sent were in compliance with the inclusion and exclusion criteria mentioned above.The PEs received the original articles that describe in detail the AO/Magerl 1994, 4 TLICS, 8 and AO/2013 3 classifications, in addition to a table to be filled out individually for each case.(Attachment 1) For each patient, their clinical history, trauma mechanism, age, neurological status, and data about the integrity of the PLC were available on the CD-ROM.Concordance of the AO/2013 3 classification was performed among the 3 groups (A, B, and C) and among the eight subtypes (A1, A2, A3, A4, B1, B2, B3, and C).
The interobserver concordance for the TCLIS classification was performed based on three variables (fracture morphology, PLC injury, and neurological compromise).
For all the cases, the PEs were asked for a final decision between conservative or surgical treatment, and this information was analyzed using the Cohen's Kappa test to determine the interobserver concordance.The literature was used as a reference for orienting the values to be interpreted. 11,12Intervals with 95% confidence were constructed for this statistic.

RESULTS
For this study, 100 cases were used, 75% of which were men and 25% women, between 20 and 60 years of age, with an average age of 35 years.The most common trauma mechanism, in 65% of the cases, was a fall from a height.The results were evaluated individually for each classification.

AO/Magerl (1994) analysis
The concordance among the PEs was evaluated taking into consideration the set of 100 imaging exams that were evaluated by all of them.
The classifications considered corresponded to the combination of type and group in the AO/Magerl 1994 classification. 4ased on the study results, we estimated a statistic of κ equal to 0.385, indicating marginal reproducibility.A confidence interval of 95% for the κ statistic was established by (0.363-0.407).The distribution of the results of the evaluations of the 4 PEs in the 100 cases can be seen in Figure 3.

TLICS analysis
The concordance among the examiners was evaluated considering the set of 100 imaging exams where there were evaluations from the 4 PEs.
The classification considered in this evaluation corresponded to the total score calculated using the TLICS classification, with the associated treatment options presented below: -score from 0 to 3 -conservative.
-score equal to 4 -conservative or surgical.
-score greater than or equal to 5 -surgical.We estimated a statistic of κ equal to 0.616 based on the results obtained in the study, indicating good reproducibility.The confidence interval of 95% for the statistic of κ was established by (0.554-0.679).The distribution of the results can be seen in Figure 4.
The concordance among the PEs was evaluated considering the set of 100 imaging exams.
The classifications considered in this evaluation corresponded to the indication for treatment, the possible evaluation options available to each evaluator being: -surgical treatment: yes.
-surgical treatment: no.We estimated a statistic of κ equal to 0.690 based on the results obtained in the study, indicating good reproducibility.The confidence interval for the statistic κ was established by (0.608-0.772).
Of the 100 evaluations, the 4 PEs classified 33 cases (33.0%) as requiring conservative treatment (response of no) and 37 cases (37.0%) needing surgical treatment (answer of yes).Thus, there was concordance among all the evaluators for 70 (70%) of the imaging exams.

AO/2013 classification analysis
The concordance among the PEs was evaluated taking the set of 100 imaging exams where there were evaluations by all of them into account.
The classification considered in this evaluation corresponded to a combination of type and group.We estimated a statistic of κ equal to 0.621 based on the results obtained in the study, indicating good reproducibility.The confidence interval of 95% for the statistic κ was established by (0.583-0.659).
The distribution of results can be seen in Figure 5.

Association between the AO/2013 classification and the indication for surgery
The results relative to the percentage of surgical indications according to the AO/2013 classification can be seen in Figure 6.

DISCUSSION
Developing a classification system that is useful to all professionals who wish to better guide treatment and better understand injury mechanisms has always been the goal of many medical researchers. 3,13,14here has always been a difficulty between using simpler systems that end up omitting some information and more complex systems that cause a lot of disagreement among professionals.
The study of thoracolumbar fractures by Blauth et al. 17 demonstrated low intra-and interobserver reliability for the AO/Magerl 1994 classification with a kappa coefficient of 0.385.
In this study, we observed a kappa coefficient of 0.385, which, although slightly higher than that from the Blauth et al. 17 study referenced above, also indicates low interobserver reliability for the AO/Magerl 1994 classification.
The AO/Magerl 1994 classification demonstrated a strong tendency towards surgical treatment for patients classified as group A3, types B and C, reserving conservative treatment for most of the A1 and A2 groups.
In the study performed with the AO/2013 classification, we obtained indications for surgical treatment in 64.7% of the type A3 fractures, more than 80% of the type B fractures, and 100% for the type C fractures, a fact that should be confronted by new studies that are using the new classification and its indication for the patient in relation to their prognosis.
The TLICS system was considered to be reliable, reproducible in smaller series, but raised questions about the cost of performing MRIs and doubts about the best treatment for a score of 4 and potential indication errors between surgical and conservative treatments. 8The discussion about the need to perform an MRI to evaluate the integrity of the PLC arose due to the studies that proved that these injuries may go unnoticed in obese patients and those with edema. 18Denis et al. 16 had previously related PLC injury to a worsening neurological profile and poor conservative treatment outcomes.In a recent study, the new AO/2013 classification reached a Kappa score > 0.55 in the evaluation of PLC lesions using only a clinical examination, 12 thus proving that conducting an MRI examination is not indispensable.In our study, we adopted the clinical examination as a parameter to assess PLC injury.In

Figure 1 .
Figure 1.Axial computed tomography in sagittal and axial slices showing a type A4 fracture of vertebra L1.

Figure 2 .
Figure 2. Anteroposterior and lateral radiography showing a type A4 fracture of vertebra L1.

Figure 5 .
Figure 5. Percentage by type and group for the AO/2013 classification.

Figure 3 .
Figure 3. Percentages obtained for each subtype and group of the AO/ Magerl 1994 classification.