Analysis of the sensitivity and reproducibility of the Basso, Beattie, Bresnahan (BBB) scale in Wistar rats.

OBJECTIVE To evaluate the sensitivity and reproducibility of the Basso, Beattie, Bresnahan functional scale in the assessment of the locomotor capacity of rats after spinal cord injury. METHODS Thirty male Wistar rats underwent laminectomy and mild, moderate or severe spinal cord contusions using the New York University Weight Drop Impactor. The mice were followed for 28 days, after which time each rat was placed in an 80x80x30 cm3 clear box lined with a blue non-slippery material and stimulated to move. Their movement was video-recorded by three digital cameras operating simultaneously. Identical copies of the edited videos were given to six independent evaluators who were blinded with regards to the degree of injury severity. Each evaluator made a determination of the locomotor capacity of the rats using the Basso, Beattie, Bresnahan functional scale. RESULTS We determined the sensitivity of the method to differences among the evaluators as well as between the results achieved on the left and right hind paws of rats subjected to either mild, moderate or severe injuries by comparing the functional outcomes and reproducibility using non-parametric correlation tests. CONCLUSIONS The Basso, Beattie, Bresnahan scale showed high reproducibility and satisfactory sensitivity for identifying mild injuries; satisfactory reproducibility and non-satisfactory sensitivity for moderate injuries; and reduced reproducibility and non-satisfactory sensitivity for severe injuries.


INTRODUCTION
It is important to understand the physiological mechanisms of spinal cord injury in order to reach a consensus on what is the most appropriate type of tool to use for behavioral evaluation and functional recovery analysis. Although some functional recovery evaluation tests are easy to use, they nonetheless provide limited sensitivity to subjective observations 1,2 .
The experimental model of the weight-drop medullar contusion developed by Allen between 1911 and 1914 (see Tarlov and Klinger) 1 has served as the basis for various other methods to induce injury in different species [3][4][5] . These tests, which measure the capacity of locomotor recovery, produce varying results, ranging from qualitative descriptions of the walking act to combined methods 6 for detecting changes in the CNS. Contradictory outcomes for various methods, types of animals and statistical analyses make it difficult to reproduce results between laboratories. Moreover, the methods of behavioral evaluation, definitions, and criteria used to evaluate the locomotor capacity of rats vary significantly in literature 7 .
Systematic evaluation of functional behavior after spinal cord injury should yield reproducible information on the sensitivity of a given method to small changes, and thereby assist in identifying a pathophysiological recovery procedure that shows therapeutic efficacy. Commonly em-ployed methods that use qualitative or semiquantitative scales have low reproducibility and lack accuracy and sensitivity 1,2 , especially to small responses or differences [8][9][10][11][12] .
Of the various existing experimental models available, we adopted the model for spinal cord injury in rats used in the Multicentric Animal Spinal Cord Injury Study (MASCIS) 13 , and used the Basso, Beattie, Bresnahan (BBB) scale 14 for functional evaluation of locomotor capacity recovery. The BBB is a semiquantitative scale based on locomotor response of rats that can take on values ranging from zero to 21.
We attempted to evaluate the sensitivity and reproducibility of the BBB scale for different types of injury (mild, moderate or severe) in 30 Wistar rats that were subjected to a spinal cord contusion injury induced by the New York University (NYU) Weight-Drop Impactor.

METHODS
In order to determine the sensitivity and reproducibility of the BBB scale 14 , we compared and correlated the average values of the scores assigned by six independent evaluators to the locomotor capacity functional recovery of 30 male Wistar rats 28 days after laminectomy and induced spinal cord injury (of degree mild, moderate or severe). Images of the movement of each animal were detected simultaneously by three digital cameras and subsequently reviewed by the evaluators, who were blinded to the severity of the injuries.
The experimental model we adopted to induce spinal cord injuries was first developed by MASCIS; our version, standardized for Wistar rats 15 , consisted of the following stages: A -Receipt and selection of animals (since a high number of rats died before 28 days after spinal cord injury, it was necessary to begin work with 60 rats. Ultimately, eight rats were excluded from the mild group, nine from the moderate group and 13 from the severe group.) B -Random formation of experimental groups C -Spinal Cord Injury induced by weight drop controlled by the NYU Impactor 1 Anesthesia 2 Laminectomy 3 Spinal cord contusion D -General standard procedures after spinal cord contusion E -Postoperative antibiotic therapy F -Maintenance of the animals G -Locomotor evaluation (simultaneous filming of the motricity of each rat by three digital cameras 28 days after mild, moderate or severe spinal cord injury; video-based analysis and corresponding BBB scale assessment of the locomotor functional capacity was conducted by six independent evaluators blinded to the degree of severity of each rat's injury). H -Euthanasia 29 days after the injury I -Statistical analysis BBB SCALE O -No observable movement of the hindlimbs. 1. Slight (limited) movement of one or two joints, usually hip and/or knee. 2. Extensive movement of one joint or extensive movement of one joint and slight movement of the other. 3. Extensive movement of two joints. 4. Slight movement of all three joints of the hindlimbs. 5. Slight movement of two joints and extensive movement of the third joint. 6. Extensive movement of two joints and slight movement of the third joint. 7. Extensive movement of the three joints in the hindlimbs. 8. Sweeping without weight bearing or plantar support of the paw without weight bearing. 9. Plantar support of the paw with weight bearing only in the support stage (i.e., when static) or occasional, frequent or inconsistent dorsal stepping with weight bearing and no plantar stepping. 10. Plantar stepping with occasional weight bearing and no forelimb-hindlimb coordination. 11. Plantar stepping with frequent to consistent weight bearing and occasional forelimb-hindlimb coordination. 12. Plantar stepping with frequent to consistent weight bearing and occasional forelimb-hindlimb coordination. 13. Plantar stepping with frequent to consistent weight bearing and frequent forelimb-hindlimb coordination. 14. Plantar stepping with consistent weight support, consistent forelimb-hindlimb coordination and predominantly rotated paw position (internally or externally) during locomotion both at the instant of initial contact with the surface as well as before moving the toes at the end of the support stage or frequent plantar stepping, consistent forelimb-hindlimb coordination and occasional dorsal stepping. 15. Consistent plantar stepping, consistent forelimbhindlimb coordination and no movement of the toes or occasional movement during forward movement of limb; predominant paw position is parallel to the body at the time of initial contact. 16. Consistent plantar stepping and forelimb-hindlimb coordination during gait and movement of the toes occurs frequently during forward movement of the limb; the predominant paw position is parallel to the body at the time of initial contact and curved at the instant of movement. 17. Consistent plantar stepping and forelimb-hindlimb coordination during gait and movement of the toes occurs frequently during forward movement of limb; the predominant paw position is parallel to the body at the time of initial contact and at the instant of movement of the toes. 18. Consistent plantar stepping and forelimb-hindlimb coordination during gait and movement of the toes occurs consistently during forward movement of limb; the predominant paw position is parallel to the body at the time of initial contact and curved during movement of the toes. 19. Consistent plantar stepping and forelimb-hindlimb coordination during gait and movement of the toes occurs consistently during forward movement of limb; the predominant paw position is parallel to the body at the instant of contact and at the time of movement of the toes, and the animal presents a downward tail some or all of the time. 20. Consistent plantar stepping and forelimb-hindlimb coordination during gait and movement of the toes occurs consistently during forward movement of limb; the predominant paw position is parallel to the body at the instant of contact and at the time of movement of toes, and the animal presents consistent elevation of the tail and trunk instability. 21. Consistent plantar stepping and coordinated gait, consistent movement of the toes; paw position is predominantly parallel to the body during the whole support stage; consistent trunk stability; consistent tail elevation.

STATISTICAL ANALYSIS
In order to analyze the sensitivity and reproducibility of the results, we checked for consistency between the results achieved on the right and left sides of the rats and between the grouped results achieved by the different evaluators.
The normality of the distributions was tested using the Kolmogorov-Smirnov test for continuous variables and by examining the Pearson's correlation coefficient (less than 30%). Since no normal distributions were found, non-parametric tests were adopted. We also used Wilcoxon's test for non-parametric paired samples to infer the difference of the means between the right (R) and left (L) sides of each rat, and Spearman's unilateral correlation coefficient (r) to check for pairing effectiveness. We used the Friedman test for non-parametric paired samples to infer the dif-ference of the means among the different evaluators. The paired differences were discriminated using the multiple comparison test modified by Dunn. In those cases where the Dunn's test did not show enough statistical power (power effectiveness) to discriminate the differences, we minimally reported the difference between those evaluators who presented the greatest difference in ranking.

RESULTS
The evaluators' assessments of the locomotor functional capacity between the right and left sides of rats subjected to mild, moderate or severe injuries revealed differences in the locomotor capacity between the right and left sides of rats subjected to mild injury, and a lack of pairing in the measurements for rats subjected to moderate and severe injury ( Table 1).
The grouped right-and left-side measurements of the functional capacity evaluations and the interevaluator analysis revealed no difference between the results for rats subjected to mild injuries; the results were were repeated and equivalent. In the moderate to severe injury cases, the re- Table 1 -Evaluations of the right and left sides using wilcoxon's test and evaluation of effective pairing using spearman's one-tailed correlation test (alpha= 0.05).  Table 2). The Spearman's correlation coefficient was used to infer agreement among the measurements of locomotor capacity between pairs of evaluators. According to the intensity of the spinal cord injury, the evaluators observed excellent and good reproducibility for mild injuries, good and moderate reproducibility for moderate injuries, and only moderate reproducibility for severe injuries.

DISCUSSION
Many different experimental models have been developed to induce controlled and reproducible spinal cord injuries in animals, as have methods to evaluate their functional recovery 16 . Most studies seek to determine the pathophysiology of the spinal cord injury, that is, the effects of ischemia and anatomical pathological changes in the spinal cord; however, the lack of standardization of contusion mechanisms and the use of different types, races, sizes and ages of the animals can impair functional recovery evaluations and make it difficult to compare the results among evaluators. Reproducible, accurate and low cost research leads to acceptance and diffusion of experimental models; however, currently used models have problems producing controlled, reproducible spinal cord injuries.
Basso et al. 14 introduced a scale to evaluate the functional recovery of locomotor capacity in rats after spinal cord contusion. This scale provides predictive measure-ments based on specific observational criteria concerning the movement of the animal and assigns sequential, cumulative scores corresponding to the criteria (motor qualities or functions).
Proof of significant functional recovery comes from better knowledge of the spinal cord injury pathophysiology, anabolic and catabolic regulation factors, tissue engineering and cell therapies. Thus, improving the knowledge of the current experimental methods and developing new, more effective models is of fundamental importance. Although presently there are no methods for direct, objective, accurate and effective evaluation, available methods must still be considered. The BBB scale is the most frequently used method of functional recovery evaluation because it is simple, easy to use and practical, having been adopted by the MASCIS and by many evaluators.
Despite its wide utilization, the BBB scale presents important discontinuities, different characteristics, and controversial issues regarding the best statistical method to be used. The scores obtained in the upper or lower ranges of the scale have distinct characteristics and do not permit accurate comparisons 6,7,17 Since no gold standard exists for the purpose of evaluating -either directly or indirectly -the efficacy of scales, combined methods have been proposed to improve the sensitivity and reproducibility of the BBB scale 7,18 . Here we proposed to combine the BBB scale with complementary methods in order to improve the efficiency and quantification of motor deficit 7 .
The controversy surrounding the validity of such combination of methods relates to the verification of reproducibility. Statistical methods have been proposed as complementary procedures that can be implemented to increase the efficacy of the quantification of the BBB scale 17 . However, both parametric and non-parametric statistical tests have been used in conjunction with the BBB scale in different studies, which confirms the lack of homogeneity of the criteria and fuels the controversy over what is the most appropriate statistical method. Our analysis of the BBB scale was conducted in order to validate the results of this study, introduce the properties of the test and define the limits of its practical applicability. We evaluated the reproducibility (i.e., homogeneity) of the results by comparing the averages of the measurements made by the evaluators and the level of agreement (correlation) among them. Tests were performed to infer differences among interevaluator measurements averages.
The measurements made using the BBB scale 14 were not considered interval measurements, nor did they present a normal distribution which prevented the use of conven- The reduced reproducibility of the BBB scale in the evaluations of rats with moderate to severe injuries (i.e. results occurring in the lower score range of the BBB scale) was proven. Differences in the evaluations of rats with moderate to severe injuries and reduced locomotor capacities in comparison with the evaluations of rats with mild injuries and higher motor capacity were confirmed. High (good and excellent) reproducibility, as well as satisfactory sensitivity of the BBB scale was observed in rats with severe injuries.
The results shown herein are part of a general project of our service. 19 Proper planning in future studies should reduce the feasibility of the results and work in the highest range of the BBB scale by increasing, whenever possi-ble, the quantity of rats and the care taken during the laminectomy and contusion in order to reduce errors associated with the method. We have shown that the BBB scale has excellent reproducibility and satisfactory sensitivity to evaluate the locomotor capacity in rats with mild spinal cord injury, in spite of the small sample size. It is therefore a good parameter for research in this injury range.

CONCLUSIONS
-The BBB scale has satisfactory sensitivity and high (good and excellent) reproducibility in rats with mild injuries.
-The BBB scale has satisfactory, albeit moderate, reproducibility and unsatisfactory sensitivity to moderate injuries.
-The BBB scale has reduced reproducibility and nonsatisfactory sensitivity to severe injuries.