Motor skills assessments : support for a general motor factor for the Movement Assessment Battery for Children-2 and the Bruininks-Oseretsky Test of Motor Proficiency-2

OBJECTIVE
To evaluate the construct validity and model-based reliability of general and specific contributions of the subscales of the Movement Assessment Battery for Children-2 (MABC-2) and Bruininks-Oseretsky Test of Motor Proficiency-2 (BOT-2) when evaluating motor skills across a range of psychiatric disorders.


METHODS
Confirmatory factor analysis (CFA) and bifactor analysis were conducted on BOT-2 data from 187 elementary school students (grades 1 to 6) (mean age: 113 ± 20 months; boys: n = 117, 62.56%) and on MABC-2 data from 127 elementary school students (grade 1) (mean age: 76 ± 2 months; boys: n = 58, 45.67%).


RESULTS
The results of the CFA fit the data for multidimensionality for the BOT-2 and presented poor fit indices for the MABC-2. For both tests, the bifactor model showed that the reliability of the subscales was poor.


CONCLUSIONS
The BOT-2 exhibited factorial validity with a multidimensional structure among the current samples, but the MABC-2 showed poor fit indices, insufficient to confirm its multidimensional structure. For both tests, most of the reliable variance came from a general motor factor (M-factor), therefore the scoring and reporting of subscale scores were not justified for both tests.


Introduction
Motor skills serve as the basis not only for sports and recreation, but are embedded in all activities of daily living.The identification of movement difficulties in children is crucial to understanding the biological basis of neurodevelopmental disorders, such as developmental coordination disorder 1 and neurological soft signs. 2 Recently, as signs of motor dysfunction are evidenced across a range of psychiatric disorders, especially schizophrenia, the Research Domain Criteria Initiative (R-DoC) propose a domain of motor systems in an attempt to understand and explain the relations between motor circuits and the pathophysiology of psychiatric disorders. 3Hence, motor skills assessment is fundamental for identifying and understanding the pathophysiology underlying neurodevelopmental and psychiatric disorders and for implementing early intervention and effective rehabilitation treatment plans and verifying the potential relationship between them.
While there is no gold standard to measure children's motor abilities, the Movement Assessment Battery for Children, Second Edition (MABC-2) 4 and the Bruininks-Oseretsky Test of Motor Proficiency (BOT-2) 5 are the tools most commonly used in both clinical and research settings.The former assesses three dimensions: manual dexterity, aiming and catching, and balance.7][8][9] Although the MABC-2 has been assessed for validity in the Brazilian population, 10 construct validity was not contemplated.
The BOT-2 assesses four dimensions: fine manual control, manual coordination, body coordination, and strength and agility.It also comprises eight items/ tasks.Although construct validity has not been assessed by external researchers, factorial validity is provided in the assessment manual, 5 with good fit statistics that provide validity evidence for the four motor-areas (Table 6.10 in the manual).
The conceptual model underlying the items of both tools is multidimensional; however, being a multidimensional construct per se does not guarantee that each subscale is reliable (how well a latent variable is represented by a given set of items [i.e., the quality of its indicators]), replicable across studies, 11  There is only one study in the area of motor assessment 7 that suggests a common "general motor ability" construct underlying the subscales.General motor ability comprehends a general factor that underlies the subscales within a battery/test, which may influence the performance of the subscales, crossing over all the items effectively and capturing their shared content with a unifying concept, whereas the specific factors (subscales) account for response variation that is unique or particular to item subsets. 12rmal procedures to evaluate the reliability and viability of the subscales in the presence of a general motor ability factor were not previously tested for MABC-2 and BOT-2 and are conducted via bifactor modeling.Clinically, it is fundamental to determine if the variance (i.e., information) captured by the motor subtest is reliable and viable when controlled by a general motor ability, as such information has a direct effect on how motor assessment scores are conducted (justification for the scoring and its reporting) and subscales interpreted. 11he formal procedure, which enables the investigation of the psychometric features of specific factor in the presence of a general factor, is the bifactor model (also known as the nested factors/direct hierarchical/ general-specific model). 13Bifactor models are a type of specification of confirmatory factor models. 14 Regarding adequacy, bifactor models are less restrictive due to more free parameters 15 and consequently they will have better fit indices when compared to other commonly more restrictive multidimensional solutions, such as correlated-factor models or second order models. 11,13nsequently, we aimed to answer the following questions: 1) Are the subscales for MABC-

Sample for the MABC-2
For the MABC-

Procedures
The BOT-2 is an objective instrument widely used in clinical and research settings to measure gross and fine motor functioning for individuals aged 4-21 years. 5e BOT-2 provides scores in four domains of motor competence and a total motor composite score, which includes all four domains: 1) fine manual control: fine motor precision and integration; 2) manual coordination: The MABC-2 provides scores in three domains of motor competence and a total motor score, which includes all three domains named as following (as described in the manual): 1) manual dexterity: posting coins with one's preferred hand, posting coins with one's non-preferred hand, threading beads, and drawing a trail; 2) aiming and catching: catching beanbag and throwing beanbag onto mat; and 3) balance: one-leg balance best leg, one-leg balance other leg, walking heels raised, and jumping on mats.
Scoring is based on the results of goal directedactivities and errors, where the raw score for each item is converted to a standard score for the items; then, the pairs of items that devise the three domains are converted to a standard score and percentile, and the sum of the domain standard scores is converted to a total motor performance score.All these scores were considered continuous variables.

Data analyses
All statistical analyses were conducted with Mplus 7.4. 18To verify the dimensional solution of the BOT-2 and MABC-2, a confirmatory factor analysis (CFA) was conducted.Robust maximum-likelihood estimation was used. 18The following fit indices were used to evaluate the model fit for CFA when all observed variables were continuous: chi-square, comparative fit index (CFI), Tucker-Lewis index (TLI), and root mean square error of approximation (RMSEA).To demonstrate a good fit to the data, an estimated model should have an RMSEA near or below 0.06 and a CFI and TLI near or above 0.95. 19e viability and reliability of BOT-2 and MABC-2 subscales were evaluated using a bifactor model. 14,20e Finally, the bifactor model also provides output that paints a more direct interpretation of factor loadings for M-factor vs. the specific factor. 13,17e following indices were used to assess the viability of the BOT-2 and MABC-2 subscales: a) coefficient omega (ω), 14,22,23 which is a reliability estimate based on factorial model that estimates the proportion of the observed variance in the total score attributed to all sources of common variances; b) coefficient omega hierarchical (ωh), 20,24 which is a reliability index that judges the degree to which composite scale scores are interpretable as a measure of a single common factor (the coefficient omega hierarchical is computed by dividing the squared sum of the factor loadings on the general factor [model estimated] by the variance of total scores]); c) coefficient omega subscale (ωs), 20,25 which is the percentage of the subscale score variance attributable to a specific group factor of items after removing the reliable variance due to the general factor, i.e., an index reflecting the reliability of a subscale score after controlling for the variance due to the general factor; and d) explained common variance (ECV), which is the percentage of common variance explained by the general factor, i.e., a type of unidimensionality index directly related to the relative strength of the general factor.It can be defined as the ratio between the explained variance by the general factor and the variance of the general and specific factors.Details of these calculations can be found elsewhere. 20,24efficient omega (ω), coefficient omega hierarchical (ωh), and coefficient omega subscale (ωs) scores > 0.8 indicate a strong relationship between the latent variable and item scores.An ECV > 0.70 indicates that the instruments should be treated as essentially unidimensional; correspondingly, single common factors were specified. 24

Results
Descriptive results for the BOT-2 and MABC-2 are available in the online-only supplementary material, Tables S1 and S2.The complete data and the computing code (outputs) can be requested from the corresponding author.

Second Edition (BOT-2)
For the BOT-2, excellent fit indices were found using the CFA. Figure 1   As observed in Figure 3, four tasks (drawing trail, throwing beanbag onto mat, walking heels raised, and jumping on mats) exhibited factor loadings below 0.3.
According to the literature, 26 for an item to remain in the model, the factor loading should be more than 0.3.
Despite this, all tasks were kept in the model because, in addition to representing the original model, one of the factors (m2 -aiming and catching) was already at the limit of the minimum number of items (two items) to be considered a latent trait.  four tasks presented low factor loadings and needed to be excluded; therefore, there were less than three dimensions (subtests).
Regarding the results of the bifactor analysis for both the BOT-2 and MABC-2 models, we observed that the viability (ability to sustain the scores of the subscales after variation due to the M-factor) and reliability (the quality of the indicators) of the subscales were poor, and almost all the reliable variance in total scores could be attributed to the M-factor.This factor may reflect individual differences in motor performance.
The M-factor is robustly reliable even though it is a multidimensional construct, and the specific subdomains displayed weak viability beyond the M-factor, indicating that the use of the subscales for BOT-2 and MABC-2 is not appropriate for estimation of motor skills; rather, the raw sum of item scores should be considered as an outcome measure.These results are consistent with the findings of the bifactor model applied to other areas of child evaluation and different scales assessing various aspects of psychopathology and personality. 20,24,28,29om a clinical perspective, these results suggest that the motor performance assessed by BOT-2 or MABC-2 reflects a general latent trait (M-factor); therefore, the reporting and interpretation of these tests should be restricted to the total motor composite for BOT-2 and the total score for the MABC-2.Moreover, the subtests should not be analyzed separately, since the values of the omegas of the subscales showed little reliable variation beyond the M-factor, limiting the use of the subtest scores as precise indicators of unique constructs. 14,20Thus, in accordance with what has been suggested by the R-DoC, 3 when using these two instruments (BOT-2 and MABC-2) to verify the relationship between motor skills and pathophysiology in psychiatric and neurodevelopmental disorders, the most robust and precise way to verify and report motor performance is through the total score.

Limitations of the study
This research was conducted only with Brazilian students and did not address all the age groups proposed by BOT-2 and MABC-2, nor was the sample representative.Therefore, it is necessary to reproduce the analyses in other samples and age groups.However, the samples were robust, given the number of items in the tests evaluated, and the poor reliability and viability results found for the subscales are in accordance with recent and diverse studies using bifactor models across various areas of knowledge, including psychiatry and psychology. 20,24,28,29Therefore, the results regarding the poor viability and reliability of the subscales are dependent of the sampling.

Conclusions
The or viable (capable to be sustained in the model); psychometric features of multidimensional constructs are evaluated through indices derived from a bivariate model.These indices are useful to describe (a) the quality of unitweighted total and subscale score composites, and (b) the specification and quality of a measurement model in structural equation modeling. 11

20; 10 .
7%) and learning disorders (dyslexia: n = 20, 10.7%; language-based learning disability: n = 20, 10.7%) were included in this sample.All students were assessed with the full version of the BOT-2 in single 50 to 60-minute sessions, applied by a trained occupational therapist, in a classroom or courtyard provided by the school or in the attendance rooms of the Center of Education and Health Studies (Centro de Estudos em Educação e Saúde -CEES), Universidade Estadual Paulista (UNESP), Marília, and at the Outpatient Clinic of Child Neurology -Learning Disorders, at Hospital das Clínicas da Faculdade de Medicina de Botucatu, UNESP.A detailed description of the diagnostic criteria used for the sample can be found in the online-only supplementary material, Appendix 1.
manual dexterity and upper-limb coordination; 3) body coordination: bilateral coordination and balance; and 4) strength and agility: running speed and agility, and strength.Scoring is based on the results of goal directedactivities, where the total score in each item is converted to a scale score for each item; then, the pairs of items that form the domains are converted to a standard score, and the sum of the domain standard scores is converted to a total composite score.All these scores were considered continuous variables.The MABC-2 is also an objective instrument widely used in clinical and research settings to measure gross and fine motor skills with normative data for three age bands.For this study, only age band 1 (3 years to 6 years and 11 months) was used given the participants' ages.
BOT-2 model consisted of four specific factors, while the MABC-2 model consisted of three specific factors.In relation to CFA, where factors are all free to correlate among each other, the bifactor model should be an orthogonal model, where the relationship among the specific factors themselves and with the general factor is fixed to zero (i.e., no correlation between the factors). 21Bifactor models have several potential advantages when compared to other forms of specifying a confirmatory model, particularly when researchers are interested in the predictive relationships between domain-specific factors over and above the general factor.A bifactor model can be used as a less restricted baseline model.It can used to study the role of domain-specific factors that are independent of the general factor and directly examine the strength of the relationship between the domain-specific factors and their associated items, because the relationship is reflected in the factor loadings.Bifactor models can be particularly useful in testing whether a subset of the domain-specific factors predicts external variables, over and above the general factor, since the domain-specific factors are directly represented as independent factors.

Figure 2 Figure 3
Figure 2 shows the bifactor model for BOT-2 with their respective standard factor loadings and standard errors.It is important to note that the values of the factor loadings reduce considerably compared to the models shown in Figure 1.

Following
the same procedure observed for BOT-2, again for the MABC-2, from the three factors, a bifactor model (a general factor and three specific factors) was derived.Although the previous model did not exhibit any good fit indices, the new model specification resulted in excellent fit indices: χ²(26) = 25.560,p = 0.4875; CFI = 1.000;TLI = 1.004;RMSEA = 0.000 (90%CI = 0.000 to 0.069).As evident in Figure 4, there was a reduction in the values of the standard factor loadings of specific
BOT-2 exhibited factorial validity with a multidimensional structure in the current samples, but the MABC-2 presented poor fit indices, insufficient to confirm its multidimensional structure.Results of the bifactor model revealed that most of the reliable variance was derived from a general M-factor.Therefore, use of the BOT-2 and MABC-2 subscales is not supported in clinical or research settings.Even though our data are not representative of the entire country, this study is the first to use bifactor models with BOT-2 and MABC-2, and our findings address new insights regarding the use and interpretation of the assessment instruments most widely used with children.